Skip to main content

Crawling Your Application

The Knowledge Graph crawls your application to discover all reachable pages and states, building a visual map of user navigation paths.

Starting a Crawl

1. Navigate to Knowledge Graph

From your project dashboard, select Knowledge Graph in the sidebar.

2. Select Target Environment

Choose where to crawl:

  • Localhost - Your local dev server (requires tunnel connection)
  • Staging - Staging environment URL
  • Production - Live site (read-only recommended)

3. Launch the Crawl

Click Build & Crawl for a fresh exploration, or Start Crawl to continue from existing state.

The crawler begins at your configured start URL and explores outward, clicking links and buttons to discover new states.

Monitoring Progress

Live Dashboard

During crawling, you'll see:

MetricDescription
Discovered StatesTotal unique pages/states found
Exploration ProgressPercentage of discovered states fully explored
Current URLPage currently being analyzed
Actions QueuedRemaining interactions to try

Graph View

The live graph updates as new states are discovered:

  • Nodes appear as pages are found
  • Edges draw as navigation paths are confirmed
  • Watch the graph expand in real-time

Understanding the Graph

Nodes (States)

Each node represents a unique application state:

  • URL-based states - Different pages (/home, /products, /checkout)
  • Dynamic states - Same URL, different content (modal open, form step 2)

Edges (Transitions)

Edges show how users navigate between states:

  • Link clicks - Standard navigation
  • Button clicks - Form submissions, modals
  • Programmatic - JavaScript navigation

Color Coding

ColorMeaning
GreenFully tested with passing tests
YellowDiscovered but untested
RedTests exist but failing
GrayExcluded from testing

Configuring Crawls

Authentication

For protected pages, configure auth in Project Settings > Authentication:

Login URL: /login
Username field: #email
Password field: #password
Submit button: button[type="submit"]
Credentials: Use environment variables

The crawler logs in before exploring, maintaining session throughout.

Depth Limits

Control crawl scope:

  • Max Depth: How many clicks from start URL (default: 10)
  • Max States: Stop after discovering N states (default: 500)
  • Timeout: Maximum crawl duration (default: 30 minutes)

Exclude Patterns

Prevent the crawler from triggering destructive actions:

Exclude URLs:
/logout
/delete/*
/admin/destroy/*

Exclude Selectors:
button.danger
[data-action="delete"]

Include Patterns

Restrict crawling to specific sections:

Include URLs:
/app/*
/dashboard/*

Common Scenarios

Crawling Authenticated Sections

  1. Configure login credentials in project settings
  2. Set start URL to a page requiring auth
  3. Add /logout to exclude patterns
  4. Start crawl - it will authenticate first

Handling Dynamic Content

The crawler waits for content to load before analyzing. For slow-loading content:

  1. Increase Page Load Timeout in settings
  2. Add Wait Selectors - elements that indicate page is ready:
    .content-loaded
    [data-ready="true"]

Multi-Step Forms

The crawler discovers form states automatically:

  • Each form step becomes a node
  • Validation states (errors shown) become separate nodes
  • Success/confirmation pages link back to the flow

To ensure complete coverage:

  1. Start crawl from the form entry point
  2. Provide test data in Form Fill Settings
  3. Review discovered states for completeness

Single-Page Applications (SPAs)

SPAs work seamlessly. The crawler detects client-side route changes, waits for dynamic content to render, and captures state even without URL changes.

Stopping and Resuming

  • Pause: Halt exploration while preserving discovered state
  • Resume: Continue from where you stopped
  • Reset: Clear all discovered states via Settings > Reset Knowledge Graph

Best Practices

  • Start small - Test with limited depth first to verify exclude patterns
  • Exclude destructive actions - Always exclude logout, delete, and admin endpoints
  • Use staging - Crawl staging environments when possible
  • Review results - Check the graph for unexpected states or missing sections