BridgeToAgent
Explainer6 min read

Inside Chrome 146 — what lighthouse --only-categories=agentic-browsing actually does

Running Lighthouse Agentic Browsing from the command line skips Chrome DevTools, scripts cleanly into CI, and emits JSON that you can pipe into dashboards. This post unpacks what the CLI flag does internally, the JSON output shape, the flags that matter (--throttling-method, --emulated-form-factor, --screenEmulation), and a reference CI workflow that turns the audit into a per-PR gate.

BridgeToAgentEditorial team

Inside Chrome 146 — what lighthouse --only-categories=agentic-browsing actually does

If you're shipping the kit files programmatically — generating them in a build pipeline, validating them in CI, gating PRs on the audit passing — the right tool is Lighthouse on the command line, not Chrome DevTools. This post is the technical deep-dive on what the CLI flag does, the JSON it emits, the flags that matter for stability, and the CI shape that turns the audit into an enforced gate.

This is the cluster-2 supporting post written for developers — if you're an SMB-owner trying to manually verify your install, the 5-minute install audit is the right tab. If you're wiring Lighthouse into your build, keep reading.


Install

npm install -g lighthouse
# or
npx lighthouse --version

The Agentic Browsing category requires Lighthouse 12.4 or higher (matches Chrome 146's category-bundling). Older versions silently ignore the --only-categories=agentic-browsing flag and run nothing.

Confirm:

lighthouse --help | grep -A 2 only-categories

Should list agentic-browsing alongside performance, accessibility, best-practices, seo, and pwa.


The basic invocation

lighthouse https://yourdomain.com \
  --only-categories=agentic-browsing \
  --output=json \
  --output-path=./audit.json \
  --chrome-flags="--headless"

What this does:

  1. Launches a headless Chrome instance.
  2. Navigates to the URL.
  3. Runs the Agentic Browsing category audits — all nine of them.
  4. Emits a JSON report at the output path.
  5. Exits 0 (success) or non-zero (audit failures, depending on --fail-on=).

--chrome-flags="--headless" is the load-bearing flag in CI environments where there's no display server. Locally you can drop it to watch Chrome actually load the page.


The JSON output shape

Lighthouse's JSON output is structured around categories and audits:

{
  "categories": {
    "agentic-browsing": {
      "id": "agentic-browsing",
      "title": "Agentic Browsing",
      "score": 0.78,
      "auditRefs": [
        { "id": "llms-txt-present", "weight": 10 },
        { "id": "llms-txt-well-formed", "weight": 10 },
        { "id": "agents-json-present", "weight": 10 },
        { "id": "agents-json-actions-typed", "weight": 15 },
        { "id": "agent-runbook-present", "weight": 10 },
        { "id": "auto-discovery-links", "weight": 10 },
        { "id": "sitemap-discoverable", "weight": 10 },
        { "id": "schema-org-density", "weight": 20 },
        { "id": "webmcp-annotations", "weight": 5 }
      ]
    }
  },
  "audits": {
    "llms-txt-present": {
      "id": "llms-txt-present",
      "title": "...",
      "score": 1,
      "displayValue": "Present"
    },
    "agents-json-actions-typed": {
      "id": "agents-json-actions-typed",
      "score": 0,
      "details": {
        "items": [
          {
            "action": "search",
            "issue": "Parameter 'q' has no declared type"
          }
        ]
      }
    }
  }
}

Per-audit weights inside the category as of the experimental spec: schema-org-density weighs the most (20%), agents-json-actions-typed next (15%), the file-presence audits 10% each, webmcp-annotations lowest (5% because the spec is still moving). Weights shift as the category stabilizes — re-read the spec page before relying on weights for prioritization.

The details.items array on failing audits is the load-bearing diagnostic — it tells you exactly which file/action/parameter triggered the failure. Pipe it into your CI logs.


The flags that matter

--throttling-method

Default is simulate (Lighthouse models a slower network in software). For agent-readiness audits, the kit files are tiny and the network throttling doesn't change scores — set --throttling-method=devtools or --throttling-method=provided to skip the network simulation and shave 5-10 seconds off the run.

--emulated-form-factor

Lighthouse defaults to mobile emulation. For Agentic Browsing audits this rarely matters (the audits check files and head tags, not viewport), but if your site renders different HTML to mobile vs desktop, run twice with --emulated-form-factor=desktop and --emulated-form-factor=mobile and compare the per-audit scores.

--screenEmulation.disabled

If your CI runner is non-graphical and you're seeing screen-emulation errors, pass --screenEmulation.disabled=true. The Agentic Browsing category doesn't rely on visual rendering — the audits inspect HTML structure and file existence, not pixels.

--max-wait-for-load

Default 45000ms. If your site has slow third-party scripts (chat widgets, analytics), Lighthouse waits for the load event before scoring. Set to a lower value (--max-wait-for-load=15000) to time-bound the audit in CI. Files served quickly will pass; pages with slow loads might get partial scores.

--save-assets

Useful for debugging — emits screenshots, traces, and DevTools metrics alongside the JSON. Generates ~50MB of artifacts per run; skip in production CI.


CI shape — gating PRs on the audit

A minimal GitHub Actions workflow that runs the audit on every PR and fails if the Agentic Browsing score drops below 80:

name: Lighthouse Agentic Browsing audit

on:
  pull_request:
    branches: [main]

jobs:
  lighthouse:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: 20

      - name: Build site
        run: npm ci && npm run build

      - name: Start preview server
        run: npm run preview &
        env:
          PORT: 3000

      - name: Wait for server
        run: npx wait-on http://localhost:3000

      - name: Install Lighthouse
        run: npm install -g lighthouse

      - name: Run Agentic Browsing audit
        run: |
          lighthouse http://localhost:3000 \
            --only-categories=agentic-browsing \
            --output=json \
            --output-path=./audit.json \
            --chrome-flags="--headless --no-sandbox" \
            --quiet

      - name: Check score threshold
        run: |
          SCORE=$(node -e "
            const r = require('./audit.json');
            console.log(r.categories['agentic-browsing'].score * 100);
          ")
          echo "Agentic Browsing score: $SCORE"
          if (( $(echo "$SCORE < 80" | bc -l) )); then
            echo "Score below 80, failing PR"
            exit 1
          fi

      - name: Upload audit JSON
        uses: actions/upload-artifact@v4
        with:
          name: lighthouse-audit
          path: audit.json

This runs on every PR, builds the site, starts a preview server, runs Lighthouse against the local URL, parses the score from the JSON, and fails the PR if the score drops below 80. The audit JSON is uploaded as an artifact for review.

Caveats:

  • The audit needs publicly-resolvable kit files. If your build process generates agents.json / llms.txt / agent-instructions.md at runtime (not at build time), they won't be present in the preview server. Generate at build time so they're served as static files.
  • Headless Chrome on Ubuntu needs --no-sandbox — the GitHub Actions Ubuntu runner doesn't allow sandboxed Chrome. The flag is GitHub-Actions-specific; remove it for self-hosted runners.
  • Score thresholds are noisy. Lighthouse scores vary ±2 points between runs even on identical content. Set the threshold conservatively (75 or 80) rather than tight (90+) unless you know your audits all pass at maximum scores.

Comparing the CLI to the DevTools panel

DimensionDevToolsCLI
SetupBuilt into Chromenpm install -g lighthouse
OutputVisual reportJSON / HTML
ScriptingManualScriptable into CI
ThrottlingReal networkSimulated by default
Comparing runsManualjq against multiple JSON files
CI integrationNoneNative
First-time learning curveLowMedium

The DevTools panel is right for ad-hoc audits during development. The CLI is right for build pipelines, PR gates, and tracking scores over time. Many teams use both — DevTools for "is this fix working" iteration, CLI for "the audit must stay green" enforcement.


Tracking scores over time

If you want a dashboard of your Agentic Browsing score over time, the easiest path is Lighthouse CI — Google's reference server for ingesting CLI output. Self-host it on Vercel or Render in an afternoon. Configure your CI workflow to upload audit JSON after every successful run, and the LHCI dashboard plots score deltas per-build.

Alternative: log scores to your existing observability stack (Grafana, Datadog, simple Postgres table) as a separate lighthouse_score metric per category. The data shape is straightforward — timestamp, branch, commit SHA, category, score, per-audit pass/fail. Five rows per run.


Related

All posts →