Inside Chrome 146 — what `lighthouse --only-categories=agentic-browsing` actually does

If you're shipping the kit files programmatically — generating them in a build pipeline, validating them in CI, gating PRs on the audit passing — the right tool is Lighthouse on the command line, not Chrome DevTools. This post is the technical deep-dive on what the CLI flag does, the JSON it emits, the flags that matter for stability, and the CI shape that turns the audit into an enforced gate.

This is the cluster-2 supporting post written for developers — if you're an SMB-owner trying to manually verify your install, the 5-minute install audit is the right tab. If you're wiring Lighthouse into your build, keep reading.

Install

npm install -g lighthouse
# or
npx lighthouse --version

The Agentic Browsing category requires Lighthouse 12.4 or higher (matches Chrome 146's category-bundling). Older versions silently ignore the --only-categories=agentic-browsing flag and run nothing.

Confirm:

lighthouse --help | grep -A 2 only-categories

Should list agentic-browsing alongside performance, accessibility, best-practices, seo, and pwa.

The basic invocation

lighthouse https://yourdomain.com \
  --only-categories=agentic-browsing \
  --output=json \
  --output-path=./audit.json \
  --chrome-flags="--headless"

What this does:

Launches a headless Chrome instance.
Navigates to the URL.
Runs the Agentic Browsing category audits — all nine of them.
Emits a JSON report at the output path.
Exits 0 (success) or non-zero (audit failures, depending on --fail-on=).

--chrome-flags="--headless" is the load-bearing flag in CI environments where there's no display server. Locally you can drop it to watch Chrome actually load the page.

The JSON output shape

Lighthouse's JSON output is structured around categories and audits:

{
  "categories": {
    "agentic-browsing": {
      "id": "agentic-browsing",
      "title": "Agentic Browsing",
      "score": 0.78,
      "auditRefs": [
        { "id": "llms-txt-present", "weight": 10 },
        { "id": "llms-txt-well-formed", "weight": 10 },
        { "id": "agents-json-present", "weight": 10 },
        { "id": "agents-json-actions-typed", "weight": 15 },
        { "id": "agent-runbook-present", "weight": 10 },
        { "id": "auto-discovery-links", "weight": 10 },
        { "id": "sitemap-discoverable", "weight": 10 },
        { "id": "schema-org-density", "weight": 20 },
        { "id": "webmcp-annotations", "weight": 5 }
      ]
    }
  },
  "audits": {
    "llms-txt-present": {
      "id": "llms-txt-present",
      "title": "...",
      "score": 1,
      "displayValue": "Present"
    },
    "agents-json-actions-typed": {
      "id": "agents-json-actions-typed",
      "score": 0,
      "details": {
        "items": [
          {
            "action": "search",
            "issue": "Parameter 'q' has no declared type"
          }
        ]
      }
    }
  }
}

Per-audit weights inside the category as of the experimental spec: schema-org-density weighs the most (20%), agents-json-actions-typed next (15%), the file-presence audits 10% each, webmcp-annotations lowest (5% because the spec is still moving). Weights shift as the category stabilizes — re-read the spec page before relying on weights for prioritization.

The details.items array on failing audits is the load-bearing diagnostic — it tells you exactly which file/action/parameter triggered the failure. Pipe it into your CI logs.

The flags that matter

`--throttling-method`

Default is simulate (Lighthouse models a slower network in software). For agent-readiness audits, the kit files are tiny and the network throttling doesn't change scores — set --throttling-method=devtools or --throttling-method=provided to skip the network simulation and shave 5-10 seconds off the run.

`--emulated-form-factor`

Lighthouse defaults to mobile emulation. For Agentic Browsing audits this rarely matters (the audits check files and head tags, not viewport), but if your site renders different HTML to mobile vs desktop, run twice with --emulated-form-factor=desktop and --emulated-form-factor=mobile and compare the per-audit scores.

`--screenEmulation.disabled`

If your CI runner is non-graphical and you're seeing screen-emulation errors, pass --screenEmulation.disabled=true. The Agentic Browsing category doesn't rely on visual rendering — the audits inspect HTML structure and file existence, not pixels.

`--max-wait-for-load`

Default 45000ms. If your site has slow third-party scripts (chat widgets, analytics), Lighthouse waits for the load event before scoring. Set to a lower value (--max-wait-for-load=15000) to time-bound the audit in CI. Files served quickly will pass; pages with slow loads might get partial scores.

`--save-assets`

Useful for debugging — emits screenshots, traces, and DevTools metrics alongside the JSON. Generates ~50MB of artifacts per run; skip in production CI.

CI shape — gating PRs on the audit

A minimal GitHub Actions workflow that runs the audit on every PR and fails if the Agentic Browsing score drops below 80:

name: Lighthouse Agentic Browsing audit

on:
  pull_request:
    branches: [main]

jobs:
  lighthouse:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: 20

      - name: Build site
        run: npm ci && npm run build

      - name: Start preview server
        run: npm run preview &
        env:
          PORT: 3000

      - name: Wait for server
        run: npx wait-on http://localhost:3000

      - name: Install Lighthouse
        run: npm install -g lighthouse

      - name: Run Agentic Browsing audit
        run: |
          lighthouse http://localhost:3000 \
            --only-categories=agentic-browsing \
            --output=json \
            --output-path=./audit.json \
            --chrome-flags="--headless --no-sandbox" \
            --quiet

      - name: Check score threshold
        run: |
          SCORE=$(node -e "
            const r = require('./audit.json');
            console.log(r.categories['agentic-browsing'].score * 100);
          ")
          echo "Agentic Browsing score: $SCORE"
          if (( $(echo "$SCORE < 80" | bc -l) )); then
            echo "Score below 80, failing PR"
            exit 1
          fi

      - name: Upload audit JSON
        uses: actions/upload-artifact@v4
        with:
          name: lighthouse-audit
          path: audit.json

This runs on every PR, builds the site, starts a preview server, runs Lighthouse against the local URL, parses the score from the JSON, and fails the PR if the score drops below 80. The audit JSON is uploaded as an artifact for review.

Caveats:

The audit needs publicly-resolvable kit files. If your build process generates agents.json / llms.txt / agent-instructions.md at runtime (not at build time), they won't be present in the preview server. Generate at build time so they're served as static files.
Headless Chrome on Ubuntu needs --no-sandbox — the GitHub Actions Ubuntu runner doesn't allow sandboxed Chrome. The flag is GitHub-Actions-specific; remove it for self-hosted runners.
Score thresholds are noisy. Lighthouse scores vary ±2 points between runs even on identical content. Set the threshold conservatively (75 or 80) rather than tight (90+) unless you know your audits all pass at maximum scores.

Comparing the CLI to the DevTools panel

Dimension	DevTools	CLI
Setup	Built into Chrome	`npm install -g lighthouse`
Output	Visual report	JSON / HTML
Scripting	Manual	Scriptable into CI
Throttling	Real network	Simulated by default
Comparing runs	Manual	`jq` against multiple JSON files
CI integration	None	Native
First-time learning curve	Low	Medium

The DevTools panel is right for ad-hoc audits during development. The CLI is right for build pipelines, PR gates, and tracking scores over time. Many teams use both — DevTools for "is this fix working" iteration, CLI for "the audit must stay green" enforcement.

Tracking scores over time

If you want a dashboard of your Agentic Browsing score over time, the easiest path is Lighthouse CI — Google's reference server for ingesting CLI output. Self-host it on Vercel or Render in an afternoon. Configure your CI workflow to upload audit JSON after every successful run, and the LHCI dashboard plots score deltas per-build.

Alternative: log scores to your existing observability stack (Grafana, Datadog, simple Postgres table) as a separate lighthouse_score metric per category. The data shape is straightforward — timestamp, branch, commit SHA, category, score, per-audit pass/fail. Five rows per run.

Lighthouse Agentic Browsing FAQ — every common question, answered → — Long-tail Q&A coverage including this audit's most-searched questions.
Lighthouse Agentic Browsing — every audit, every fix → — the canonical per-audit fix reference
Chrome added an Agentic Browsing audit to Lighthouse → — the category announcement
Cloudflare's Agent Readiness Score vs Chrome Lighthouse → — when to use each
How the BridgeToAgent kit maps to every audit → — what's kit-fixable, what's CMS/theme-side
The 5-minute install audit for SMBs who can't read code → — the non-developer companion to this CLI post
Agent-readiness scoring frameworks reference → — Lighthouse, Cloudflare, Bridge AI, The Optimisers compared
Lighthouse Agentic landing page → — the BTA Lighthouse-focused reference page

Inside Chrome 146 — what lighthouse --only-categories=agentic-browsing actually does

Inside Chrome 146 — what `lighthouse --only-categories=agentic-browsing` actually does

Install

The basic invocation

The JSON output shape

The flags that matter

`--throttling-method`

`--emulated-form-factor`

`--screenEmulation.disabled`

`--max-wait-for-load`

`--save-assets`

CI shape — gating PRs on the audit

Comparing the CLI to the DevTools panel

Tracking scores over time

Related

WebMCP for store owners — what it is and what to do about it

UCP is Shopify-native — here's what the Agent Kit adds on top

We built a free llms.txt validator — open-source, no signup, MIT-licensed

Inside Chrome 146 — what lighthouse --only-categories=agentic-browsing actually does

Install

The basic invocation

The JSON output shape

The flags that matter

--throttling-method

--emulated-form-factor

--screenEmulation.disabled

--max-wait-for-load

--save-assets

CI shape — gating PRs on the audit

Comparing the CLI to the DevTools panel

Tracking scores over time

Related

Keep reading

WebMCP for store owners — what it is and what to do about it

UCP is Shopify-native — here's what the Agent Kit adds on top

We built a free llms.txt validator — open-source, no signup, MIT-licensed

Inside Chrome 146 — what `lighthouse --only-categories=agentic-browsing` actually does

`--throttling-method`

`--emulated-form-factor`

`--screenEmulation.disabled`

`--max-wait-for-load`

`--save-assets`