GEO Audit — 100-Site Survey — 2026-04-29
Executive Summary
Between 2026-03-30 and 2026-04-29 we audited 100 websites across 31 industries against a 13-signal protocol — 8 binary readiness signals plus 5 quantitative metrics — using the v3.4 audit endpoint. The cohort spans real estate, finance, government, news, education, e-commerce, healthcare, AI infrastructure and reference platforms. Every site receives the same scan, the same thresholds, and the same arithmetic. The result is a structural map of where AI retrieval pipelines can ingest the open web today, and where they cannot.
The headline finding is uncomfortable: Source Grounding Ratio — the verifiable-claim density signal — clears at just 11.1% of the cohort. Roughly one site in nine writes pages that an AI can cite with attribution. Top10Lists.us is the sole 13/13 result; the next-highest score is 9/13 (elevenlabs.io). 28 of 100 sites — including The Wall Street Journal, The New York Times, CoStar, Apartments.com, Glassdoor and Uber — block AI crawlers at the WAF and produce no measurable signals at all. This isn't an edge case: roughly one in three-and-a-half large web properties is structurally invisible to ChatGPT, Claude, Perplexity, and Gemini. Among the 69 sites that complete a full audit, the binary readiness layer alone has a median of 2/8 — bots get a meaningful page, sometimes JSON-LD, and very little else.
Per-signal, the failure modes are specific — and they sort cleanly into three layers, not one. Only one of the five quantitative metrics is purely infrastructure; the other four live in the data-structure and content layers, where configuration alone cannot move the score. Sitemap throughput (RPS ≥ 1,000) — the lone purely-infrastructure metric — passes at 74.1%, the highest of the five. RPS is a hosting-and-pipeline problem: a properly configured sitemap on competent infrastructure clears it. The other four metrics do not yield to infrastructure work. Relevance Ratio (RR ≥ 0.85) passes at 44.6% — a data-structure problem: more than half the cohort renders templates where the majority of bytes are scaffold, not text. Retrieval Token Cost (RTC ≤ 0.50) passes at 29.2% — also a data-structure problem at root, with content as the secondary lever: seven sites in ten ship more JS scaffolding and chrome than usable content per byte delivered. Last-Modified Recency (LMR ≤ 30 days) clears at 43.2% — a content-discipline problem: stale sitemaps mean editorial cadence isn't producing fresh material the pipeline can detect, regardless of how the sitemap is hooked up. Source Grounding Ratio (SGR ≥ 0.30, original calibration) clears at 11.1% — the moat metric, and the hardest of the five to fake. SGR is hard not because it's the only non-infrastructure signal — most of these are non-infrastructure — but because of which non-infrastructure layer it sits in: it demands a content discipline of attributed, verifiable claims. You cannot configure your way to high SGR; you have to write that way.
The failures are not concentrated in any one industry — they are universal. Government sites (NASA, data.gov, CDC) clear SGR but lag on AI-facing infrastructure files (no llms.txt, no MCP, no AI content feed). E-commerce and proptech (Shopify, Stripe, AppFolio, Yardi) clear infrastructure plumbing but fail on the data-structure layer (RR), grounding (SGR), and the composite (RTC) — the three problems that infrastructure investments alone don't solve. Major news (WSJ, NYT, Bloomberg) blocks bots at the perimeter. Major social and SaaS (LinkedIn, Netflix, Salesforce) sit in the same band as much smaller properties. The notable exceptions prove the moat: data.gov and nasa.gov clear SGR (0.9762 and 0.5000 respectively) on the strength of attributed primary-source content, despite missing several infrastructure signals — confirming SGR measures genuine content legibility, not plumbing.
For AI discovery, this is the citation problem in concrete form. Retrieval pipelines need four kinds of signal to cite a source efficiently: clean ingestion paths (the 8 binary signals — infrastructure), efficient page structure (RR — data structure), verifiable claim density (SGR — content), and a low overall retrieval cost (RTC — the composite of all three). The 2026-04-29 cohort shows almost no site clearing all four at once. The few that do are over-represented in AI citations relative to their domain authority — the thesis the GeoLocus whitepaper develops in detail. The 9.8% user-bot crawl share observed at Top10Lists.us, the only 13/13 site, is what happens when a small domain solves all four problems the rest of the cohort hasn't: AI assistants reach for it because the path of least resistance leads there.
The full methodology, every command issued, and the script that produces this scorecard are reproducible two ways. Get a free audit of your site in ∼60 seconds against the same 13 signals, or download the reproduction runbook to run the audit on your own machine.
Quantitative Metric Pass Rates
Each of the 5 quantitative metrics is binarized at a threshold. A site earns +1 toward its 13-signal score for each threshold it clears. Pass rates are over the full 98-site measured cohort (null values treated as fail).
SGR is the differentiator: At 11.1% cohort pass rate, Source Grounding Ratio is the hardest metric to clear and the strongest competitive moat in the 13-signal rubric. Sites with high external brand authority (data.gov 0.9762, nasa.gov 0.5000) pass SGR despite lower binary signal counts, confirming SGR measures genuine AI-legible content quality, not just infrastructure signals.
Per-Site Scorecard — All 98 Sites — Sorted by 13-Signal Score
Quantitative columns color-coded green (pass threshold) / red (fail) / grey (null). Top10Lists.us row highlighted. Click any column header to sort.
| # | URL | Outcome | S1 | S2 | S3 | S4 | S5 | S6 | S7 | S8 | RR | SGR ★ | RTC | RPS | LMR (days) |
13★ |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 001 | top10lists.us | audited | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 1.000 | 0.4847 | 0.086 | 351.8K | 0.7 | 13 |
| 002 | elevenlabs.io | audited | ✓ | ✓ | ✓ | ✗ | ✓ | ✓ | ✗ | ✓ | 0.902 | 0.0000 | 7.208 | 8.3K | 1.0 | 9 |
| 003 | data.gov | audited | ✓ | ✗ | ✗ | ✓ | ✗ | ✓ | ✗ | ✗ | 0.884 | 0.9762 | 0.080 | 41.7K | 0.0 | 8 |
| 004 | edx.org | audited | ✓ | ✓ | ✗ | ✓ | ✓ | ✓ | ✗ | ✗ | 0.420 | N/A | 72.022 | 67.4K | 0.1 | 7 |
| 005 | supabase.com | audited | ✓ | ✓ | ✓ | ✗ | ✗ | ✓ | ✗ | ✗ | 0.941 | 0.0000 | 0.355 | 17.0K | N/A | 7 |
| 006 | apartmentlist.com | audited | ✓ | ✗ | ✗ | ✓ | ✓ | ✓ | ✗ | ✗ | 0.941 | 0.0000 | 1.681 | 857.6K | 0.1 | 7 |
| 007 | nasa.gov | audited | ✓ | ✗ | ✗ | ✗ | ✓ | ✓ | ✗ | ✗ | 0.892 | 0.5000 | 0.457 | 515.7K | 1007.9 | 7 |
| 008 | walmart.com | audited | ✓ | ✓ | ✓ | ✗ | ✓ | ✓ | ✗ | ✓ | 0.755 | 0.0000 | 6.550 | N/A | N/A | 6 |
| 009 | shopify.com | audited | ✓ | ✓ | ✗ | ✓ | ✓ | ✓ | ✗ | ✗ | 0.518 | 0.0000 | 2.289 | 1.31M | 451.7 | 6 |
| 010 | stripe.com | audited | ✓ | ✓ | ✗ | ✗ | ✓ | ✓ | ✗ | ✗ | 0.893 | 0.0000 | 1.587 | 2.0K | N/A | 6 |
| 011 | yardi.com | audited | ✓ | ✗ | ✗ | ✗ | ✓ | ✓ | ✗ | ✗ | 0.965 | N/A | 0.256 | 255 | 14.5 | 6 |
| 012 | bankofamerica.com | audited | ✓ | ✗ | ✗ | ✗ | ✓ | ✓ | ✗ | ✗ | 0.992 | N/A | 0.099 | 131 | 12.7 | 6 |
| 013 | huggingface.co | audited | ✓ | ✗ | ✗ | ✗ | ✗ | ✓ | ✗ | ✗ | 0.925 | 0.0000 | 0.186 | 37.8K | 2.7 | 6 |
| 014 | cloudflare.com | audited | ✓ | ✓ | ✗ | ✓ | ✗ | ✓ | ✗ | ✗ | 0.711 | 0.0000 | 6.081 | 29.1K | 70.6 | 5 |
| 015 | appfolio.com | audited | ✓ | ✓ | ✗ | ✗ | ✓ | ✓ | ✗ | ✗ | 0.818 | 0.0000 | 0.822 | 2.1K | 636.7 | 5 |
| 016 | coursera.org | audited | ✓ | ✓ | ✗ | ✗ | ✓ | ✓ | ✗ | ✗ | 0.811 | 0.0000 | 27.620 | 46.6K | N/A | 5 |
| 017 | espn.com | audited | ✓ | ✓ | ✓ | ✗ | ✗ | ✗ | ✗ | ✓ | 0.789 | 0.0000 | 0.160 | N/A | N/A | 5 |
| 018 | homelight.com | audited | ✗ | ✓ | ✓ | ✗ | ✓ | ✓ | ✗ | ✗ | 0.695 | 0.0000 | 7.187 | 2.0K | 91.0 | 5 |
| 019 | harvard.edu | audited | — | — | — | — | — | — | — | — | 0.584 | 0.0000 | 0.351 | 523 | 257.9 | 5 |
| 020 | netflix.com | audited | ✗ | ✓ | ✓ | ✗ | ✗ | ✓ | ✗ | ✓ | 0.868 | 0.0000 | 186.255 | N/A | N/A | 5 |
| 021 | bbb.org | audited | ✓ | ✗ | ✗ | ✗ | ✓ | ✓ | ✗ | ✗ | 0.567 | N/A | 8.199 | 1.56M | 21.7 | 5 |
| 022 | buildium.com | audited | ✓ | ✗ | ✗ | ✗ | ✓ | ✓ | ✗ | ✗ | 0.914 | 0.0000 | 0.878 | 2.5K | 1793.3 | 5 |
| 023 | who.int | audited | ✓ | ✗ | ✗ | ✗ | ✓ | ✓ | ✗ | ✗ | 0.881 | 0.0000 | 0.408 | 8 | N/A | 5 |
| 024 | khanacademy.org | audited | ✗ | ✓ | ✓ | ✗ | ✗ | ✗ | ✗ | ✓ | 1.000 | N/A | 0.816 | 72.9K | 106.1 | 5 |
| 025 | statefarm.com | audited | ✓ | ✗ | ✗ | ✗ | ✓ | ✓ | ✗ | ✗ | 1.000 | 0.0000 | 0.938 | 66.0K | 1045.7 | 5 |
| 026 | wsj.com | audited | ✗ | ✗ | ✗ | ✓ | ✗ | ✗ | ✗ | ✗ | 1.000 | N/A | 0.420 | 1.5K | 0.0 | 5 |
| 027 | github.com | audited | ✓ | ✓ | ✗ | ✗ | ✗ | ✓ | ✗ | ✓ | 0.696 | 0.0000 | 1.664 | N/A | N/A | 4 |
| 028 | lendingtree.com | audited | ✓ | ✗ | ✗ | ✓ | ✓ | ✓ | ✗ | ✗ | 0.774 | 0.0625 | 1.017 | N/A | 37.0 | 4 |
| 029 | ieee.org | audited | ✓ | ✓ | ✓ | ✗ | ✗ | ✗ | ✗ | ✓ | N/A | N/A | N/A | N/A | N/A | 4 |
| 030 | notion.so | audited | ✓ | ✓ | ✗ | ✗ | ✗ | ✓ | ✓ | ✗ | 0.690 | 0.0000 | 10.222 | N/A | N/A | 4 |
| 031 | apple.com | audited | ✓ | ✗ | ✗ | ✗ | ✓ | ✓ | ✗ | ✗ | 0.265 | N/A | 3.323 | 23.5K | N/A | 4 |
| 032 | fastexpert.com | audited | ✓ | ✗ | ✗ | ✗ | ✓ | ✓ | ✗ | ✗ | 0.806 | 0.0000 | 2.732 | 12.0K | 804.9 | 4 |
| 033 | realpage.com | audited | ✓ | ✗ | ✗ | ✗ | ✓ | ✓ | ✗ | ✗ | 0.192 | N/A | 5.091 | 3.0K | 1045.1 | 4 |
| 034 | progressive.com | audited | ✓ | ✗ | ✗ | ✓ | ✗ | ✓ | ✗ | ✗ | 0.589 | 0.0000 | 3.406 | 5.3K | 581.7 | 4 |
| 035 | redfin.com | audited | ✓ | ✗ | ✗ | ✗ | ✓ | ✓ | ✗ | ✗ | 1.000 | N/A | 1.209 | N/A | N/A | 4 |
| 036 | bankrate.com | audited | ✓ | ✗ | ✗ | ✗ | ✓ | ✓ | ✗ | ✗ | 0.546 | 0.0000 | 0.525 | 29.0K | 344.8 | 4 |
| 037 | nih.gov | audited | ✓ | ✗ | ✗ | ✓ | ✗ | ✓ | ✗ | ✗ | 0.531 | N/A | 1.034 | 6.0K | 390.1 | 4 |
| 038 | salesforce.com | blocked | — | — | — | — | — | — | — | — | N/A | N/A | N/A | N/A | N/A | 0 |
| 039 | bbc.com | audited | ✗ | ✗ | ✗ | ✗ | ✓ | ✓ | ✗ | ✗ | 0.899 | 0.0000 | 0.422 | 109 | N/A | 4 |
| 040 | arxiv.org | audited | ✓ | ✗ | ✗ | ✗ | ✗ | ✓ | ✗ | ✗ | 0.964 | 0.0000 | 0.051 | N/A | N/A | 4 |
| 041 | cdc.gov | audited | ✓ | ✗ | ✗ | ✗ | ✗ | ✓ | ✗ | ✗ | 0.429 | 0.0833 | 0.198 | 38.7K | 644.2 | 4 |
| 042 | census.gov | audited | ✓ | ✗ | ✗ | ✗ | ✗ | ✓ | ✗ | ✗ | 0.855 | 0.0000 | 5.224 | 4.2K | 4228.7 | 4 |
| 043 | stanford.edu | audited | ✓ | ✗ | ✗ | ✗ | ✗ | ✓ | ✗ | ✗ | 0.868 | 0.0000 | 0.399 | N/A | N/A | 4 |
| 044 | cnn.com | audited | ✗ | ✗ | ✗ | ✗ | ✓ | ✓ | ✗ | ✗ | 0.734 | 0.0000 | 5.658 | 1.3K | 1.0 | 4 |
| 045 | chase.com | audited | ✓ | ✗ | ✗ | ✗ | ✗ | ✓ | ✗ | ✗ | 0.060 | 0.5000 | 50.349 | 21.2K | N/A | 4 |
| 046 | fidelity.com | audited | ✓ | ✗ | ✗ | ✗ | ✗ | ✓ | ✗ | ✗ | 0.406 | N/A | 123.086 | 1.8K | 0.7 | 4 |
| 047 | reddit.com | blocked | — | — | — | — | — | — | — | — | N/A | N/A | N/A | N/A | N/A | 3 |
| 048 | wellsfargo.com | audited | ✓ | ✗ | ✗ | ✗ | ✓ | ✓ | ✗ | ✗ | 0.843 | 0.0000 | 0.822 | N/A | N/A | 3 |
| 049 | imdb.com | audited | ✗ | ✓ | ✓ | ✗ | ✗ | ✗ | ✗ | ✓ | N/A | N/A | N/A | N/A | N/A | 3 |
| 050 | x.com | audited | ✗ | ✓ | ✓ | ✗ | ✗ | ✗ | ✗ | ✓ | 0.490 | N/A | 74.213 | N/A | N/A | 3 |
| 051 | ratemyagent.com | audited | ✗ | ✗ | ✗ | ✗ | ✓ | ✓ | ✗ | ✗ | 0.582 | 0.0000 | 1.207 | 47.08M | N/A | 3 |
| 052 | wikidata.org | audited | ✓ | ✗ | ✗ | ✗ | ✗ | ✓ | ✗ | ✗ | 0.539 | 0.9500 | 0.723 | N/A | N/A | 3 |
| 053 | mozilla.org | audited | ✓ | ✗ | ✗ | ✗ | ✗ | ✓ | ✗ | ✗ | 0.447 | N/A | 0.467 | N/A | N/A | 3 |
| 054 | acm.org | blocked | — | — | — | — | — | — | — | — | N/A | N/A | N/A | N/A | N/A | 0 |
| 055 | stackoverflow.com | audited | ✗ | ✗ | ✗ | ✗ | ✓ | ✓ | ✗ | ✗ | 0.881 | 0.0000 | 1.239 | N/A | N/A | 3 |
| 056 | jstor.org | audited | ✗ | ✗ | ✗ | ✗ | ✓ | ✓ | ✗ | ✗ | 1.000 | 0.0000 | 1.198 | N/A | N/A | 3 |
| 057 | mit.edu | audited | ✓ | ✗ | ✗ | ✗ | ✗ | ✓ | ✗ | ✗ | 0.795 | 0.0000 | 0.472 | N/A | N/A | 3 |
| 058 | microsoft.com | audited | ✓ | ✗ | ✗ | ✗ | ✗ | ✓ | ✗ | ✗ | 1.000 | N/A | 1.127 | N/A | N/A | 3 |
| 059 | forbes.com | audited | ✗ | ✗ | ✗ | ✗ | ✓ | ✓ | ✗ | ✗ | 0.518 | 0.0000 | 1.196 | 747 | 0.8 | 3 |
| 060 | perplexity.ai | blocked | — | — | — | — | — | — | — | — | N/A | N/A | N/A | N/A | N/A | 0 |
| 061 | realtor.com | blocked | — | — | — | — | — | — | — | — | N/A | N/A | N/A | N/A | N/A | 0 |
| 062 | npr.org | audited | ✗ | ✗ | ✗ | ✗ | ✗ | ✓ | ✗ | ✗ | 0.988 | 0.0000 | 0.605 | 110.3K | 8389.2 | 3 |
| 063 | archive.org | audited | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | 1.000 | N/A | 0.287 | N/A | N/A | 3 |
| 064 | wikipedia.org | audited | ✓ | ✗ | ✗ | ✗ | ✗ | ✓ | ✗ | ✗ | 0.149 | N/A | 1.530 | N/A | N/A | 2 |
| 065 | pitchbook.com | blocked | — | — | — | — | — | — | — | — | N/A | N/A | N/A | N/A | N/A | 2 |
| 066 | hud.gov | audited | ✓ | ✗ | ✗ | ✗ | ✗ | ✓ | ✗ | ✗ | 0.394 | N/A | 15.678 | N/A | N/A | 2 |
| 067 | turbotenant.com | blocked | — | — | — | — | — | — | — | — | N/A | N/A | N/A | N/A | N/A | 2 |
| 068 | anthropic.com | audited | ✓ | ✗ | ✗ | ✓ | ✗ | ✓ | ✗ | ✗ | 0.388 | 0.0000 | 3.267 | 52 | 187.7 | 2 |
| 069 | google.com | blocked | — | — | — | — | — | — | — | — | N/A | N/A | N/A | N/A | N/A | 0 |
| 070 | substack.com | audited | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | 1.000 | N/A | 8.220 | N/A | 37.8 | 2 |
| 071 | youtube.com | audited | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | 1.000 | N/A | 403.929 | 726 | N/A | 2 |
| 072 | openai.com | blocked | — | — | — | — | — | — | — | — | N/A | N/A | N/A | N/A | N/A | 0 |
| 073 | wikimedia.org | audited | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | 1.000 | 0.0000 | 0.434 | N/A | N/A | 2 |
| 074 | nytimes.com | blocked | — | — | — | — | — | — | — | — | N/A | N/A | N/A | N/A | N/A | 0 |
| 075 | airbnb.com | blocked | — | — | — | — | — | — | — | — | N/A | N/A | N/A | N/A | N/A | 1 |
| 076 | webmd.com | audited | ✗ | ✗ | ✗ | ✗ | ✗ | ✓ | ✗ | ✗ | 0.765 | N/A | 3.325 | N/A | N/A | 1 |
| 077 | zillow.com | blocked | — | — | — | — | — | — | — | — | N/A | N/A | N/A | N/A | N/A | 0 |
| 078 | nerdwallet.com | audited | ✗ | ✗ | ✗ | ✗ | ✗ | ✓ | ✗ | ✗ | 0.536 | 0.0323 | 3.713 | 862 | 99.3 | 1 |
| 079 | linkedin.com | audited | ✗ | ✗ | ✗ | ✗ | ✗ | ✓ | ✗ | ✗ | 0.842 | 0.0000 | 2.345 | N/A | N/A | 1 |
| 080 | indeed.com | blocked | — | — | — | — | — | — | — | — | N/A | N/A | N/A | N/A | N/A | 1 |
| 081 | rentcafe.com | blocked | — | — | — | — | — | — | — | — | N/A | N/A | N/A | N/A | N/A | 1 |
| 082 | medium.com | blocked | — | — | — | — | — | — | — | — | N/A | N/A | N/A | N/A | N/A | 0 |
| 083 | theguardian.com | blocked | — | — | — | — | — | — | — | — | N/A | N/A | N/A | N/A | N/A | 0 |
| 084 | facebook.com | audited | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | 1.000 | N/A | 34.532 | N/A | N/A | 1 |
| 085 | bloomberg.com | blocked | — | — | — | — | — | — | — | — | N/A | N/A | N/A | N/A | N/A | 0 |
| 086 | w3.org | blocked | — | — | — | — | — | — | — | — | N/A | N/A | N/A | N/A | N/A | 0 |
| 087 | fda.gov | unreachable | — | — | — | — | — | — | — | — | N/A | N/A | N/A | N/A | N/A | 0 |
| 088 | amazon.com | blocked | — | — | — | — | — | — | — | — | N/A | N/A | N/A | N/A | N/A | 0 |
| 089 | healthgrades.com | blocked | — | — | — | — | — | — | — | — | N/A | N/A | N/A | N/A | N/A | 0 |
| 090 | yelp.com | blocked | — | — | — | — | — | — | — | — | N/A | N/A | N/A | N/A | N/A | 0 |
| 091 | tripadvisor.com | blocked | — | — | — | — | — | — | — | — | N/A | N/A | N/A | N/A | N/A | 0 |
| 092 | crunchbase.com | audited | ✗ | ✗ | ✗ | ✗ | ✓ | ✓ | ✗ | ✗ | 1.000 | 0.0000 | 0.285 | N/A | N/A | 4 |
| 093 | martindale.com | blocked | — | — | — | — | — | — | — | — | N/A | N/A | N/A | N/A | N/A | 0 |
| 094 | avvo.com | blocked | — | — | — | — | — | — | — | — | N/A | N/A | N/A | N/A | N/A | 0 |
| 095 | sec.gov | blocked | — | — | — | — | — | — | — | — | N/A | N/A | N/A | N/A | N/A | 0 |
| 096 | glassdoor.com | blocked | — | — | — | — | — | — | — | — | N/A | N/A | N/A | N/A | N/A | 0 |
| 097 | uber.com | error | — | — | — | — | — | — | — | — | N/A | N/A | N/A | N/A | N/A | 0 |
| 098 | costar.com | blocked | — | — | — | — | — | — | — | — | N/A | N/A | N/A | N/A | N/A | 0 |
| 099 | apartments.com | blocked | — | — | — | — | — | — | — | — | N/A | N/A | N/A | N/A | N/A | 0 |
| 100 | rent.com | error | — | — | — | — | — | — | — | — | N/A | N/A | N/A | N/A | N/A | 0 |
Methodology — 13-Signal Protocol v3.4
8 Binary Signals
Each signal is true/false. A site earns 1 point per passing signal. Blocked/unreachable sites receive 0 for quantitative metrics but may receive partial credit on sitemap-derivable binary signals.
| S1 | Robots AI bots allowed | robots.txt does not Disallow GPTBot, ClaudeBot, or PerplexityBot |
| S2 | llms.txt present | HTTP 200 on /llms.txt |
| S3 | llms-full.txt present | HTTP 200 on /llms-full.txt |
| S4 | Sitemap fresh | Sitemap lastmod median ≤ 30 days across all URLs |
| S5 | JSON-LD structured data | Valid JSON-LD <script> block present on homepage |
| S6 | Pre-rendered HTML | Homepage delivers meaningful HTML to bot UA (not SPA shell or JS-dependent render) |
| S7 | MCP server live | HTTP 200 on /.well-known/mcp.json |
| S8 | AI content feed | HTTP 200 on /ai-content-index.json or /for-ai |
5 Quantitative Metrics (binarized at threshold for 13-signal score)
| RR | Relevance Ratio | Clean text chars / total response chars. Threshold: ≥ 0.85 |
| SGR | Source Grounding Ratio ★ | Verifiable claims / total claims extracted via Sonnet LLM (claim-extraction-v1). Threshold: ≥ 0.30 (original calibration; current threshold >0.00 — see Threshold recalibration note below). The moat signal. |
| RTC | Retrieval Token Cost | Response tokens / useful chars × 4. Lower = more efficient retrieval. Threshold: ≤ 0.50 |
| RPS | Sitemap Throughput | URLs indexable per wall-clock second via parallel sitemap tree crawl. Threshold: ≥ 1,000 |
| LMR | Last-Modified Recency | Median days since sitemap lastmod across all indexed URLs. Lower = fresher. Threshold: ≤ 30 days |
Threshold recalibration — SGR (2026-05-05)
The SGR pass threshold has since been recalibrated from ≥ 0.30 to >0.00 for forward-looking audits. The original 0.30 threshold proved unachievable for ∼94% of the cohort — max observed SGR across the measured 100-site survey (excluding Top10Lists.us) was 0.0620, so 0.30 produced a near-universal fail that masked meaningful differences between sites that have attribution discipline at low density and sites that have none. This page reports findings under the original 0.30 calibration — the 11.1% cohort pass rate, the per-site SGR scores in the scorecard, and the “1 in 9 sites clear” framing all reflect the original threshold. Live prospect audits at staging.geolocus.ai/api/audit use the recalibrated >0.00 threshold; a separate methodology page documenting the recalibration is in progress.
Reproduce Any Cell
The v3.4 13-signal audit is fully reproducible two ways:
- Free in-page audit. Fill the form above and we'll audit your site live in ∼60 seconds against the same 13 signals. Output renders inline, byte-for-byte matching the per-site scorecard schema above.
- Self-host. Download the reproduction runbook to get the script, the 100-site cohort definition, the threshold definitions, and the full output schema. Run it locally with Node 20.x — no paid API keys required for the 8 binary signals or RR/RTC/RPS/LMR.
The endpoint behind both paths:
curl "https://staging.geolocus.ai/api/audit?url=https%3A%2F%2Fyour-site.com"
Source: audit-v3.js — v3.4 deploy on Cloudflare Workers via staging.geolocus.ai. Run: 2026-04-29T16:23:48Z • 98 of 100 sites measured. top10lists.us SGR updated to 0.4847 from live re-run (post-PR-308/309 self-citation improvements; original batch measurement was 0.3577).