Workspace / Project / Extractability Funnel

AI Extractability Funnel

Visualise exactly where your content is lost between raw DOM and AI citation. The funnel shows: Total DOM Tokens → Extractable Content → RAG-Contextualized Content.

Free: site-wide average funnel. Sign up free for URL-level funnel with historical tracking.

How the funnel works

Stage 1 — Total DOM Tokens (N₁): The raw number of tokens visible to an AI crawler after stripping scripts. This is your page's "token weight."

Stage 2 — Extractable Content (N₂ = N₁ − Boilerplate): After removing navigation, footer chrome, sidebars, and repeated UI boilerplate, what remains is the actual content an LLM sees. If more than 60% of tokens are lost here, it's a Critical Drop-off.

Stage 3 — Contextualized Content (N₃ = min(N₂, RAG Limit)): Most RAG pipelines truncate retrieved content to a context window (typically ~1,000 tokens). If your extractable content exceeds this, the AI only sees a slice.

Bottleneck Detection: The stage with the largest percentage loss is flagged as your primary bottleneck. Token Bloat (Stage 1→2) means your DOM has too much non-content markup. RAG Truncation (Stage 2→3) means your content is too long for single-chunk retrieval.