Free Tool

Extraction Lab

Run a paired before/after experiment on any URL. See exactly how HTML structure affects LLM fact extraction — token cost, accuracy, and hallucination rate.

Large language models like GPT-4, Claude, and Gemini consume raw HTML when extracting facts from web pages. Bloated markup — inline styles, deeply nested divs, tracking scripts — inflates token counts and degrades extraction accuracy. The Extraction Lab quantifies this effect by comparing how an LLM processes your original page versus a structurally optimized twin with the same visible content but cleaner HTML.

The experiment is fully deterministic: no LLM API calls are made. Instead, we simulate extraction using token counting, structural analysis, and a fact-density heuristic to estimate accuracy, cost savings, and hallucination risk before and after optimization.

We fetch the page, create a structurally optimised twin, and compare extraction metrics side-by-side. Zero LLM API calls — fully deterministic.

Overview

Before / After

Fact Extraction

Remediation

Golden String

Methodology

🔴 Original

🟢 Optimized

Token Noise Sources (Original)

Extracted Facts Comparison

Original Extraction

Field	Value	Match

Optimized Extraction

Field	Value	Match

Remediation Steps

Copyable Patch

The Golden Semantic String is what an LLM actually reads: Title + Meta + H1 + H2s + first ~600 words of body content + JSON-LD entities. Everything else is noise.

Original Golden String

Optimized Golden String

Want the Full Deep Audit?

Run a comprehensive crawl with template-level analysis, historical tracking, and automated remediation snippets.

Start Deep Audit →

Read the Research

Our whitepaper documents the full methodology, case studies, and statistical proof that HTML structure causally determines LLM extraction accuracy.

Read the Whitepaper →