PROOF scores your page across nine dimensions, totalling 100 points. Each section below explains what we check, why it matters, and how to fix it. Use this as your reference when working through the recommendations PROOF generates.
01
Technical SEO
The plumbing of your page. Search engines and AI crawlers need clean, well-structured HTML signals to understand what the page is and who it serves.
Browser title length
WhatThe title tag in your page head, shown in browser tabs and search results.
WhyGoogle truncates titles over 60 characters. Too short and you miss ranking opportunities. Too long and your full pitch never shows.
FixAim for 50-60 characters. Lead with your keyword, end with brand.
Meta description length
WhatThe meta description tag, often used as the snippet under your link in search results.
WhyA compelling 150-160 char description boosts click-through rate, which Google uses as a ranking signal.
FixWrite 150-160 chars with the value proposition + keyword + a verb that drives action.
Single H1 tag
WhatThe main heading at the top of the page.
WhyMultiple H1s confuse both search engines and screen readers about what the page is primarily about.
FixUse exactly one H1 per page. Demote any others to H2.
URL length and structure
WhatThe full web address of the page.
WhyShort, descriptive URLs are easier to share, look more trustworthy, and rank slightly better.
FixKeep under 75 characters. Use hyphens between words. Include your keyword.
Mobile viewport tag
WhatThe viewport meta tag that tells mobile browsers how to scale your page.
WhyWithout it, mobile users see a tiny desktop view. Google penalises non-mobile-friendly pages.
FixAdd this in your head: meta viewport content="width=device-width, initial-scale=1".
02
On-page SEO
How well the page signals what it is about for a specific search intent. Where your target keyword shows up, and how naturally.
Keyword in browser title
WhatWhether your target keyword appears in the page title tag.
WhyTitle is the single strongest on-page ranking signal. If your keyword is not here, you are fighting uphill.
FixAdd the keyword near the start of your title.
Keyword in H1
WhatWhether your target keyword appears in the main page heading.
WhyH1 reinforces topical relevance to both users and crawlers when it matches the title intent.
FixInclude the keyword in your H1 phrasing, naturally.
Keyword in meta description
WhatWhether your keyword appears in the meta description tag.
WhyWhen users see their search term highlighted in the snippet, click-through rates jump.
FixMention the keyword once in your description, naturally, near the start.
Keyword in URL slug
WhatWhether the keyword appears in the page URL.
WhySearch engines use URL words as a relevance signal. Users also trust descriptive URLs more.
FixUse a slug like /your-keyword-phrase rather than /post-12345.
Keyword in opening 100 words
WhatWhether the keyword appears in the first 100 words.
WhySearch engines weight the opening of the page heavily. LLMs disproportionately quote opening paragraphs.
FixMention your keyword naturally in the intro paragraph.
Keyword density
WhatHow often your keyword appears relative to total word count.
WhyToo low and the page looks irrelevant. Too high and it looks like keyword stuffing.
FixAim for 0.5 to 3 percent. For 600 words that is 3 to 18 mentions.
03
Topical coverage
Whether your page demonstrates topical authority by using multiple related phrases across different page zones, not just repeating one keyword.
Phrases distributed across page zones
WhatHow many distinct phrases related to your target keyword appear across multiple page zones (title, H1, H2, meta description, body).
WhyPages that score highly with both Google and AI engines use 3 to 5 related phrases that appear in multiple zones, not piled in body alone. Distribution across zones signals genuine topical depth, not keyword stuffing.
FixIdentify 5 to 7 variations of your target keyword. Make sure each variation appears in at least 2 places, for example title plus body, or H2 plus meta description.
Total breadth of related phrases
WhatTotal count of distinct related phrases that appear at least twice anywhere in your content.
WhyAI engines look for vocabulary breadth around a topic. A page mentioning only your exact keyword 20 times looks thin compared to one that uses 7 related phrases naturally.
FixBrainstorm related phrases: synonyms, sub-topics, common questions about the keyword, related products or services. Weave 5 to 7 of these naturally into your content.
04
Citation worthiness
How likely your page is to be cited by ChatGPT, Claude, Gemini, and Perplexity. AI engines cite complex, nuanced content 77 percent of the time versus only 23 percent for simple content.
Multi-word question depth
WhatHow many headings on your page ask substantive, multi-word questions (5 or more words).
WhyAI engines cite content that answers specific, nuanced questions. A heading like "What is SEO?" is too generic. "What is the difference between SEO and AEO?" is the kind of question AI engines quote.
FixRewrite at least 3 H2 headings as multi-word questions your readers actually ask. Aim for 5 or more words per question.
Sentence depth and nuance
WhatWhat percentage of your sentences are long (15+ words) or contain multi-clause structures (comma + because, while, although, etc).
WhyShort, simple sentences are surface-level content. AI engines prefer pages that demonstrate analytical depth through layered sentence structures, signaling genuine expertise.
FixMix sentence lengths intentionally. Keep some short and punchy. Use longer multi-clause sentences when you need to explain causation, comparison, or qualification.
Concrete data and specifics
WhatHow many specific numbers, dates, statistics, and proper nouns appear in your content.
WhyAI engines cite content rich in concrete data because there is something specific to extract. Vague content with no numbers or named examples rarely gets cited.
FixAdd specific numbers (percentages, dollar amounts, counts), year references, and named examples (companies, people, places) wherever you can do so honestly.
05
Content quality
Whether the page is substantial enough to rank and structured cleanly enough for both humans and AI to extract from.
Word count
WhatTotal words in the body content.
WhyThin pages rarely rank. Most ranking pages for competitive terms are 800+ words. LLMs need substance to extract anything quotable.
FixExpand below 600 words. For competitive terms, 1500+ tends to perform best.
Average sentence length
WhatMean words per sentence.
WhyLong sentences are harder for humans to parse and harder for LLMs to chunk into citation-quality snippets.
FixKeep average under 25 words. Break up anything over 30.
Heading hierarchy
WhatHow many H2 sub-headings structure your content.
WhyPages with 2 to 5 H2s scan better, rank better, and give LLMs natural extraction points.
FixAdd 2-3 H2 headings for any article over 500 words.
Image alt text coverage
WhatPercentage of images that have descriptive alt text attributes.
WhyAlt text is required for accessibility, helps images rank in image search, and gives LLMs context.
FixAdd descriptive alt text to every image. Avoid generic "image1.jpg" or empty alt.
Visual content present
WhatWhether the page has at least one image.
WhyPages with visuals get more dwell time and shares, which improves rankings indirectly.
FixAdd at least one relevant image, infographic, or diagram.
06
Structured data
Schema.org JSON-LD markup tells search engines and AI crawlers exactly what kind of thing your page is, who created it, and what facts it contains.
Schema.org JSON-LD present
WhatWhether the page has any JSON-LD structured data.
WhyJSON-LD is the format Google, Bing, and AI crawlers prefer. Without it, machines have to guess at your page meaning.
FixAdd a script type=application/ld+json block in your head with at least Article or WebPage schema.
Article or WebPage schema
WhatSpecific declaration that this is an article or content page.
WhyTells crawlers how to interpret the rest of your data. Prerequisite for richer schemas like FAQ.
FixAdd @type Article (or WebPage) with headline, datePublished, author fields.
Organization or Author schema
WhatSchema that declares who published the content.
WhyEstablishes E-E-A-T (Experience, Expertise, Authoritativeness, Trust). LLMs cite content from named, credible sources more often.
FixAdd @type Organization or Person with name, url, optional logo and sameAs links.
FAQ or QAPage schema
WhatStructured Q&A data attached to the page.
WhyHighest-impact AEO move available. LLMs preferentially extract from FAQ-shaped content.
FixAdd 3-5 FAQ entries via FAQPage schema, each with a question + clear answer.
07
AEO readiness
Answer Engine Optimization. How well the page is shaped to be quoted, cited, or summarised by ChatGPT, Claude, Gemini, and Perplexity.
Page answers a clear question
WhatWhether the page has at least one explicit question pattern in headings or body.
WhyLLMs are trained to answer questions. Pages that pre-frame the question get extracted more often.
FixAdd at least one question, in an H2 or in the opening line.
Direct answer in opening
WhatWhether the first 40-80 words give a substantive, quotable answer.
WhyLLMs heavily weight opening content. A wandering intro means LLMs grab a bad snippet, or none at all.
FixLead with a direct answer to the page topic. The first paragraph should be self-contained.
Question-style sub-headings
WhatWhether your H2s are framed as actual questions.
WhyH2 questions match the way users phrase queries to LLMs, dramatically increasing your retrieval probability.
FixReframe at least 2 H2s as questions: "What is X" rather than "About X".
Lists and structured content
WhatWhether the page uses bulleted or numbered lists.
WhyLists are pre-chunked content. LLMs preferentially extract from list items because they are self-contained.
FixConvert at least one prose paragraph into a 3-5 item bulleted or numbered list.
Citation or source language
WhatWhether the page references named external sources, studies, or reports.
WhyLLMs rate well-cited content as more authoritative. Phrases like "according to" signal credibility.
FixReference at least one external study, report, or named source by name.
Author or attribution signal
WhatWhether a named author byline is visible.
WhyAnonymous content gets cited less. LLMs use authorship as a credibility filter (E-E-A-T).
FixAdd a visible byline plus meta author tag.
Date or freshness signal
WhatWhether the page declares when it was published or updated.
WhyLLMs and search engines favour fresh content. Without dates, content looks stale.
FixShow publish/updated date in visible content AND in article published_time meta.
Definitions of key terms
WhatWhether the page explicitly defines what its key terms mean.
WhyLLMs cite definitions because they are extractable as standalone facts.
FixDefine your key terms with explicit phrases like "X is defined as", "X refers to".
08
AI Discoverability
Whether the AI crawlers from major LLM providers can actually access your site. Many sites accidentally block them and have no idea.
robots.txt present
WhatA robots.txt file at the root of your domain.
WhyWithout it, crawler behaviour is undefined. Best practice is to declare your rules explicitly.
FixAdd a robots.txt at yoursite.com/robots.txt with explicit User-agent rules.
GPTBot allowed
WhatOpenAI's crawler for ChatGPT and GPT model training.
WhyBlock this and you are invisible to ChatGPT. ChatGPT has hundreds of millions of weekly users.
FixIn robots.txt, do NOT include "Disallow: /" under "User-agent: GPTBot".
ClaudeBot allowed
WhatAnthropic's crawler for Claude.
WhyBlock this and Claude cannot cite your content. Claude has a fast-growing enterprise user base.
FixIn robots.txt, do NOT block "User-agent: ClaudeBot".
PerplexityBot allowed
WhatPerplexity's crawler.
WhyPerplexity is the fastest-growing AI search engine. Blocking it removes a real growth channel.
FixIn robots.txt, do NOT block "User-agent: PerplexityBot".
Google-Extended allowed
WhatGoogle's opt-in flag for Gemini and AI training.
WhyBlock this and Gemini cannot use your content.
FixIn robots.txt, do NOT block "User-agent: Google-Extended".
CCBot allowed
WhatCommon Crawl bot. Many AI training datasets derive from Common Crawl.
WhyBlock this and you reduce the chance of appearing in next-generation models.
FixIn robots.txt, do NOT block "User-agent: CCBot".
Applebot-Extended allowed
WhatApple's crawler for Apple Intelligence.
WhyApple Intelligence is rolling out across iOS and macOS. Long-term visibility signal.
FixIn robots.txt, do NOT block "User-agent: Applebot-Extended".
llms.txt declaration
WhatA new standard file at yoursite.com/llms.txt that gives AI a curated map of your important content.
WhyAnthropic-proposed standard. Helps AI tools navigate your site efficiently.
FixCreate an llms.txt file with markdown links to your most important pages.
Sitemap declared in robots.txt
WhatA Sitemap directive in your robots.txt pointing to your XML sitemap.
WhyHelps both Google and AI crawlers discover all your pages quickly without crawling your entire site.
FixAdd a line at the bottom of robots.txt: Sitemap: https://yoursite.com/sitemap.xml
Explicit AI crawler rules
WhatWhether your robots.txt specifically names AI crawlers (GPTBot, ClaudeBot, etc.) rather than just having no rules.
WhyExplicitly naming the AI crawlers signals intentional permission. AI tools can distinguish between "you forgot to set rules" and "you actively welcome us".
FixAdd explicit User-agent blocks for GPTBot, ClaudeBot, PerplexityBot, etc., each followed by Allow: / to show you intentionally permit them.
llms.txt quality
WhatWhether your llms.txt has substantial content beyond just a title — a description, structured links, context.
WhyA sparse llms.txt is barely better than none. AI engines use llms.txt as a curated map of your site, so empty or thin files give them nothing to work with.
FixInclude a description of your site, markdown links to your most important pages organized by section, and a brief context line for each link.
09
Authority signals
External and internal signals that the page is trustworthy and well-connected.
Internal link depth
WhatHow many links the page has to other pages on your own site.
WhyInternal links spread authority across your site and help crawlers discover related content.
FixLink to 3-5 related pages on your site. Use descriptive anchor text.
External citations
WhatLinks to credible sources outside your domain.
WhyExternal links to authoritative sources signal that you have done your research.
FixCite at least 2 credible external sources with descriptive anchor text.
Author metadata
WhatA meta author tag in the page head.
WhyReinforces authorship for crawlers that do not parse visible bylines.
FixAdd meta name=author content="Your Name" to the head.
Date metadata
WhatArticle published_time and modified_time meta tags.
WhyMachine-readable date signals that visible dates do not always provide.
FixAdd meta property=article:published_time and article:modified_time tags.