businesscommunication
On-Page SGEO Optimization
Optimize individual pages for both search engine ranking and AI citation — covering title tags, meta descriptions, heading hierarchy, URL structure, internal linking, image optimization, structured data, direct-answer formatting, and knowledge block structure.
SEOGEOSGEOon-page-SEOmeta-tagsheadingsinternal-linkingAI-citationstructured-data
Works well with agents
Works well with skills
$ npx skills add The-AI-Directory-Company/(…) --skill on-page-sgeoon-page-sgeo/
SKILL.md
Markdown
| 1 | |
| 2 | # On-Page SGEO Optimization |
| 3 | |
| 4 | On-page SGEO (Search Generative Engine Optimization) is the practice of optimizing individual page elements so the page both ranks in traditional search results and gets cited by AI platforms (ChatGPT, Perplexity, Gemini, Copilot). Every section below addresses both dimensions together — SEO impact and GEO impact are not separate concerns. |
| 5 | |
| 6 | This is skill 2 of 4 in the SGEO series: technical-sgeo > **on-page-sgeo** > content-sgeo > off-page-sgeo. |
| 7 | |
| 8 | ## Tool discovery |
| 9 | |
| 10 | Before gathering project details, confirm which tools are available. Ask the user directly — do not assume access to any external service. |
| 11 | |
| 12 | **Free tools (no API key required):** |
| 13 | - [ ] WebFetch (fetch any public URL — robots.txt, sitemaps, pages) |
| 14 | - [ ] WebSearch (search engine queries for competitive analysis) |
| 15 | - [ ] Google PageSpeed Insights API (CWV data, no key needed for basic usage) |
| 16 | - [ ] Google Rich Results Test (structured data validation) |
| 17 | - [ ] Playwright MCP or Chrome DevTools MCP (browser automation) |
| 18 | |
| 19 | **Paid tools (API key or MCP required):** |
| 20 | - [ ] Google Search Console API (requires OAuth) |
| 21 | - [ ] DataForSEO MCP (SERP data, keyword metrics, backlinks) |
| 22 | - [ ] Ahrefs API (backlink profiles, keyword research) |
| 23 | - [ ] Semrush API (competitive analysis, keyword data) |
| 24 | |
| 25 | **The agent must:** |
| 26 | 1. Present this checklist to the user |
| 27 | 2. Record which tools are available |
| 28 | 3. Pass the inventory to scripts as context |
| 29 | 4. Fall back gracefully — every check has a free-tier path using WebFetch/WebSearch |
| 30 | |
| 31 | ## Before you start |
| 32 | |
| 33 | Gather the following from the user. If anything is missing, ask before proceeding: |
| 34 | |
| 35 | 1. **Target page URL** — The live URL being optimized, or a description of the page being created. |
| 36 | 2. **Primary keyword / topic** — The main query or subject the page should rank for and be cited on. |
| 37 | 3. **Search intent** — Informational, navigational, commercial, or transactional. This determines the optimal page structure. |
| 38 | 4. **Target audience** — Who this page is for (developers, marketers, executives, general consumers, etc.). |
| 39 | 5. **Existing performance data** — Google Search Console impressions, average position, CTR, and top queries if the page already exists. Omit for new pages. |
| 40 | 6. **AI citation priority** — Whether this page should be optimized for AI citation (high, medium, low). High priority pages get extra GEO formatting. |
| 41 | 7. **Related pages on the site** — Pages that should link to/from this one. Needed for internal linking recommendations. |
| 42 | 8. **Competitor pages ranking for the same keyword** — Top 3-5 URLs currently ranking, for gap analysis. |
| 43 | |
| 44 | ## On-page optimization template |
| 45 | |
| 46 | ### 1. Title Tag & Meta Description |
| 47 | |
| 48 | > **Automate:** Run `scripts/extract-meta-tags.py --url <URL>` to extract and validate all meta tags. See `references/meta-optimization.md` for industry-specific examples and OG tag requirements. |
| 49 | |
| 50 | The title tag is the single strongest on-page ranking signal. The meta description does not directly affect ranking but controls CTR from search results and is often extracted by AI engines as a page summary. |
| 51 | |
| 52 | **Title tag rules:** |
| 53 | |
| 54 | - Include the primary keyword within the first 50 characters |
| 55 | - Keep total length under 60 characters (Google truncates at ~580px) |
| 56 | - Front-load the most important words |
| 57 | - Make it specific and outcome-oriented — avoid vague labels |
| 58 | - Do not stuff multiple keywords separated by pipes |
| 59 | |
| 60 | **Meta description rules:** |
| 61 | |
| 62 | - Summarize the page's value proposition in under 155 characters |
| 63 | - Include the primary keyword naturally (Google bolds matching terms) |
| 64 | - Use active voice and a clear benefit statement |
| 65 | - For GEO: write the description as a complete, factual sentence — AI engines sometimes pull meta descriptions as summary text |
| 66 | |
| 67 | **Examples:** |
| 68 | |
| 69 | ``` |
| 70 | BAD title: "SEO Guide | Best SEO Tips 2026 | SEO Company" |
| 71 | WHY: Keyword-stuffed, no specificity, no compelling reason to click. |
| 72 | |
| 73 | GOOD title: "Technical SEO Checklist: 15 Fixes That Improve Crawlability" |
| 74 | WHY: Specific topic, concrete number, clear outcome, keyword up front. |
| 75 | ``` |
| 76 | |
| 77 | ``` |
| 78 | BAD meta: "We are the best SEO company. Learn about SEO on our blog." |
| 79 | WHY: Self-promotional, no value proposition, no keyword alignment. |
| 80 | |
| 81 | GOOD meta: "A 15-point technical SEO checklist covering crawl errors, indexation, |
| 82 | Core Web Vitals, and structured data — with fix instructions for each." |
| 83 | WHY: Describes exactly what the reader gets. Complete sentence an AI could cite. |
| 84 | ``` |
| 85 | |
| 86 | ### 2. URL Structure |
| 87 | |
| 88 | URLs are a minor ranking factor but a major usability and crawlability signal. AI engines also parse URLs to understand page topic. |
| 89 | |
| 90 | **Rules:** |
| 91 | |
| 92 | - Keep it short: 3-5 words after the domain |
| 93 | - Use hyphens to separate words (not underscores or camelCase) |
| 94 | - Include the primary keyword naturally |
| 95 | - Use lowercase only |
| 96 | - No dates, IDs, session parameters, or query strings |
| 97 | - Match the URL to the page's core topic, not its category hierarchy |
| 98 | |
| 99 | **Examples:** |
| 100 | |
| 101 | ``` |
| 102 | BAD: /blog/2026/03/27/post-id-4827 |
| 103 | WHY: Date will make the URL look stale. ID is meaningless. Deeply nested. |
| 104 | |
| 105 | BAD: /resources/guides/seo/technical-seo-comprehensive-beginners-advanced-guide |
| 106 | WHY: Too long. Redundant words. Dilutes keyword signal. |
| 107 | |
| 108 | GOOD: /technical-seo-checklist |
| 109 | WHY: Short, descriptive, keyword-inclusive, no unnecessary nesting. |
| 110 | ``` |
| 111 | |
| 112 | ### 3. Heading Hierarchy |
| 113 | |
| 114 | > **Automate:** Run `scripts/analyze-headings.py --url <URL>` to validate the hierarchy and get a GEO heading score. See `references/heading-and-structure.md` for question-format templates and 15 before/after heading rewrites. |
| 115 | |
| 116 | Headings define page structure for both search crawlers and AI parsers. A well-structured heading hierarchy helps search engines understand topic relationships and helps AI engines extract specific sections. |
| 117 | |
| 118 | **Rules:** |
| 119 | |
| 120 | - One H1 per page — it should match the primary topic and closely align with the title tag |
| 121 | - H2s for major sections — each should cover a distinct subtopic |
| 122 | - H3s for subsections within an H2 |
| 123 | - Never skip heading levels (H1 > H3 without an H2) |
| 124 | - Do not use headings for visual styling — use CSS instead |
| 125 | |
| 126 | **GEO-specific rule: use question-format H2s where natural.** AI engines match user queries to headings. A heading phrased as a question directly matches how users ask AI platforms. |
| 127 | |
| 128 | **Examples:** |
| 129 | |
| 130 | ``` |
| 131 | WEAK heading: "## Technical SEO Overview" |
| 132 | WHY: Generic. Does not match any natural query pattern. |
| 133 | |
| 134 | STRONG heading: "## What Is Technical SEO?" |
| 135 | WHY: Matches "what is technical seo" — a common AI query. |
| 136 | Perplexity, ChatGPT, and Gemini will pull the content |
| 137 | under this heading when answering that question. |
| 138 | ``` |
| 139 | |
| 140 | ``` |
| 141 | WEAK heading: "## Pricing" |
| 142 | STRONG heading: "## How Much Does [Product] Cost?" |
| 143 | WHY: Matches commercial query format AI users actually type. |
| 144 | ``` |
| 145 | |
| 146 | Not every heading should be a question — use questions for informational and commercial intent sections, and declarative headings for procedural or reference sections. |
| 147 | |
| 148 | ### 4. Direct-Answer-First Pattern (GEO) |
| 149 | |
| 150 | > **Automate:** Run `scripts/check-direct-answer.py --url <URL>` to score the opening content on the 0-8 GEO rubric. See `references/geo-formatting.md` for the scoring criteria and before/after examples. |
| 151 | |
| 152 | This is the most important GEO on-page technique. AI engines synthesize answers from the opening content of a page. If your answer is buried in paragraph 5 after a lengthy introduction, it will not be cited. |
| 153 | |
| 154 | **The rule: the first 200 words of the page (or of each major section) must directly and completely answer the primary question.** |
| 155 | |
| 156 | Lead with the TLDR. Then elaborate. |
| 157 | |
| 158 | **Bad — meandering introduction:** |
| 159 | |
| 160 | ``` |
| 161 | Search engine optimization has evolved significantly over the past decade. |
| 162 | With the rise of AI-powered search, marketers face new challenges. In this |
| 163 | comprehensive guide, we will explore the many facets of technical SEO and |
| 164 | help you understand why it matters for your business. Before we dive in, |
| 165 | let's take a step back and consider the history of search engines... |
| 166 | |
| 167 | [The actual answer appears 600 words later] |
| 168 | ``` |
| 169 | |
| 170 | **Good — direct answer first:** |
| 171 | |
| 172 | ``` |
| 173 | Technical SEO is the practice of optimizing a website's infrastructure so |
| 174 | search engines can crawl, index, and render its pages efficiently. It |
| 175 | covers server configuration, site architecture, structured data, page |
| 176 | speed, and mobile usability — everything that is not content or backlinks. |
| 177 | |
| 178 | Why it matters: if search engines cannot access your pages, no amount of |
| 179 | content quality or link building will help you rank. |
| 180 | |
| 181 | [Elaboration, details, and supporting sections follow] |
| 182 | ``` |
| 183 | |
| 184 | The direct-answer version gives an AI engine a self-contained, citable passage in the first two sentences. The meandering version gives it nothing usable. |
| 185 | |
| 186 | **Apply this pattern to every H2 section**, not just the page opening. Each section should open with its key point, then expand. |
| 187 | |
| 188 | ### 5. Self-Contained Knowledge Blocks (GEO) |
| 189 | |
| 190 | > See `references/geo-formatting.md` for the self-contained knowledge block rubric, anaphoric reference checklist, and citation-worthy passage construction formula. |
| 191 | |
| 192 | AI engines do not always cite an entire page. They extract specific passages — typically 50-150 words — and present them as part of a synthesized answer. Each H2 section on your page should function as a standalone knowledge block that makes sense without surrounding context. |
| 193 | |
| 194 | **Rules for knowledge blocks:** |
| 195 | |
| 196 | - Each H2 section should be 50-150 words of self-contained, factual content (before elaboration) |
| 197 | - Include specific, citable data points within each block — numbers, percentages, named entities |
| 198 | - Avoid anaphoric references ("As mentioned above...", "This approach...", "It...") in the opening sentences of a section |
| 199 | - Front-load the most important fact or definition |
| 200 | |
| 201 | **Examples:** |
| 202 | |
| 203 | ``` |
| 204 | WEAK (not self-contained): |
| 205 | "As we discussed in the previous section, this approach can significantly |
| 206 | improve your results. Many companies have seen positive outcomes." |
| 207 | |
| 208 | WHY: An AI extracting this passage has no idea what "this approach" refers |
| 209 | to or what "positive outcomes" means. Zero citation value. |
| 210 | ``` |
| 211 | |
| 212 | ``` |
| 213 | STRONG (self-contained): |
| 214 | "Internal linking passes PageRank between pages and helps search engines |
| 215 | discover content. Sites that increase internal links to key pages by 40% |
| 216 | see a median ranking improvement of 3.2 positions within 60 days, |
| 217 | according to a 2025 Ahrefs study of 14,000 domains." |
| 218 | |
| 219 | WHY: Complete topic sentence. Specific data. Named source. An AI can |
| 220 | extract this passage and it stands alone as a useful answer. |
| 221 | ``` |
| 222 | |
| 223 | ### 6. Internal Linking |
| 224 | |
| 225 | > **Automate:** Run `scripts/check-internal-links.py --url <URL>` to count links, evaluate anchor text quality, and detect broken links. See `references/internal-linking.md` for anchor text taxonomy and click depth optimization. |
| 226 | |
| 227 | Internal links distribute link equity, help crawlers discover pages, establish topical relationships, and guide users through your site. They also help AI engines understand your site's knowledge structure. |
| 228 | |
| 229 | **Rules:** |
| 230 | |
| 231 | - Aim for 3-5 internal links per 1,000 words of content |
| 232 | - Use descriptive anchor text that tells the reader (and crawlers) what the destination page covers |
| 233 | - Link to your most important pages from your most authoritative pages |
| 234 | - Link contextually within body content — not just in sidebars or footers |
| 235 | - Ensure every important page is reachable within 3 clicks from the homepage |
| 236 | |
| 237 | **Anchor text examples:** |
| 238 | |
| 239 | ``` |
| 240 | BAD: "For more information, click here." |
| 241 | WHY: "click here" tells crawlers and AI nothing about the destination. |
| 242 | |
| 243 | BAD: "Read our technical SEO guide for a comprehensive overview of |
| 244 | technical SEO best practices for technical SEO." |
| 245 | WHY: Over-optimized. Keyword-stuffed anchor text triggers spam signals. |
| 246 | |
| 247 | GOOD: "Run a technical SEO audit to identify crawl and indexation issues |
| 248 | before optimizing individual pages." |
| 249 | WHY: Descriptive, natural, tells the reader and crawlers what to expect. |
| 250 | ``` |
| 251 | |
| 252 | **GEO consideration:** AI engines follow internal links to build context about your site's expertise. A well-linked site on a topic signals topical authority — making any individual page more likely to be cited. |
| 253 | |
| 254 | ### 7. Image Optimization |
| 255 | |
| 256 | > **Automate:** Run `scripts/check-images.py --url <URL>` to audit alt text, file sizes, formats, dimensions, and lazy loading. See `references/image-and-media.md` for the format decision tree and compression targets. |
| 257 | |
| 258 | Images affect page speed, accessibility, search visibility (image search), and CLS. AI engines that process visual content (Google's multimodal search, Bing visual search) also use image metadata. |
| 259 | |
| 260 | **Rules:** |
| 261 | |
| 262 | - **Alt text:** Descriptive, concise (under 125 characters). Include the primary keyword only if the image genuinely relates to it. Do not keyword-stuff alt attributes. |
| 263 | - **File format:** Use WebP or AVIF for photographs, SVG for icons and illustrations. Serve fallback formats for older browsers. |
| 264 | - **File size:** Compress images to under 200KB where possible. Use tools like Squoosh, Sharp, or your build pipeline's image optimization. |
| 265 | - **Lazy loading:** Add `loading="lazy"` to all images below the fold. Do NOT lazy-load the LCP image (usually the hero image). |
| 266 | - **Dimensions:** Always set explicit `width` and `height` attributes (or use CSS `aspect-ratio`) to prevent Cumulative Layout Shift. |
| 267 | - **File names:** Use descriptive, hyphenated file names (`technical-seo-audit-results.webp`, not `IMG_4827.jpg`). |
| 268 | |
| 269 | **Examples:** |
| 270 | |
| 271 | ``` |
| 272 | BAD alt: alt="image" or alt="" (on a meaningful image) or alt="SEO SEO guide SEO tips" |
| 273 | GOOD alt: alt="Screaming Frog crawl report showing 47 pages with redirect chains" |
| 274 | ``` |
| 275 | |
| 276 | ### 8. Structured Data Per Page |
| 277 | |
| 278 | > **Automate:** Run `scripts/extract-structured-data.py --url <URL>` to extract JSON-LD, identify schema types, and validate required properties. |
| 279 | |
| 280 | Structured data (JSON-LD) helps search engines understand page content and enables rich results. For GEO, structured data pre-packages information in a machine-readable format that AI engines can directly parse. |
| 281 | |
| 282 | **Page type to schema mapping:** |
| 283 | |
| 284 | ``` |
| 285 | | Page Type | Schema Type | Key Properties | |
| 286 | |-----------------|---------------|---------------------------------------------------| |
| 287 | | Homepage | Organization | name, url, logo, sameAs (social profiles) | |
| 288 | | Product page | Product | name, price, availability, review, aggregateRating | |
| 289 | | Blog post | Article | headline, author, datePublished, dateModified | |
| 290 | | FAQ page | FAQPage | mainEntity (array of Question + acceptedAnswer) | |
| 291 | | How-to guide | HowTo | name, step (array of HowToStep with text + image) | |
| 292 | | Local business | LocalBusiness | address, geo, openingHours, telephone | |
| 293 | | Event page | Event | name, startDate, location, offers | |
| 294 | | Person/bio page | Person | name, jobTitle, worksFor, sameAs | |
| 295 | ``` |
| 296 | |
| 297 | **Implementation rules:** |
| 298 | |
| 299 | - Use JSON-LD format (Google's preferred format), placed in a `<script type="application/ld+json">` tag |
| 300 | - One primary schema type per page — do not overload a single page with unrelated schema types |
| 301 | - Validate every page with Google's Rich Results Test (https://search.google.com/test/rich-results) |
| 302 | - Keep schema data consistent with visible on-page content — mismatches can trigger manual actions |
| 303 | |
| 304 | **GEO-specific note:** FAQPage schema is particularly valuable for AI citation. It pre-structures question/answer pairs in exactly the format AI engines consume. If your page answers common questions, implement FAQPage schema even if the page is not a traditional FAQ — blog posts and product pages can include FAQ sections with matching schema. |
| 305 | |
| 306 | ### 9. Freshness Signals |
| 307 | |
| 308 | > **Automate:** Run `scripts/check-freshness.py --url <URL>` to verify visible dates, schema dates, author bylines, and freshness score. |
| 309 | |
| 310 | Both search engines and AI engines weight content recency. A page last updated in 2022 is less likely to be cited for a 2026 query than one updated this month — even if the underlying information has not changed. |
| 311 | |
| 312 | **Rules:** |
| 313 | |
| 314 | - Display a visible "Last updated: [date]" timestamp on content pages — this signals recency to both users and AI crawlers |
| 315 | - Include `dateModified` in your Article schema markup (not just `datePublished`) |
| 316 | - Show author name and credentials on content pages — this feeds E-E-A-T (Experience, Expertise, Authoritativeness, Trustworthiness) signals that both Google and AI engines evaluate |
| 317 | - When updating content, change substantive information — do not just change the date. Search engines can detect superficial updates. |
| 318 | - Review and update high-value pages on a quarterly cycle at minimum |
| 319 | |
| 320 | **Example Article schema with freshness signals:** |
| 321 | |
| 322 | ```json |
| 323 | { |
| 324 | "@context": "https://schema.org", |
| 325 | "@type": "Article", |
| 326 | "headline": "Technical SEO Checklist: 15 Fixes That Improve Crawlability", |
| 327 | "author": { |
| 328 | "@type": "Person", |
| 329 | "name": "Jane Smith", |
| 330 | "jobTitle": "Senior SEO Engineer", |
| 331 | "url": "https://example.com/team/jane-smith" |
| 332 | }, |
| 333 | "datePublished": "2025-06-15", |
| 334 | "dateModified": "2026-03-20", |
| 335 | "publisher": { |
| 336 | "@type": "Organization", |
| 337 | "name": "Example Company", |
| 338 | "logo": { |
| 339 | "@type": "ImageObject", |
| 340 | "url": "https://example.com/logo.png" |
| 341 | } |
| 342 | } |
| 343 | } |
| 344 | ``` |
| 345 | |
| 346 | ### 10. On-Page SGEO Audit Table |
| 347 | |
| 348 | > **Automate:** Run `scripts/audit-page.py --url <URL> --format md` to generate this table automatically. The orchestrator runs all 7 scripts and aggregates results. |
| 349 | |
| 350 | Use this consolidated table to audit any existing page. Walk through each element, note the current state, and flag items that need work. |
| 351 | |
| 352 | ``` |
| 353 | | Element | SEO Impact | GEO Impact | What to Check | Status | |
| 354 | |----------------------------|------------|------------|------------------------------------------------------------|--------| |
| 355 | | Title tag | High | Medium | Under 60 chars, keyword up front, specific and compelling | [ ] | |
| 356 | | Meta description | Medium | Medium | Under 155 chars, value proposition, complete sentence | [ ] | |
| 357 | | URL structure | Medium | Low | Short, descriptive, keyword-inclusive, no parameters | [ ] | |
| 358 | | H1 tag | High | Medium | One per page, matches primary topic, aligns with title | [ ] | |
| 359 | | Heading hierarchy | Medium | High | Logical H2/H3 structure, question-format where natural | [ ] | |
| 360 | | Direct-answer opening | Low | High | First 200 words directly answer the primary question | [ ] | |
| 361 | | Knowledge blocks | Low | High | Each H2 section is self-contained, 50-150 words, specific | [ ] | |
| 362 | | Internal links | High | Medium | 3-5 per 1000 words, descriptive anchors, contextual | [ ] | |
| 363 | | Image alt text | Medium | Low | Descriptive, concise, keyword where natural | [ ] | |
| 364 | | Image file size & format | Medium | Low | Under 200KB, WebP/AVIF, lazy loading below fold | [ ] | |
| 365 | | Image dimensions | Medium | Low | Explicit width/height set, no CLS | [ ] | |
| 366 | | Structured data | Medium | High | Correct schema type for page, validates in Rich Results | [ ] | |
| 367 | | FAQPage schema | Medium | High | Applied to pages with Q&A content, matches visible content | [ ] | |
| 368 | | dateModified in schema | Low | High | Present and reflects actual last substantive update | [ ] | |
| 369 | | Visible last-updated date | Low | Medium | Displayed on page, matches schema dateModified | [ ] | |
| 370 | | Author & credentials | Medium | High | Name, role, and expertise visible on content pages | [ ] | |
| 371 | ``` |
| 372 | |
| 373 | ## Quality checklist |
| 374 | |
| 375 | Before delivering the optimized page, verify: |
| 376 | |
| 377 | - [ ] Title tag is under 60 characters, includes primary keyword, and is specific |
| 378 | - [ ] Meta description is under 155 characters, reads as a complete sentence, and conveys the page's value |
| 379 | - [ ] URL is short, descriptive, and keyword-inclusive |
| 380 | - [ ] Page has exactly one H1 and a logical H2/H3 hierarchy with question-format headings where appropriate |
| 381 | - [ ] The first 200 words directly answer the primary question without preamble |
| 382 | - [ ] Each H2 section opens with a self-contained knowledge block (50-150 words that stand alone) |
| 383 | - [ ] 3-5 internal links per 1,000 words with descriptive anchor text |
| 384 | - [ ] All images have descriptive alt text, are compressed, use modern formats, and have explicit dimensions |
| 385 | - [ ] Correct structured data type is implemented and validates in Rich Results Test |
| 386 | - [ ] Freshness signals are present: visible date, dateModified in schema, author credentials |
| 387 | |
| 388 | ## Common mistakes to avoid |
| 389 | |
| 390 | 1. **Keyword stuffing in title tags and headings.** Repeating the keyword 3+ times does not help — it triggers spam signals and reads poorly. Use the keyword once naturally and use semantic variations elsewhere. |
| 391 | |
| 392 | 2. **Burying the answer below a long introduction.** AI engines extract from the opening content. If your first 200 words are throat-clearing ("In today's digital landscape..."), your page will not be cited. Lead with the answer. |
| 393 | |
| 394 | 3. **Generic headings instead of question-format headers.** "Overview" and "Introduction" tell crawlers and AI nothing. "What Is [Topic]?" and "How Does [Topic] Work?" match real queries and increase citation likelihood. |
| 395 | |
| 396 | 4. **Missing structured data for the page type.** A product page without Product schema, a blog post without Article schema, an FAQ without FAQPage schema — each is a missed opportunity for rich results and AI extraction. |
| 397 | |
| 398 | 5. **Internal links with "click here" or "learn more" as anchor text.** These waste the anchor text signal entirely. Describe the destination: "Run a technical SEO audit" is both more useful to users and more informative to crawlers. |
| 399 | |
| 400 | 6. **No freshness signals on content pages.** Pages without visible dates, author information, or dateModified schema appear stale to both search engines and AI engines. A 2024 Semrush study found that pages with visible update dates had 23% higher click-through rates in search results. |
| 401 | |
| 402 | 7. **Ignoring search intent mismatch.** A product page will not rank for "how to" queries. A tutorial will not rank for "buy" queries. Match the page format to the intent: informational queries need guides, commercial queries need comparison pages, transactional queries need product/pricing pages. |
| 403 | |
| 404 | 8. **Writing H2 sections that depend on prior context.** If an AI extracts a section that starts with "As we saw above..." or "Using the same approach...", the extracted passage is incoherent. Every section must open with enough context to stand alone. |
| 405 | |
| 406 | ## Available scripts |
| 407 | |
| 408 | For a complete page audit, run `scripts/audit-page.py --url <URL>` — it runs all other scripts and aggregates results into the audit table from Section 10. |
| 409 | |
| 410 | | Script | What it checks | Run it when | |
| 411 | |--------|---------------|-------------| |
| 412 | | `audit-page.py` | **Orchestrator** — runs all 7 scripts below, outputs audit table | You want a full on-page SGEO audit of any URL | |
| 413 | | `extract-meta-tags.py` | Title, meta description, OG tags, canonical, robots meta | Optimizing title/description or diagnosing CTR issues | |
| 414 | | `analyze-headings.py` | Heading hierarchy, H1 count, question-format ratio, GEO score | Restructuring page headings or evaluating GEO readiness | |
| 415 | | `check-direct-answer.py` | First 200 words, direct-answer scoring (0-8 rubric), meandering patterns | Evaluating whether a page's opening is citation-ready | |
| 416 | | `check-internal-links.py` | Link count per 1000 words, anchor text quality, broken links | Auditing internal link structure or fixing anchor text | |
| 417 | | `check-images.py` | Alt text, formats, file sizes, dimensions, lazy loading | Optimizing images for speed, accessibility, and SEO | |
| 418 | | `extract-structured-data.py` | JSON-LD extraction, schema type validation, required properties | Implementing or auditing structured data markup | |
| 419 | | `check-freshness.py` | Visible dates, schema dates, author bylines, freshness score | Checking freshness signals or planning content updates | |
| 420 | |
| 421 | All scripts accept `--url <URL>` and `--tools <tools.json>`. Output is JSON by default. `audit-page.py` also accepts `--format md` for a markdown table. |
| 422 | |
| 423 | ## References |
| 424 | |
| 425 | | File | Covers | |
| 426 | |------|--------| |
| 427 | | `references/meta-optimization.md` | Title/description craft, OG tags, canonical, robots meta, industry examples | |
| 428 | | `references/heading-and-structure.md` | Hierarchy rules, GEO question-format patterns, 15 before/after rewrites | |
| 429 | | `references/geo-formatting.md` | Direct-answer-first, knowledge blocks, citation-worthy passages, 0-8 scoring rubric | |
| 430 | | `references/internal-linking.md` | Link equity, anchor text taxonomy, click depth, audit workflow | |
| 431 | | `references/image-and-media.md` | Format decision tree, compression targets, responsive images, alt text guide | |
| 432 |