businesscommunication

On-Page SGEO Optimization

Optimize individual pages for both search engine ranking and AI citation — covering title tags, meta descriptions, heading hierarchy, URL structure, internal linking, image optimization, structured data, direct-answer formatting, and knowledge block structure.

SEOGEOSGEOon-page-SEOmeta-tagsheadingsinternal-linkingAI-citationstructured-data

Works well with agents

SEO Specialist AgentContent Strategist AgentFrontend Engineer AgentCopywriter Agent

Works well with skills

Technical SGEO SetupContent SGEO StrategyOff-Page SGEO AuthorityTechnical SEO Audit
$ npx skills add The-AI-Directory-Company/(…) --skill on-page-sgeo
on-page-sgeo/
    • geo-formatting.md10.3 KB
    • heading-and-structure.md5.8 KB
    • image-and-media.md5.7 KB
    • internal-linking.md6.3 KB
    • meta-optimization.md5.6 KB
    • analyze-headings.py5.9 KB
    • audit-page.py14.6 KB
    • check-direct-answer.py9.1 KB
    • check-freshness.py12.1 KB
    • check-images.py7.3 KB
    • check-internal-links.py7.6 KB
    • extract-meta-tags.py7.1 KB
    • extract-structured-data.py8.1 KB
  • SKILL.md25.2 KB
SKILL.md
Markdown
1 
2# On-Page SGEO Optimization
3 
4On-page SGEO (Search Generative Engine Optimization) is the practice of optimizing individual page elements so the page both ranks in traditional search results and gets cited by AI platforms (ChatGPT, Perplexity, Gemini, Copilot). Every section below addresses both dimensions together — SEO impact and GEO impact are not separate concerns.
5 
6This is skill 2 of 4 in the SGEO series: technical-sgeo > **on-page-sgeo** > content-sgeo > off-page-sgeo.
7 
8## Tool discovery
9 
10Before gathering project details, confirm which tools are available. Ask the user directly — do not assume access to any external service.
11 
12**Free tools (no API key required):**
13- [ ] WebFetch (fetch any public URL — robots.txt, sitemaps, pages)
14- [ ] WebSearch (search engine queries for competitive analysis)
15- [ ] Google PageSpeed Insights API (CWV data, no key needed for basic usage)
16- [ ] Google Rich Results Test (structured data validation)
17- [ ] Playwright MCP or Chrome DevTools MCP (browser automation)
18 
19**Paid tools (API key or MCP required):**
20- [ ] Google Search Console API (requires OAuth)
21- [ ] DataForSEO MCP (SERP data, keyword metrics, backlinks)
22- [ ] Ahrefs API (backlink profiles, keyword research)
23- [ ] Semrush API (competitive analysis, keyword data)
24 
25**The agent must:**
261. Present this checklist to the user
272. Record which tools are available
283. Pass the inventory to scripts as context
294. Fall back gracefully — every check has a free-tier path using WebFetch/WebSearch
30 
31## Before you start
32 
33Gather the following from the user. If anything is missing, ask before proceeding:
34 
351. **Target page URL** — The live URL being optimized, or a description of the page being created.
362. **Primary keyword / topic** — The main query or subject the page should rank for and be cited on.
373. **Search intent** — Informational, navigational, commercial, or transactional. This determines the optimal page structure.
384. **Target audience** — Who this page is for (developers, marketers, executives, general consumers, etc.).
395. **Existing performance data** — Google Search Console impressions, average position, CTR, and top queries if the page already exists. Omit for new pages.
406. **AI citation priority** — Whether this page should be optimized for AI citation (high, medium, low). High priority pages get extra GEO formatting.
417. **Related pages on the site** — Pages that should link to/from this one. Needed for internal linking recommendations.
428. **Competitor pages ranking for the same keyword** — Top 3-5 URLs currently ranking, for gap analysis.
43 
44## On-page optimization template
45 
46### 1. Title Tag & Meta Description
47 
48> **Automate:** Run `scripts/extract-meta-tags.py --url <URL>` to extract and validate all meta tags. See `references/meta-optimization.md` for industry-specific examples and OG tag requirements.
49 
50The title tag is the single strongest on-page ranking signal. The meta description does not directly affect ranking but controls CTR from search results and is often extracted by AI engines as a page summary.
51 
52**Title tag rules:**
53 
54- Include the primary keyword within the first 50 characters
55- Keep total length under 60 characters (Google truncates at ~580px)
56- Front-load the most important words
57- Make it specific and outcome-oriented — avoid vague labels
58- Do not stuff multiple keywords separated by pipes
59 
60**Meta description rules:**
61 
62- Summarize the page's value proposition in under 155 characters
63- Include the primary keyword naturally (Google bolds matching terms)
64- Use active voice and a clear benefit statement
65- For GEO: write the description as a complete, factual sentence — AI engines sometimes pull meta descriptions as summary text
66 
67**Examples:**
68 
69```
70BAD title: "SEO Guide | Best SEO Tips 2026 | SEO Company"
71WHY: Keyword-stuffed, no specificity, no compelling reason to click.
72 
73GOOD title: "Technical SEO Checklist: 15 Fixes That Improve Crawlability"
74WHY: Specific topic, concrete number, clear outcome, keyword up front.
75```
76 
77```
78BAD meta: "We are the best SEO company. Learn about SEO on our blog."
79WHY: Self-promotional, no value proposition, no keyword alignment.
80 
81GOOD meta: "A 15-point technical SEO checklist covering crawl errors, indexation,
82 Core Web Vitals, and structured data — with fix instructions for each."
83WHY: Describes exactly what the reader gets. Complete sentence an AI could cite.
84```
85 
86### 2. URL Structure
87 
88URLs are a minor ranking factor but a major usability and crawlability signal. AI engines also parse URLs to understand page topic.
89 
90**Rules:**
91 
92- Keep it short: 3-5 words after the domain
93- Use hyphens to separate words (not underscores or camelCase)
94- Include the primary keyword naturally
95- Use lowercase only
96- No dates, IDs, session parameters, or query strings
97- Match the URL to the page's core topic, not its category hierarchy
98 
99**Examples:**
100 
101```
102BAD: /blog/2026/03/27/post-id-4827
103WHY: Date will make the URL look stale. ID is meaningless. Deeply nested.
104 
105BAD: /resources/guides/seo/technical-seo-comprehensive-beginners-advanced-guide
106WHY: Too long. Redundant words. Dilutes keyword signal.
107 
108GOOD: /technical-seo-checklist
109WHY: Short, descriptive, keyword-inclusive, no unnecessary nesting.
110```
111 
112### 3. Heading Hierarchy
113 
114> **Automate:** Run `scripts/analyze-headings.py --url <URL>` to validate the hierarchy and get a GEO heading score. See `references/heading-and-structure.md` for question-format templates and 15 before/after heading rewrites.
115 
116Headings define page structure for both search crawlers and AI parsers. A well-structured heading hierarchy helps search engines understand topic relationships and helps AI engines extract specific sections.
117 
118**Rules:**
119 
120- One H1 per page — it should match the primary topic and closely align with the title tag
121- H2s for major sections — each should cover a distinct subtopic
122- H3s for subsections within an H2
123- Never skip heading levels (H1 > H3 without an H2)
124- Do not use headings for visual styling — use CSS instead
125 
126**GEO-specific rule: use question-format H2s where natural.** AI engines match user queries to headings. A heading phrased as a question directly matches how users ask AI platforms.
127 
128**Examples:**
129 
130```
131WEAK heading: "## Technical SEO Overview"
132WHY: Generic. Does not match any natural query pattern.
133 
134STRONG heading: "## What Is Technical SEO?"
135WHY: Matches "what is technical seo" — a common AI query.
136 Perplexity, ChatGPT, and Gemini will pull the content
137 under this heading when answering that question.
138```
139 
140```
141WEAK heading: "## Pricing"
142STRONG heading: "## How Much Does [Product] Cost?"
143WHY: Matches commercial query format AI users actually type.
144```
145 
146Not every heading should be a question — use questions for informational and commercial intent sections, and declarative headings for procedural or reference sections.
147 
148### 4. Direct-Answer-First Pattern (GEO)
149 
150> **Automate:** Run `scripts/check-direct-answer.py --url <URL>` to score the opening content on the 0-8 GEO rubric. See `references/geo-formatting.md` for the scoring criteria and before/after examples.
151 
152This is the most important GEO on-page technique. AI engines synthesize answers from the opening content of a page. If your answer is buried in paragraph 5 after a lengthy introduction, it will not be cited.
153 
154**The rule: the first 200 words of the page (or of each major section) must directly and completely answer the primary question.**
155 
156Lead with the TLDR. Then elaborate.
157 
158**Bad — meandering introduction:**
159 
160```
161Search engine optimization has evolved significantly over the past decade.
162With the rise of AI-powered search, marketers face new challenges. In this
163comprehensive guide, we will explore the many facets of technical SEO and
164help you understand why it matters for your business. Before we dive in,
165let's take a step back and consider the history of search engines...
166 
167[The actual answer appears 600 words later]
168```
169 
170**Good — direct answer first:**
171 
172```
173Technical SEO is the practice of optimizing a website's infrastructure so
174search engines can crawl, index, and render its pages efficiently. It
175covers server configuration, site architecture, structured data, page
176speed, and mobile usability — everything that is not content or backlinks.
177 
178Why it matters: if search engines cannot access your pages, no amount of
179content quality or link building will help you rank.
180 
181[Elaboration, details, and supporting sections follow]
182```
183 
184The direct-answer version gives an AI engine a self-contained, citable passage in the first two sentences. The meandering version gives it nothing usable.
185 
186**Apply this pattern to every H2 section**, not just the page opening. Each section should open with its key point, then expand.
187 
188### 5. Self-Contained Knowledge Blocks (GEO)
189 
190> See `references/geo-formatting.md` for the self-contained knowledge block rubric, anaphoric reference checklist, and citation-worthy passage construction formula.
191 
192AI engines do not always cite an entire page. They extract specific passages — typically 50-150 words — and present them as part of a synthesized answer. Each H2 section on your page should function as a standalone knowledge block that makes sense without surrounding context.
193 
194**Rules for knowledge blocks:**
195 
196- Each H2 section should be 50-150 words of self-contained, factual content (before elaboration)
197- Include specific, citable data points within each block — numbers, percentages, named entities
198- Avoid anaphoric references ("As mentioned above...", "This approach...", "It...") in the opening sentences of a section
199- Front-load the most important fact or definition
200 
201**Examples:**
202 
203```
204WEAK (not self-contained):
205"As we discussed in the previous section, this approach can significantly
206improve your results. Many companies have seen positive outcomes."
207 
208WHY: An AI extracting this passage has no idea what "this approach" refers
209 to or what "positive outcomes" means. Zero citation value.
210```
211 
212```
213STRONG (self-contained):
214"Internal linking passes PageRank between pages and helps search engines
215discover content. Sites that increase internal links to key pages by 40%
216see a median ranking improvement of 3.2 positions within 60 days,
217according to a 2025 Ahrefs study of 14,000 domains."
218 
219WHY: Complete topic sentence. Specific data. Named source. An AI can
220 extract this passage and it stands alone as a useful answer.
221```
222 
223### 6. Internal Linking
224 
225> **Automate:** Run `scripts/check-internal-links.py --url <URL>` to count links, evaluate anchor text quality, and detect broken links. See `references/internal-linking.md` for anchor text taxonomy and click depth optimization.
226 
227Internal links distribute link equity, help crawlers discover pages, establish topical relationships, and guide users through your site. They also help AI engines understand your site's knowledge structure.
228 
229**Rules:**
230 
231- Aim for 3-5 internal links per 1,000 words of content
232- Use descriptive anchor text that tells the reader (and crawlers) what the destination page covers
233- Link to your most important pages from your most authoritative pages
234- Link contextually within body content — not just in sidebars or footers
235- Ensure every important page is reachable within 3 clicks from the homepage
236 
237**Anchor text examples:**
238 
239```
240BAD: "For more information, click here."
241WHY: "click here" tells crawlers and AI nothing about the destination.
242 
243BAD: "Read our technical SEO guide for a comprehensive overview of
244 technical SEO best practices for technical SEO."
245WHY: Over-optimized. Keyword-stuffed anchor text triggers spam signals.
246 
247GOOD: "Run a technical SEO audit to identify crawl and indexation issues
248 before optimizing individual pages."
249WHY: Descriptive, natural, tells the reader and crawlers what to expect.
250```
251 
252**GEO consideration:** AI engines follow internal links to build context about your site's expertise. A well-linked site on a topic signals topical authority — making any individual page more likely to be cited.
253 
254### 7. Image Optimization
255 
256> **Automate:** Run `scripts/check-images.py --url <URL>` to audit alt text, file sizes, formats, dimensions, and lazy loading. See `references/image-and-media.md` for the format decision tree and compression targets.
257 
258Images affect page speed, accessibility, search visibility (image search), and CLS. AI engines that process visual content (Google's multimodal search, Bing visual search) also use image metadata.
259 
260**Rules:**
261 
262- **Alt text:** Descriptive, concise (under 125 characters). Include the primary keyword only if the image genuinely relates to it. Do not keyword-stuff alt attributes.
263- **File format:** Use WebP or AVIF for photographs, SVG for icons and illustrations. Serve fallback formats for older browsers.
264- **File size:** Compress images to under 200KB where possible. Use tools like Squoosh, Sharp, or your build pipeline's image optimization.
265- **Lazy loading:** Add `loading="lazy"` to all images below the fold. Do NOT lazy-load the LCP image (usually the hero image).
266- **Dimensions:** Always set explicit `width` and `height` attributes (or use CSS `aspect-ratio`) to prevent Cumulative Layout Shift.
267- **File names:** Use descriptive, hyphenated file names (`technical-seo-audit-results.webp`, not `IMG_4827.jpg`).
268 
269**Examples:**
270 
271```
272BAD alt: alt="image" or alt="" (on a meaningful image) or alt="SEO SEO guide SEO tips"
273GOOD alt: alt="Screaming Frog crawl report showing 47 pages with redirect chains"
274```
275 
276### 8. Structured Data Per Page
277 
278> **Automate:** Run `scripts/extract-structured-data.py --url <URL>` to extract JSON-LD, identify schema types, and validate required properties.
279 
280Structured data (JSON-LD) helps search engines understand page content and enables rich results. For GEO, structured data pre-packages information in a machine-readable format that AI engines can directly parse.
281 
282**Page type to schema mapping:**
283 
284```
285| Page Type | Schema Type | Key Properties |
286|-----------------|---------------|---------------------------------------------------|
287| Homepage | Organization | name, url, logo, sameAs (social profiles) |
288| Product page | Product | name, price, availability, review, aggregateRating |
289| Blog post | Article | headline, author, datePublished, dateModified |
290| FAQ page | FAQPage | mainEntity (array of Question + acceptedAnswer) |
291| How-to guide | HowTo | name, step (array of HowToStep with text + image) |
292| Local business | LocalBusiness | address, geo, openingHours, telephone |
293| Event page | Event | name, startDate, location, offers |
294| Person/bio page | Person | name, jobTitle, worksFor, sameAs |
295```
296 
297**Implementation rules:**
298 
299- Use JSON-LD format (Google's preferred format), placed in a `<script type="application/ld+json">` tag
300- One primary schema type per page — do not overload a single page with unrelated schema types
301- Validate every page with Google's Rich Results Test (https://search.google.com/test/rich-results)
302- Keep schema data consistent with visible on-page content — mismatches can trigger manual actions
303 
304**GEO-specific note:** FAQPage schema is particularly valuable for AI citation. It pre-structures question/answer pairs in exactly the format AI engines consume. If your page answers common questions, implement FAQPage schema even if the page is not a traditional FAQ — blog posts and product pages can include FAQ sections with matching schema.
305 
306### 9. Freshness Signals
307 
308> **Automate:** Run `scripts/check-freshness.py --url <URL>` to verify visible dates, schema dates, author bylines, and freshness score.
309 
310Both search engines and AI engines weight content recency. A page last updated in 2022 is less likely to be cited for a 2026 query than one updated this month — even if the underlying information has not changed.
311 
312**Rules:**
313 
314- Display a visible "Last updated: [date]" timestamp on content pages — this signals recency to both users and AI crawlers
315- Include `dateModified` in your Article schema markup (not just `datePublished`)
316- Show author name and credentials on content pages — this feeds E-E-A-T (Experience, Expertise, Authoritativeness, Trustworthiness) signals that both Google and AI engines evaluate
317- When updating content, change substantive information — do not just change the date. Search engines can detect superficial updates.
318- Review and update high-value pages on a quarterly cycle at minimum
319 
320**Example Article schema with freshness signals:**
321 
322```json
323{
324 "@context": "https://schema.org",
325 "@type": "Article",
326 "headline": "Technical SEO Checklist: 15 Fixes That Improve Crawlability",
327 "author": {
328 "@type": "Person",
329 "name": "Jane Smith",
330 "jobTitle": "Senior SEO Engineer",
331 "url": "https://example.com/team/jane-smith"
332 },
333 "datePublished": "2025-06-15",
334 "dateModified": "2026-03-20",
335 "publisher": {
336 "@type": "Organization",
337 "name": "Example Company",
338 "logo": {
339 "@type": "ImageObject",
340 "url": "https://example.com/logo.png"
341 }
342 }
343}
344```
345 
346### 10. On-Page SGEO Audit Table
347 
348> **Automate:** Run `scripts/audit-page.py --url <URL> --format md` to generate this table automatically. The orchestrator runs all 7 scripts and aggregates results.
349 
350Use this consolidated table to audit any existing page. Walk through each element, note the current state, and flag items that need work.
351 
352```
353| Element | SEO Impact | GEO Impact | What to Check | Status |
354|----------------------------|------------|------------|------------------------------------------------------------|--------|
355| Title tag | High | Medium | Under 60 chars, keyword up front, specific and compelling | [ ] |
356| Meta description | Medium | Medium | Under 155 chars, value proposition, complete sentence | [ ] |
357| URL structure | Medium | Low | Short, descriptive, keyword-inclusive, no parameters | [ ] |
358| H1 tag | High | Medium | One per page, matches primary topic, aligns with title | [ ] |
359| Heading hierarchy | Medium | High | Logical H2/H3 structure, question-format where natural | [ ] |
360| Direct-answer opening | Low | High | First 200 words directly answer the primary question | [ ] |
361| Knowledge blocks | Low | High | Each H2 section is self-contained, 50-150 words, specific | [ ] |
362| Internal links | High | Medium | 3-5 per 1000 words, descriptive anchors, contextual | [ ] |
363| Image alt text | Medium | Low | Descriptive, concise, keyword where natural | [ ] |
364| Image file size & format | Medium | Low | Under 200KB, WebP/AVIF, lazy loading below fold | [ ] |
365| Image dimensions | Medium | Low | Explicit width/height set, no CLS | [ ] |
366| Structured data | Medium | High | Correct schema type for page, validates in Rich Results | [ ] |
367| FAQPage schema | Medium | High | Applied to pages with Q&A content, matches visible content | [ ] |
368| dateModified in schema | Low | High | Present and reflects actual last substantive update | [ ] |
369| Visible last-updated date | Low | Medium | Displayed on page, matches schema dateModified | [ ] |
370| Author & credentials | Medium | High | Name, role, and expertise visible on content pages | [ ] |
371```
372 
373## Quality checklist
374 
375Before delivering the optimized page, verify:
376 
377- [ ] Title tag is under 60 characters, includes primary keyword, and is specific
378- [ ] Meta description is under 155 characters, reads as a complete sentence, and conveys the page's value
379- [ ] URL is short, descriptive, and keyword-inclusive
380- [ ] Page has exactly one H1 and a logical H2/H3 hierarchy with question-format headings where appropriate
381- [ ] The first 200 words directly answer the primary question without preamble
382- [ ] Each H2 section opens with a self-contained knowledge block (50-150 words that stand alone)
383- [ ] 3-5 internal links per 1,000 words with descriptive anchor text
384- [ ] All images have descriptive alt text, are compressed, use modern formats, and have explicit dimensions
385- [ ] Correct structured data type is implemented and validates in Rich Results Test
386- [ ] Freshness signals are present: visible date, dateModified in schema, author credentials
387 
388## Common mistakes to avoid
389 
3901. **Keyword stuffing in title tags and headings.** Repeating the keyword 3+ times does not help — it triggers spam signals and reads poorly. Use the keyword once naturally and use semantic variations elsewhere.
391 
3922. **Burying the answer below a long introduction.** AI engines extract from the opening content. If your first 200 words are throat-clearing ("In today's digital landscape..."), your page will not be cited. Lead with the answer.
393 
3943. **Generic headings instead of question-format headers.** "Overview" and "Introduction" tell crawlers and AI nothing. "What Is [Topic]?" and "How Does [Topic] Work?" match real queries and increase citation likelihood.
395 
3964. **Missing structured data for the page type.** A product page without Product schema, a blog post without Article schema, an FAQ without FAQPage schema — each is a missed opportunity for rich results and AI extraction.
397 
3985. **Internal links with "click here" or "learn more" as anchor text.** These waste the anchor text signal entirely. Describe the destination: "Run a technical SEO audit" is both more useful to users and more informative to crawlers.
399 
4006. **No freshness signals on content pages.** Pages without visible dates, author information, or dateModified schema appear stale to both search engines and AI engines. A 2024 Semrush study found that pages with visible update dates had 23% higher click-through rates in search results.
401 
4027. **Ignoring search intent mismatch.** A product page will not rank for "how to" queries. A tutorial will not rank for "buy" queries. Match the page format to the intent: informational queries need guides, commercial queries need comparison pages, transactional queries need product/pricing pages.
403 
4048. **Writing H2 sections that depend on prior context.** If an AI extracts a section that starts with "As we saw above..." or "Using the same approach...", the extracted passage is incoherent. Every section must open with enough context to stand alone.
405 
406## Available scripts
407 
408For a complete page audit, run `scripts/audit-page.py --url <URL>` — it runs all other scripts and aggregates results into the audit table from Section 10.
409 
410| Script | What it checks | Run it when |
411|--------|---------------|-------------|
412| `audit-page.py` | **Orchestrator** — runs all 7 scripts below, outputs audit table | You want a full on-page SGEO audit of any URL |
413| `extract-meta-tags.py` | Title, meta description, OG tags, canonical, robots meta | Optimizing title/description or diagnosing CTR issues |
414| `analyze-headings.py` | Heading hierarchy, H1 count, question-format ratio, GEO score | Restructuring page headings or evaluating GEO readiness |
415| `check-direct-answer.py` | First 200 words, direct-answer scoring (0-8 rubric), meandering patterns | Evaluating whether a page's opening is citation-ready |
416| `check-internal-links.py` | Link count per 1000 words, anchor text quality, broken links | Auditing internal link structure or fixing anchor text |
417| `check-images.py` | Alt text, formats, file sizes, dimensions, lazy loading | Optimizing images for speed, accessibility, and SEO |
418| `extract-structured-data.py` | JSON-LD extraction, schema type validation, required properties | Implementing or auditing structured data markup |
419| `check-freshness.py` | Visible dates, schema dates, author bylines, freshness score | Checking freshness signals or planning content updates |
420 
421All scripts accept `--url <URL>` and `--tools <tools.json>`. Output is JSON by default. `audit-page.py` also accepts `--format md` for a markdown table.
422 
423## References
424 
425| File | Covers |
426|------|--------|
427| `references/meta-optimization.md` | Title/description craft, OG tags, canonical, robots meta, industry examples |
428| `references/heading-and-structure.md` | Hierarchy rules, GEO question-format patterns, 15 before/after rewrites |
429| `references/geo-formatting.md` | Direct-answer-first, knowledge blocks, citation-worthy passages, 0-8 scoring rubric |
430| `references/internal-linking.md` | Link equity, anchor text taxonomy, click depth, audit workflow |
431| `references/image-and-media.md` | Format decision tree, compression targets, responsive images, alt text guide |
432 

©2026 ai-directory.company

·Privacy·Terms·Cookies·