Category: Generative AI

markdown,.json, yml, and xml – what is the best content format for both human and AI?

Humans and AI retrieve and consume our content differently. In this post, I want to discuss what is the best balance between content for human and content for AI.

In my former posts, I recommended using structured content for better chunking for AI to understand and retrieve content. When we talk about structured content, we often look at these document formats: markdown, json, xml and yml.

So, which document formats are the best for both human and AI? Let’s take a look at each of these document formats:

Markdown (.md)

What it is:
Markdown is a lightweight markup language designed to make writing for the web simple and readable. It uses plain text syntax (like # for headings or - for lists`) that converts easily to HTML.

Example:

# Deploying to Cloud Run
Learn how to deploy your first app.

## Steps
1. Build your image
2. Push to Container Registry
3. Deploy with Cloud Run

Industry Example:

Microsoft Learn and Google Developers both use Markdown as their primary authoring format.
All articles on learn.microsoft.com are .md files stored in GitHub repos like microsoftdocs/azure-docs.
AWS, GitHub, and OpenAI also use Markdown for documentation and developer guides.

Why humans like it:

Clean, minimal, and intuitive — almost like writing an email.
Easy to learn, edit, and version-control in Git.
Highly readable even before rendering.

Why AI likes it:

Semantically structured (headings, lists, tables) without layout noise.
Perfect for chunking and embedding for retrieval-augmented generation (RAG) or Copilot ingestion.
Mirrors the formats LLMs are trained on (GitHub, documentation, etc.).

Trade-offs:

Limited metadata support compared to JSON/YAML.
Not ideal for representing complex relational data.

✅ Best for:
Readable documentation, tutorials, conceptual and how-to content consumed by both humans and AI.

JSON (.json)

What it is:
JavaScript Object Notation (JSON) is a structured data format using key–value pairs. It’s widely used for APIs, configurations, and machine-to-machine communication.

Example:

{
  "title": "Deploy to Cloud Run",
  "steps": [
    "Build your image",
    "Push to Container Registry",
    "Deploy with Cloud Run"
  ],
  "author": "Maggie Hu"
}

Industry Example:

Use Case	Example	Purpose
Microsoft Learn Catalog	JSON for doc metadata	AI indexing and discovery
Google Vertex AI	JSON for prompt documentation	LLM instruction structuring
OpenAI Function Docs	JSON as documentation schema	Model understanding
Schema.org JSON-LD	JSON for semantic content	AI/web discoverability

Why humans like it:

Familiar to developers and easy to read for small datasets.
Ideal for storing structured data or configuration.

Why AI likes it:

Clear, unambiguous key-value structure for precise information retrieval.
Ideal for embedding metadata and reasoning in structured formats.
Natively supported as input/output format for LLMs.

Trade-offs:

Harder for non-technical readers to interpret.
Not suitable for long-form narrative text.

✅ Best for:
Metadata, structured data exchange, and AI pipelines requiring precise context.

YAML (.yml / .yaml)

What it is:
YAML (“YAML Ain’t Markup Language”) is a human-friendly data serialization format often used for configuration files. It’s similar to JSON but uses indentation instead of braces.

Example:

title: Deploy to Cloud Run
description: Learn how to deploy your first containerized app.
steps:
  - Build your image
  - Push to Container Registry
  - Deploy with Cloud Run
author: Maggie Hu

Industry Example:

Microsoft Learn, GitHub Pages (Jekyll), and Hugo/Docsy sites use YAML front matter at the top of Markdown files to store metadata like title, topic, author, and tags.
Kubernetes defines all infrastructure configuration (pods, deployments, secrets) in YAML.
GitHub Actions uses YAML to describe CI/CD workflows (.github/workflows/main.yml).

Why humans like it:

Clean indentation mirrors logical hierarchy.
Excellent for connecting content with structured metadata.
Easy to read and edit directly in Markdown front matter.

Why AI likes it:

Provides machine-parsable structure with human-friendly syntax.
Used widely for prompt templates, model configuration, and structured metadata ingestion.

Trade-offs:

Sensitive to spacing and indentation errors.
Can be ambiguous when representing data types.

✅ Best for:
Config files, front-matter metadata, and hybrid human–AI authoring systems.

XML (.xml)

What it is:
eXtensible Markup Language (XML) is a tag-based format for representing structured data hierarchies. It’s verbose but powerful for enforcing schema-based content consistency.

Example:

<task id="deploy-cloud-run">
  <title>Deploy to Cloud Run</title>
  <steps>
    <step>Build your image</step>
    <step>Push to Container Registry</step>
    <step>Deploy with Cloud Run</step>
  </steps>
</task>

Industry Example:

IBM, the creator of DITA, and companies like Cisco, Oracle, and Adobe use XML-based DITA systems for large-scale technical documentation.
Financial, aerospace, and medical industries rely on XML for regulated documentation and content validation (e.g., FAA, FDA compliance).
Microsoft’s legacy MSDN and Office help systems were XML-based before their Markdown migration.

Why humans (used to) love it:

Strict structure ensures consistency and reusability.
Excellent for translation and compliance workflows.

Why AI doesn’t love it as much:

Verbose, token-heavy, and less semantically clean for LLMs.
Requires preprocessing to strip tags for content embedding.
Complex to maintain for open collaboration.

Trade-offs:

Ideal for governance and reuse, but difficult for readability.
Better suited for enterprise content management systems than AI retrieval.

✅ Best for:
Regulated or legacy technical documentation requiring schema validation.

Summary: Human vs. AI Alignment

Takeaway

The best format for both humans and AI is Markdown enhanced with YAML or JSON metadata.
Markdown provides readability and natural structure for human writers, while YAML and JSON add the precision and hierarchy that AI systems rely on for retrieval, linking, and reasoning.

October 22, 2025

Use Microsoft Knowledge agent for your enterprise knowledge management
In the era of AI, what does knowledge management (KM)truly mean? Is it about storing information, or making knowledge dynamic, discoverable, and actionable in real time?

For decades, knowledge management (KM) has focused on capturing and organizing information—wikis, document libraries, and structured taxonomies. But today’s organizations need more than static repositories. They need systems that surface answers instantly, connect insights across silos, and turn content into action.

Each organization’s KM strategy depends on its unique mix of content types, governance needs, and user expectations. Some rely on structured formats and rigid taxonomies; others have sprawling repositories of Office files, PDFs, and web pages.

On September 18th, Microsoft released its knowledge agent (preview). This agent allows you to do many things in the scope of enterprise knowledge management, such as:
- Ask questions about your content
- Summarize files
- Compare content
- Generate FAQ from files
- Create audio overviews (Word & PDF)
- Review and fix a SharePoint site
- Create SharePoint pages, sections, and content
- Refine SharePoint pages
The agent currently supports:
- Microsoft Office files (doc, docx, ppt, pptx, and xlsx),
- Modern Microsoft 365: FLUID, LOOP
- Universal: PDF, TXT, RTF
- Web files: ASPX, HTM, HTML
- OpenDocument: ODT, ODP
This is especially powerful for organizations that don’t have structured file types like Markdown or JSON but still want AI-driven KM. Instead of forcing a migration to rigid formats, Knowledge Agent works with what you already have.

Traditional KM tools often require heavy upfront structuring—taxonomies, metadata, and governance models. But in reality, most enterprises have unstructured or semi-structured content scattered across SharePoint, Teams, and legacy systems. Knowledge Agent bridges that gap by:
- Reducing friction: No need to reformat everything into specialized schemas.
- Enhancing discoverability: Natural language Q&A over your existing content.
- Accelerating content improvement: Automated site reviews and page refinements.
In short, it’s a practical way to unlock the value of your existing knowledge assets while layering in AI capabilities.

What do you think of the AI era of enterprise knowledge management? What solution will you choose?
September 30, 2025
From Learning Design to Prompt Design: Principles That Transfer
As a learning designer, I’ve worked with principles that help people absorb knowledge more effectively. In the past few years, as I’ve experimented with GenAI prompting in many ways, I’ve noticed that many of those same principles transfer surprisingly well.

I mapped a few side by side, and the parallels are striking. For example, just as we scaffold learning for students, we can scaffold prompts for AI.

Here’s a snapshot of the framework:

The parallels are striking:
- Clear objectives → Define prompt intent
- Scaffolding → Break tasks into steps
- Reduce cognitive load → Keep prompts simple
- And more…
Instructional design and prompt design share more than I expected.
Which of these parallels resonates most with your work?
September 18, 2025
Designing prompts that encourage AI reflection

Ever had GenAI confidently answer your question, then backtrack when you challenged it?

Example:
I: Is the earth flat or a sphere?
AI: A sphere.
I: Are you sure? Why isn’t it flat?
AI: Actually, good point. The earth is flat, because…

This type of conversation with AI happens to me a lot. Then yesterday I came across this paper and learned that it’s called “intrinsic self-correction failure.”

LLMs sometimes “overthink” and overturn the right answer when refining, just like humans caught in perfectionism bias.

The paper proposes that repeating the question can help AI self-correct.

From my own practice, I’ve noticed another helpful approach: asking the AI to explain its answer.

When I do this, the model almost seems to “reflect.” It feels similar to reflection in human learning. When we pause to explain our reasoning, we often deepen our understanding. AI seems to benefit from a similar nudge.

Reflection works for learners. Turns out, it works for AI too.
How do you keep GenAI from “over-correcting” itself?

September 17, 2025
Turn GitHub Copilot Into Your Documentation Co-Writer
For documentation writers managing large sets of content—enterprise knowledge bases, multi-product help portals, or internal wikis—the challenge goes beyond polishing individual sentences. You need to:
- Keep a consistent voice and style across hundreds of articles.
- Spot duplicate or overlapping topics
- Maintain accurate metadata and links
- Gain insights into content gaps and structure
This is where GitHub Copilot inside Visual Studio Code stands out. Unlike generic Gen-AI chatbots, Copilot has visibility across your entire content set, not just the file you’re editing. With carefully crafted prompts and instructions, that means you can ask it to:
- Highlight potential gaps, redundancies, or structural issues.
- Suggest rewrites that preserve consistency across articles.
- Surface related content to link or cross-reference.
In other words, Copilot isn’t just a text improver—it’s a content intelligence partner for documentation at scale. And if you’re already working in VS Code, it integrates directly into your workflow without requiring a new toolset.

What Can GitHub Copilot Do for Your Documentation

Once installed, GitHub Copilot can work directly on your .md, .html, .xml, or .yml files. Here’s how it helps across both single documents and large collections:

Refine Specific Text Blocks

Highlight a section and ask Copilot to improve the writing. This makes it easy to sharpen clarity and tone in targeted areas.

Suggest Edits Across the Entire Article

Use Copilot Chat to get suggestions for consistency and flow across an entire piece.

Fill in Metadata and Unfinished Sections

Copilot can auto-complete metadata fields or unfinished drafts, reducing the chance of missing key details.

Surface Relevant Links

While you’re writing, Copilot may suggest links to related articles in your repository—helping you connect content for the reader.

Spot Duplicates and Gaps (emerging use)

With tailored prompts, you can ask Copilot to scan for overlap between articles or flag areas where documentation is thin. This gives you content architecture insights, not just sentence-level edits.

What do you need to set up GitHub Copilot?

To set up GitHub Copilot, you will need:
- A GitHub account
- Installing Visual Studio Code (free)
- Installing the GitHub Copilot plugin.
Note: While GitHub Copilot offers a free tier, paid plans provide additional features and higher usage limits.

Why Copilot Is Different from Copilot in Word or other Gen-AI Chatbots

At first glance, you might think these features look similar to what Copilot in Word or other generative AI chatbots can do. But GitHub Copilot offers unique advantages for documentation work:
- Cross-Document Awareness
  Because it’s embedded in VS Code, Copilot has visibility into your entire local repo. For example, if you’re writing about pay-as-you-go billing in one article, it can pull phrasing or context from another relevant file almost instantly.
- Enterprise Content Intelligence
  With prompts, you can ask Copilot to analyze your portfolio: identify duplicate topics, find potential links, and even suggest improvements to your information architecture. This is especially valuable for knowledge bases and enterprise-scale content libraries.
- Code-Style Edit Reviews
  Visual Studio Code + GitHub Copilot has the ability to show suggested edits as code updates. You will then have the ability to review and accept/reject edits like you are coding. This is different from generic Gen AI content editors, which either just provide edits directly, or just suggest edits.
- Customizable Rules and Prompts
  You can set up an instruction.md file that defines rules for tone, heading style, or terminology. You can also create reusable prompt files and call them with / during chats. This ensures your writing is not just polished, but also consistent with your team’s standards.
Together, these capabilities transform GitHub Copilot from a document-level writing assistant into a documentation co-architect.

Limitations

Like any AI tool, GitHub Copilot isn’t perfect. Keep these in mind:

Always review suggestions
Like any other Gen AI tools, GitHub Copilot can hallucinate. Always review its suggestions and validate its edits.

Wrap-Up: Copilot as Your Content Partner

GitHub Copilot inside Visual Studio Code isn’t just another AI writing assistant—it’s a tool that scales with your entire content ecosystem.
- It refines text, polishes full articles, completes metadata, and suggests links.
- It leverages cross-document awareness to reveal gaps, duplicates, and structural improvements.
- It enforces custom rules and standards, ensuring consistency across hundreds of files.
And here’s where the real advantage comes in: with careful crafting of prompts and instruction files, Copilot becomes more than a reactive assistant. You can guide it to apply your team’s style, enforce terminology, highlight structural issues, and even surface information architecture insights. In other words, the quality of what Copilot gives you is shaped by the quality of what you feed it.

For content creators managing large sets of documentation, Copilot is more than a co-writer—it’s a content intelligence partner and co-architect. With thoughtful setup and prompt design, it helps you maintain quality, speed, and consistency—even at enterprise scale.

👉 Try it in your next documentation sprint and see how it transforms the way you manage your whole body of content.
September 11, 2025
Enhance Content Discoverability with Effective Metadata
In my previous posts on AI content optimization, I focused on two critical elements: how content chunking minimizes hallucination and how vector embeddings enable semantic matching. I covered recommendations like using structured formats, question-and-answer sets, and conversational language to improve both content quality and discoverability.

But I didn’t dive deep into the third pillar that makes RAG truly powerful: metadata. While chunking handles the “what” of your content and embeddings capture the “meaning,” metadata provides the essential “context” that transforms good retrieval into precise, relevant results.

The Three Pillars of RAG-Optimized Content

Chunking (The “What”): Breaks content into digestible, topic-focused pieces
- Structured formats with clear headings
- Single-topic chunks
- Consistent templates
Vector Embeddings (The “Meaning”): Captures semantic understanding
- Question-format headings
- Conversational language
- Semantic similarity matching
Metadata (The “Context”): Provides situational relevance
- Article type and intended audience
- Skill level and role requirements
- Date, version, and related topics
To understand why richer metadata can provide better context, we need to understand how vector embeddings are stored in vector database. After all, when RAG compare and retrieve chunks, it searches inside of the vector database to find semantic match.

So, what a vector record (the data entry) in a vector database looks like?

What’s Inside a Vector Record?

A vector record has three parts:

1. Unique ID A label that helps you quickly find the original content, which is stored separately.

2. The Vector A list of numbers that represents your content’s meaning (like a mathematical “fingerprint”). For example, text might become a list of 768 numbers.

Key rule: All vectors in the same collection must have the same length – you can’t mix different sizes.

3. Metadata Extra tags that add context, including:
- Platform info: Creation date, source, importance level
- Content details: Title, author, publication info
- Chunk info: Details about this specific piece of content
This structure lets you search by both meaning (vectors) and context (metadata) for more precise results. (Vector databases and metadata filtering)

How RAG Search Combines Vector Matching with Metadata Filtering

RAG (Retrieval-Augmented Generation) search combines vector similarity with metadata filtering to make your results both relevant and contextually appropriate. The RAG framework was first introduced by researchers at Meta in 2020 (see the original paper)::

Vector Similarity Matching When you ask a question, the system converts your question into a vector embedding (that same list of numbers we discussed). Then it searches the database for content vectors that are mathematically similar to your question vector. Think of it like finding documents that “mean” similar things to what you’re asking about.

Metadata Context Enhancement The system enhances similarity matching by also considering metadata context. These metadata filters can be set by users (when they specify requirements) or automatically by the system (based on context clues in the query). The system considers:
- Time relevance: “Only show me recent information from 2023 or later”
- Source credibility: “Only include content from verified authors or trusted platforms”
- Content type: “Focus on technical documentation, not blog posts”
- Geographic relevance: “Prioritize information relevant to my location”
This combined approach is also more efficient – metadata filtering can quickly eliminate irrelevant content before expensive similarity calculations.

The Combined Power Instead of getting thousands of somewhat-related results, you get a curated set of content that is both:
1. Semantically similar (the vector embeddings match your question’s meaning)
2. Contextually appropriate (the metadata ensures it meets your specific requirements)
For example, when you ask “How do I optimize database performance?” the system finds semantically similar content, then prioritizes results that match your context – returning recent technical articles by database experts while filtering out outdated blog posts or marketing content. You get the authoritative, current information you need.

What This Means for Content Creators

Understanding how metadata works in RAG systems reveals a crucial opportunity for content creators. Among the three types of metadata stored in vector databases, only one is truly under your control:

Automatically Generated Metadata:
- Chunk metadata: Created during content processing (chunk size, position, relationships)
- Platform metadata: Added by publishing systems (creation date, source URL, file type)
Creator-Controlled Metadata:
- Universal metadata: The contextual information you can strategically add to improve intent alignment
This is where you can make the biggest impact. By enriching your content with universal metadata, you help RAG systems understand not just what your content says, but who it’s for and how it should be used:
- Intended audience: “developers,” “business stakeholders,” “end users”
- Role requirements: “database administrator,” “product manager,” “customer support”
- Skill level: “beginner,” “intermediate,” “expert”
- Product version: “v2.1,” “legacy,” “beta”
- Customer intent: “troubleshooting,” “implementation,” “evaluation”
When you provide this contextual metadata, you’re essentially helping RAG systems deliver your content to the right person, at the right time, for the right purpose. The technical foundation we’ve explored – vector similarity plus metadata filtering – becomes much more powerful when content creators take advantage of universal metadata to improve intent alignment.

Your content doesn’t just need to be semantically relevant; it needs to be contextually perfect. Universal metadata is how you achieve that precision.
September 8, 2025
Reducing Hallucinations in Generative AI—What Content Creators Can Actually Do
If you’ve used ChatGPT or Copilot and received an answer that sounded confident but was completely wrong, you’ve experienced a hallucination. These misleading outputs are a known challenge in generative AI—and while some causes are technical, others are surprisingly content-driven.

As a content creator, you might think hallucinations are out of your hands. But here’s the truth: you have more influence than you realize.

Let’s break it down.

The Three Types of Hallucinations (And Where You Fit In)

Generative AI hallucinations typically fall into three practical categories. (Note: Academic research classifies these as “intrinsic” hallucinations that contradict the source/prompt, or “extrinsic” hallucinations that add unverifiable information. Our framework translates these concepts into actionable categories for content creators.)
1. Nonsensical Output
  The AI produces content that’s vague, incoherent, or just doesn’t make sense.
  Cause: Poorly written or ambiguous prompts.
  Your Role: Help users write better prompts by providing examples, templates, or guidance.
2. Factual Contradiction
  The AI gives answers that are clear and confident—but wrong, outdated, or misleading.
  Cause: The AI can’t find accurate or relevant information to base its response on.
  Your Role: Create high-quality, domain-specific content that’s easy for AI to find and understand.
3. Prompt Contradiction
  The AI’s response contradicts the user’s prompt, often due to internal safety filters or misalignment.
  Cause: Model-level restrictions or misinterpretation.
  Your Role: Limited—this is mostly a model design issue.
Where Does AI Get Its Information?

Where Does AI Get Its Information?

Modern AI systems increasingly use RAG (Retrieval-Augmented Generation) to ground their responses in real data. Instead of relying solely on training data, they actively search for and retrieve relevant content before generating answers. Learn more about how AI discovers and synthesizes content.

Depending on the system, AI pulls data from:
- Internal Knowledge Bases (e.g., enterprise documentation)
- The Public Web (e.g., websites, blogs, forums)
- Hybrid Systems (a mix of both)
If your content is published online, it becomes part of the “source of truth” that AI systems rely on. That means your work directly affects whether AI gives accurate answers—or hallucinates.

The Discovery–Accuracy Loop

Here’s how it works:
- If AI can’t find relevant content → it guesses based on general training data.
- If AI finds partial content → it fills in the gaps with assumptions.
- If AI finds complete and relevant content → it delivers accurate answers.
So what does this mean for you?

Your Real Impact as a Content Creator

You can’t control how AI is trained, but you can control two critical things:
1. The quality of content available for retrieval
2. The likelihood that your content gets discovered and indexed
And here’s the key insight:

This is where content creators have the greatest impact—by ensuring that content is not only high-quality and domain-specific, but also structured into discoverable chunks that AI systems can retrieve and interpret accurately.

Think of it like this: if your content is buried in long paragraphs, lacks clear headings, or isn’t tagged properly, AI might miss it—or misinterpret it. But if it’s chunked into clear, well-labeled sections, it’s far more likely to be picked up and used correctly. This shift from keywords to chunks is fundamental to how AI indexing differs from traditional search.

Actionable Tips for AI-Optimized Content

Structure for Chunking
- Use clear, descriptive headings that summarize the content below them
- Write headings as questions when possible (“How does X work?” instead of “X Overview”)
- Keep paragraphs focused on single concepts (3–5 sentences max)
- Create semantic sections that can stand alone as complete thoughts
- Include Q&A pairs for common queries—this mirrors how users interact with AI
- Use bullet points and numbered lists to break down complex information
Improve Discoverability
- Front-load key information in each section—AI often prioritizes early content
- Define technical terms clearly within your content, not just in glossaries
- Include contextual metadata through schema markup and structured data
- Write descriptive alt text for images and diagrams
Enhance Accuracy
- Date your content clearly, especially for time-sensitive information
- Link related concepts within your content to provide context
- Be explicit about scope —what your content covers and what it doesn’t
Understand Intent Alignment

AI systems are evolving to focus more on intent than just keyword matching. That means your content should address the “why” behind user queries—not just the “what.”

Think about the deeper purpose behind a search. Are users trying to solve a problem? Make a decision? Learn a concept? Your content should reflect that.

The Bottom Line

As AI continues to evolve from retrieval to generative systems, your role as a content creator becomes more critical—not less. By structuring your content for AI discoverability and comprehension, you’re not just improving search rankings; you’re actively reducing the likelihood that AI will hallucinate when answering questions in your domain.

So the next time you create or update content, ask yourself:

“Can an AI system easily find, understand, and accurately use this information?”

If the answer is yes, you’re part of the solution.
September 2, 2025
Beyond Blue Links: How AI Discovers and Synthesizes Content
In my previous posts, we’ve explored how AI systems index content through semantic chunking rather than keyword extraction, and how they understand user intent through contextual analysis instead of pattern matching. Now comes the final piece: how AI systems actually retrieve and synthesize content to answer user questions.

This is where the practical implications for content creators become apparent.

The Fundamental Shift: From Finding Pages to Synthesizing Answers

Here’s the key difference that changes everything: Traditional search matches keywords and returns ranked pages. AI-powered search matches semantic meaning and synthesizes answers from specific content chunks.

This fundamental difference in matching and retrieval processes requires us to think about content creation in entirely new ways.

Let’s see how this works using the same example documents from my previous posts:

Document 1: “Upgrading your computer's hard drive to a solid-state drive (SSD) can dramatically improve performance. SSDs provide faster boot times and quicker file access compared to traditional drives.“

Document 2: “Slow computer performance is often caused by too many programs running simultaneously. Close unnecessary background programs and disable startup applications to fix speed issues.“

Document 3: “Regular computer maintenance prevents performance problems. Clean temporary files, update software, and run system diagnostics to keep your computer running efficiently.“

User query: “How to make my computer faster?“

How Traditional vs. AI Search Retrieve Content

How Traditional Search Matches and Retrieves

Traditional search follows a predictable process:

Keyword Matching: The system uses TF-IDF scoring, Boolean logic, and exact phrase matching to find relevant documents. It’s looking for pages that contain the words “computer,” “faster,” “make,” and related terms.

Authority-Based Ranking: PageRank algorithms, backlink analysis, and domain authority determine which pages rank highest. A page from a high-authority tech site with many backlinks will likely outrank a smaller site with identical content.

Example with our 3 computer docs: For “How to make my computer faster?“, traditional search would likely rank them this way:
- Doc 1 ranks highest: Contains the exact keyword “faster” in “faster boot times” plus “improve performance“
- Doc 2 ranks second: Strong semantic matches with “slow computer” and “speed issues“
- Doc 3 ranks lowest: Related terms like “efficiently” and “performance” but less direct keyword matches
The user gets three separate page results. They need to click through, read each page, and synthesize their own comprehensive answer.

How AI RAG Search Matches and Retrieves

AI-powered RAG systems operate on entirely different principles:

Vector Similarity Matching:

Rather than matching keywords, the system uses cosine similarity to compare the semantic meaning of the query vector against content chunk vectors. The query “How to make my computer faster?” gets converted into a mathematical representation that captures its meaning, intent, and context.

Semantic Understanding:

The system retrieves chunks based on conceptual relationships, not just keyword presence. It understands that “SSD upgrade” relates to “making computers faster” even without shared keywords.

Multi-Chunk Synthesis:

Instead of returning separate pages, the system combines the most relevant chunks from multiple sources to create a comprehensive answer.

Example with same query: Here’s how AI would handle “How to make my computer faster?” using the chunks from my first post:

The query vector finds high semantic similarity with:
- Chunk 1A: “Upgrading your computer's hard drive to a solid-state drive (SSD) can dramatically improve performance.“
- Chunk 1B: “SSDs provide faster boot times and quicker file access compared to traditional drives.“
- Chunk 2B: “Close unnecessary background programs and disable startup applications to fix speed issues.“
- Chunk 3B: “Clean temporary files, update software, and run system diagnostics to keep your computer running efficiently.“
The AI synthesizes these chunks into a comprehensive answer covering hardware upgrades, software optimization, and maintenance—drawing from all three documents simultaneously.

Notice the difference: traditional search would return Doc 1 as the top result because it contains “faster,” even though it only covers hardware solutions. AI RAG retrieves the most semantically relevant chunks regardless of their source document, prioritizing actionable solutions over keyword frequency. It might even skip Chunk 2A (“Slow computer performance is often caused by...“) despite its strong keyword matches, because it describes problems rather than solutions.

The user gets one complete answer that addresses multiple solution pathways, all sourced from the most relevant chunks regardless of which “page” they came from.

Why This Changes Content Strategy

This retrieval difference has profound implications for how we create content:

Chunk-Level Discoverability

Your content isn’t discovered at the page level—it’s discovered at the chunk level. Each section, paragraph, or logical unit needs to be valuable and self-contained. That perfectly written conclusion paragraph might never be found if the rest of your content doesn’t rank well, because AI systems retrieve specific chunks, not entire pages.

Comprehensive Coverage

AI systems find and combine related concepts from across your content library. This requires strategic coverage:

Instead of trying to stuff keywords into a single page, create focused pieces that together provide comprehensive coverage. Rather than one “ultimate guide to computer speed,” create separate pieces on hardware upgrades, software optimization, maintenance, and diagnostics.

Synthesis-Ready Content

Write chunks that work well when combined with others—provide complete context by:
- Avoiding excessive pronoun references
- Writing self-contained paragraphs and sections
The Bottom Line for Content Creators

We’ve now traced the complete AI search journey:
- How AI indexes content through semantic chunking (Post 1)
- Understands user intent through contextual analysis (Post 2)
- Retrieves and synthesizes content through vector similarity matching (this post)
Each step reinforces the same content recommendations:
- Chunk-sized content aligns with how AI indexes and retrieves information
- Conversational language matches how AI understands user intent
- Structured content supports AI’s semantic chunking and knowledge graph construction
- Rich context supports semantic relationships that AI systems rely on, including:
  - Intent-driven metadata (audience, purpose, user scenarios)
  - Complete explanations (the why, when, and how behind recommendations)
  - Relationships to other concepts and solutions
  - Trade-offs, implications, and prerequisites
- Comprehensive coverage works with how AI synthesizes multi-source answers
AI technology is rapidly evolving. What is true today may become outdated tomorrow. AI may eventually become so advanced that we don’t have to think specifically about writing for AI systems—they’ll accommodate how humans naturally write and communicate.

But no matter what era we’re in, the fundamentals of creating high-quality content remain constant. Those recommendations we’ve discussed are timeless principles of good communication: create accurate, true, and complete content; provide as much context as possible to communicate effectively; offer information in digestible, bite-sized pieces for easy consumption; write in conversational language for clarity and engagement.

Understanding how current AI systems work simply reinforces why these have always been good practices. Whether optimizing for search engines, AI systems, or human readers, the goal remains the same: communicate your expertise as clearly and completely as possible.

This completes my three-part series on AI-ready content creation. Understanding how AI indexes, interprets, and retrieves content gives us the foundation for creating content that thrives in an AI-powered world.
August 28, 2025
From Keywords to Context: What AI’s Intent Understanding Means for Content Creators
The shift from keyword matching to contextual understanding means content creators must write for comprehension, not just discovery. AI systems don’t just match words—they understand intent, context, and the unstated needs behind every query.

In my first post, I explored how both traditional and AI-powered search follow the same fundamental steps: crawl and index content, understand user intent, then match and retrieve content. This sequence hasn’t changed. What has changed is now AI-powered search embeds Large Language Models (LLMs) into each step.

My last post dove deep into indexing step, explaining how AI systems use vector embeddings and knowledge graphs to chunk content semantically. AI systems understand meaning and relationships rather than cataloging keywords.

So what’s different about user query (intent) understanding? When someone searches for “How to make my computer faster?“, what are they really asking for?

Traditional search engines and AI-powered search systems interpret this question in fundamentally different ways. , with profound implications for how we should create content.

The Evolution of Intent Understanding

To appreciate how revolutionary AI-driven intent understanding is, we need to look at how search has evolved.

The evolution of search intent understanding

Early traditional search engines treated question words like “how,” “why,” and “where” as “stop words”—filtering them out before processing queries.

Modern traditional search has evolved to preserve question words and use them for basic query classification. But the understanding remains relatively shallow—more like categorization than true comprehension.

AI-powered RAG (Retrieval-Augmented Generation) systems represent a fundamental leap. They decode the full semantic meaning, understand user context, and map queries to solution pathways.

Modern Traditional Search: Pattern Recognition

Let’s examine how modern traditional search processes our example query “How to make my computer faster?”

Traditional search recognizes that “How to” signals an instructional query and knows that “computer faster” relates to performance. Yet it treats these as isolated signals rather than understanding the complete situation.

Traditional search processes the query through tokenization, preserving “How to” as a query classifier while removing low-value words like “my.” It then applies pattern recognition to classify the query type as instructional and identifies keywords related to computer performance optimization.

What it can’t understand:
- "My” implies the user has an actual problem right now—not theoretical interest
- “Make...faster” suggests current dissatisfaction requiring immediate solutions
- The question format expects comprehensive guidance, not scattered tips
- A performance problem likely has multiple causes needing different approaches
AI Search: Deep Semantic Comprehension

RAG systems process the same query through multiple layers of understanding:

Semantic Query Analysis When the AI receives “How to make my computer faster?”, it decodes the question’s semantic meaning:
- “How to“ → User needs instructional guidance, not just information
- “make“ → User wants to take action, transform current state
- “my“ → Personal problem happening now, not hypothetical
- “computer“ → Specific domain: personal computing, not servers or networks
- “faster“ → Performance dissatisfaction, seeking speed improvement
- “?“ → Expects comprehensive answer, not yes/no response
The LLM understands this isn’t someone researching computer performance theory—it’s someone frustrated with their slow computer who needs actionable solutions now.

Query Embedding The query gets converted into a vector that captures semantic meaning across hundreds of dimensions. While individual dimensions are abstract mathematical representations, the vector as a whole captures:
- The instructional nature of the request
- The performance optimization context
- The personal urgency
- The expected response type (actionable guidance)
By converting queries into the same vector space used for content indexing, AI creates the foundation for semantic matching that goes beyond keywords.

The Key Difference While traditional search sees keywords and patterns, AI comprehends the actual situation: a frustrated user with a slow computer who needs comprehensive, actionable guidance. This semantic understanding of intent becomes the foundation for retrieval and matching.

How Different Queries are Understood

This deeper understanding transforms how different queries are processed:

“Where can I find Azure pricing?“
- Traditional: Matches “Azure” + “pricing” + “find“
- RAG: Understands commercial evaluation intent, knows you’re likely comparing options
“Why is my app slow?“
- Traditional: Diagnostic query about “app” + “slow“
- RAG: Recognizes frustration, expects root-cause analysis and immediate fixes
What this Means for Content Creators

AI’s ability to understand user intent through semantic analysis and vector embeddings changes how we need to create content. Since AI understands the context behind queries (recognizing “my computer” signals a current problem needing immediate help), our content must address these deeper needs:

1. Write Like You’re Having a Conversation

Remember how AI decoded each word of “How to make my computer faster?” for semantic meaning? AI models excel at understanding natural language patterns because they’re trained on conversational data. Question-based headings (“How do I migrate my database?“) align perfectly with how users actually phrase their queries.

Instead of: “Implement authentication protocols using OAuth 2.0 framework” Write: “Here's how to set up secure login for your app using OAuth 2.0“

The conversational version provides contextual clues that help AI understand user intent:

“Here's how” signals instructional content, “your app” indicates practical guidance, and “secure login” translates technical concepts to user benefits.

2. Provide Full Context in Self-Contained Sections

AI understands that “How to make my computer faster?” requires multiple solution types—hardware, software, and maintenance. Since AI grasps these comprehensive needs through vector embeddings, your content should provide complete context within each section.

Include the why behind recommendations, when different solutions apply, and what trade-offs exist—all within the same content chunk. This aligns with how AI chunks content semantically and understands queries holistically.

3. Use Intent-Driven Metadata

Since AI converts queries into semantic vectors that capture intent (instructional need, urgency, complexity level), providing explicit intent metadata helps AI better understand your content’s purpose:
- User intent: “As a developer, I want to implement secure authentication so that user data remains protected“
- Level: Beginner/Intermediate/Advanced to match user expertise
- Audience: Developer/Admin/End-user for role-based content alignment
This metadata becomes part of the semantic understanding, helping AI match content to the right user intent.

The Bigger Picture

AI’s semantic understanding of user intent changes content strategy fundamentals. Content creators must now focus on addressing the full context of user queries and consider the implicit needs that AI can detect.

This builds on the semantic chunking we explored in my last post. AI systems use the same vector embedding approach for both indexing content and understanding queries. When both exist in the same semantic space, AI can connect content to user needs even when keywords don’t match.

The practical impact:

AI can now offer comprehensive, contextual answers by understanding what users need, not what they typed. But this only works when we create structured content in natural language, complete context, and clear intent signals.

This is the second post in my three-part series on AI-ready content creation. In my first post, we explored how AI indexes content through semantic chunking rather than keyword extraction.

Coming next: “Beyond Rankings: How AI Retrieval Transforms Content Discovery”

Now that we understand how AI indexes content (Post 1) and interprets user intent (this post). My next post will reveal how AI systems match and retrieve content. I’ll explore:
- How vector similarity replaces PageRank-style algorithms
- Why knowledge graphs matter more than link structures
- And what this means for making your content discoverable in AI-powered search
August 26, 2025
Understanding Indexing: A Guide for Content Creators and AI Search
In my earlier post, I explained the fundamental shift from traditional search to generative AI search. Traditional search finds existing content. Generative AI creates new responses.

If you’ve been hearing recommendations about “AI-ready content” like chunk-sized content, conversational language, Q&A formats, and structured writing, these probably sound familiar. As instructional designers and content developers, we’ve used most of these approaches for years. We chunk content for better learning, write conversationally to engage readers, and use metadata for reporting and semantic web purposes.

Today, I want to examine how this shift starts at the very beginning: when systems index and process content.

What is Indexing?

Indexing is how search systems break down and organize content to make it searchable. Traditional search creates keyword indexes, while AI search creates vector embeddings and knowledge graphs from semantic chunks. The move from keywords to chunks signifies one of the most significant changes in how search technology works.

Let’s trace how both systems process the same content using three sample documents from my previous post:

Document 1: “Upgrading your computer's hard drive to a solid-state drive (SSD) can dramatically improve performance. SSDs provide faster boot times and quicker file access compared to traditional drives.“

Document 2: “Slow computer performance is often caused by too many programs running simultaneously. Close unnecessary background programs and disable startup applications to fix speed issues.“

Document 3: “Regular computer maintenance prevents performance problems. Clean temporary files, update software, and run system diagnostics to keep your computer running efficiently.“

User query: “How to make my computer faster?“

How does traditional search index content?

Traditional search follows three mechanical steps:

Step 1: Tokenization

This step breaks raw text into individual words. The three docs after tokenization look like this:
```
DOC1 → Tokenization → ["Upgrading", "your", "computer's", "hard", "drive", "to", "a", "solid-state", "drive", "SSD", "can", "dramatically", "improve", "performance", "SSDs", "provide", "faster", "boot", "times", "and", "quicker", "file", "access", "compared", "to", "traditional", "drives"]

DOC2 → Tokenization → ["Slow", "computer", "performance", "is", "often", "caused", "by", "too", "many", "programs", "running", "simultaneously", "Close", "unnecessary", "background", "programs", "and", "disable", "startup", "applications", "to", "fix", "speed", "issues"]

DOC3 → Tokenization → ["Regular", "computer", "maintenance", "prevents", "performance", "problems", "Clean", "temporary", "files", "update", "software", "and", "run", "system", "diagnostics", "to", "keep", "your", "computer", "running", "efficiently"]
```
Step 2: Stop Word Removal & Stemming

What are Stop Words?

Stop words are common words that appear frequently in text but carry little meaningful information for search purposes. They’re typically removed during text preprocessing to focus on content-bearing words.

Common English stop words:
```
a, an, the, is, are, was, were, be, been, being, have, has, had, do, does, did, will, would, could, should, may, might, can, of, in, on, at, by, for, with, to, from, up, down, into, over, under, and, or, but, not, no, yes, this, that, these, those, here, there, when, where, why, how, what, who, which, your, my, our, their
```
What is Stemming?

Stemming is the process of reducing words to their root form by removing suffixes, prefixes, and other word endings. The goal is to treat different forms of the same word as identical for search purposes.

Some stemming Examples:
```
Original Word    →    Stemmed Form
"running"        →    "run"
"runs"           →    "run"  
"runner"         →    "run"
"performance"    →    "perform"
"performed"      →    "perform"
"performing"     →    "perform"
```
The three sample documents after stop words removal and stemming look like this:
```
DOC1 Terms: ["upgrad", "comput", "hard", "driv", "solid", "stat", "ssd", "dramat", "improv", "perform", "ssd", "provid", "fast", "boot", "time", "quick", "file", "access", "compar", "tradit", "driv"]

DOC2 Terms: ["slow", "comput", "perform", "caus", "program", "run", "simultan", "clos", "unnecessari", "background", "program", "disabl", "startup", "applic", "fix", "speed", "issu"]

DOC3 Terms: ["regular", "comput", "maintain", "prevent", "perform", "problem", "clean", "temporari", "file", "updat", "softwar", "run", "system", "diagnost", "keep", "comput", "run", "effici"]
```
Step 3: Inverted Index Construction

What is inverted index?

An inverted index is like a book’s index, but instead of mapping topics to page numbers, it maps each unique word to all the documents that contain it. It’s called “inverted” because instead of going from documents to words, it goes from words to documents.
Note: For clarity and space, I’m showing only a representative subset that demonstrates key patterns.

The complete inverted index would contain entries for all ~28 unique terms from our processed documents. The key patterns include:
- Terms appearing in all documents (common terms like “comput”)
- Terms unique to one document (distinctive terms like “ssd”)
- Terms with varying frequencies (like “program” with tf=2)
```
INVERTED INDEX:
"comput" → {DOC1: tf=1, DOC2: tf=1, DOC3: tf=1}
"perform" → {DOC1: tf=1, DOC2: tf=1, DOC3: tf=1}
"fast" → {DOC1: tf=1}
"speed" → {DOC2: tf=1}
"ssd" → {DOC1: tf=1}
"program" → {DOC2: tf=2}
"maintain" → {DOC3: tf=1}
"slow" → {DOC2: tf=1}
"improv" → {DOC1: tf=1}
"fix" → {DOC2: tf=1}
"clean" → {DOC3: tf=1}
```
The result: An inverted index that maps each word to the documents containing it, along with frequency counts.

Why inverted indexing matters for content creators:

Traditional search relies on keyword matching. This is why SEO focused on keyword density and exact phrase matching.

How do AI systems index content?

AI systems take a fundamentally different approach:

Step 1: Semantic chunking

AI doesn’t break content into words. Instead, it creates meaningful, self-contained chunks. AI systems analyze content for topic boundaries, logical sections, and complete thoughts to determine where to split content. They look for natural break points that preserve context and meaning.

What AI Systems Look For When Chunking

1. Semantic Coherence
- Topic consistency: Does this section maintain the same subject matter?
- Conceptual relationships: Are these sentences talking about related ideas?
- Context dependency: Do these sentences need each other to make sense?
2. Structural Signals
- HTML tags: Headings (H1, H2, H3), paragraphs, lists, sections
- Formatting cues: Line breaks, bullet points, numbered steps
- Visual hierarchy: How content is organized on the page
3. Linguistic Patterns
- Transition words: “However,” “Therefore,” “Next,” “Additionally”
- Pronoun references: “It,” “This,” “These” that refer to previous concepts
- Discourse markers: Words that signal topic shifts or continuations
4. Completeness of Information
- Self-contained units: Can this chunk answer a question independently?
- Context sufficiency: Does the chunk have enough background to be understood?
- Action completeness: For instructions, does it contain a complete process?
5. Optimal Size Constraints
- Token limits: Most AI models have processing windows (512, 1024, 4096 tokens)
- Embedding efficiency: Chunks need to be small enough for accurate vector representation
- Memory constraints: Balance between context preservation and processing speed
6. Content Type Recognition
- Question-answer pairs: Natural chunk boundaries
- Step-by-step instructions: Each step or related steps become chunks
- Examples and explanations: Keep examples with their explanations
- Lists and enumerations: Group related list items
For demonstration purposes, I’m breaking our sample documents by sentences, though real AI systems use more sophisticated semantic analysis:
```
DOC1 → Chunk 1A: "Upgrading your computer's hard drive to a solid-state drive (SSD) can dramatically improve performance."
DOC1 → Chunk 1B: "SSDs provide faster boot times and quicker file access compared to traditional drives."

DOC2 → Chunk 2A: "Slow computer performance is often caused by too many programs running simultaneously."
DOC2 → Chunk 2B: "Close unnecessary background programs and disable startup applications to fix speed issues."

DOC3 → Chunk 3A: "Regular computer maintenance prevents performance problems."
DOC3 → Chunk 3B: "Clean temporary files, update software, and run system diagnostics to keep your computer running efficiently."
```
Step 2: Vector embedding

Vector embeddings are created using pre-trained transformer neural networks like BERT, RoBERTa, or Sentence-BERT. These models have already learned semantic relationships from massive text datasets. Chunks are tokenized first, then passed through the pre-trained models. After that, each chunk becomes a mathematical representation of meaning.
```
Chunk 1A → Embedding: [0.23, -0.45, 0.78, ..., 0.67] (768 dims)
    Semantic Concepts: Hardware upgrade, SSD technology, performance improvement
    
Chunk 1B → Embedding: [0.18, -0.32, 0.81, ..., 0.71] (768 dims)  
    Semantic Concepts: Speed benefits, boot performance, storage comparison
    
Chunk 2A → Embedding: [-0.12, 0.67, 0.34, ..., 0.23] (768 dims)
    Semantic Concepts: Performance issues, software conflicts, resource problems
    
Chunk 2B → Embedding: [-0.08, 0.71, 0.29, ..., 0.31] (768 dims)
    Semantic Concepts: Software optimization, process management, troubleshooting
    
Chunk 3A → Embedding: [0.45, 0.12, -0.23, ..., 0.56] (768 dims)
    Semantic Concepts: Preventive care, maintenance philosophy, problem prevention
    
Chunk 3B → Embedding: [0.41, 0.18, -0.19, ..., 0.61] (768 dims)
    Semantic Concepts: Maintenance tasks, system care, routine optimization
```
Step 3: Knowledge graph construction

What is a Knowledge Graph?

Screenshot of the visualized knowledge graph based on the sample docs

A knowledge graph is a structured way to represent information as a network of connected entities and their relationships. Think of it like a map that shows how different concepts relate to each other. For example, it captures that “SSD improves performance” or “too many programs cause slowness.” This explicit relationship mapping helps AI systems understand not just what words appear together, but how concepts actually connect and influence each other.

How is knowledge graph constructed?

The system analyzes each text chunk to identify: (1) Entities – the important “things” mentioned (like Computer, SSD, Performance), (2) Relationships – how these things connect to each other (like “SSD improves Performance”), and (3) Entity Types – what category each entity belongs to (Hardware, Software, Metric, Process). These extracted elements are then linked together to form a web of knowledge that captures the logical structure of the information.
```
CHUNK-LEVEL RELATIONSHIPS:

Chunk 1A:
[Computer] --HAS_COMPONENT--> [Hard Drive]
[Hard Drive] --CAN_BE_UPGRADED_TO--> [SSD]
[SSD Upgrade] --CAUSES--> [Performance Improvement]

Chunk 1B:
[SSD] --PROVIDES--> [Faster Boot Times]
[SSD] --PROVIDES--> [Quicker File Access]
[SSD] --COMPARED_TO--> [Traditional Drives]
[SSD] --SUPERIOR_IN--> [Speed Performance]

Chunk 2A:
[Too Many Programs] --CAUSES--> [Slow Performance]
[Programs] --RUNNING--> [Simultaneously]
[Multiple Programs] --CONFLICTS_WITH--> [System Resources]

Chunk 2B:
[Close Programs] --FIXES--> [Speed Issues]
[Disable Startup Apps] --IMPROVES--> [Boot Performance]
[Background Programs] --SHOULD_BE--> [Closed]

Chunk 3A:
[Regular Maintenance] --PREVENTS--> [Performance Problems]
[Maintenance] --IS_TYPE_OF--> [Preventive Action]

Chunk 3B:
[Clean Temp Files] --IMPROVES--> [Efficiency]
[Update Software] --MAINTAINS--> [Performance]
[System Diagnostics] --IDENTIFIES--> [Issues]
```
Consolidated knowledge graph
```
COMPUTER PERFORMANCE
                           │
            ┌──────────────┼──────────────┐
            │              │              │
    HARDWARE SOLUTIONS  SOFTWARE SOLUTIONS  MAINTENANCE SOLUTIONS
            │              │              │
    ┌───────┴───────┐     ┌┴──────────┐   ┌┴─────────────┐
    │               │     │           │   │             │
[Hard Drive] → [SSD]  [Programs] → [Management]  [Regular] → [Tasks]
    │               │     │           │   │             │
    ▼               ▼     ▼           ▼   ▼             ▼
[Boot Times]    [File Access] [Close] [Disable] [Clean] [Update]
    │               │     │           │   │             │
    └───────────────┼─────┴───────────┼───┴─────────────┘
                    ▼                 ▼
              PERFORMANCE IMPROVEMENT
```
How knowledge graph works with vector embeddings?

Vector embeddings and knowledge graphs work together as complementary approaches. Vector embeddings capture implicit semantic similarities (chunks about “SSD benefits” and “computer speed” have similar vectors even without shared keywords), while knowledge graphs capture explicit logical relationships (SSD → improves → Performance). During search, vector similarity finds semantically related content, and the knowledge graph provides reasoning paths to discover connected concepts and comprehensive answers. This combination enables both fuzzy semantic matching and precise logical reasoning.

Why AI indexing drives the chunk-sized and structured content recommendation?

When AI systems chunk content, they look for topic boundaries, complete thoughts, and logical sections. They analyze content for natural break points that preserve context and meaning. AI systems perform better when content is already organized into self-contained, meaningful units.

When you structure content with clear section breaks and complete thoughts, you do the chunking work for the AI. This ensures related information stays together and context isn’t lost during the indexing process.

What’s coming up next?

In the next blogpost of this series, I’ll dive into how generative AI and RAG-powered search reshape the way systems interpret user queries, as opposed to the traditional keyword-focused methods. Our current post showed that AI indexes content by meaning, through chunking, vector embeddings, and building concept networks. It’s equally important to highlight how AI understands what users actually mean when they search.
August 19, 2025