Underlying Google’s AI Overviews is a powerful new technology known as Large Language Models (LLMs). These models, which include systems you’ve probably heard of like Google Gemini, Copilot, and ChatGPT, are designed to process and understand natural language. LLMs enable machines to comprehend and generate human-like text, a critical feature for powering AI-driven summaries in search. Google Gemini, one of Google’s most advanced LLMs, builds on previous models by offering improved accuracy, better handling of complex queries, and an improved understanding of context. Even with these improvements, it still faces challenges in these areas.

Why is this important? Understanding how LLMs and Google Gemini function is necessary to understand how to improve your content’s visibility in AI Overviews. These models do more than just match keywords—they analyze intent, process context, and generate responses based on the most relevant documents. By optimizing your content to align with how these models retrieve and summarize information, you increase your chances of appearing in these prominent AI-generated summaries. As AI Overviews play an increasingly important role in search, knowing how to tailor your content to fit these systems will give you a competitive edge.

I previously wrote a detailed explanation on how does Google AI Overviews work, drawing insights from my analysis of the Google patent. While I briefly mentioned LLMs in that article, this article is dedicated to exploring how the system specifically utilizes LLMs. It serves as an expansion and complement to the earlier discussion.

In this article, we will explore how Google Gemini works as the engine behind AI Overviews, looking into the role of LLMs, how they work, and the unique features that set Google Gemini apart. By understanding these foundational elements, we can better grasp how AI Overviews function—and also examine how and why the system gets things wrong.

What are Large Language Models (LLMs)?

Large Language Models are advanced AI systems designed to understand and generate human language. Trained on huge datasets, these models can process language, recognize context, and generate coherent text, making them ideal for responding to complex queries. The technology behind LLMs, such as transformer architecture, allows the models to focus on the most relevant parts of a query and generate accurate summaries. There’s a lot more about LLMs to discuss which I will go into more in-depth in a future article.

In the context of Google AI Overviews, LLMs play a central role in analyzing user queries and producing the summaries we see in the search results by processing vast amounts of relevant information. Google Gemini, one of Google’s most advanced LLMs, continues to refine these capabilities, offering more accurate and contextually rich search results.

Visit the Google AI Overview Library

Google AI Overview Library

Why Understanding How LLMs Work Improves Your AI Overview Visibility

Google’s AI Overviews rely on LLMs to analyze and summarize search results. These models interpret user queries and generate summaries based on the most relevant documents. Here’s why I think knowing how LLMs work helps boost your visibility in AI Overviews.

Content Targeting: LLMs don’t just match queries—they interpret user intent. By creating content that thoroughly addresses both the primary query and related topics, you improve the chances of your content being selected by these models.

Optimizing Structure: LLMs likely process content based on structure and clarity. Well-organized content, with clear headers and concise sections, allows the model to extract key points more easily, increasing its chances of being included in AI Overviews.

Relevance and Authority: LLMs likely prioritize content that is authoritative and up-to-date. By regularly updating your content and ensuring its accuracy, you align better with what LLMs are programmed to retrieve and summarize.

By understanding how LLMs work and implementing best practices for how to optimize for AI Overviews, you can significantly improve your content’s chances of being featured in AI-driven summaries.

Introduction to Google Gemini

The Google patent shares that the system can select from a variety of LLMs depending on the task it is completing. It specifically calls out the Pathways Language Model (PaLM) and Language Model for Dialogue Applications (LaMDA). Liz Reid, Google VP and Head of Google Search, has recently confirmed that Gemini is the primary LLM powering AI Overviews.

What is Google Gemini? Google Gemini is a family of LLMs developed by Google, which can handle text, images, audio, and video. Most modern LLMs, including Gemini, are built on the Transformer architecture, a deep learning model. Transformers are intended to understand the relationships between different words or elements in a sentence, helping LLMs like Gemini grasp context and generate more accurate responses.

Screenshot of Google's Gemini LLM
Google Gemini is a Large Language Model (LLM) developed by Google, capable of processing text, images, video, and audio. This image represents the LLM version of Gemini, which powers AI Overviews in Google Search.

Gemini operates using a few key components, such as attention mechanisms that allow the model to focus on the most relevant parts of the input while generating a response. This is important because not all parts of a query or input text are equally important—attention helps the model prioritize what’s most critical.

Tokenization is another key element, breaking text into smaller units or “tokens” to help the model process language more effectively. Gemini uses these tools to understand vast amounts of data, from single sentences to entire documents, movies, or even days of audio, enabling it to provide rich, contextually accurate responses.

One of Gemini’s prominent features is its ability to process up to 10 million tokens at once. This allows it to handle A LOT of information, such as processing five days of audio or understanding an entire movie without losing track of context.

How Google Gemini Fits into Google’s AI Overview System

Google Gemini’s value is in its ability to support AI Overviews by synthesizing information across various formats. As the patent explains, content from a “search result document can include, for example, text content, image content, and/or video content.” By integrating text, images, and potentially video or audio into the summaries, Gemini offers users a more comprehensive understanding of their query.

How AI Overviews Utilize LLMs and Google Gemini

Google Gemini generates AI Overviews by interpreting user queries and quickly summarizing vast amounts of relevant information. According to Google’s patent, when a user enters a query, the LLM generates a prompt to retrieve query-responsive Search Result Documents (SRDs) from Google’s index. This prompt guides the system in selecting the most relevant documents that contain the information needed to address the user’s search intent​. Beyond Google’s index, the LLM also has access to its own training data and Google’s knowledge graph.

In this process, Retrieval-Augmented Generation (RAG) enables the system to retrieve information from its own training data and from external sources like Google’s index and knowledge graph. RAG allows the system to dynamically pull in the most relevant and up-to-date data from these sources, ensuring that the generated AI Overview is accurate, comprehensive, and contextually relevant to the user’s query.

Role of LLMs in Summarizing and Interpreting Search Queries

When a user enters a search query, the LLM creates a specialized prompt designed to locate SRDs and related-query documents, which are processed and analyzed for relevance. This prompt enables the LLM to find documents that not only match the user’s query directly but also documents that respond to implied or related queries​. This ensures that the generated AI Overview provides a comprehensive and accurate response to the query.

To handle large datasets and ensure the content fits within memory constraints, the system may use a summarization LLM to reduce the size of the SRDs. The summarized content is then processed by the primary LLM to generate a concise, relevant response. This step is essential for maintaining efficiency, especially when working with larger amounts of data or complex queries.

When you enter a query, the system provides additional context to enhance its search capabilities. According to the patent, the LLM doesn’t just focus on the initial query—it expands the search through a multi-layered approach. By incorporating Search Result Documents (SRDs) relevant to related queries, the LLM broadens the range of information it can access, leading to a more comprehensive and nuanced summary. This allows the system to better interpret the user’s intent, even for complex or ambiguous queries.

An example prompt the system might use to retrieve documents:

“Retrieve documents related to improving sleep quality. Include results from recent scientific studies, expert health articles, and wellness blogs that provide actionable sleep improvement strategies. Expand search to include related queries such as ‘sleep hygiene tips,’ ‘how to fall asleep faster,’ and ‘best practices for a good night’s sleep.’ Filter results to prioritize content published within the last 12 months, focusing on authoritative sources like medical journals, health organizations, and sleep experts. If available, include supporting visuals or diagrams that illustrate sleep cycles or optimal sleep environments. Analyze these documents to extract key strategies and insights, and summarize the most relevant information in a concise format for the user.”

How Google Gemini Generates AI Overviews

Google Gemini specifically uses this prompt-based approach to generate AI Overviews. After the LLM constructs a query-specific prompt, it retrieves SRDs and processes their content using its multimodal capabilities. These documents are then synthesized into a concise, easy-to-understand summary that directly addresses the query​.

For example, if a user searches for “best practices for website SEO,” Google Gemini will generate a prompt that retrieves SRDs related to SEO strategies, recent SEO updates, and expert advice. The LLM then processes these documents to generate a natural language summary, presenting the user with a concise overview of SEO best practices​.

An example prompt the system might use to create the summary:

“Summarize the content from the following Search Result Documents (SRDs) that are responsive to the query ‘best practices for website SEO.’ Include relevant insights about recent SEO strategies, algorithm updates, and expert recommendations, while omitting any introductory explanations about SEO basics. Assume the user already has foundational knowledge of SEO, and focus on recent developments and advanced tactics such as structured data usage, core web vitals, and backlink strategies. Additionally, integrate content from related queries such as ‘on-page SEO techniques’ and ‘Google algorithm updates 2024.’ Generate a concise natural language summary that highlights the most up-to-date best practices in SEO, ensuring that technical and non-technical readers can easily understand the strategies.”

An AI Overview for the query, "best practices for website SEO."
An AI Overview for the query, “best practices for website SEO.”

Below, we can see how the LLMs are used throughout the process of creating an AI Overview summary.

How Google Gemini generates AI Overviews: From Query to Summary
This diagram illustrates the process through which Google Gemini, a Large Language Model (LLM), generates AI Overviews. From query input to final summary, the system retrieves and processes Search Result Documents (SRDs), ensuring relevant and concise answers are delivered to the user.

It’s important to note here that after the summary has been created, the system then seeks out sources to verify each of the statements include in the summary. It’s quite possible that the sources used to create the summary may not be the same sources used to verify the summary content.

While we haven’t seen personalization in the live version of AI Overviews yet, the system is capable of it. The patent provides an example of this by describing prompts such as, “assuming the user is familiar with [description of the certain content], summarize [search result document content].” This suggests that the system can tailor summaries based on the user’s prior searches of the topic.

Combining Quotes, Paraphrasing, and Confidence: How Google Gemini Builds AI Overview Summaries

When the AI Overview system creates summaries, it doesn’t just copy and paste content from the top SRDs. Instead, Gemini processes the information in a much more nuanced way. The system can directly quote sections from SRDs, but it can also paraphrase content, combining information from a variety of documents. As explained in the patent, “the NL based summary that is generated can include direct quotes from the content that is processed using the LLM and/or can paraphrase the content that is processed using the LLM.” This method ensures that the summary is both comprehensive and easy to understand.

Additionally, Gemini can draw on its training data to fill in gaps where the SRDs might lack clarity or detail. The patent further explains, “the NL based summary […] can also include content that is not directly (or even indirectly) derivable from the content processed using the LLM, but is relevant to the content and is generated based on world knowledge of the LLM.” This means that the summary can include relevant, factually accurate information, even if it’s not directly found in the SRDs, as long as it aligns with the user’s query.

Ensuring Transparency with Source Identifiers

To ensure transparency, each part of the summary is linked to its source through source identifiers. These markers allow users to trace specific portions of the summary back to the original SRD. According to the patent, “the system can additionally or alternatively include, as part of the content, a source identifier of the SRD […] enabling the LLM output, generated based on processing the content using the LLM, to reflect which portion(s) of the NL based summary are supported by which SRD(s).” By incorporating these markers, Google makes it clear which sections are supported by external content, while still allowing the model to use its broader understanding to provide context and fill in missing details.

Example of an AI Overview with highlighted links (source identifiers), showing how the system traces portions of the summary back to their supporting documents, ensuring transparency and accuracy.

Confidence Scores and Content Verification

However, not all parts of the summary are treated equally. Google’s AI system can assign confidence scores to different sections of the summary, based on how well-supported the information is by the SRDs. The patent details this process: “the NL based summary can be annotated with confidence measure(s) associated with corresponding portion(s) of the NL based summary.” For instance, sections with high confidence are more likely to be shown prominently, perhaps highlighted in green, while lower-confidence sections might be flagged with colors like orange or red, signaling to users that the information may require further verification.

Finalizing the Summary: Displaying Additional Search Results

Once the model has processed and verified the content, the final natural language summary is produced. This summary blends direct quotes, paraphrased material, and additional context provided by the LLM’s understanding. If the AI has a high level of confidence in the summary, it may display it alone, without showing additional search results. As noted in the patent, “if confidence measure(s) […] satisfies upper threshold(s) most indicative of confidence, the NL based summary can be rendered […] without any initial rendering of any additional search results.” In cases where confidence is lower, traditional search results will appear alongside the summary, giving users more avenues to explore. “If confidence measure(s) […] satisfies lower threshold(s) […] the NL based summary can be rendered […] with initial rendering of additional search results.”

The Role of Confidence Scores Behind the Scenes

While I previously discussed this process in my original patent article, I haven’t observed this particular behavior in the live version of the system. It’s possible that the confidence scores described in the patent aren’t meant to be directly visible to users but are used behind the scenes to shape the search experience.

These scores may determine not only whether additional search results are displayed but also how many supporting links are included. In fact, according to my upcoming study with Authoritas on the variability of AI Overviews across industries, the number of links varies by industry. More complex or informative sectors like Education, Health, and Food & Beverage tend to feature more links—likely reflecting the system’s need to boost confidence with additional resources. Meanwhile, industries like Entertainment and Sports & Recreation often require fewer links, as the AI can generate confident summaries with less external support.

This implies that confidence levels may play a very important role in determining both the number and visibility of links in AI Overviews, based on the complexity of the industry and content.

Why Google AI Overviews Might Not Include the User Query

You might have noticed a peculiar quality about AI Overview summaries—they don’t usually repeat the query in the summaries. Recent studies by Kevin Indig and Surfer SEO highlighted that only 5-6% of AI Overviews contain the query. This is actually highlighted in the Google patent.

A specific example from the patent explains that “a prompt of ‘Summarize <Content A>, <Content B>, <Content C>, and <Content D>’ (which omits the query itself) can be processed using the LLM to generate the NL based summary.” This means the system can focus entirely on relevant content rather than using the user query directly.

It appears that this decision is deliberate, as the goal is to generate summaries that prioritize delivering the most useful and relevant information rather than simply mirroring the query.

Take a question-based query like “Do you need business insurance.” From a user perspective, you’ve already asked the question, so repeating it in the summary would add unnecessary characters (30 characters in this case) and wouldn’t enhance the value of the response. Instead, the system focuses on answering the question, prioritizing the user’s intent over keyword repetition. For such queries, it might make more sense for Google to directly provide the answer rather than repeat the question back to the user. You can see in the AI Overview summary below, it does just that. It answers the question directly instead of restating it.

A Google AI Overview for the query "Do you need business insurance."
A Google AI Overview for the query “Do you need business insurance.”

By avoiding the direct inclusion of the query, the LLM ensures that the summary is contextually rich and provides a broader, more nuanced understanding of the topic. This approach helps Google Gemini create more accurate and comprehensive summaries, although it can sometimes lead to confusion when users expect to see their query mirrored in the output.

Challenges and Limitations of LLMs in AI Overviews

While Google Gemini improves search experiences, several limitations still impact the quality and accuracy of AI Overview summaries. A key issue is the inclusion of biased, irrelevant or incorrect information, often from unrelated geographical regions or context-specific sources. Recognizing these shortcomings is essential for identifying areas where LLMs can be further refined and improved.

Known Limitations of LLMs: Bias, Accuracy, and Complexity

LLMs are trained on immense datasets, which can include biases, resulting in skewed summaries that reflect certain viewpoints or regions. Additionally, accuracy issues arise when LLMs generate responses based on incomplete, outdated, or unreliable data, leading to incorrect or misleading summaries.

As mentioned in an article by MIT Technology Review, the suggestion to “add glue to pizza” or claim that Andrew Johnson earned university degrees after his death are extreme but notable examples of how generative models can produce nonsensical or factually incorrect answers.

Moreover, LLMs struggle with complex or nuanced queries, particularly when multiple layers of meaning or intent are involved. This can cause models to oversimplify or misinterpret the intricacies of certain queries, leading to summaries that fail to address the user’s expectations.

Why Irrelevant Information Appears in AI Overviews

A significant issue in AI Overviews is the inclusion of irrelevant information. We’ve seen a lot of this lately especially from inappropriate geographical or contextual sources. According to the patent, the system selects Search Result Documents (SRDs) using query-dependent, user-dependent, and query-independent factors, prioritizing highly ranked documents. However, if these highly ranked documents include irrelevant or off-topic information (such as a humorous or satirical source), the LLM may still pull them into the summary.

Recently, Lily Ray shared a post on X about the query “what is the recommended weekly alcohol intake,” which displayed sources from the UK, despite being searched from the US.

It’s likely that Gemini selected these sources because they were highly ranked and offered a diverse, in this case problematic, perspective on the topic. It’s important to remember that the system prioritizes diversity of information when generating summaries.

How These Limitations Affect AI Overview Summaries

These challenges impact the quality of AI Overviews by including irrelevant or inaccurate information, diminishing the usefulness of the responses. AI Overviews can misunderstand and misinterpret legitimate sources, leading to incorrect outputs. In one case, the system generated a response that incorrectly stated Barack Obama was a Muslim president, when it misinterpreted the title of an academic essay.

This issue is especially prominent in geographically specific queries, where users may receive information from international sources that don’t align with their location, creating confusion or frustration.

Moreover, when the system encounters conflicting information from multiple sources, it may combine or generate a misleading response. As pointed out in the MIT Technology Review, conflicting information, such as different versions of content, can result in outputs that mix old and new data, providing inaccurate summaries.

Steps Google May Be Taking to Mitigate These Issues

Google has stated that they are actively working on addressing these issues. According to Liz Reid, head of Google Search, technical improvements are being made to reduce nonsensical or incorrect answers, with a focus on improving the system’s ability to detect irrelevant content. Additionally, Google has introduced restrictions on certain types of queries where AI Overviews might not be helpful, such as in medical or sensitive contexts, where incorrect information could pose greater risks.

Conclusion

This article explored how Large Language Models (LLMs), in particular Google Gemini, power AI Overviews in Google Search. We covered how these models work, their impact on summarizing content, and challenges such as bias, accuracy issues, and the inclusion of irrelevant information.

Google Gemini can generate fast, comprehensive summaries, but its limitations highlight areas for improvement, like handling complex or region-specific queries more effectively. As AI evolves, refining models like Gemini will be key to enhancing search accuracy and relevance. Looking ahead, AI will continue to shape search, offering more personalized, precise, and dynamic experiences.




Further Reading on Google AI Overviews

Subscribe to the SEO, AI & Pizza Newsletter

Receive weekly updates on the intersection of SEO, Search, and AI, directly in your inbox.

References

AI on Innovation – part 2
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context
Generative AI in Search: Let Google do the searching for you
Generative summaries for search results
Google AI Overviews Study: 25+ Statistics from 405,576 searches
Introducing Gemini 1.5, Google’s next-generation AI model
Introducing Gemini: our largest and most capable AI model
Transformer: A Novel Neural Network Architecture for Language Understanding

Subscribe to the SEO, AI & Pizza Newsletter

Receive weekly updates on the intersection of SEO, Search, and AI, directly in your inbox.

Keep up with the latest on the intersection of SEO, Search, and AI, right to your inbox.

Leave a Reply