iTranslated by AI

The content below is an AI-generated translation. This is an experimental feature, and may contain errors. View original article
📖

re:Invent 2025: Grounding GenAI on Enterprise Data with AWS AgentCore and Coveo

に公開

Introduction

By transcribing various overseas lectures into Japanese articles, we aim to make hidden valuable information more accessible. The presentation we will be covering in this project, which is driven by this concept, is here!

For re:Invent 2025 transcription articles, information is summarized in this Spreadsheet. Please check it as well.

📖re:Invent 2025: AWS re:Invent 2025 - Grounding GenAI on Enterprise Data with AWS AgentCore + Coveo (MAM221)

In this video, Nicolas Bordeleau from Coveo explains how to ground GenAI with enterprise data using AgentCore and Coveo. He introduces case studies from Dell, NVIDIA, Intuit, Vanguard, and others, emphasizing the importance of reducing LLM hallucinations, factual accuracy, source attribution, and dynamic knowledge updates. He lists the conditions for an excellent retriever as depth of knowledge, contextual awareness, quality of relevance, execution speed, and support for multiple formats. He also describes the toolset provided through MCP, including passage retrieval, answer generation, and document search. For agent building architecture, he illustrates the integration of AgentCore, gateway, and Coveo MCP server, stressing that the key to success lies in the proper design of prompts and MCP descriptions. He recommends using snake case for tool naming and placing concise and accurate descriptions upfront.

https://www.youtube.com/watch?v=8WH3zCwlLj0

  • Please note that this article is automatically generated while maintaining the content of the existing lecture as much as possible. There may be typos or incorrect information.

Main Content

Thumbnail 0

Coveo's Achievements and the Importance of GenAI Grounding with Enterprise Data

Hello everyone. Thank you for taking the time to come to this session in your busy re:Invent schedule. Today, I'm going to talk about grounding GenAI with enterprise data using AgentCore and Coveo. My name is Nicolas Bordeleau. I work for Coveo. I'm on the Product Relations team. I'm excited to provide you with information on grounding with enterprise data.

Thumbnail 30

So, for today's agenda, first I'll talk about why grounding is so important. There are a few concepts we need to cover there. I'll also touch a little bit on the architecture of how we integrate Coveo with Bedrock. The secret sauce, spoiler alert, it's prompting. Everything in the LLM world is about prompting. And then we'll talk about some next steps for those of you who want to learn more.

Thumbnail 50

So first and foremost, not to brag, but just to give you a little bit of context so you understand who we are, what we do, and you might consider that I'm worthy of your time. We are an enterprise search company. We've worked with a lot of customers to bring GenAI into production. This is an example of Dell. Dell is powering multiple portals with Coveo. If you want to buy a laptop, you want to buy some equipment from them, it's powered by Coveo. If you're looking at their support section, that's also powered by Coveo. They have a very complex product, and when users are trying to solve their problems, they're using us to provide them with answers.

Thumbnail 100

Thumbnail 110

Thumbnail 130

Same thing for NVIDIA. NVIDIA has all the money in the world. They could have built that solution themselves. But they decided to partner with Coveo to provide question answering to their customers through a Coveo solution. Intuit decided to integrate Coveo within their own applications. So if you're using Intuit and you want to find out how to use the product, you have a question, Coveo is also there. And Vanguard, in the financial services space, is also a big customer of Coveo. They're using Coveo pretty much across the board, internally and externally, and this is an example of their Personal Investor portal. If you're looking for information about their product, not investment information, but information about their product, it's powered by Coveo. So we help customers bringing GenAI into production.

We built the full solution, but we also have options for customers if they want to integrate with AWS solutions so they can build their own. What I've been talking about so far is the left-hand side of this slide, where we basically do everything. We index the content, we build the index, we ground the prompt, we build the prompt, and then we provide UI components so you can serve it in a portal you want to deploy. We decided that in order to help customers, we're going to provide the rest of our platform as a retriever, and we're going to provide integration points with Bedrock, AgentCore, Q Business, and also QuickSight. And what we're going to cover today is basically how to integrate Coveo with the second one, which is AgentCore, but it's going to be quite similar if you want to work with QuickSight or any other solutions from AWS.

Thumbnail 180

This is a slide that I borrowed from AWS from a couple of weeks ago when they announced it. I thought it was interesting. I completely agree with what's written here. Agentic is probably the future of how LLMs will fully take their potential. I was really intrigued to see that in any case, there's always a dependency on data. When dealing with GenAI, you need to ground your LLMs based on enterprise data. You need to ground them so they don't hallucinate. So there's a strong dependency on data. That's where we come into play. Same thing for GenAI. There's always a strong dependency on tools and data for these agents and models to know how they should behave and where they should get fresh information from. So let's get into the meat of this presentation. Why do LLMs need to be grounded?

Thumbnail 230

Thumbnail 260

Why LLMs Need Grounding: From Factual Accuracy to Hallucination Reduction

Because they need to be factually accurate. In order for an LLM to be factually accurate, you need to provide it with some sort of information. LLMs are basically trained on stale data, and they don't know what's true or not true. You ask them a question, they will return an answer, but they will basically return whatever information is ingested. They don't care if that's not what your brand wants to tell your customer. They just return the information to the user on the other side who's trying to get that information. By grounding them, you give them a source of truth, and you can provide more accurate information, and they're factually accurate.

Thumbnail 280

You also need grounding if you want your LLMs to be contextually relevant. LLMs don't know much about the user on the other side who's asking the question. You want to be able to provide project information, user history. You need to add that to the prompt. That's easier to do by grounding your LLMs.

Traceability and trustworthiness. When you ground your LLM, you can do source attribution. If you want a user to trust what they read, you want them to be able to see where that information came from. If you don't ground your LLMs, the information comes from the LLM itself, and there's no way to trace back where this information came from.

Thumbnail 320

Grounding an LLM basically means providing it with the information that it should use to return an answer. That's where you can do source attribution, and users can follow those links to verify if the information is accurate, and they gain trust in the system. Dynamic knowledge updating is also a key point. Basically, LLMs are trained on data sets that are fixed in time. If you want to be able to provide them with updated data, you need to provide that as grounding information.

Thumbnail 340

So the information you want to expose with your LLM might already be available externally, might already be published, but it's from when the model was last trained. So for these reasons, you need to be able to provide grounding information to those models. The ultimate goal of grounding is to reduce hallucination. You want your LLM to answer factually using trusted information so it doesn't provide incorrect answers to your customers.

Thumbnail 370

LLMs are really good at lying to us and making us believe that the information they're providing is true. So by grounding them, you reduce the amount of hallucination. And in an enterprise, it's a must. There's not enough information about your enterprise externally for a model to know about your data. You need to ground them to be fully accurate.

Thumbnail 380

Qualities of a Good Retriever and Coveo MCP Server Toolset

So let's dive a little deeper into what makes a good retriever. A good retriever needs to have sufficient depth of knowledge. It needs to be able to look at a large amount of data. It's relatively easy to build a vector database and be able to deal with a small data set, be able to ground your model based on a small data set. But what we see with users lately is that they interact with LLMs, they ask a first question, and then they don't come back.

In the era of Google, you would look at results, you would navigate through the results, and then you would leave. You would try to find the information yourself. With LLMs, people go there, they ask one question, they refine it, they ask a second question, so you can't know exactly the scope of what they're trying to find. So you want to be able to provide a larger set of information so that when they interact with the LLM, they can always get an answer from that LLM.

Thumbnail 440

So they don't hit a dead end where there's no information to retrieve. A good retriever needs to be contextually aware, just as an LLM needs to be grounded. You want the retriever to be able to know about the user at hand. You need that information in order to personalize the information that's returned to them based on who you are, what you have access to, and what you've done in the past.

Thumbnail 470

The retriever needs to be able to take that information into consideration and retrieve a set of information that will be used to ground the LLM based on who you are and what you're trying to do right now. It's super important for the retriever to be contextually aware. Quality of relevance is probably the most important thing here. Because you're actually providing the information to the LLM prior to inference. You're basically telling the LLM to only use its language capabilities.

You don't want the LLM to use its own information. You're basically telling it to answer the user's question or to determine a course of action based on the information that you provide. Don't use anything else. Use what I'm providing here. So if what you're providing is not relevant, if it's not accurate, it's going to provide incorrect answers, and that's by design. So the quality of relevance is super important when dealing with retrievers.

Thumbnail 510

Execution speed is also important. The retrieval, the retrieval part of the RAG pipeline, basically happens in the first step. So you're used to when you interact with an LLM lately, if it's not grounded, the answer comes back fast, and you're used to seeing the answer streaming. So as soon as an answer is generated, it gets returned to you, and you consume that information.

Thumbnail 540

When you ground an LLM, that information is retrieved first. So there's a first step where the user is basically waiting for the LLM to start generating its answer. So you need that retriever to be really fast so that the user can start seeing the answer as fast as possible. Supported formats are also important. A retriever can do multiple things. An LLM can do multiple things. It can do deep investigation. It can ask questions for you to be guided towards exploration.

The classical way that retrievers work lately is that they return chunks of information, passages of information, which is basically a piece of information that's useful for the LLM to be able to answer the question. But sometimes you just want to have links so people can navigate to for further exploration. Sometimes you want to do deep investigation, or you want to answer a really complex question, and you don't need a few passages of a document.

Thumbnail 590

You need the whole document for the LLM to look at the full information and make its own decision and be able to answer the question fully. So being able to provide different formats of information in your retriever, I think, is super important. And then concisely, concisely and accurately, return the right information in the smallest form possible so that it makes the LLM's job easier. Because smaller, more focused information is easier to digest. It doesn't have to decipher the returned information. It's already retrieving the right passages of information.

To answer the question. And that also helps in terms of cost. The more information you put in your prompt, the more you have to pay for those input tokens. So having a concise and accurate retriever will ultimately reduce the amount of money you pay for that LLM inference. It's not the most expensive input token, but it will ultimately impact it.

Thumbnail 630

We provide a toolset of search tools through our MCP. In that toolset, we have passage retrieval. So we can extract passages from documents, retrieve those documents, so you can use it to ground the tools you're trying to build with your LLM. We also have an answer generation tool within that MCP server. If you want to get an answer, or you want to use Coveo more as an agent, basically, as a question-answering agent, you can omit the prompt and simply get an answer as an API from Coveo and provide that answer to an LLM on the other side. That LLM might be specialized in doing something a little bit simpler than answering complex questions.

We also have search and document search. So if the user just wants to explore and get a list of results for them to do their own thing with the data, we can also provide that. And as I mentioned before, we also have full document search. For deeper investigation or for more complex questions, you have access to the full document, not just passages, not just stacking up passages one after the other. For instance, procedures on how to rebuild a complex engine, we can get that from Coveo, which makes the LLM's job a lot easier.

Thumbnail 700

Integration Architecture with AgentCore: The Practice of Prompting and MCP Descriptions

This is the architecture that we propose to our customers who want to start building agents. It's fairly simple, there's nothing groundbreaking about it. You have your agent in the middle. That agent is obviously using an LLM. On top you have your long-term and short-term memory, so you can have sessions. You can work with context over a few interactions. Basically, you have a conversation with that agent.

That agent is connected through a gateway. This is an AgentCore service that's connected to the Coveo MCP server. And on the side, you have all the tools that I've been talking about, that are provided through the MCP server. So the gateway will register those tools. So now the agent has a set of tools that it can use to do the job that you want it to do. And we're also leveraging the identity provider of AgentCore, so actions that are done on Coveo will be done as the authenticated user on the other side.

So if I'm trying to ask a question to an agent, and I don't have access to a specific piece of information that's on the Coveo index, those pieces of information will not be returned to the agent. So there's no information leakage. This is also an important piece to add, and basically, you can have more contextual information based on who you are.

Thumbnail 780

The secret sauce is the combination of the prompt and the MCP description. Basically, you're building an agent. You're giving it a set of tools, so you have to be super explicit about what those tools are, when to use those tools, what they're for, how they work, and all these things. At the end of the day, it's the model that will decide what to do with those descriptions, how to use the tools.

This is kind of a metaprompt that we worked on with a couple of customers. The global directive, who you are, what's the main job of the agent. This is where you determine what you're going to do. But you want to talk about grounding, you want to talk about memory, you want to talk about sources. Grounding will be available. It will be done with the XYZ tool. So you have to be explicit about it.

Also make a clear distinction between what comes from memory and what comes from fresh information from a retriever. It's a different source of information. Memory is also information, but it's much more limited. But it's a different source of information, so you have to be explicit about when to use both. If you want to cite sources, you have to be explicit about it. So it's kind of holding the hand of a little kid and guiding it, but at the end of the day, you do it once. You test it multiple times, but you do it once. And at some point, the agent will be autonomous and able to autonomously use those tools.

There are also questions that go so far as defining the type of question, whether it comes from memory, whether it comes from retrieval. You can go deeper, but building a good top layer is a really good starting point for it to properly use the retriever.

Thumbnail 870

The other part is the MCP description. You have an agent, you told them how to use, well, not how to use, but basically when to use the tools, what to do with these tools. On the other side, you have the MCP server that contains these tools. What we provide by default is fairly simple. We have a retriever called Coveo. We have a tool for search, a tool for retrieval. We also have a tool for full documents, but we don't know what you're exposing in your index on the other side. So you need to be explicit about what will be available through those tools.

Thumbnail 910

Thumbnail 920

So, first, you want to use .NET tool naming standards. We've seen a lot of customers who are really confused with dots in their tool names, so avoid dots and use snake case. You can use dashes, but tool naming is fairly important. Probably the most important thing is to have a good description for your tool. Stick to concise descriptions. Again, this is going to be used by the LLM on your side, so the more accurate the description of the tool, the easier it will be for the LLM to use those tools properly.

Stick to one or two sentences, and bring the information forward. If the tool is to create a case and it needs authentication, start with

"authentication is required to create a case" as the primary information, and then place secondary details after that. Use verbs and specify the type of object to be retrieved. There are many guidelines you can provide, but correctly defining the tools will make a difference in how the LLM agent uses them.

Thumbnail 970
Regarding schemas and descriptions, when using the MCP server, you get the entire JSON schema of that MCP server, which includes the tool descriptions. The entire schema is what the agent on the other side uses at runtime to call the tool. So, it includes the procedure's description, arguments, and everything needed to call the tool. However, the part that the LLM uses to decide when to use a tool and what to use it for is just the description. They are part of the same bundle but are used for different purposes. Please provide good descriptions for your tools.

Thumbnail 1010
That's basically what I had planned for today. Our booth is around the Atlassian booth. If you walk around the Atlassian booth, you'll find the Coveo booth, number 1529. We also host a webinar series called AI Masterclass. I think we hold it approximately every two weeks or once a month. You can use this QR code or visit Coveo.com to see the latest AI Masterclass.

Thumbnail 1050
Also, I must ask you to access the app and fill out the survey. In fact, we would be very grateful if you could fill out the survey to help us know how to improve. We still have a few minutes, so if anyone has any questions, I'd be happy to answer them. Or, if anyone would like to speak directly, I am available. I will be at the booth for the rest of today. Thank you very much.


  • This article was automatically generated using Amazon Bedrock, maintaining the original video's information as much as possible.

Discussion