Description Tips for Bedrock × Redshift Knowledge Bases
Hi, I’m Dang, an AI/ML engineer at Knowledgelabo, Inc. We provide a service called "Manageboard", which supports our clients in aggregating, analyzing, and managing scattered internal business data. Manageboard is set to enhance its AI capabilities in the future.
In this article, I’ll introduce a constraint you may face when building a structured knowledge base using AWS Bedrock integrated with Amazon Redshift—and how to work around it.
Background: What is a Structured Knowledge Base?
When building Retrieval-Augmented Generation (RAG) agents in AWS Bedrock, you can use a knowledge base. In addition to standard vector stores (such as PDFs and documents), you can also connect structured databases (i.e., SQL-based).
Key characteristics of structured knowledge bases:
- Redshift is supported (not RDS, etc.)
- Automatically generates SQL queries from natural language input
- Executes the generated SQL on Redshift, inserts results into the prompt, and sends them to the LLM
Steps to Create an Agent Using a Structured Knowledge Base
To create a Bedrock agent using a Redshift-integrated structured knowledge base, follow these basic steps:
1. Connect a Knowledge Base that connects to Redshift
- In the Bedrock console, go to “Knowledge Bases” → “Create” → Choose “Knowledge Base with structured data store”
- Configure the service role
- In the query engine settings, connect to your Redshift cluster and database (We verified with Redshift Serverless)
- Create the knowledge base
- On Redshift, grant permissions to the service role:
CREATE USER "IAMR:[role name]" WITH PASSWORD DISABLE;
GRANT USAGE ON SCHEMA public TO "IAMR:[role name]";
GRANT SELECT ON [table name] TO "IAMR:[role name]";
- Sync the knowledge base
2. Create an Agent
- In the Bedrock “Agents” screen, create a new agent
- Select the knowledge base you just created
- Select an LLM model (e.g., Claude or Nova)
- Customize constraints and system instructions, then deploy
Note: Description Field Limited to 200 Characters
As the metadata of a knowledge base, you need to register instructions on what kind of data the agent should handle, but there is a 200-character limit. Key points to be aware of include:
- If tables or columns are given abstract names, it becomes difficult to grasp their meaning.
- If many abbreviations or coined terms specific to a business domain are used, it will be harder for the LLM to understand.
Solution: Use “Table and Column Descriptions”
To mitigate the risk of inaccurate SQL generation due to vague metadata, you can do the following:
- In the knowledge base creation flow, proceed to the “Query Engine Settings” step
- Open the “Table and column descriptions” section
- Add detailed explanations for each table and column individually
This provides extra context for the LLM, helping it to generate more accurate SQL even from ambiguous user prompts.
Summary
AWS Bedrock allows you to build structured knowledge bases by integrating with Redshift. However, the 200-character instruction limit can negatively impact agent accuracy.
You can avoid this issue by using the “Table and column descriptions” feature to provide more detailed metadata.
By keeping this in mind from the initial design phase, you’ll build a more reliable and effective structured knowledge base.
Discussion