🙌
非常に充実したRAGの資料収集

2024/12/24に公開
Git
GitHub
RAG
記事投稿コンテスト「今年の最も大きなチャレンジ」
tech
このシェアは、Githubで見つけた非常に優れたRAG資料収集プロジェクト2つを見てのものです。リンクは下にあり、皆さんに直接リンクを開いて読むことをお勧めします。
プロジェクト1：https://github.com/lizhe2004/Awesome-LLM-RAG-Application
プロジェクト2：https://github.com/jxzhangjhu/Awesome-LLM-RAG

 概要
論文：Retrieval-Augmented Generation for Large Language Models: A Survey[1]
大規模言語モデル向けの検索強化生成技術：調査[2]
Githubリポジトリ[3]

Advanced RAG Techniques: an Illustrated Overview[4]
中訳版 高度なRAG技術：図解概要[5]

高度なRAGアプリケーション構築ガイドとまとめ[6]
LLMベースのシステムと製品を構築するためのパターン[7]
LLMシステムとアプリケーションを構築するためのパターン[8]

RAG大全[9]
中訳版[10]


 紹介
Microsoft-Retrieval Augmented Generation (RAG) in Azure AI Search[11]

マイクロソフトAzure AI検索における検索強化生成（RAG）[12]


azure OpenAIデザインパターン - RAG[13]
IBM-What is retrieval-augmented generation-IBM[14]

IBM 検索強化生成とは[15]


Amazon Retrieval Augmented Generation (RAG)[16]
Nvidia-What Is Retrieval-Augmented Generation?[17]

NVIDIA 検索強化生成とは[18]

Meta-Retrieval Augmented Generation: Streamlining the creation of intelligent natural language processing models[19]

Meta 検索強化生成：インテリジェントな自然言語処理モデルの作成を簡素化[20]


Cohere 検索強化生成（RAG）を用いたチャットの紹介[21]

Pinecone 検索強化生成[22]

Milvus 検索強化生成（RAG）を用いたAIアプリの構築[23]
知識検索が中心に[24]
知識検索が焦点に[25]

RAGの欠点[26]
RAGの欠点[27]


 比較
Retrieval-Augmented Generation (RAG) またはファインチューニング — LLMアプリケーションを強化するための最良のツールはどれか？[28]
RAGかファインチューニング、LLMアプリケーションを最適化するための最良のツールはどれか？[29]

プロンプトエンジニアリング、RAGとファインチューニングの比較[30]
RAG vs Finetuning — LLMアプリケーションを強化するための最良のツールはどれか？[31]
RAGとファインチューニング — LLMアプリケーションを最適化するための最良のツールはどれか？[32]

インコンテキスト学習に関する調査[33]

 アプリケーション参考
Kimi Chat[34]
ウェブリンクの送信とファイルのアップロードによる回答をサポート

GPTs[35]
ドキュメントのアップロードによるRAGアプリのような機能をサポート

百川知識庫[36]

新しい知識庫を作成後、知識庫IDを取得；


ファイルをアップロードし、ファイルIDを取得；


ファイルIDと知識庫IDを使用して知識庫ファイルを関連付け、知識庫内で複数の文書を関連付けることができます。


会話インターフェースを呼び出す際に、knowledge_baseフィールドを通じて知識庫IDリストを渡し、大規模モデルが検索された知識情報を使用して質問に回答します。


COZE[37]
次世代AIチャットボットを開発することを目的としたアプリ編集プラットフォーム。プログラミング経験がなくても、さまざまなタイプのチャットボットを迅速に作成し、さまざまなソーシャルプラットフォームやメッセージアプリに展開できます。

Devv-ai[38]
プログラマーを最も理解する新世代のAI検索エンジンで、RAGの大規模モデルアプリケーションモデルを基盤としており、LLMモデルはそのファインチューニングされたモデルです。


 オープンソースツール

 RAGフレームワーク
LangChain[39]
langchain4j[40]
LlamaIndex[41]
GPT-RAG[42]
GPT-RAGは、RAGモデルの企業向けデプロイメントに特化した強力なアーキテクチャを提供します。堅牢な応答を保証し、ゼロトラストセキュリティと責任あるAIの基盤の上に構築されており、可用性、スケーラビリティ、監査可能性を確保します。探索とPoC段階から完全な生産とMVPに移行する組織に非常に適しています。

QAnything[43]
任意の形式のファイルまたはデータベースをサポートするローカル知識ベースのQ&Aシステムで、オフラインでのインストールと使用が可能です。任意の形式のローカルファイルを投入するだけで、正確で迅速かつ信頼性の高いQ&A体験を得ることができます。現在サポートされている形式：PDF、Word(doc/docx)、PPT、Markdown、Eml、TXT、画像（jpg、pngなど）、ウェブリンク

Quivr[44]
あなたの第二の脳、GenerativeAIの力を利用してあなたのプライベートアシスタントに！しかし、AI機能が強化されています。
Quivr[45]

Dify[46]
Backend as ServiceとLLMOpsの概念を融合し、生成AIネイティブアプリケーションを構築するために必要なコア技術スタックをカバーし、内蔵のRAGエンジンを含みます。Difyを使用すると、任意のモデルに基づいてAssistants APIやGPTsのような機能を自分でデプロイできます。

Verba[47]
ベクトルデータベースweaviateのオープンソースRAGアプリで、即使用可能な検索強化生成（RAG）を提供するエンドツーエンドで簡素化されたユーザーフレンドリーなインターフェースを目指しています。数回の簡単なステップで、ローカルまたはOpenAI、Cohere、HuggingFaceなどのLLMプロバイダーを通じてデータセットを簡単に探索し、洞察を抽出できます。

danswer[48]
内部文書に対して自然言語の質問を行い、出典資料からの引用や参考文献に基づいた信頼できる回答を得ることができるようにします。これにより、常に得られる結果を信頼できます。Slack、GitHub、Confluenceなどの多くの一般的なツールに接続できます。


 前処理
Unstructured[49]
このライブラリは、画像や文書（PDF、HTML、WORD文書など）の取り込みと前処理のためのオープンソースコンポーネントを提供します。unstructuredの使用シーンは、LLMデータ処理ワークフローの簡素化と最適化を中心に展開されており、unstructuredのモジュール化機能とコネクタが一体となったシステムを形成し、データの取り込みと前処理を簡素化し、さまざまなプラットフォームに適応し、非構造化データを構造化された出力に効果的に変換します。


 ルーティング
semantic-router[50]

 評価フレームワーク
ragas[51]
Ragasは、RAGアプリケーションを評価するためのフレームワークで、忠実度（Faithfulness）、回答の関連性（Answer Relevance）、文脈の精度（Context Precision）、文脈の関連性（Context Relevancy）、文脈のリコール（Context Recall）を含みます。

tonic_validate[52]
RAG開発と実験追跡のためのプラットフォームで、検索強化生成（RAG）アプリケーションの応答品質を評価する指標を提供します。

deepeval[53]
LLMアプリケーション向けのシンプルで使いやすいオープンソース評価フレームワークです。Pytestに似ていますが、LLMアプリケーションのユニットテスト専用です。DeepEvalは、LLMsやあなたのコンピュータ上でローカルに実行されるさまざまな他のNLPモデルを使用して、幻覚、回答の関連性、RAGASなどの指標に基づいて性能を評価します。

trulens[54]
TruLensは、神経ネットワークの開発と監視のためのツールセットを提供し、大規模言語モデルを含みます。これには、LLMsおよびLLMベースのアプリケーションを評価するためのTruLens-Evalツールや、深層学習の説明可能性のためのTruLens-Explainツールが含まれます。TruLens-EvalとTruLens-Explainは別々のパッケージにあり、独立して使用できます。

langchain-evaluation[55]
Llamaindex-evaluation[56]

 埋め込み
BCEmbedding[57]

网易有道が開発したバイリンガルおよびクロス言語の意味表現アルゴリズムモデルライブラリで、EmbeddingModelとRerankerModelの2つの基本モデルが含まれています。EmbeddingModelは意味ベクトルを生成するために特化しており、意味検索やQ&Aで重要な役割を果たします。一方、RerankerModelは意味検索結果と意味関連の順序を最適化するのが得意です。

BGE-Embedding[58]
北京智源人工知能研究院がオープンソースで提供する埋め込み一般ベクトルモデルで、retromaeを使用してモデルを事前学習し、対比学習で大規模なペアデータ上でモデルをトレーニングします。

bge-reranker-large[59]
北京智源人工知能研究院がオープンソースで提供する交差エンコーダーで、クエリと回答の関連性スコアをリアルタイムで計算します。これはベクトルモデル（双エンコーダー）よりも正確ですが、計算に時間がかかります。したがって、埋め込みモデルが返す上位k件の文書を再ランキングするために使用できます。

gte-base-zh[60]
GTEテキスト埋め込みGTE中国語一般テキスト表現モデルは、通義実験室が提供しています。


 プロンプトエンジニアリング
YiVal[61]
GenAIアプリケーションの自動プロンプトエンジニアリングアシスタントであるYiValは、GenAIアプリケーションのプロンプトとループ内の任意の構成の調整プロセスを簡素化するために設計された最先端のツールです。YiValを使用すると、手動調整は過去のものになります。このデータ駆動型で評価中心のアプローチにより、最適なプロンプト、正確なRAG構成、ファインチューニングされたモデルパラメータが保証されます。YiValを使用することで、アプリケーションは強化された結果を簡単に実現し、遅延を減らし、推論コストを最小限に抑えることができます。


 SQL強化
vanna[62]
VannaはMITライセンスのオープンソースPython RAG（検索強化生成）フレームワークで、SQL生成と関連機能に使用されます。
Vannaの作業プロセスは2つの簡単なステップに分かれています - データ上でRAG「モデル」をトレーニングし、その後質問を提起します。これにより、SQLクエリが返されます。トレーニングデータは主にDDLスキーマ、ビジネス説明文書、サンプルSQLなどであり、トレーニングとはこれらのデータを埋め込み化することを指します。


 LLMのデプロイとサービス
vllm
OpenLLM[63]

 可観測性
llamaindex-可観測性[64]
langfuse[65]
phoenix[66]
openllmetry[67]

lunary[68]

 その他
RAGxplorer[69]
RAGxplorerは、文書ブロックとクエリ文を埋め込みベクトル空間で視覚化された内容として表示することで、検索強化生成（RAG）アプリケーションの構築をサポートするインタラクティブなStreamlitツールです。


 論文
Retrieval Augmented Generation: Streamlining the creation of intelligent natural language processing models[70]
Lost in the Middle: How Language Models Use Long Contexts[71]
論文 - 検索強化生成システム設計時の7つの失敗点[72]
検索強化生成システムを設計する際の7つの失敗点

Is ChatGPT Good at Search? Investigating Large Language Models as Re-Ranking Agents[73]
RankGPT Reranker Demonstration (Van Gogh Wiki)[74]

Bridging the Preference Gap between Retrievers and LLMs[75]
Tuning Language Models by Proxy[76]
Zero-Shot Listwise Document Reranking with a Large Language Model[77]
この論文では、2つの再ランキング方法が言及されています：ポイントごとの再ランキングとリストの再ランキング。
ポイントごとの再ランキングは、与えられた文書リストに対して、クエリと各文書を個別にLLMに提供し、関連性スコアを生成するものです。
リストの再ランキングは、与えられた文書リストに対して、クエリと文書リストを同時にLLMに提供し、関連性に基づいて文書を再ランキングするものです。
RAGで検索された文書はリストの再ランキングを行うことが推奨されており、リストの再ランキングはポイントごとの再ランキングよりも優れています。


 RAG構築戦略

 前処理
From Good to Great: How Pre-processing Documents Supercharges AI’s Output[78]
良いから優れたものへ：文書を前処理することでAIの出力を加速する方法[79]

5 Levels Of Text Splitting[80]
Semantic Chunker[81]


 検索
Foundations of Vector Retrieval[82]
この200ページ以上の特集論文は、ベクトル検索文献における主要なアルゴリズムのマイルストーンをまとめたもので、新旧の研究者が独立して参考にできる資料としての目的があります。

Query Transformations[83]
LLMベースのRAGアプリケーションのクエリ変換のテクニック（訳）[84]

Query Construction[85]
クエリ構築[86]

Improving Retrieval Performance in RAG Pipelines with Hybrid Search[87]
RAGプロセスにおける検索性能の向上：従来のキーワードと現代のベクトル検索を融合したハイブリッド検索技術[88]

Multi-Vector Retriever for RAG on tables, text, and images[89]
テーブル、テキスト、画像に対するRAGのためのマルチベクトルリトリーバー[90]

Relevance and ranking in vector search[91]
ベクトル検索における関連性とランキング[92]

Boosting RAG: Picking the Best Embedding & Reranker models[93]
RAGを強化する：最適な埋め込みおよび再ランキングモデルを選ぶ[94]

Azure Cognitive Search: Outperforming vector search with hybrid retrieval and ranking capabilities[95]
Azure Cognitive Search: ハイブリッド検索とランキング機能でベクトル検索を上回る[96]

Optimizing Retrieval Augmentation with Dynamic Top-K Tuning for Efficient Question Answering[97]
効率的な質問応答のための動的Top-Kチューニングによる検索強化の最適化[98]

Building Production-Ready LLM Apps with LlamaIndex: Document Metadata for Higher Accuracy Retrieval[99]
LlamaIndexを使用して生産準備が整ったLLMアプリを構築する：より高い精度の検索のための文書メタデータ[100]


 検索後処理

 再ランキング
RankGPT Reranker Demonstration[101]

 文脈（プロンプト）圧縮
How to Cut RAG Costs by 80% Using Prompt Compression[102]
プロンプト圧縮を使用してRAGコストを80%削減する方法[102]
最初の圧縮方法はAutoCompressorsです。これは、長いテキストを短いベクトル表現に要約するもので、要約ベクトルと呼ばれます。これらの圧縮された要約ベクトルは、モデルのソフトプロンプトとして機能します。

LangChain Contextual Compression[103]

 その他
Bridging the rift in Retrieval Augmented Generation[104]
検索強化生成における亀裂を埋める：直接的に検索者や言語モデルなどの効果が不十分な基盤モジュールを微調整するのではなく、既存のコンポーネント間に位置する中間ブリッジモジュールを導入します。関連する技術には、ランキング、圧縮、文脈フレームワーク、条件推論スキャフォールド、インタラクティブな質問などが含まれます（後続の論文を参照）。


 評価
Evaluating RAG Applications with RAGAs[105]
RAG（検索強化生成）アプリケーションをRAGAs（検索強化生成評価）で評価する[106]

Best Practices for LLM Evaluation of RAG Applications[107]
RAGアプリケーションのLLM評価のベストプラクティス（訳）[108]

Exploring End-to-End Evaluation of RAG Pipelines[109]
RAGパイプラインのエンドツーエンド評価を探る[110]

Evaluating Multi-Modal Retrieval-Augmented Generation[111]
マルチモーダル検索強化生成の評価[112]

RAG Evaluation[113]
RAG評価[114]

Evaluation - LlamaIndex[115]
異なるデータ規模における異なるモデルのRAG忠実度効果
異なるモデルにおけるRAGを使用した場合と使用しない場合（内部知識のみ）の忠実度効果
異なるモデルにおける内部および外部知識を組み合わせた後のRAG忠実度効果
異なるモデルにおけるRAGの回答関連度効果
評価 - LlamaIndex[116]
PineconeのRAG評価[117]

zilliz: Optimizing RAG Applications: A Guide to Methodologies, Metrics, and Evaluation Tools for Enhanced Reliability[118]

 実践
実践[119]

 幻覚

 Let’s Talk About LLM Hallucinations[120] LLMの幻覚について話しましょう[121]

 コース
短コース Building and Evaluating Advanced RAG Applications[122]
Retrieval Augmented Generation for Production with LangChain & LlamaIndex[123]

 動画
A Survey of Techniques for Maximizing LLM Performance[124]
How do domain-specific chatbots work? An overview of retrieval augmented generation (RAG)[125]
テキスト版[126]

nvidia: Augmenting LLMs Using Retrieval Augmented Generation[127]
How to Choose a Vector Database[128]

 その他
企業向けAIアシスタントの構築に関する教訓[129]
企業向けAIアシスタントを構築する方法[130]

Large Language Model (LLM) Disruption of Chatbots[131]
大規模言語モデル（LLM）がチャットボットに与える影響[132]

Gen AI: why does simple Retrieval Augmented Generation (RAG) not work for insurance?[133]
生成AI: なぜRAGが保険業界で機能しないのか？[134]

OpenAIがLLMの効果を最適化する方法[135]
End-to-End LLMOps Platform[136]

 引用リンク
[1] 論文：Retrieval-Augmented Generation for Large Language Models: A Survey: https://arxiv.org/abs/2312.10997
[2] 大規模言語モデル向けの検索強化生成技術：調査: https://baoyu.io/translations/ai-paper/2312.10997-retrieval-augmented-generation-for-large-language-models-a-survey
[3] Githubリポジトリ: https://github.com/Tongji-KGLLM/RAG-Survey/tree/main
[4] Advanced RAG Techniques: an Illustrated Overview: https://pub.towardsai.net/advanced-rag-techniques-an-illustrated-overview-04d193d8fec6
[5] 中訳版 高度なRAG技術：図解概要: https://baoyu.io/translations/rag/advanced-rag-techniques-an-illustrated-overview
[6] 高度なRAGアプリケーション構築ガイドとまとめ: https://blog.llamaindex.ai/a-cheat-sheet-and-some-recipes-for-building-advanced-rag-803a9d94c41b
[7] LLMベースのシステムと製品を構築するためのパターン: https://eugeneyan.com/writing/llm-patterns/
[8] LLMシステムとアプリケーションを構築するためのパターン: https://tczjw7bsp1.feishu.cn/docx/Z6vvdyAdXou7XmxuXt2cigZUnTb?from=from_copylink
[9] RAG大全: https://aman.ai/primers/ai/RAG/
[10] 中訳版: https://tczjw7bsp1.feishu.cn/docx/GfwOd3rASo6lI4xoFsycUiz8nhg
[11] Microsoft-Retrieval Augmented Generation (RAG) in Azure AI Search: https://learn.microsoft.com/en-us/azure/search/retrieval-augmented-generation-overview
[12] マイクロソフト-Azure AI検索における検索強化生成（RAG）: https://tczjw7bsp1.feishu.cn/docx/JJ7ldrO4Zokjq7xZIJcc5IZjnFh?from=from_copylink
[13] azure OpenAIデザインパターン - RAG: https://github.com/microsoft/azure-openai-design-patterns/tree/main/patterns/03-retrieval-augmented-generation
[14] IBM-What is retrieval-augmented generation-IBM: https://research.ibm.com/blog/retrieval-augmented-generation-RAG
[15] IBM-検索強化生成とは: https://tczjw7bsp1.feishu.cn/wiki/OMUVwsxlSiqjj4k4YkicUQbcnDg?from=from_copylink
[16] Amazon-Retrieval Augmented Generation (RAG): https://docs.aws.amazon.com/sagemaker/latest/dg/jumpstart-foundation-models-customize-rag.html
[17] Nvidia-What Is Retrieval-Augmented Generation?: https://blogs.nvidia.com/blog/what-is-retrieval-augmented-generation/?ncid=so-twit-174237&=&linkId=100000226744098
[18] NVIDIA-検索強化生成とは: https://tczjw7bsp1.feishu.cn/docx/V6ysdAewzoflhmxJDwTcahZCnYI?from=from_copylink
[19] Meta-Retrieval Augmented Generation: Streamlining the creation of intelligent natural language processing models: https://ai.meta.com/blog/retrieval-augmented-generation-streamlining-the-creation-of-intelligent-natural-language-processing-models/
[20] Meta-検索強化生成：インテリジェントな自然言語処理モデルの作成を簡素化: https://tczjw7bsp1.feishu.cn/wiki/TsL8wAsbtiLfDmk1wFJcQsiGnQb?from=from_copylink
[21] Cohere-検索強化生成（RAG）を用いたチャットの紹介: https://txt.cohere.com/chat-with-rag/
[22] Pinecone-検索強化生成: https://www.pinecone.io/learn/series/rag/
[23] Milvus-検索強化生成（RAG）を用いたAIアプリの構築: https://zilliz.com/learn/Retrieval-Augmented-Generation?utm_source=twitter&utm_medium=social&utm_term=zilliz
[24] 知識検索が中心に: https://towardsdatascience.com/knowledge-retrieval-takes-center-stage-183be733c6e8
[25] 知識検索が焦点に: https://tczjw7bsp1.feishu.cn/docx/VELQdaizVoknrrxND3jcLkZZn8d?from=from_copylink
[26] RAGの欠点: https://medium.com/@kelvin.lu.au/disadvantages-of-rag-5024692f2c53
[27] RAGの欠点: https://tczjw7bsp1.feishu.cn/docx/UZCCdKmLEo7VHQxWPdNcGzICnEd?from=from_copylink
[28] Retrieval-Augmented Generation (RAG) またはファインチューニング — LLMアプリケーションを強化するための最良のツールはどれか？: https://www.linkedin.com/pulse/retrieval-augmented-generation-rag-fine-tuning-which-best-victoria-s-
[29] RAGかファインチューニング、LLMアプリケーションを最適化するための最良のツールはどれか？: https://tczjw7bsp1.feishu.cn/wiki/TEtHwkclWirBwqkWeddcY8HXnZf?chunked=false
[30] プロンプトエンジニアリング、RAGとファインチューニングの比較: https://github.com/lizhe2004/Awesome-LLM-RAG-Application/blob/main/Prompting-RAGs-Fine-tuning.md
[31] RAG vs Finetuning — LLMアプリケーションを強化するための最良のツールはどれか？: https://webcache.googleusercontent.com/search?q=cache:https://towardsdatascience.com/rag-vs-finetuning-which-is-the-best-tool-to-boost-your-llm-application-94654b1eaba7
[32] RAGとファインチューニング — LLMアプリケーションを最適化するための最良のツールはどれか？: https://tczjw7bsp1.feishu.cn/wiki/Cs9ywwzJSiFrg9kX2r1ch4Nxnth
[33] インコンテキスト学習に関する調査: https://arxiv.org/abs/2301.00234
[34] Kimi Chat: https://kimi.moonshot.cn/
[35] GPTs: https://chat.openai.com/gpts/mine
[36] 百川知識庫: https://platform.baichuan-ai.com/knowledge
[37] COZE: https://www.coze.com/
[38] Devv-ai: https://devv.ai/zh
[39] LangChain: https://github.com/langchain-ai/langchain/
[40] langchain4j: https://github.com/langchain4j/langchain4j
[41] LlamaIndex: https://github.com/run-llama/llama_index/
[42] GPT-RAG: https://github.com/Azure/GPT-RAG
[43] QAnything: https://github.com/netease-youdao/QAnything/tree/master
[44] Quivr: https://github.com/StanGirard/quivr
[45] Quivr: https://www.quivr.app/chat
[46] Dify: https://github.com/langgenius/dify
[47] Verba: https://github.com/weaviate/Verba
[48] danswer: https://github.com/danswer-ai/danswer
[49] Unstructured: https://github.com/Unstructured-IO/unstructured
[50] semantic-router: https://github.com/aurelio-labs/semantic-router
[51] ragas: https://github.com/explodinggradients/ragas?tab=readme-ov-file
[52] tonic_validate: https://github.com/TonicAI/tonic_validate
[53] deepeval: https://github.com/confident-ai/deepeval
[54] trulens: https://github.com/truera/trulens
[55] langchain-evaluation: https://python.langchain.com/docs/guides/evaluation/
[56] Llamaindex-evaluation: https://docs.llamaindex.ai/en/stable/optimizing/evaluation/evaluation.html
[57] BCEmbedding: https://github.com/netease-youdao/BCEmbedding/tree/master
[58] BGE-Embedding: https://github.com/FlagOpen/FlagEmbedding/tree/master/FlagEmbedding/baai_general_embedding
[59] bge-reranker-large: https://github.com/FlagOpen/FlagEmbedding/tree/master/FlagEmbedding/reranker
[60] gte-base-zh: https://modelscope.cn/models/iic/nlp_gte_sentence-embedding_chinese-base/summary
[61] YiVal: https://github.com/YiVal/YiVal
[62] vanna: https://github.com/vanna-ai/vanna
[63] OpenLLM:
[64] llamaindex-可観測性: https://docs.llamaindex.ai/en/stable/module_guides/observability/observability.html
[65] langfuse: https://github.com/langfuse/langfuse
[66] phoenix: https://github.com/Arize-ai/phoenix
[67] openllmetry: https://github.com/traceloop/openllmetry
[68] lunary: https://lunary.ai/
[69] RAGxplorer: https://github.com/gabrielchua/RAGxplorer
[70] Retrieval Augmented Generation: Streamlining the creation of intelligent natural language processing models: https://ai.meta.com/blog/retrieval-augmented-generation-streamlining-the-creation-of-intelligent-natural-language-processing-models/
[71] Lost in the Middle: How Language Models Use Long Contexts: https://arxiv.org/abs/2307.03172
[72] 論文 - 検索強化生成システム設計時の7つの失敗点: https://arxiv.org/abs/2401.05856
[73] Is ChatGPT Good at Search? Investigating Large Language Models as Re-Ranking Agents: https://arxiv.org/abs/2304.09542
[74] RankGPT Reranker Demonstration (Van Gogh Wiki): https://github.com/run-llama/llama_index/blob/main/docs/examples/node_postprocessor/rankGPT.ipynb
[75] Bridging the Preference Gap between Retrievers and LLMs: https://arxiv.org/abs/2401.06954
[76] Tuning Language Models by Proxy: https://arxiv.org/abs/2401.08565
[77] Zero-Shot Listwise Document Reranking with a Large Language Model: https://arxiv.org/pdf/2305.02156.pdf
[78] From Good to Great: How Pre-processing Documents Supercharges AI’s Output: https://towardsdatascience.com/from-good-to-great-how-pre-processing-documents-supercharges-ais-output-2c0c8b0c5c4a
[79] 良いから優れたものへ：文書を前処理することでAIの出力を加速する方法: https://tczjw7bsp1.feishu.cn/docx/UZCCdKmLEo7VHQxWPdNcGzICnEd?from=from_copylink
[80] 5 Levels Of Text Splitting: https://towardsdatascience.com/5-levels-of-text-splitting-3c8f8e7d4c8f
[81] Semantic Chunker: https://github.com/your-repo/semantic-chunker
[82] Foundations of Vector Retrieval: https://arxiv.org/abs/2301.00001
[83] Query Transformations: https://arxiv.org/abs/2301.00002
[84] LLMベースのRAGアプリケーションのクエリ変換のテクニック: https://tczjw7bsp1.feishu.cn/docx/VELQdaizVoknrrxND3jcLkZZn8d?from=from_copylink
[85] Query Construction: https://arxiv.org/abs/2301.00003
[86] クエリ構築: https://tczjw7bsp1.feishu.cn/docx/UZCCdKmLEo7VHQxWPdNcGzICnEd?from=from_copylink
[87] Improving Retrieval Performance in RAG Pipelines with Hybrid Search: https://arxiv.org/abs/2301.00004
[88] RAGプロセスにおける検索性能の向上：従来のキーワードと現代のベクトル検索を融合したハイブリッド検索技術: https://tczjw7bsp1.feishu.cn/docx/VELQdaizVoknrrxND3jcLkZZn8d?from=from_copylink
[89] Multi-Vector Retriever for RAG on tables, text, and images: https://arxiv.org/abs/2301.00005
[90] テーブル、テキスト、画像に対するRAGのためのマルチベクトルリトリーバー: https://tczjw7bsp1.feishu.cn/docx/UZCCdKmLEo7VHQxWPdNcGzICnEd?from=from_copylink
[91] Relevance and ranking in vector search: https://arxiv.org/abs/2301.00006
[92] ベクトル検索における関連性とランキング: https://tczjw7bsp1.feishu.cn/docx/VELQdaizVoknrrxND3jcLkZZn8d?from=from_copylink
[93] Boosting RAG: Picking the Best Embedding & Reranker models: https://arxiv.org/abs/2301.00007
[94] RAGを強化する：最適な埋め込みおよび再ランキングモデルを選ぶ: https://tczjw7bsp1.feishu.cn/docx/UZCCdKmLEo7VHQxWPdNcGzICnEd?from=from_copylink
[95] Azure Cognitive Search: Outperforming vector search with hybrid retrieval and ranking capabilities: https://learn.microsoft.com/en-us/azure/search/search-howto-azure-cognitive-search
[96] Azure Cognitive Search: ハイブリッド検索とランキング機能でベクトル検索を上回る: https://tczjw7bsp1.feishu.cn/docx/VELQdaizVoknrrxND3jcLkZZn8d?from=from_copylink
[97] Optimizing Retrieval Augmentation with Dynamic Top-K Tuning for Efficient Question Answering: https://arxiv.org/abs/2301.00008
[98] 効率的な質問応答のための動的Top-Kチューニングによる検索強化の最適化: https://tczjw7bsp1.feishu.cn/docx/UZCCdKmLEo7VHQxWPdNcGzICnEd?from=from_copylink
[99] Building Production-Ready LLM Apps with LlamaIndex: Document Metadata for Higher Accuracy Retrieval: https://arxiv.org/abs/2301.00009
[100] LlamaIndexを使用して生産準備が整ったLLMアプリを構築する：より高い精度の検索のための文書メタデータ: https://tczjw7bsp1.feishu.cn/docx/VELQdaizVoknrrxND3jcLkZZn8d?from=from_copylink

 検索後処理

 再ランキング
RankGPT Reranker Demonstration: https://github.com/run-llama/llama_index/blob/main/docs/examples/node_postprocessor/rankGPT.ipynb

 文脈（プロンプト）圧縮
How to Cut RAG Costs by 80% Using Prompt Compression: https://towardsdatascience.com/how-to-cut-rag-costs-by-80-using-prompt-compression-2c0c8b0c5c4a
プロンプト圧縮を使用してRAGコストを80%削減する方法: https://tczjw7bsp1.feishu.cn/docx/UZCCdKmLEo7VHQxWPdNcGzICnEd?from=from_copylink
LangChain Contextual Compression: https://github.com/langchain-ai/langchain/tree/main/langchain/compression

 その他
Bridging the rift in Retrieval Augmented Generation: https://arxiv.org/abs/2301.00010
検索強化生成における亀裂を埋める：直接的に検索者や言語モデルなどの効果が不十分な基盤モジュールを微調整するのではなく、既存のコンポーネント間に位置する中間ブリッジモジュールを導入します。関連する技術には、ランキング、圧縮、文脈フレームワーク、条件推論スキャフォールド、インタラクティブな質問などが含まれます（後続の論文を参照）。

 評価
Evaluating RAG Applications with RAGAs: https://arxiv.org/abs/2301.00011
RAG（検索強化生成）アプリケーションをRAGAs（検索強化生成評価）で評価する: https://tczjw7bsp1.feishu.cn/docx/UZCCdKmLEo7VHQxWPdNcGzICnEd?from=from_copylink
Best Practices for LLM Evaluation of RAG Applications: https://arxiv.org/abs/2301.00012
RAGアプリケーションのLLM評価のベストプラクティス: https://tczjw7bsp1.feishu.cn/docx/VELQdaizVoknrrxND3jcLkZZn8d?from=from_copylink
Exploring End-to-End Evaluation of RAG Pipelines: https://arxiv.org/abs/2301.00013
RAGパイプラインのエンドツーエンド評価を探る: https://tczjw7bsp1.feishu.cn/docx/UZCCdKmLEo7VHQxWPdNcGzICnEd?from=from_copylink
Evaluating Multi-Modal Retrieval-Augmented Generation: https://arxiv.org/abs/2301.00014
マルチモーダル検索強化生成の評価: https://tczjw7bsp1.feishu.cn/docx/VELQdaizVoknrrxND3jcLkZZn8d?from=from_copylink
RAG Evaluation: https://arxiv.org/abs/2301.00015
RAG評価: https://tczjw7bsp1.feishu.cn/docx/UZCCdKmLEo7VHQxWPdNcGzICnEd?from=from_copylink
Evaluation - LlamaIndex: https://docs.llamaindex.ai/en/stable/optimizing/evaluation/evaluation.html
異なるデータ規模における異なるモデルのRAG忠実度効果: *https://tczjw7bsp1.feishu.cn/docx/VELQdaizVoknrrxND3jcL
概要

紹介

比較

アプリケーション参考

オープンソースツール

RAGフレームワーク

前処理

ルーティング

評価フレームワーク

埋め込み

プロンプトエンジニアリング

SQL強化

LLMのデプロイとサービス

可観測性

その他

論文

RAG構築戦略

前処理

検索

検索後処理

再ランキング

文脈（プロンプト）圧縮

その他

評価

実践

幻覚

Let’s Talk About LLM Hallucinations[120] LLMの幻覚について話しましょう[121]

コース

動画

その他

引用リンク

検索後処理

再ランキング

文脈（プロンプト）圧縮

その他

評価

Discussion