Closed6
ragasのチュートリアルを試してみる

を試してみる

ライブラリのインストール
今回はBedrockを使用する
pip install ragas
pip install langchain-aws

resがinputの要約として正しいかを検証している?
from langchain_aws import ChatBedrockConverse
from langchain_aws import BedrockEmbeddings
from ragas.llms import LangchainLLMWrapper
from ragas.embeddings import LangchainEmbeddingsWrapper
from ragas import SingleTurnSample
from ragas.metrics import AspectCritic
import asyncio
config = {
"credentials_profile_name": "your-profile-name", # E.g "default"
"region_name": "your-region-name", # E.g. "us-east-1"
"llm": "your-llm-model-id", # E.g "anthropic.claude-3-5-sonnet-20241022-v2:0"
"embeddings": "your-embedding-model-id", # E.g "amazon.titan-embed-text-v2:0"
"temperature": 0.4,
}
evaluator_llm = LangchainLLMWrapper(ChatBedrockConverse(
credentials_profile_name=config["credentials_profile_name"],
region_name=config["region_name"],
base_url=f"https://bedrock-runtime.{config['region_name']}.amazonaws.com",
model=config["llm"],
temperature=config["temperature"],
))
evaluator_embeddings = LangchainEmbeddingsWrapper(BedrockEmbeddings(
credentials_profile_name=config["credentials_profile_name"],
region_name=config["region_name"],
model_id=config["embeddings"],
))
async def main():
# AspectCritic: 要約が正確かどうかを評価
metric = AspectCritic(
name="summary_accuracy",
llm=evaluator_llm,
definition="Verify if the summary is accurate."
)
test_data = SingleTurnSample(
user_input="田中さんは東京都千代田区永田町に住んでいます。",
response="東京都千代田区永田町",
)
res = await metric.single_turn_ascore(test_data)
print(res)
if __name__ == "__main__":
asyncio.run(main())

huggingfaceのデータセットを使って、実際に評価をしてみる。
50個のサンプルデータを使用する。
- user_input: アプリケーションに提供される入力
- response: アプリケーションによって生成される出力
from datasets import load_dataset
from ragas import EvaluationDataset
eval_dataset = load_dataset("explodinggradients/earning_report_summary",split="train")
eval_dataset = EvaluationDataset.from_hf_dataset(eval_dataset)
print("Features in dataset:", eval_dataset.features())
print("Total samples in dataset:", len(eval_dataset))
ちゃんと50個ある。
Features in dataset: ['user_input', 'response']
Total samples in dataset: 50

先ほどのデータセットを使って、評価してみる。
from ragas import evaluate
results = evaluate(eval_dataset, metrics=[metric])
results
サンプルだとsummary_accuracyは0.84になるらしい。。。
{'summary_accuracy': 1.0000}
結果を出力してみてみたが、全部1になってる。
user_input response summary_accuracy
0 summarise given text\nThe Q2 earnings report r... The Q2 earnings report showed a 15% revenue in... 1
1 summarise given text\nIn 2023, North American ... Companies are strategizing to adapt to market ... 1
2 summarise given text\nIn 2022, European expans... Many companies experienced a notable 15% growt... 1
3 summarise given text\nSupply chain challenges ... Supply chain challenges in North America, caus... 1
4 summarise given text\nIn Q2 2023, the company ... The company experienced a notable increase in ... 1
5 summarise given text\nIn 2023, marketing campa... In 2023, marketing campaigns in North America ... 1
6 summarise given text\nThe company's internatio... The company's international expansion strategy... 1
7 summarise given text\nIn 2024, companies are i... Companies are using data analytics to customiz... 1
8 summarise given text\nIn 2023, logistics inves... Driven by technological and infrastructural ad... 1
9 summarise given text\nIn 2023, the company exp... The company faced challenges due to competitio... 1
10 summarise given text\nThe company reported a 5... The company faced challenges in the European m... 1
11 summarise given text\nThe company reported a s... The company's significant profit in Q3 2024, d... 1
12 summarise given text\nThe global market has ex... The recent downturn has raised concerns among ... 1
13 summarise given text\nThe logistics industry i... The industry is expected to grow by 20% in 202... 1
14 summarise given text\nThe company reported an ... The company experienced an 8% increase in Q3 2... 1
15 summarise given text\nIn 2022, the Asian marke... In 2022, the Asian market experienced a signif... 1
16 summarise given text\nThe global market has wi... The global market experienced a 10% increase i... 1
17 summarise given text\nThe company reported a 1... The company experienced significant growth due... 1
18 summarise given text\nThe company's revenue sa... The company's financial success was significan... 1
19 summarise given text\nThe Marketing team is st... The team is strategizing to address the challe... 1
20 summarise given text\nIn 2023, the global mark... In 2023, the global market saw a 5% sales decr... 1
21 summarise given text\nIn 2022, there was an 8%... Economic factors led to increased expenses, pr... 1
22 summarise given text\nIn 2022, the global mark... In 2022, a remarkable 20% growth significantly... 1
23 summarise given text\nIn 2022, the European ma... In 2022, the European market experienced a 5% ... 1
24 summarise given text\nIn 2022, companies opera... In 2022, companies in Latin America faced a 15... 1
25 summarise given text\nIn 2023, the European ma... In 2023, the European market experienced a 15%... 1
26 summarise given text\nThe global market is poi... A significant shift is expected with a 20% gro... 1
27 summarise given text\nThe company reported a s... The company's 8% rise in 2022 was driven by ex... 1
28 summarise given text\nThe logistics industry i... The logistics industry is expected to face a 5... 1
29 summarise given text\nThe company is facing a ... The company is anticipating a difficult year d... 1
30 summarise given text\nThe company is projectin... The company's growth is fueled by strategic in... 1
31 summarise given text\nIn 2023, the Asian marke... In 2023, the Asian market experienced a 10% in... 1
32 summarise given text\nThe global market has wi... The global market experienced an 8% rise in Q1... 1
33 summarise given text\nIn 2022, there was an 8%... Expenses increased across various sectors in L... 1
34 summarise given text\nIn 2023, companies opera... In 2023, companies in Latin America faced a 15... 1
35 summarise given text\nSales in Latin America e... Sales in Latin America saw a remarkable 20% gr... 1
36 summarise given text\nIn 2022, the company exp... In 2022, the company faced a 5% revenue decrea... 1
37 summarise given text\nIn Q3 2024, the company ... In Q3 2024, the company reported a 15% decline... 1
38 summarise given text\nThe logistics sector is ... The sector is set for major expansion due to r... 1
39 summarise given text\nIn 2022, North America e... A significant economic boost was observed due ... 1
40 summarise given text\nIn 2022, the company exp... The company's financial success was significan... 1
41 summarise given text\nThe company is preparing... The company is planning to address a projected... 1
42 summarise given text\nThe European market is p... The European market's projected 8% rise in 202... 1
43 summarise given text\nThe logistics sector in ... The logistics sector in Latin America is proje... 1
44 summarise given text\nIn North America, compan... In North America, companies report a 5% decrea... 1
45 summarise given text\nIn 2023, the company exp... In 2023, the company faced a 5% revenue decrea... 1
46 summarise given text\nIn Q3 2024, the company ... In Q3 2024, the company reported an 8% rise in... 1
47 summarise given text\nThe European market expe... The European market's 5% decrease in Q3 2024 h... 1
48 summarise given text\nIn 2022, Sales in North ... A remarkable increase was achieved through str... 1
49 summarise given text\nThe logistics sector exp... The logistics sector underwent a major transfo... 1
実際にはこうやって確認できるでってチュートリアルには書いているので、よしとする。

チュートリアルは完了。
別スクラップで実際のデータを使って、ragasを評価してみる。
このスクラップは26日前にクローズされました
ログインするとコメントできます