iTranslated by AI
Implementing Reciprocal Rank Fusion (RRF) in C#
Introduction
In Azure AI Search, hybrid search allows you to combine the results of traditional full-text search and vector search to return better results. A mechanism called Reciprocal Rank Fusion (RRF) is used to merge the results scored by these different algorithms. The following documentation provides a clear explanation of RRF.
To explain it more simply, it works as follows (let's set aside the coefficient k for a moment):
- Sort each list by the order of their scores.
- For each item in the list, calculate the value of 1 divided by its rank. The results will be values like 1, 1/2 (0.5), 1/3 (0.333...).
- If the same item appears in multiple lists, sum the calculated values.
- Sort the combined list by the summed values.
If you want to implement hybrid search outside of Azure AI Search, you need to implement RRF yourself.
Sample Code
The code is as follows:
// Assume the tuple consists of key, score, and document
public static IEnumerable<Tuple<string, double, T>> Fuse<T>(IEnumerable<IEnumerable<Tuple<string, double, T>>> rankings, double k = 60)
{
var result = new List<Tuple<string, double, T>>();
foreach (var ranking in rankings)
{
// Sort the list by the order of their scores
var ordered = ranking.OrderByDescending(x => x.Item2).Select((value, order) => Tuple.Create(value, order + 1.0));
foreach (var (value, order) in ordered)
{
// Search for the result list by key
var index = result.FindIndex(x => x.Item1 == value.Item1);
if (index < 0)
{
// If the item does not exist, add 1 divided by rank
result.Add(Tuple.Create(value.Item1, 1.0 / (order + k), value.Item3));
}
else
{
// If the item exists, update the current value by adding 1 divided by rank
result[index] = Tuple.Create(result[index].Item1, result[index].Item2 + 1.0 / (order + k), result[index].Item3);
}
}
}
// Sort and return the results
return result.OrderByDescending(x => x.Item2);
}
Executing this yields the following results (when k = 0):
-
Lists before fusion
Rank List 1 List 2 List 3 1 A B C 2 B A A 3 C C B -
Combined list
Rank Name Score Calculation 1 A 2 1 + 0.5 + 0.5 2 B 1.833... 0.5 + 1 + 0.333... 3 C 1.666... 0.333... + 0.333... + 0.5
Conclusion
I had GitHub Copilot write the code, and while it looked correct, the algorithm was flawed, so I ended up rewriting it myself.
Discussion