iTranslated by AI

The content below is an AI-generated translation. This is an experimental feature, and may contain errors. View original article
⚙️

Implementing Reciprocal Rank Fusion (RRF) in C#

に公開

Introduction

In Azure AI Search, hybrid search allows you to combine the results of traditional full-text search and vector search to return better results. A mechanism called Reciprocal Rank Fusion (RRF) is used to merge the results scored by these different algorithms. The following documentation provides a clear explanation of RRF.

https://learn.microsoft.com/en-us/azure/search/hybrid-search-ranking?WT.mc_id=M365-MVP-5002941

To explain it more simply, it works as follows (let's set aside the coefficient k for a moment):

  • Sort each list by the order of their scores.
  • For each item in the list, calculate the value of 1 divided by its rank. The results will be values like 1, 1/2 (0.5), 1/3 (0.333...).
  • If the same item appears in multiple lists, sum the calculated values.
  • Sort the combined list by the summed values.

If you want to implement hybrid search outside of Azure AI Search, you need to implement RRF yourself.

Sample Code

https://github.com/karamem0/samples/tree/main/csharp-reciprocal-rank-fusion

The code is as follows:

// Assume the tuple consists of key, score, and document
public static IEnumerable<Tuple<string, double, T>> Fuse<T>(IEnumerable<IEnumerable<Tuple<string, double, T>>> rankings, double k = 60)
{
    var result = new List<Tuple<string, double, T>>();

    foreach (var ranking in rankings)
    {
        // Sort the list by the order of their scores
        var ordered = ranking.OrderByDescending(x => x.Item2).Select((value, order) => Tuple.Create(value, order + 1.0));
        foreach (var (value, order) in ordered)
        {
            // Search for the result list by key
            var index = result.FindIndex(x => x.Item1 == value.Item1);
            if (index < 0)
            {
                // If the item does not exist, add 1 divided by rank
                result.Add(Tuple.Create(value.Item1, 1.0 / (order + k), value.Item3));
            }
            else
            {
                // If the item exists, update the current value by adding 1 divided by rank
                result[index] = Tuple.Create(result[index].Item1, result[index].Item2 + 1.0 / (order + k), result[index].Item3);
            }
        }
    }

    // Sort and return the results
    return result.OrderByDescending(x => x.Item2);
}

Executing this yields the following results (when k = 0):

  • Lists before fusion

    Rank List 1 List 2 List 3
    1 A B C
    2 B A A
    3 C C B
  • Combined list

    Rank Name Score Calculation
    1 A 2 1 + 0.5 + 0.5
    2 B 1.833... 0.5 + 1 + 0.333...
    3 C 1.666... 0.333... + 0.333... + 0.5

Conclusion

I had GitHub Copilot write the code, and while it looked correct, the algorithm was flawed, so I ended up rewriting it myself.

Discussion