🐍

Azure Cognitive Search のベクトル検索を Python で行う

2023/07/09に公開

はじめに

2023 年 7 月 に Azure Cognitive Search のベクトル検索のパブリックプレビューが開始しました。これに合わせて Azrue Cognitive Search のドキュメントにベクトル検索のクイックスタートが追加されましたが、執筆時点で REST API のサンプルしか記載されていませんでした。そこで、Pythonista のために同じことを Python で行うための一連の方法をまとめました。
https://learn.microsoft.com/en-us/azure/search/whats-new#june-2023

なお、本記事の内容はほぼ Azure Cognitive Search の公式サンプルコードを参考にクイックスタートと同じことをやってみただけの内容になっています。元ネタが気になる方は公式の Python サンプルコードを直接ご覧ください。

【追記】2023 年 11 月にベクトル検索とセマンティック検索が一般提供 (GA) になりました。また、Azure Cognitive Search は Azure AI Search、セマンティック検索 (Semantic Search) はセマンティックランク付け (Semantic Ranker) に名称が変更されました。
https://learn.microsoft.com/ja-jp/azure/search/whats-new#november-2023

方法

1. Python ライブラリ

Python 実行環境に以下のライブラリをインストールします。

  • Azrue Cognitive Search SDK
  • azure-identity (Azrue Cognitive Search に認証情報を与える際に使用)
  • openai (Embedding に使用)
pip install azure-search-documents
pip install azure-identity
pip install openai

参考

2.1. Azure Cognitive Search リソース作成

Azure Cognitive Search のリソースを作成します。今回は、ベクトル検索と同じくプレビュー中の機能であるセマンティック検索を使うため、価格レベルは S (Standard) 以上を選択します。また、セマンティック検索に対応しているリージョンを選択します。

参考

2.2. セマンティック検索の有効化

ドキュメントの手順に従ってセマンティック検索を有効化します。

参考

3. Azrue OpenAI Service

執筆時点で Azure Cognitive Search は テキストの Embedding (ベクトル化) を行う機能を持っていません。本記事では Azure OpenAI Service を使って Embedding を行います。

Vector data (text embeddings) is used for vector search. Currently, Cognitive Search doesn't generate vectors for you.
ベクトル検索にはベクトルデータ(テキスト埋め込み)を使用します。現在、Cognitive Search はベクトルを生成しません。

【追記】2023 年 11 月にベクトル化とチャンク化を Azure Cognitive Search のインデクサーで行う機能のプレビューが開始しました。

参考

3.1. Azure OpenAI Service リソース作成

Azure OpenAI Service を使うには利用申請が必要です。利用申請が承認されたら、ドキュメントの手順に従ってリソースを作成します。

参考

3.2. Embedding モデルデプロイ

執筆時点での推奨モデルである text-embedding-ada-002 (Version 2) を選択してデプロイを行います。

参考

4. サンプルデータダウンロード

Azure Cognitive Search サンプルコードの中で使われているこちらのファイルをダウンロードして任意のフォルダに移動しておきます。

5. インデックス作成

5.1. 環境変数

Azrue Cognitive Search と Azrue OpenAI Service のエンドポイントと API キーを環境変数に登録しておきます。

import os

os.environ['SEARCH_ENDPOINT'] = 'https://<your-cognitive-search-resource-name>.search.windows.net'
os.environ['SEARCH_API_KEY'] = '<your-cognitive-search-api-key>'
os.environ['OPENAI_API_BASE'] = 'https://<your-aoai-resource-name>.openai.azure.com/'
os.environ['OPENAI_API_KEY'] = '<your-aoai-api-key>'

参考

5.2. 下準備

以下のコードでサンプルデータの読み込みからベクトル化までを行います。以下のように値を置き換えて実行します。

  • <your-data-folder>: サンプルデータを配置したフォルダのパス
  • <your-index-name>: 任意のインデックス名
  • <your-deployment-name>: text-embedding-ada-002 のデプロイ名
import os
import json
import openai
from azure.core.credentials import AzureKeyCredential  
from azure.search.documents import SearchClient
from azure.search.documents.indexes import SearchIndexClient
from azure.search.documents.models import Vector
from azure.search.documents.indexes.models import (
    SearchIndex,
    SearchField,
    SearchFieldDataType,
    SimpleField,
    SearchableField,
    SearchIndex,
    SemanticConfiguration,
    PrioritizedFields,
    SemanticField,
    SearchField,
    SemanticSettings,
    VectorSearch,
    VectorSearchAlgorithmConfiguration,
)

search_endpoint = os.environ['SEARCH_ENDPOINT']
search_credential = AzureKeyCredential(os.environ['SEARCH_API_KEY'])
index_name = '<your-index-name>'
openai.api_type = 'azure'  
openai.api_key = os.getenv('OPENAI_API_KEY')  
openai.api_base = os.getenv('OPENAI_API_BASE')  
openai.api_version = '2023-05-15'

with open('<your-data-folder>/text-sample.json', 'r', encoding='utf-8') as file:
    documents = json.load(file)

def generate_embeddings(text):
    response = openai.Embedding.create(
        input=text,
        engine='<your-deployment-name>'  # text-embedding-ada-002 のデプロイ名
    )
    embeddings = response['data'][0]['embedding']
    return embeddings

for document in documents:
    title = document['title']
    content = document['content']
    title_embeddings = generate_embeddings(title)
    content_embeddings = generate_embeddings(content)
    document['titleVector'] = title_embeddings
    document['contentVector'] = content_embeddings

# データをファイルに保存
with open('<your-data-folder>/text-sample-with-vector.json', 'w') as f:
    json.dump(documents, f)

[参考] 保存したデータの読み込み

with open('<your-data-folder>/text-sample-with-vector.json', 'r') as file:
    documents = json.load(file)

5.3. インデックス作成

ベクトルフィールドを持つインデックスを作成します。クイックスタートの インデックスを作成する に相当します。

なお、ベクトル探索の探索アルゴリズムは執筆時点で Hierarchical Navigable Small World (HNSW) のみに対応しています。

The "vectorSearch" object is an array of algorithm configurations used by vector fields. Currently, only HNSW is supported. HNSW is a graph-based Approximate Nearest Neighbors (ANN) algorithm optimized for high-recall, low-latency applications.

index_client = SearchIndexClient(
    endpoint=search_endpoint,
    credential=search_credential
)

fields = [
    SimpleField(
        name='id',
        type=SearchFieldDataType.String,
        key=True
    ),
    SearchableField(
        name='category',
        type=SearchFieldDataType.String,filterable=True,
        searchable=True,
        retrievable=True
    ),
    SearchableField(
        name='title',
        type=SearchFieldDataType.String,searchable=True,
        retrievable=True
    ),
    SearchField(
        name='titleVector',
        type=SearchFieldDataType.Collection(SearchFieldDataType.Single),
        searchable=True,
        dimensions=1536, # text-embedding-ada-002 の出力の次元数
        vector_search_configuration='vectorConfig'
    ),
    SearchableField(
        name='content',
        type=SearchFieldDataType.String,searchable=True,
        retrievable=True
    ),
    SearchField(
        name='contentVector',
        type=SearchFieldDataType.Collection(SearchFieldDataType.Single),
        searchable=True,
        dimensions=1536, # text-embedding-ada-002 の出力の次元数
        vector_search_configuration='vectorConfig'
    )
]

vector_search = VectorSearch(
    algorithm_configurations=[
        VectorSearchAlgorithmConfiguration(
            name='vectorConfig',
            kind='hnsw',
        )
    ]
)

semantic_config = SemanticConfiguration(
    name='my-semantic-config',
    prioritized_fields=PrioritizedFields(
        title_field=SemanticField(field_name='title'),
        prioritized_content_fields=[SemanticField(field_name='content')],
        prioritized_keywords_fields=[],
    )
)

semantic_settings = SemanticSettings(configurations=[semantic_config])

index = SearchIndex(
    name=index_name,
    fields=fields,
    vector_search=vector_search,
    semantic_settings=semantic_settings
)

result = index_client.create_or_update_index(index)
print(f" {result.name} created")
<your-index-name> created

参考

[補足] 言語アナライザーの指定

本記事で使用しているサンプルデータは英語であるため、既定の言語アナライザー (en.lucene) を使用しています。フィールドに格納するデータの言語に応じてアナライザーを変更したい場合は以下のように指定します。

Python SDK

index_client = SearchIndexClient(
    endpoint=search_endpoint,
    credential=search_credential
)

fields = [
    SimpleField(
        name='id',
        type=SearchFieldDataType.String,
        key=True
    ),
    SearchableField(
        name='category',
        type=SearchFieldDataType.String,
        filterable=True,
        searchable=True,
        retrievable=True,
        analyzer_name='ja.microsoft' # Microsoft日本語アナライザーを指定
    ),
    SearchableField(
        name='title',
        type=SearchFieldDataType.String,
        searchable=True,
        retrievable=True,
        analyzer_name='ja.microsoft' # Microsoft日本語アナライザーを指定
    ),
    SearchField(
        name='titleVector',
        type=SearchFieldDataType.Collection(SearchFieldDataType.Single),
        searchable=True,
        dimensions=1536,
        vector_search_configuration='vectorConfig'
    ),
    SearchableField(
        name='content',
        type=SearchFieldDataType.String,
        searchable=True,
        retrievable=True,
        analyzer_name='ja.microsoft' # Microsoft日本語アナライザーを指定
    ),
    SearchField(
        name='contentVector',
        type=SearchFieldDataType.Collection(SearchFieldDataType.Single),
        searchable=True,
        dimensions=1536,
        vector_search_configuration='vectorConfig'
    )
]

vector_search = VectorSearch(
    algorithm_configurations=[
        VectorSearchAlgorithmConfiguration(
            name='vectorConfig',
            kind='hnsw',
        )
    ]
)

semantic_config = SemanticConfiguration(
    name='my-semantic-config',
    prioritized_fields=PrioritizedFields(
        title_field=SemanticField(field_name='title'),
        prioritized_content_fields=[SemanticField(field_name='content')],
        prioritized_keywords_fields=[],
    )
)

semantic_settings = SemanticSettings(configurations=[semantic_config])

index = SearchIndex(
    name=index_name,
    fields=fields,
    vector_search=vector_search,
    semantic_settings=semantic_settings
)

result = index_client.create_or_update_index(index)
print(f" {result.name} created")

REST API

PUT https://{{search-service-name}}.search.windows.net/indexes/{{index-name}}?api-version=2023-07-01-preview
Content-Type: application/json
api-key: {{admin-api-key}}
{
    "name": "{{index-name}}",
    "fields": [
        {
            "name": "id",
            "type": "Edm.String",
            "key": true,
            "filterable": true
        },
        {
            "name": "category",
            "type": "Edm.String",
            "filterable": true,
            "searchable": true,
            "retrievable": true,
            "analyzer": "ja.microsoft"
        },
        {
            "name": "title",
            "type": "Edm.String",
            "searchable": true,
            "retrievable": true,
            "analyzer": "ja.microsoft"
        },
        {
            "name": "titleVector",
            "type": "Collection(Edm.Single)",
            "searchable": true,
            "retrievable": true,
            "dimensions": 1536,
            "vectorSearchConfiguration": "vectorConfig"
        },
        {
            "name": "content",
            "type": "Edm.String",
            "searchable": true,
            "retrievable": true,
            "analyzer": "ja.microsoft"
        },
        {
            "name": "contentVector",
            "type": "Collection(Edm.Single)",
            "searchable": true,
            "retrievable": true,
            "dimensions": 1536,
            "vectorSearchConfiguration": "vectorConfig"
        }
    ],
    "corsOptions": {
        "allowedOrigins": [
            "*"
        ],
        "maxAgeInSeconds": 60
    },
    "vectorSearch": {
        "algorithmConfigurations": [
            {
                "name": "vectorConfig",
                "kind": "hnsw"
            }
        ]
    },
    "semantic": {
        "configurations": [
            {
                "name": "my-semantic-config",
                "prioritizedFields": {
                    "titleField": {
                        "fieldName": "title"
                    },
                    "prioritizedContentFields": [
                        {
                            "fieldName": "content"
                        }
                    ],
                    "prioritizedKeywordsFields": []
                }
            }
        ]
    }
}

参考

6. ドキュメントアップロード

ベクトルを追加したドキュメントデータをアップロードします。クイックスタートの ドキュメントのアップロード に相当します。

search_client = SearchClient(
    endpoint=search_endpoint,
    index_name=index_name,
    credential=search_credential
)
result = search_client.upload_documents(documents)
print(f'Uploaded {len(documents)} documents')
Uploaded 108 documents

参考

7. 検索

作成したインデックスに対して様々な方法で検索を行います。

参考

7.1. Single vector search (単一ベクトル検索)

クイックスタートの 単一ベクトル検索 に相当します。

この例では "what Azure services support full text search" というテキストをベクトル化したものをクエリとして、単一のベクトルフィールド (contentVector) を対象に検索を行っています。

query = 'what Azure services support full text search'
results = search_client.search(
    search_text='', # ベクトル検索のみ行うためテキストクエリは空
    vector=Vector(
        value=generate_embeddings(query), # ベクトルクエリ
        k=5,
        fields='contentVector'
    )
)

for result in results:
    print(f"Title: {result['title']}")
    print(f"Score: {result['@search.score']}")
    print(f"Content: {result['content']}")
    print(f"Category: {result['category']}\n")
Title: Azure Cognitive Search
Score: 0.8807836
Content: Azure Cognitive Search is a fully managed search-as-a-service that enables you to build rich search experiences for your applications. It provides features like full-text search, faceted navigation, and filters. Azure Cognitive Search supports various data sources, such as Azure SQL Database, Azure Blob Storage, and Azure Cosmos DB. You can use Azure Cognitive Search to index your data, create custom scoring profiles, and integrate with other Azure services. It also integrates with other Azure services, such as Azure Cognitive Services and Azure Machine Learning.
Category: AI + Machine Learning

Title: Azure Cognitive Services
Score: 0.8504988
Content: Azure Cognitive Services is a collection of AI services and APIs that enable you to build intelligent applications using pre-built models and algorithms. It provides features like computer vision, speech recognition, and natural language processing. Cognitive Services supports various platforms, such as .NET, Java, Node.js, and Python. You can use Azure Cognitive Services to build chatbots, analyze images and videos, and process and understand text. It also integrates with other Azure services, such as Azure Machine Learning and Azure Cognitive Search.
Category: AI + Machine Learning

Title: Azure Data Explorer
Score: 0.8435348
Content: Azure Data Explorer is a fast, fully managed data analytics service for real-time analysis on large volumes of data. It provides features like ingestion, querying, and visualization. Data Explorer supports various data sources, such as Azure Event Hubs, Azure IoT Hub, and Azure Blob Storage. You can use Data Explorer to analyze logs, monitor applications, and gain insights into your data. It also integrates with other Azure services, such as Azure Synapse Analytics and Azure Machine Learning.
Category: Analytics

Title: Azure Cognitive Services
Score: 0.84273267
Content: Azure Cognitive Services are a set of AI services that enable you to build intelligent applications with powerful algorithms using just a few lines of code. These services cover a wide range of capabilities, including vision, speech, language, knowledge, and search. They are designed to be easy to use and integrate into your applications. Cognitive Services are fully managed, scalable, and continuously improved by Microsoft. It allows developers to create AI-powered solutions without deep expertise in machine learning.
Category: AI + Machine Learning

Title: Azure SQL Managed Instance
Score: 0.84138477
Content: Azure SQL Managed Instance is a fully managed, scalable, and secure SQL Server instance hosted in Azure. It provides features like automatic backups, monitoring, and high availability. SQL Managed Instance supports various data types, such as JSON, spatial, and full-text. You can use Azure SQL Managed Instance to migrate your existing applications, build new applications, and ensure the performance and security of your data. It also integrates with other Azure services, such as Azure App Service and Azure Data Factory.
Category: Databases

7.2. Single vector search with filter (フィルターを使用した単一ベクトル検索)

クイックスタートの フィルターを使用した単一ベクトル検索 に相当します。

この例では 単一フィールドベクトル検索 と合わせて selectfilter を使っています。select により検索結果で返すフィールド (titlecontentcategory) を選択して、さらに filter により category フィールドの値が Database のものに絞り込んでします。

query = 'what Azure services support full text search'
results = search_client.search(
    search_text="", # ベクトル検索のみ行うためテキストクエリは空
    vector=Vector(
        value=generate_embeddings(query), # ベクトルクエリ
        k=10,
        fields='contentVector'
    ),
    select=['title', 'content', 'category'],
    filter="category eq 'Databases'"
)

for result in results:
    print(f"Title: {result['title']}")
    print(f"Score: {result['@search.score']}")
    print(f"Content: {result['content']}")
    print(f"Category: {result['category']}\n")
Title: Azure SQL Managed Instance
Score: 0.84138477
Content: Azure SQL Managed Instance is a fully managed, scalable, and secure SQL Server instance hosted in Azure. It provides features like automatic backups, monitoring, and high availability. SQL Managed Instance supports various data types, such as JSON, spatial, and full-text. You can use Azure SQL Managed Instance to migrate your existing applications, build new applications, and ensure the performance and security of your data. It also integrates with other Azure services, such as Azure App Service and Azure Data Factory.
Category: Databases

Title: Azure SQL Database
Score: 0.8405633
Content: Azure SQL Database is a fully managed relational database service based on the latest stable version of Microsoft SQL Server. It offers built-in intelligence that learns your application patterns and adapts to maximize performance, reliability, and data protection. SQL Database supports elastic scaling, allowing you to dynamically adjust resources to match your workload. It provides advanced security features, such as encryption, auditing, and threat detection. You can migrate your existing SQL Server databases to Azure SQL Database with minimal downtime.
Category: Databases

Title: Azure SQL Data Warehouse
Score: 0.8358454
Content: Azure SQL Data Warehouse is a fully managed, petabyte-scale cloud data warehouse service that enables you to store and analyze your structured and semi-structured data. It provides features like automatic scaling, data movement, and integration with Azure Machine Learning. SQL Data Warehouse supports various data sources, such as Azure Blob Storage, Azure Data Lake Storage, and Azure SQL Database. You can use Azure SQL Data Warehouse to build data lakes, develop big data analytics solutions, and ensure the performance and security of your data. It also integrates with other Azure services, such as Azure Synapse Analytics and Azure Data Factory.
Category: Databases

Title: Azure Database for MySQL
Score: 0.8357488
Content: Azure Database for MySQL is a fully managed, scalable, and secure relational database service that enables you to build and manage MySQL applications in Azure. It provides features like automatic backups, monitoring, and high availability. Database for MySQL supports various data types, such as JSON, spatial, and full-text. You can use Azure Database for MySQL to migrate your existing applications, build new applications, and ensure the performance and security of your data. It also integrates with other Azure services, such as Azure App Service and Azure Data Factory.
Category: Databases

Title: Azure Database Migration Service
Score: 0.83318764
Content: Azure Database Migration Service is a fully managed, end-to-end migration service that enables you to migrate your databases to Azure with minimal downtime. It supports various source and target platforms, such as SQL Server, MySQL, PostgreSQL, and Azure SQL Database. Database Migration Service provides features like schema conversion, data migration, and performance monitoring. You can use Azure Database Migration Service to migrate your databases, ensure the compatibility of your workloads, and minimize your migration risks. It also integrates with other Azure services, such as Azure Migrate and Azure SQL Managed Instance.
Category: Databases

Title: Azure Cosmos DB
Score: 0.83215886
Content: Azure Cosmos DB is a fully managed, globally distributed, multi-model database service designed for building highly responsive and scalable applications. It offers turnkey global distribution, automatic and instant scalability, and guarantees low latency, high availability, and consistency. Cosmos DB supports popular NoSQL APIs, including MongoDB, Cassandra, Gremlin, and Azure Table Storage. You can build globally distributed applications with ease, without having to deal with complex configuration and capacity planning. Data stored in Cosmos DB is automatically indexed, enabling you to query your data with SQL, JavaScript, or other supported query languages.
Category: Databases

Title: Azure Database for MariaDB
Score: 0.8320385
Content: Azure Database for MariaDB is a fully managed, scalable, and secure relational database service that enables you to build and manage MariaDB applications in Azure. It provides features like automatic backups, monitoring, and high availability. Database for MariaDB supports various data types, such as JSON, spatial, and full-text. You can use Azure Database for MariaDB to migrate your existing applications, build new applications, and ensure the performance and security of your data. It also integrates with other Azure services, such as Azure App Service and Azure Data Factory.
Category: Databases

Title: Azure Database for PostgreSQL
Score: 0.82775676
Content: Azure Database for PostgreSQL is a fully managed, scalable, and secure relational database service that enables you to build and manage PostgreSQL applications in Azure. It provides features like automatic backups, monitoring, and high availability. Database for PostgreSQL supports various data types, such as JSON, spatial, and full-text. You can use Azure Database for PostgreSQL to migrate your existing applications, build new applications, and ensure the performance and security of your data. It also integrates with other Azure services, such as Azure App Service and Azure Data Factory.
Category: Databases

Title: Azure Cosmos DB
Score: 0.82391936
Content: Azure Cosmos DB is a globally distributed, multi-model database service that enables you to build and manage NoSQL applications in Azure. It provides features like automatic scaling, low-latency access, and multi-master replication. Cosmos DB supports various data models, such as key-value, document, graph, and column-family. You can use Azure Cosmos DB to build globally distributed applications, ensure high availability and performance, and manage your data at scale. It also integrates with other Azure services, such as Azure Functions and Azure App Service.
Category: Databases

Title: Azure Cache for Redis
Score: 0.814518
Content: Azure Cache for Redis is a fully managed, in-memory data store that enables you to build highly responsive and scalable applications. It provides features like data persistence, high availability, and automatic scaling. Cache for Redis supports various data structures, such as strings, lists, sets, and hashes. You can use Azure Cache for Redis to improve the performance of your applications, offload your databases, and reduce the latency of your data access. It also integrates with other Azure services, such as Azure App Service and Azure Kubernetes Service.
Category: Databases

7.3. Cross-field vector search (クロスフィールドベクトル検索)

クイックスタートの クロスフィールド ベクトル検索 に相当します。

この例では "what Azure services support full text search" というテキストをベクトル化したものをクエリとして、複数のフィールド (titleVectorcontentVector) を対象に検索を行っています。

query = 'what Azure services support full text search'
results = search_client.search(
    search_text='', # ベクトル検索のみ行うためテキストクエリは空
    vector=Vector(
        value=generate_embeddings(query), # ベクトルクエリ
        k=5,
        fields='titleVector, contentVector'
    )
)

for result in results:
    print(f"Title: {result['title']}")
    print(f"Score: {result['@search.score']}")
    print(f"Content: {result['content']}")
    print(f"Category: {result['category']}\n")
Title: Azure Cognitive Search
Score: 0.03333333507180214
Content: Azure Cognitive Search is a fully managed search-as-a-service that enables you to build rich search experiences for your applications. It provides features like full-text search, faceted navigation, and filters. Azure Cognitive Search supports various data sources, such as Azure SQL Database, Azure Blob Storage, and Azure Cosmos DB. You can use Azure Cognitive Search to index your data, create custom scoring profiles, and integrate with other Azure services. It also integrates with other Azure services, such as Azure Cognitive Services and Azure Machine Learning.
Category: AI + Machine Learning

Title: Azure Cognitive Services
Score: 0.032522473484277725
Content: Azure Cognitive Services is a collection of AI services and APIs that enable you to build intelligent applications using pre-built models and algorithms. It provides features like computer vision, speech recognition, and natural language processing. Cognitive Services supports various platforms, such as .NET, Java, Node.js, and Python. You can use Azure Cognitive Services to build chatbots, analyze images and videos, and process and understand text. It also integrates with other Azure services, such as Azure Machine Learning and Azure Cognitive Search.
Category: AI + Machine Learning

Title: Azure Cognitive Services
Score: 0.03226646035909653
Content: Azure Cognitive Services are a set of AI services that enable you to build intelligent applications with powerful algorithms using just a few lines of code. These services cover a wide range of capabilities, including vision, speech, language, knowledge, and search. They are designed to be easy to use and integrate into your applications. Cognitive Services are fully managed, scalable, and continuously improved by Microsoft. It allows developers to create AI-powered solutions without deep expertise in machine learning.
Category: AI + Machine Learning

Title: Azure Data Explorer
Score: 0.016129031777381897
Content: Azure Data Explorer is a fast, fully managed data analytics service for real-time analysis on large volumes of data. It provides features like ingestion, querying, and visualization. Data Explorer supports various data sources, such as Azure Event Hubs, Azure IoT Hub, and Azure Blob Storage. You can use Data Explorer to analyze logs, monitor applications, and gain insights into your data. It also integrates with other Azure services, such as Azure Synapse Analytics and Azure Machine Learning.
Category: Analytics

Title: Azure App Service
Score: 0.01587301678955555
Content: Azure App Service is a fully managed platform for building, deploying, and scaling web apps. You can host web apps, mobile app backends, and RESTful APIs. It supports a variety of programming languages and frameworks, such as .NET, Java, Node.js, Python, and PHP. The service offers built-in auto-scaling and load balancing capabilities. It also provides integration with other Azure services, such as Azure DevOps, GitHub, and Bitbucket.
Category: Web

Title: Azure SQL Database
Score: 0.015625
Content: Azure SQL Database is a fully managed relational database service based on the latest stable version of Microsoft SQL Server. It offers built-in intelligence that learns your application patterns and adapts to maximize performance, reliability, and data protection. SQL Database supports elastic scaling, allowing you to dynamically adjust resources to match your workload. It provides advanced security features, such as encryption, auditing, and threat detection. You can migrate your existing SQL Server databases to Azure SQL Database with minimal downtime.
Category: Databases

Title: Azure SQL Managed Instance
Score: 0.015625
Content: Azure SQL Managed Instance is a fully managed, scalable, and secure SQL Server instance hosted in Azure. It provides features like automatic backups, monitoring, and high availability. SQL Managed Instance supports various data types, such as JSON, spatial, and full-text. You can use Azure SQL Managed Instance to migrate your existing applications, build new applications, and ensure the performance and security of your data. It also integrates with other Azure services, such as Azure App Service and Azure Data Factory.
Category: Databases

7.4. Multi-query vector search (マルチクエリベクトル検索)

クイックスタートの マルチクエリ ベクトル検索 に相当します。

こちらに関しては Python から行う方法を見つけることができませんでした。テキストベクトルフィールドと画像ベクトルフィールドを持つようなインデックスに対して、テキストベクトルクエリと画像ベクトルクエリを同時に投げるようなマルチモーダルな状況を想定しているようですが、単に別々にリクエストを投げれば良いだけではないかと感じます。

7.5. Hybrid search (ハイブリッド検索)

クイックスタートの ハイブリッド検索 に相当します。

この例では "what Azure services support full text search" というテキストのクエリによる従来のキーワード検索と、単一フィールドベクトル検索 を同時に行うハイブリッド検索を行っています。ハイブリッド検索の結果は Azure Cognitive Search によってランク付けされ、検索ランクが高い順に上位 10 件が返されています。

query = 'what Azure services support full text search'
results = search_client.search(
    search_text=query, # テキストクエリ
    vector=Vector(
        value=generate_embeddings(query), # ベクトルクエリ
        k=10,
        fields='contentVector'
    ),
    top=10
)

for result in results:
    print(f"Title: {result['title']}")
    print(f"Score: {result['@search.score']}")
    print(f"Content: {result['content']}")
    print(f"Category: {result['category']}\n")
Title: Azure Cognitive Search
Score: 0.03333333507180214
Content: Azure Cognitive Search is a fully managed search-as-a-service that enables you to build rich search experiences for your applications. It provides features like full-text search, faceted navigation, and filters. Azure Cognitive Search supports various data sources, such as Azure SQL Database, Azure Blob Storage, and Azure Cosmos DB. You can use Azure Cognitive Search to index your data, create custom scoring profiles, and integrate with other Azure services. It also integrates with other Azure services, such as Azure Cognitive Services and Azure Machine Learning.
Category: AI + Machine Learning

Title: Azure Cognitive Services
Score: 0.032786883413791656
Content: Azure Cognitive Services is a collection of AI services and APIs that enable you to build intelligent applications using pre-built models and algorithms. It provides features like computer vision, speech recognition, and natural language processing. Cognitive Services supports various platforms, such as .NET, Java, Node.js, and Python. You can use Azure Cognitive Services to build chatbots, analyze images and videos, and process and understand text. It also integrates with other Azure services, such as Azure Machine Learning and Azure Cognitive Search.
Category: AI + Machine Learning

Title: Azure Cognitive Services
Score: 0.0314980149269104
Content: Azure Cognitive Services are a set of AI services that enable you to build intelligent applications with powerful algorithms using just a few lines of code. These services cover a wide range of capabilities, including vision, speech, language, knowledge, and search. They are designed to be easy to use and integrate into your applications. Cognitive Services are fully managed, scalable, and continuously improved by Microsoft. It allows developers to create AI-powered solutions without deep expertise in machine learning.
Category: AI + Machine Learning

Title: Azure SQL Managed Instance
Score: 0.03100961446762085
Content: Azure SQL Managed Instance is a fully managed, scalable, and secure SQL Server instance hosted in Azure. It provides features like automatic backups, monitoring, and high availability. SQL Managed Instance supports various data types, such as JSON, spatial, and full-text. You can use Azure SQL Managed Instance to migrate your existing applications, build new applications, and ensure the performance and security of your data. It also integrates with other Azure services, such as Azure App Service and Azure Data Factory.
Category: Databases

Title: Azure Data Catalog
Score: 0.027497194707393646
Content: Azure Data Catalog is a fully managed metadata service that enables you to discover, understand, and use your data sources. It provides features like data asset registration, metadata discovery, and data lineage. Data Catalog supports various data sources, such as Azure SQL Database, Azure Blob Storage, and on-premises file systems. You can use Data Catalog to build a unified data catalog, improve data governance, and streamline your data discovery process. It also integrates with other Azure services, such as Azure Synapse Analytics and Azure Data Factory.
Category: Analytics

Title: Azure Time Series Insights
Score: 0.0233423113822937
Content: Azure Time Series Insights is a fully managed analytics, storage, and visualization service that enables you to explore and analyze time-series data. It supports various data sources, such as Azure IoT Hub, Azure Event Hubs, and Azure Blob Storage. Time Series Insights provides features like real-time data streaming, advanced querying, and pattern recognition. You can use Time Series Insights to monitor your IoT devices, detect anomalies, and gain insights into your data. It also integrates with other Azure services, such as Azure Stream Analytics and Azure Machine Learning.
Category: Analytics

Title: Azure Data Explorer
Score: 0.02262253873050213
Content: Azure Data Explorer is a fast, fully managed data analytics service for real-time analysis on large volumes of data. It provides features like ingestion, querying, and visualization. Data Explorer supports various data sources, such as Azure Event Hubs, Azure IoT Hub, and Azure Blob Storage. You can use Data Explorer to analyze logs, monitor applications, and gain insights into your data. It also integrates with other Azure services, such as Azure Synapse Analytics and Azure Machine Learning.
Category: Analytics

Title: Azure HDInsight
Score: 0.02155519649386406
Content: Azure HDInsight is a fully managed, open-source analytics service for processing big data workloads. It provides popular open-source frameworks, such as Apache Hadoop, Apache Spark, Apache Kafka, and Apache HBase. HDInsight supports various data sources, such as Azure Blob Storage, Azure Data Lake Storage, and Azure Cosmos DB. You can use HDInsight to analyze and process large volumes of data, build real-time analytics solutions, and develop machine learning models. It also integrates with other Azure services, such as Azure Synapse Analytics and Azure Machine Learning.
Category: Analytics

Title: Azure SQL Database
Score: 0.021372638642787933
Content: Azure SQL Database is a fully managed relational database service based on the latest stable version of Microsoft SQL Server. It offers built-in intelligence that learns your application patterns and adapts to maximize performance, reliability, and data protection. SQL Database supports elastic scaling, allowing you to dynamically adjust resources to match your workload. It provides advanced security features, such as encryption, auditing, and threat detection. You can migrate your existing SQL Server databases to Azure SQL Database with minimal downtime.
Category: Databases

Title: Azure Table Storage
Score: 0.02129479870200157
Content: Azure Table Storage is a fully managed, NoSQL datastore that enables you to store and query large amounts of structured, non-relational data. It provides features like automatic scaling, schema-less design, and a RESTful API. Table Storage supports various data types, such as strings, numbers, and booleans. You can use Azure Table Storage to store and manage your data, build scalable applications, and reduce the cost of your storage. It also integrates with other Azure services, such as Azure Functions and Azure Cosmos DB.
Category: Storage

参考

7.6. Hybrid search with filter (フィルターを使用したハイブリッド検索)

クイックスタートの フィルターを使用したハイブリッド検索 に相当します。

この例では ハイブリッド検索filter を併用して、検索結果を category フィールドの値が Database のものに絞り込んでします。

query = 'what Azure services support full text search'
results = search_client.search(
    search_text=query, # テキストクエリ
    vector=Vector(
        value=generate_embeddings(query), # ベクトルクエリ
        k=10,
        fields='contentVector'
    ),
    filter="category eq 'Databases'",
    top=10
)

for result in results:
    print(f"Title: {result['title']}")
    print(f"Score: {result['@search.score']}")
    print(f"Content: {result['content']}")
    print(f"Category: {result['category']}\n")
Title: Azure SQL Managed Instance
Score: 0.03279569745063782
Content: Azure SQL Managed Instance is a fully managed, scalable, and secure SQL Server instance hosted in Azure. It provides features like automatic backups, monitoring, and high availability. SQL Managed Instance supports various data types, such as JSON, spatial, and full-text. You can use Azure SQL Managed Instance to migrate your existing applications, build new applications, and ensure the performance and security of your data. It also integrates with other Azure services, such as Azure App Service and Azure Data Factory.
Category: Databases

Title: Azure Database for MySQL
Score: 0.03226646035909653
Content: Azure Database for MySQL is a fully managed, scalable, and secure relational database service that enables you to build and manage MySQL applications in Azure. It provides features like automatic backups, monitoring, and high availability. Database for MySQL supports various data types, such as JSON, spatial, and full-text. You can use Azure Database for MySQL to migrate your existing applications, build new applications, and ensure the performance and security of your data. It also integrates with other Azure services, such as Azure App Service and Azure Data Factory.
Category: Databases

Title: Azure Database for MariaDB
Score: 0.03181818127632141
Content: Azure Database for MariaDB is a fully managed, scalable, and secure relational database service that enables you to build and manage MariaDB applications in Azure. It provides features like automatic backups, monitoring, and high availability. Database for MariaDB supports various data types, such as JSON, spatial, and full-text. You can use Azure Database for MariaDB to migrate your existing applications, build new applications, and ensure the performance and security of your data. It also integrates with other Azure services, such as Azure App Service and Azure Data Factory.
Category: Databases

Title: Azure SQL Data Warehouse
Score: 0.03151364624500275
Content: Azure SQL Data Warehouse is a fully managed, petabyte-scale cloud data warehouse service that enables you to store and analyze your structured and semi-structured data. It provides features like automatic scaling, data movement, and integration with Azure Machine Learning. SQL Data Warehouse supports various data sources, such as Azure Blob Storage, Azure Data Lake Storage, and Azure SQL Database. You can use Azure SQL Data Warehouse to build data lakes, develop big data analytics solutions, and ensure the performance and security of your data. It also integrates with other Azure services, such as Azure Synapse Analytics and Azure Data Factory.
Category: Databases

Title: Azure SQL Database
Score: 0.030886195600032806
Content: Azure SQL Database is a fully managed relational database service based on the latest stable version of Microsoft SQL Server. It offers built-in intelligence that learns your application patterns and adapts to maximize performance, reliability, and data protection. SQL Database supports elastic scaling, allowing you to dynamically adjust resources to match your workload. It provides advanced security features, such as encryption, auditing, and threat detection. You can migrate your existing SQL Server databases to Azure SQL Database with minimal downtime.
Category: Databases

Title: Azure Database for PostgreSQL
Score: 0.03079839050769806
Content: Azure Database for PostgreSQL is a fully managed, scalable, and secure relational database service that enables you to build and manage PostgreSQL applications in Azure. It provides features like automatic backups, monitoring, and high availability. Database for PostgreSQL supports various data types, such as JSON, spatial, and full-text. You can use Azure Database for PostgreSQL to migrate your existing applications, build new applications, and ensure the performance and security of your data. It also integrates with other Azure services, such as Azure App Service and Azure Data Factory.
Category: Databases

Title: Azure Database Migration Service
Score: 0.03077651560306549
Content: Azure Database Migration Service is a fully managed, end-to-end migration service that enables you to migrate your databases to Azure with minimal downtime. It supports various source and target platforms, such as SQL Server, MySQL, PostgreSQL, and Azure SQL Database. Database Migration Service provides features like schema conversion, data migration, and performance monitoring. You can use Azure Database Migration Service to migrate your databases, ensure the compatibility of your workloads, and minimize your migration risks. It also integrates with other Azure services, such as Azure Migrate and Azure SQL Managed Instance.
Category: Databases

Title: Azure Cosmos DB
Score: 0.030330881476402283
Content: Azure Cosmos DB is a globally distributed, multi-model database service that enables you to build and manage NoSQL applications in Azure. It provides features like automatic scaling, low-latency access, and multi-master replication. Cosmos DB supports various data models, such as key-value, document, graph, and column-family. You can use Azure Cosmos DB to build globally distributed applications, ensure high availability and performance, and manage your data at scale. It also integrates with other Azure services, such as Azure Functions and Azure App Service.
Category: Databases

Title: Azure Cosmos DB
Score: 0.03009049780666828
Content: Azure Cosmos DB is a fully managed, globally distributed, multi-model database service designed for building highly responsive and scalable applications. It offers turnkey global distribution, automatic and instant scalability, and guarantees low latency, high availability, and consistency. Cosmos DB supports popular NoSQL APIs, including MongoDB, Cassandra, Gremlin, and Azure Table Storage. You can build globally distributed applications with ease, without having to deal with complex configuration and capacity planning. Data stored in Cosmos DB is automatically indexed, enabling you to query your data with SQL, JavaScript, or other supported query languages.
Category: Databases

Title: Azure Cache for Redis
Score: 0.02941812574863434
Content: Azure Cache for Redis is a fully managed, in-memory data store that enables you to build highly responsive and scalable applications. It provides features like data persistence, high availability, and automatic scaling. Cache for Redis supports various data structures, such as strings, lists, sets, and hashes. You can use Azure Cache for Redis to improve the performance of your applications, offload your databases, and reduce the latency of your data access. It also integrates with other Azure services, such as Azure App Service and Azure Kubernetes Service.
Category: Databases

7.7. Semantic hybrid search (セマンティックハイブリッド検索)

クイックスタートの セマンティック ハイブリッド検索 に相当します。

前述の ハイブリッド検索セマンティック検索 の合わせ技です。Semantic ranking により検索精度が向上し、さらにセマンティック検索特有の Semantic AnswerSemantic Caption を受け取ることができます。

Semantic Answer

Semantic Answer: <em>Azure Cognitive</em> Search is a fully managed search-as-a-service that enables you to build rich search experiences for your applications. It provides features like full-text search, faceted navigation, and filters. Azure<em> Cognitive Search</em> supports various data sources, such as<em> Azure SQL Database,</em><em> Azure Blob Storage,</em> and<em> Azure Cosmos DB.</em>
Semantic Answer Score: 0.9228515625

Semantic Caption (末尾の "Caption: xxx" の部分)

Title: Azure Cognitive Search
Content: Azure Cognitive Search is a fully managed search-as-a-service that enables you to build rich search experiences for your applications. It provides features like full-text search, faceted navigation, and filters. Azure Cognitive Search supports various data sources, such as Azure SQL Database, Azure Blob Storage, and Azure Cosmos DB. You can use Azure Cognitive Search to index your data, create custom scoring profiles, and integrate with other Azure services. It also integrates with other Azure services, such as Azure Cognitive Services and Azure Machine Learning.
Category: AI + Machine Learning
Caption: <em>Azure Cognitive Search</em> is a fully managed search-as-a-service that enables you to build rich search experiences for your applications. It provides features like full-text search, faceted navigation, and filters.<em> Azure Cognitive Search</em> supports various data sources, such as Azure SQL Database, Azure Blob Storage, and Azure Cosmos DB.
query = 'what Azure services support full text search'
results = search_client.search(
    search_text=query, # テキストクエリ
    vector=Vector(
        value=generate_embeddings(query), # ベクトルクエリ
        k=10,
        fields='contentVector'
    ),
    select=['title', 'content', 'category'],
    query_type='semantic',
    query_language='en-us',
    semantic_configuration_name='my-semantic-config',
    query_caption='extractive',
    query_answer='extractive',
    top=10
)

semantic_answers = results.get_answers()
for answer in semantic_answers:
    if answer.highlights:
        print(f"Semantic Answer: {answer.highlights}")
    else:
        print(f"Semantic Answer: {answer.text}")
    print(f"Semantic Answer Score: {answer.score}\n")

for result in results:
    print(f"Title: {result['title']}")
    print(f"Content: {result['content']}")
    print(f"Category: {result['category']}")

    captions = result['@search.captions']
    if captions:
        caption = captions[0]
        if caption.highlights:
            print(f"Caption: {caption.highlights}\n")
        else:
            print(f"Caption: {caption.text}\n")
Semantic Answer: <em>Azure Cognitive</em> Search is a fully managed search-as-a-service that enables you to build rich search experiences for your applications. It provides features like full-text search, faceted navigation, and filters. Azure<em> Cognitive Search</em> supports various data sources, such as<em> Azure SQL Database,</em><em> Azure Blob Storage,</em> and<em> Azure Cosmos DB.</em>
Semantic Answer Score: 0.9228515625

Title: Azure Cognitive Search
Content: Azure Cognitive Search is a fully managed search-as-a-service that enables you to build rich search experiences for your applications. It provides features like full-text search, faceted navigation, and filters. Azure Cognitive Search supports various data sources, such as Azure SQL Database, Azure Blob Storage, and Azure Cosmos DB. You can use Azure Cognitive Search to index your data, create custom scoring profiles, and integrate with other Azure services. It also integrates with other Azure services, such as Azure Cognitive Services and Azure Machine Learning.
Category: AI + Machine Learning
Caption: <em>Azure Cognitive Search</em> is a fully managed search-as-a-service that enables you to build rich search experiences for your applications. It provides features like full-text search, faceted navigation, and filters.<em> Azure Cognitive Search</em> supports various data sources, such as Azure SQL Database, Azure Blob Storage, and Azure Cosmos DB.

Title: Azure Cognitive Services
Content: Azure Cognitive Services is a collection of AI services and APIs that enable you to build intelligent applications using pre-built models and algorithms. It provides features like computer vision, speech recognition, and natural language processing. Cognitive Services supports various platforms, such as .NET, Java, Node.js, and Python. You can use Azure Cognitive Services to build chatbots, analyze images and videos, and process and understand text. It also integrates with other Azure services, such as Azure Machine Learning and Azure Cognitive Search.
Category: AI + Machine Learning
Caption: You can use Azure Cognitive Services to build chatbots, analyze images and videos, and process and understand text. It also integrates with other Azure services, such as Azure Machine Learning and Azure Cognitive Search..

Title: Azure Cognitive Services
Content: Azure Cognitive Services are a set of AI services that enable you to build intelligent applications with powerful algorithms using just a few lines of code. These services cover a wide range of capabilities, including vision, speech, language, knowledge, and search. They are designed to be easy to use and integrate into your applications. Cognitive Services are fully managed, scalable, and continuously improved by Microsoft. It allows developers to create AI-powered solutions without deep expertise in machine learning.
Category: AI + Machine Learning
Caption: Azure Cognitive Services. Azure Cognitive Services are a set of AI services that enable you to build intelligent applications with powerful algorithms using just a few lines of code. These services cover a wide range of capabilities, including vision, speech, language, knowledge, and search. They are designed to be easy to use and integrate into your applications. Cognitive Services are fully managed, scalable, and continuously improved by Microsoft. It allows developers to create AI-powered solutions without deep expertise in machine learning..

Title: Azure SQL Managed Instance
Content: Azure SQL Managed Instance is a fully managed, scalable, and secure SQL Server instance hosted in Azure. It provides features like automatic backups, monitoring, and high availability. SQL Managed Instance supports various data types, such as JSON, spatial, and full-text. You can use Azure SQL Managed Instance to migrate your existing applications, build new applications, and ensure the performance and security of your data. It also integrates with other Azure services, such as Azure App Service and Azure Data Factory.
Category: Databases
Caption: <em>Azure SQL Managed Instance.</em><em> Azure SQL Managed Instance</em> is a fully managed, scalable, and secure SQL Server instance hosted in Azure. It provides features like automatic backups, monitoring, and high availability.<em> SQL Managed Instance</em> supports various data types, such as JSON, spatial, and full-text. You can use<em> Azure SQL Managed Instance</em> to migrate your existing applications, build new applications, and ensure the performance and security of your data. It also integrates with other Azure services, such as Azure App Service and Azure Data Factory..

Title: Azure Database for MariaDB
Content: Azure Database for MariaDB is a fully managed, scalable, and secure relational database service that enables you to build and manage MariaDB applications in Azure. It provides features like automatic backups, monitoring, and high availability. Database for MariaDB supports various data types, such as JSON, spatial, and full-text. You can use Azure Database for MariaDB to migrate your existing applications, build new applications, and ensure the performance and security of your data. It also integrates with other Azure services, such as Azure App Service and Azure Data Factory.
Category: Databases
Caption: Azure Database for MariaDB is a fully managed, scalable, and secure relational database service that enables you to build and manage<em> MariaDB</em> applications in Azure. It provides features like automatic backups, monitoring, and high availability. Database for MariaDB supports various data types, such as JSON, spatial, and full-text.

Title: Azure Database for MySQL
Content: Azure Database for MySQL is a fully managed, scalable, and secure relational database service that enables you to build and manage MySQL applications in Azure. It provides features like automatic backups, monitoring, and high availability. Database for MySQL supports various data types, such as JSON, spatial, and full-text. You can use Azure Database for MySQL to migrate your existing applications, build new applications, and ensure the performance and security of your data. It also integrates with other Azure services, such as Azure App Service and Azure Data Factory.
Category: Databases
Caption: <em>Azure</em> Database for MySQL is a fully managed, scalable, and secure relational database service that enables you to build and manage MySQL applications in Azure. It provides features like automatic backups, monitoring, and high availability. Database for MySQL supports various data types, such as JSON, spatial, and full-text.

Title: Azure Data Catalog
Content: Azure Data Catalog is a fully managed metadata service that enables you to discover, understand, and use your data sources. It provides features like data asset registration, metadata discovery, and data lineage. Data Catalog supports various data sources, such as Azure SQL Database, Azure Blob Storage, and on-premises file systems. You can use Data Catalog to build a unified data catalog, improve data governance, and streamline your data discovery process. It also integrates with other Azure services, such as Azure Synapse Analytics and Azure Data Factory.
Category: Analytics
Caption: Azure Data Catalog is a fully managed metadata service that enables you to discover, understand, and use your data sources. It provides features like data asset registration, metadata discovery, and data lineage. Data Catalog supports various data sources, such as Azure SQL Database, Azure Blob Storage, and on-premises file systems.

Title: Azure CDN
Content: Azure Content Delivery Network (CDN) is a global content delivery network that enables you to deliver content to users with low latency and high availability. It caches and serves content from edge servers located near your users, improving the performance and reliability of your web applications. Azure CDN supports various content types, including static files, videos, and images. It provides advanced features like geo-filtering, custom domains, and SSL certificates. You can use Azure CDN with other Azure services, such as Azure App Service and Azure Storage.
Category: Networking
Caption: <em>Azure Content</em> Delivery Network (CDN) is a global<em> content delivery</em> network that enables you to deliver content to users with low latency and high availability. It caches and serves content from edge servers located near your users, improving the performance and reliability of your web applications.

Title: Azure SignalR Service
Content: Azure SignalR Service is a fully managed, real-time messaging service that enables you to build and scale real-time web applications. It provides features like automatic scaling, WebSocket support, and serverless integration. SignalR Service supports various programming languages, such as C#, JavaScript, and Java. You can use Azure SignalR Service to build chat applications, real-time dashboards, and collaborative tools. It also integrates with other Azure services, such as Azure Functions and Azure App Service.
Category: Web
Caption: Azure SignalR Service. Azure SignalR Service is a fully managed, real-time messaging service that enables you to build and scale real-time web applications. It provides features like automatic scaling, WebSocket support, and serverless integration. SignalR Service supports various programming languages, such as C#, JavaScript, and Java. You can use<em> Azure SignalR Service</em> to build chat applications, real-time dashboards, and collaborative tools. It also integrates with other Azure services, such as Azure Functions and Azure App Service..

Title: Azure Data Explorer
Content: Azure Data Explorer is a fast, fully managed data analytics service for real-time analysis on large volumes of data. It provides features like ingestion, querying, and visualization. Data Explorer supports various data sources, such as Azure Event Hubs, Azure IoT Hub, and Azure Blob Storage. You can use Data Explorer to analyze logs, monitor applications, and gain insights into your data. It also integrates with other Azure services, such as Azure Synapse Analytics and Azure Machine Learning.
Category: Analytics
Caption: Azure Data Explorer. Azure Data Explorer is a fast, fully managed data analytics service for real-time analysis on large volumes of data. It provides features like ingestion, querying, and visualization. Data Explorer supports various data sources, such as Azure Event Hubs, Azure IoT Hub, and Azure Blob Storage. You can use<em> Data Explorer</em> to analyze logs, monitor applications, and gain insights into your data. It also integrates with other Azure services, such as <em>Azure Synapse Analytics</em> and Azure Machine Learning..

7.8. Semantic hybrid search with filter (フィルターを使用したセマンティック ハイブリッド検索)

クイックスタートの フィルターを使用したセマンティック ハイブリッド検索 に相当します。

この例では セマンティックハイブリッド検索filter を併用して、検索結果を category フィールドの値が Database のものに絞り込んでします。

query = 'what Azure services support full text search'
results = search_client.search(
    search_text=query, # テキストクエリ
    vector=Vector(
        value=generate_embeddings(query), # ベクトルクエリ
        k=10,
        fields='contentVector'
    ),
    select=['title', 'content', 'category'],
    query_type='semantic',
    query_language='en-us',
    semantic_configuration_name='my-semantic-config',
    query_caption='extractive',
    query_answer='extractive',
    filter="category eq 'Databases'",
    top=10
)

semantic_answers = results.get_answers()
for answer in semantic_answers:
    if answer.highlights:
        print(f"Semantic Answer: {answer.highlights}")
    else:
        print(f"Semantic Answer: {answer.text}")
    print(f"Semantic Answer Score: {answer.score}\n")

for result in results:
    print(f"Title: {result['title']}")
    print(f"Content: {result['content']}")
    print(f"Category: {result['category']}")

    captions = result['@search.captions']
    if captions:
        caption = captions[0]
        if caption.highlights:
            print(f"Caption: {caption.highlights}\n")
        else:
            print(f"Caption: {caption.text}\n")
Title: Azure SQL Managed Instance
Content: Azure SQL Managed Instance is a fully managed, scalable, and secure SQL Server instance hosted in Azure. It provides features like automatic backups, monitoring, and high availability. SQL Managed Instance supports various data types, such as JSON, spatial, and full-text. You can use Azure SQL Managed Instance to migrate your existing applications, build new applications, and ensure the performance and security of your data. It also integrates with other Azure services, such as Azure App Service and Azure Data Factory.
Category: Databases
Caption: <em>Azure SQL Managed Instance.</em><em> Azure SQL Managed Instance</em> is a fully managed, scalable, and secure SQL Server instance hosted in Azure. It provides features like automatic backups, monitoring, and high availability.<em> SQL Managed Instance</em> supports various data types, such as JSON, spatial, and full-text. You can use<em> Azure SQL Managed Instance</em> to migrate your existing applications, build new applications, and ensure the performance and security of your data. It also integrates with other Azure services, such as Azure App Service and Azure Data Factory..

Title: Azure Database for MariaDB
Content: Azure Database for MariaDB is a fully managed, scalable, and secure relational database service that enables you to build and manage MariaDB applications in Azure. It provides features like automatic backups, monitoring, and high availability. Database for MariaDB supports various data types, such as JSON, spatial, and full-text. You can use Azure Database for MariaDB to migrate your existing applications, build new applications, and ensure the performance and security of your data. It also integrates with other Azure services, such as Azure App Service and Azure Data Factory.
Category: Databases
Caption: Azure Database for MariaDB is a fully managed, scalable, and secure relational database service that enables you to build and manage<em> MariaDB</em> applications in Azure. It provides features like automatic backups, monitoring, and high availability. Database for MariaDB supports various data types, such as JSON, spatial, and full-text.

Title: Azure Database for MySQL
Content: Azure Database for MySQL is a fully managed, scalable, and secure relational database service that enables you to build and manage MySQL applications in Azure. It provides features like automatic backups, monitoring, and high availability. Database for MySQL supports various data types, such as JSON, spatial, and full-text. You can use Azure Database for MySQL to migrate your existing applications, build new applications, and ensure the performance and security of your data. It also integrates with other Azure services, such as Azure App Service and Azure Data Factory.
Category: Databases
Caption: <em>Azure</em> Database for MySQL is a fully managed, scalable, and secure relational database service that enables you to build and manage MySQL applications in Azure. It provides features like automatic backups, monitoring, and high availability. Database for MySQL supports various data types, such as JSON, spatial, and full-text.

Title: Azure Database for PostgreSQL
Content: Azure Database for PostgreSQL is a fully managed, scalable, and secure relational database service that enables you to build and manage PostgreSQL applications in Azure. It provides features like automatic backups, monitoring, and high availability. Database for PostgreSQL supports various data types, such as JSON, spatial, and full-text. You can use Azure Database for PostgreSQL to migrate your existing applications, build new applications, and ensure the performance and security of your data. It also integrates with other Azure services, such as Azure App Service and Azure Data Factory.
Category: Databases
Caption: <em>Azure</em> Database for<em> PostgreSQL</em> is a fully managed, scalable, and secure relational database service that enables you to build and manage<em> PostgreSQL</em> applications in Azure. It provides features like automatic backups, monitoring, and high availability. Database for PostgreSQL supports various data types, such as JSON, spatial, and full-text.

Title: Azure Cache for Redis
Content: Azure Cache for Redis is a fully managed, in-memory data store that enables you to build highly responsive and scalable applications. It provides features like data persistence, high availability, and automatic scaling. Cache for Redis supports various data structures, such as strings, lists, sets, and hashes. You can use Azure Cache for Redis to improve the performance of your applications, offload your databases, and reduce the latency of your data access. It also integrates with other Azure services, such as Azure App Service and Azure Kubernetes Service.
Category: Databases
Caption: Azure Cache for Redis is a fully managed, in-memory data store that enables you to build highly responsive and scalable applications. It provides features like data persistence, high availability, and automatic scaling. Cache for Redis supports various data structures, such as strings, lists, sets, and hashes.

Title: Azure SQL Data Warehouse
Content: Azure SQL Data Warehouse is a fully managed, petabyte-scale cloud data warehouse service that enables you to store and analyze your structured and semi-structured data. It provides features like automatic scaling, data movement, and integration with Azure Machine Learning. SQL Data Warehouse supports various data sources, such as Azure Blob Storage, Azure Data Lake Storage, and Azure SQL Database. You can use Azure SQL Data Warehouse to build data lakes, develop big data analytics solutions, and ensure the performance and security of your data. It also integrates with other Azure services, such as Azure Synapse Analytics and Azure Data Factory.
Category: Databases
Caption: Azure SQL Data Warehouse is a fully managed, petabyte-scale cloud data warehouse service that enables you to store and analyze your structured and semi-structured data. It provides features like automatic scaling, data movement, and integration with Azure Machine Learning.

Title: Azure SQL Database
Content: Azure SQL Database is a fully managed relational database service based on the latest stable version of Microsoft SQL Server. It offers built-in intelligence that learns your application patterns and adapts to maximize performance, reliability, and data protection. SQL Database supports elastic scaling, allowing you to dynamically adjust resources to match your workload. It provides advanced security features, such as encryption, auditing, and threat detection. You can migrate your existing SQL Server databases to Azure SQL Database with minimal downtime.
Category: Databases
Caption: Azure SQL Database.<em> Azure SQL Database</em> is a fully managed relational database service based on the latest stable version of<em> Microsoft SQL Server.</em> It offers built-in intelligence that learns your application patterns and adapts to maximize performance, reliability, and data protection.<em> SQL Database</em> supports elastic scaling, allowing you to dynamically adjust resources to match your workload. It provides advanced security features, such as encryption, auditing, and threat detection. You can migrate your existing<em> SQL Server</em> databases to<em> Azure SQL Database</em> with minimal downtime..

Title: Azure Database Migration Service
Content: Azure Database Migration Service is a fully managed, end-to-end migration service that enables you to migrate your databases to Azure with minimal downtime. It supports various source and target platforms, such as SQL Server, MySQL, PostgreSQL, and Azure SQL Database. Database Migration Service provides features like schema conversion, data migration, and performance monitoring. You can use Azure Database Migration Service to migrate your databases, ensure the compatibility of your workloads, and minimize your migration risks. It also integrates with other Azure services, such as Azure Migrate and Azure SQL Managed Instance.
Category: Databases
Caption: Azure Database Migration Service is a fully managed, end-to-end migration service that enables you to migrate your databases to Azure with minimal downtime. It supports various source and target platforms, such as<em> SQL Server, MySQL, PostgreSQL,</em> and Azure SQL Database.

Title: Azure Cosmos DB
Content: Azure Cosmos DB is a fully managed, globally distributed, multi-model database service designed for building highly responsive and scalable applications. It offers turnkey global distribution, automatic and instant scalability, and guarantees low latency, high availability, and consistency. Cosmos DB supports popular NoSQL APIs, including MongoDB, Cassandra, Gremlin, and Azure Table Storage. You can build globally distributed applications with ease, without having to deal with complex configuration and capacity planning. Data stored in Cosmos DB is automatically indexed, enabling you to query your data with SQL, JavaScript, or other supported query languages.
Category: Databases
Caption: Azure Cosmos DB is a fully managed, globally distributed, multi-model database service designed for building highly responsive and scalable applications. It offers turnkey global distribution, automatic and instant scalability, and guarantees low latency, high availability, and consistency.

Title: Azure Cosmos DB
Content: Azure Cosmos DB is a globally distributed, multi-model database service that enables you to build and manage NoSQL applications in Azure. It provides features like automatic scaling, low-latency access, and multi-master replication. Cosmos DB supports various data models, such as key-value, document, graph, and column-family. You can use Azure Cosmos DB to build globally distributed applications, ensure high availability and performance, and manage your data at scale. It also integrates with other Azure services, such as Azure Functions and Azure App Service.
Category: Databases
Caption: Azure Cosmos<em> DB</em> is a globally distributed, multi-model database service that enables you to build and manage NoSQL applications in Azure. It provides features like automatic scaling, low-latency access, and multi-master replication. Cosmos DB supports various data models, such as key-value, document, graph, and column-family.

[補足] クエリ言語の指定

クエリ言語は以下のように変更できます。

Python SDK

query = 'what Azure services support full text search'
results = search_client.search(
    search_text=query,
    vector=Vector(
        value=generate_embeddings(query),
        k=10,
        fields='contentVector'
    ),
    query_language='ja-jp', # 日本語
    top=10
)

for result in results:
    print(f"Title: {result['title']}")
    print(f"Score: {result['@search.score']}")
    print(f"Content: {result['content']}")
    print(f"Category: {result['category']}\n")

REST API

POST https://{{search-service-name}}.search.windows.net/indexes/{{index-name}}/docs/search?api-version=2023-07-01-preview
Content-Type: application/json
api-key: {{admin-api-key}}
{
    "vectors": [
        {
            "value": [
                -0.009154141,
                0.018708462,
                . . .
                -0.02178128,
                -0.00086512347
            ],
            "fields": "contentVector",
            "k": 10
        }
    ],
    "search": "what azure services support full text search",
    "queryLanguage": "ja-jp",
    "top": "10"
}

参考

おわりに

ベクトル検索の登場によって Azure Cognitive Search による検索体験が大きく変わりそうです。また、将来的には、同じくプレビュー中の Azure OpenAI Service の On your data との連携 (On your data + Semantic hybrid search という特盛構成) に対する期待も膨らみます。一方で、周辺のツールとしても LangChain の Vectorestore に Azure Cognitive Search が既に対応しているため、こちらも別途操作性を確認してみたいと思います。

引き続き Azure Cognitive Search や Azure OpenAI Service 周辺から目が離せなそうです。

以上です。🍵

Microsoft (有志)

Discussion