Azure AI Search
Azure AI Search (formerly known as Azure Search and Azure Cognitive Search) is a distributed, RESTful search engine optimized for speed and relevance on production-scale workloads on Azure. It supports also vector search using the k-nearest neighbor (kNN) algorithm and also semantic search.
This vector store integration supports full text search, vector search and hybrid search for best ranking performance.
Learn how to leverage the vector search capabilities of Azure AI Search from this page. If you don't have an Azure account, you can create a free account to get started.
Setup
You'll first need to install the @azure/search-documents
SDK and the @langchain/community
package:
- npm
- Yarn
- pnpm
npm install -S @langchain/community @azure/search-documents
yarn add @langchain/community @azure/search-documents
pnpm add @langchain/community @azure/search-documents
You'll also need to have an Azure AI Search instance running. You can deploy a free version on Azure Portal without any cost, following this guide.
Once you have your instance running, make sure you have the endpoint and the admin key (query keys can be used only to search document, not to index, update or delete). The endpoint is the URL of your instance which you can find in the Azure Portal, under the "Overview" section of your instance. The admin key can be found under the "Keys" section of your instance. Then you need to set the following environment variables:
# Azure AI Search connection settings
AZURE_AISEARCH_ENDPOINT=
AZURE_AISEARCH_KEY=
# If you're using Azure OpenAI API, you'll need to set these variables
AZURE_OPENAI_API_KEY=
AZURE_OPENAI_API_INSTANCE_NAME=
AZURE_OPENAI_API_DEPLOYMENT_NAME=
AZURE_OPENAI_API_EMBEDDINGS_DEPLOYMENT_NAME=
AZURE_OPENAI_API_VERSION=
# Or you can use the OpenAI API directly
OPENAI_API_KEY=
API Reference:
About hybrid search
Hybrid search is a feature that combines the strengths of full text search and vector search to provide the best ranking performance. It's enabled by default in Azure AI Search vector stores, but you can select a different search query type by setting the search.type
property when creating the vector store.
You can read more about hybrid search and how it may improve your search results in the official documentation.
In some scenarios like retrieval-augmented generation (RAG), you may want to enable semantic ranking in addition to hybrid search to improve the relevance of the search results. You can enable semantic ranking by setting the search.type
property to AzureAISearchQueryType.SemanticHybrid
when creating the vector store.
Note that semantic ranking capabilities are only available in the Basic and higher pricing tiers, and subject to regional availability.
You can read more about the performance of using semantic ranking with hybrid search in this blog post.
Example: index docs, vector search and LLM integration
Below is an example that indexes documents from a file in Azure AI Search, runs a hybrid search query, and finally uses a chain to answer a question in natural language based on the retrieved documents.
import {
AzureAISearchVectorStore,
AzureAISearchQueryType,
} from "@langchain/community/vectorstores/azure_aisearch";
import { ChatPromptTemplate } from "@langchain/core/prompts";
import { ChatOpenAI, OpenAIEmbeddings } from "@langchain/openai";
import { createStuffDocumentsChain } from "langchain/chains/combine_documents";
import { createRetrievalChain } from "langchain/chains/retrieval";
import { TextLoader } from "langchain/document_loaders/fs/text";
import { RecursiveCharacterTextSplitter } from "langchain/text_splitter";
// Load documents from file
const loader = new TextLoader("./state_of_the_union.txt");
const rawDocuments = await loader.load();
const splitter = new RecursiveCharacterTextSplitter({
chunkSize: 1000,
chunkOverlap: 0,
});
const documents = await splitter.splitDocuments(rawDocuments);
// Create Azure AI Search vector store
const store = await AzureAISearchVectorStore.fromDocuments(
documents,
new OpenAIEmbeddings(),
{
search: {
type: AzureAISearchQueryType.SimilarityHybrid,
},
}
);
// The first time you run this, the index will be created.
// You may need to wait a bit for the index to be created before you can perform
// a search, or you can create the index manually beforehand.
// Performs a similarity search
const resultDocuments = await store.similaritySearch(
"What did the president say about Ketanji Brown Jackson?"
);
console.log("Similarity search results:");
console.log(resultDocuments[0].pageContent);
/*
Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while you’re at it, pass the Disclose Act so Americans can know who is funding our elections.
Tonight, I’d like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service.
One of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court.
And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nation’s top legal minds, who will continue Justice Breyer’s legacy of excellence.
*/
// Use the store as part of a chain
const model = new ChatOpenAI({ model: "gpt-3.5-turbo-1106" });
const questionAnsweringPrompt = ChatPromptTemplate.fromMessages([
[
"system",
"Answer the user's questions based on the below context:\n\n{context}",
],
["human", "{input}"],
]);
const combineDocsChain = await createStuffDocumentsChain({
llm: model,
prompt: questionAnsweringPrompt,
});
const chain = await createRetrievalChain({
retriever: store.asRetriever(),
combineDocsChain,
});
const response = await chain.invoke({
input: "What is the president's top priority regarding prices?",
});
console.log("Chain response:");
console.log(response.answer);
/*
The president's top priority is getting prices under control.
*/
API Reference:
- AzureAISearchVectorStore from
@langchain/community/vectorstores/azure_aisearch
- AzureAISearchQueryType from
@langchain/community/vectorstores/azure_aisearch
- ChatPromptTemplate from
@langchain/core/prompts
- ChatOpenAI from
@langchain/openai
- OpenAIEmbeddings from
@langchain/openai
- createStuffDocumentsChain from
langchain/chains/combine_documents
- createRetrievalChain from
langchain/chains/retrieval
- TextLoader from
langchain/document_loaders/fs/text
- RecursiveCharacterTextSplitter from
langchain/text_splitter