Skip to main content

Returning structured output

Here is a simple example of an agent which uses LCEL, a web search tool (Tavily) and a structured output parser to create an OpenAI functions agent that returns source chunks.

The first step is to import necessary modules

npm install @langchain/openai
import { zodToJsonSchema } from "zod-to-json-schema";
import { z } from "zod";
import {
type BaseMessage,
AIMessage,
FunctionMessage,
type AgentFinish,
type AgentStep,
} from "langchain/schema";
import { RunnableSequence } from "langchain/runnables";
import { ChatPromptTemplate, MessagesPlaceholder } from "langchain/prompts";
import { ChatOpenAI } from "@langchain/openai";
import { AgentExecutor } from "langchain/agents";
import { DynamicTool } from "@langchain/core/tools";
import type { FunctionsAgentAction } from "langchain/agents/openai/output_parser";
import { convertToOpenAIFunction } from "@langchain/core/utils/function_calling";

import { TavilySearchAPIRetriever } from "@langchain/community/retrievers/tavily_search_api";

Next, we initialize an LLM and a search tool that wraps our web search retriever. We will later bind this as an OpenAI function:

const llm = new ChatOpenAI({
model: "gpt-4-1106-preview",
});

const searchTool = new DynamicTool({
name: "web-search-tool",
description: "Tool for getting the latest information from the web",
func: async (searchQuery: string, runManager) => {
const retriever = new TavilySearchAPIRetriever();
const docs = await retriever.invoke(searchQuery, runManager?.getChild());
return docs.map((doc) => doc.pageContent).join("\n-----\n");
},
});

Now we can define our prompt template. We'll use a simple ChatPromptTemplate with placeholders for the user's question, and the agent scratchpad.

const prompt = ChatPromptTemplate.fromMessages([
[
"system",
"You are a helpful assistant. You must always call one of the provided tools.",
],
["user", "{input}"],
new MessagesPlaceholder("agent_scratchpad"),
]);

After that, we define our structured response schema using Zod. This schema defines the structure of the final response from the agent.

const responseSchema = z.object({
answer: z.string().describe("The final answer to return to the user"),
sources: z
.array(z.string())
.describe(
"List of page chunks that contain answer to the question. Only include a page chunk if it contains relevant information"
),
});

Once our response schema is defined, we can construct it as an OpenAI function to later be passed to the model. This is an important step regarding consistency as the model will always respond in this schema when it successfully completes a task

const responseOpenAIFunction = {
name: "response",
description: "Return the response to the user",
parameters: zodToJsonSchema(responseSchema),
};

Next, we construct a custom structured output parsing function that can detect when the model has called our final response function. This is similar to the method in the stock JSONOutputFunctionsParser, but with a change to directly return a response when the final response function is called.

const structuredOutputParser = (
message: AIMessage
): FunctionsAgentAction | AgentFinish => {
if (message.content && typeof message.content !== "string") {
throw new Error("This agent cannot parse non-string model responses.");
}
if (message.additional_kwargs.function_call) {
const { function_call } = message.additional_kwargs;
try {
const toolInput = function_call.arguments
? JSON.parse(function_call.arguments)
: {};
// If the function call name is `response` then we know it's used our final
// response function and can return an instance of `AgentFinish`
if (function_call.name === "response") {
return { returnValues: { ...toolInput }, log: message.content };
}
return {
tool: function_call.name,
toolInput,
log: `Invoking "${function_call.name}" with ${
function_call.arguments ?? "{}"
}\n${message.content}`,
messageLog: [message],
};
} catch (error) {
throw new Error(
`Failed to parse function arguments from chat model response. Text: "${function_call.arguments}". ${error}`
);
}
} else {
return {
returnValues: { output: message.content },
log: message.content,
};
}
};

After this, we can bind our two functions to the LLM, and create a runnable sequence which will be used as the agent.

Important - note here we pass in agent_scratchpad as an input variable, which formats all the previous steps using the formatForOpenAIFunctions function. This is very important as it contains all the context history the model needs to preform accurate tasks. Without this, the model would have no context on the previous steps taken. The formatForOpenAIFunctions function returns the steps as an array of BaseMessages. This is necessary as the MessagesPlaceholder class expects this type as the input.

const formatAgentSteps = (steps: AgentStep[]): BaseMessage[] =>
steps.flatMap(({ action, observation }) => {
if ("messageLog" in action && action.messageLog !== undefined) {
const log = action.messageLog as BaseMessage[];
return log.concat(new FunctionMessage(observation, action.tool));
} else {
return [new AIMessage(action.log)];
}
});

const llmWithTools = llm.bind({
functions: [convertToOpenAIFunction(searchTool), responseOpenAIFunction],
});
/** Create the runnable */
const runnableAgent = RunnableSequence.from<{
input: string;
steps: Array<AgentStep>;
}>([
{
input: (i) => i.input,
agent_scratchpad: (i) => formatAgentSteps(i.steps),
},
prompt,
llmWithTools,
structuredOutputParser,
]);

Finally, we can create an instance of AgentExecutor and run the agent.

const executor = AgentExecutor.fromAgentAndTools({
agent: runnableAgent,
tools: [searchTool],
});
/** Call invoke on the agent */
const res = await executor.invoke({
input: "what is the current weather in honolulu?",
});
console.log({
res,
});

The output will look something like this

{
res: {
answer: 'The current weather in Honolulu is 71 \bF with light rain and broken clouds.',
sources: [
'Currently: 71 \bF. Light rain. Broken clouds. (Weather station: Honolulu International Airport, USA). See more current weather'
]
}
}

Help us out by providing feedback on this documentation page: