Dexter is a powerful TypeScript library for working with Large Language Models (LLMs), with a focus on real-world Retrieval-Augmented Generation (RAG) applications. It provides a set of tools and utilities to interact with various AI models, manage caching, handle embeddings, and implement AI functions.
-
Comprehensive Model Support: Implementations for Chat, Completion, Embedding, and Sparse Vector models, with efficient OpenAI API integration via
openai-fetch
. -
Advanced AI Function Utilities: Tools for creating and managing AI functions, including
createAIFunction
,createAIExtractFunction
, andcreateAIRunner
, with Zod integration for schema validation. -
Structured Data Extraction: Dexter supports OpenAI's structured output feature through the
createExtractFunction
, which uses theresponse_format
parameter with a JSON schema derived from a Zod schema. -
Flexible Caching and Tokenization: Built-in caching system with custom cache support, and advanced tokenization based on
tiktoken
for accurate token management. -
Robust Observability and Control: Customizable telemetry system, comprehensive event hooks, and specialized error handling for enhanced monitoring and control.
-
Performance Optimization: Built-in support for batching, throttling, and streaming, optimized for handling large-scale operations and real-time responses.
-
TypeScript-First and Environment Flexible: Fully typed for excellent developer experience, with minimal dependencies and compatibility across Node.js 18+, Deno, Cloudflare Workers, and Vercel edge functions.
To install Dexter, use your preferred package manager:
npm install @dexaai/dexter
This package requires node >= 18
or an environment with fetch
support.
This package exports ESM. If your project uses CommonJS, consider switching to ESM or use the dynamic import()
function.
Here's a basic example of how to use the ChatModel:
import { ChatModel } from '@dexaai/dexter';
const chatModel = new ChatModel({
params: { model: 'gpt-3.5-turbo' },
});
const response = await chatModel.run({
messages: [{ role: 'user', content: 'Tell me a short joke' }],
});
console.log(response.message.content);
}
import { ChatModel, MsgUtil } from '@dexaai/dexter';
const chatModel = new ChatModel({
params: { model: 'gpt-4' },
});
const response = await chatModel.run({
messages: [MsgUtil.user('Write a short story about a robot learning to love')],
handleUpdate: (chunk) => {
process.stdout.write(chunk);
},
});
console.log('\n\nFull response:', response.message.content);
import { ChatModel, createExtractFunction } from '@dexaai/dexter';
import { z } from 'zod';
const extractPeopleNames = createExtractFunction({
chatModel: new ChatModel({ params: { model: 'gpt-4o-mini' } }),
systemMessage: `You extract the names of people from unstructured text.`,
name: 'people_names',
schema: z.object({
names: z.array(
z.string().describe(
`The name of a person from the message. Normalize the name by removing suffixes, prefixes, and fixing capitalization`
)
),
}),
});
const peopleNames = await extractPeopleNames(
`Dr. Andrew Huberman interviewed Tony Hawk, an idol of Andrew Hubermans.`
);
console.log('peopleNames', peopleNames);
// => ['Andrew Huberman', 'Tony Hawk']
import { ChatModel, MsgUtil, createAIFunction } from '@dexaai/dexter';
import { z } from 'zod';
const getWeather = createAIFunction(
{
name: 'get_weather',
description: 'Gets the weather for a given location',
argsSchema: z.object({
location: z.string().describe('The city and state e.g. San Francisco, CA'),
unit: z.enum(['c', 'f']).optional().default('f').describe('The unit of temperature to use'),
}),
},
async ({ location, unit }) => {
// Simulate API call
await new Promise((resolve) => setTimeout(resolve, 500));
return {
location,
unit,
temperature: Math.floor(Math.random() * 30) + 10,
condition: ['sunny', 'cloudy', 'rainy'][Math.floor(Math.random() * 3)],
};
}
);
const chatModel = new ChatModel({
params: {
model: 'gpt-4',
tools: [{ type: 'function', function: getWeather.spec }],
},
});
const response = await chatModel.run({
messages: [MsgUtil.user('What\'s the weather like in New York?')],
});
console.log(response.message);
import { EmbeddingModel } from '@dexaai/dexter';
const embeddingModel = new EmbeddingModel({
params: { model: 'text-embedding-ada-002' },
});
const response = await embeddingModel.run({
input: ['Hello, world!', 'How are you?'],
});
console.log(response.embeddings);
The Dexter library is organized into the following main directories:
src/
: Contains the source code for the librarymodel/
: Core model implementations and utilitiesai-function/
: AI function creation and handling
examples/
: Contains example scripts demonstrating library usagedist/
: Contains the compiled JavaScript output (generated after build)
Key files:
src/model/chat.ts
: Implementation of the ChatModelsrc/model/completion.ts
: Implementation of the CompletionModelsrc/model/embedding.ts
: Implementation of the EmbeddingModelsrc/model/sparse-vector.ts
: Implementation of the SparseVectorModelsrc/ai-function/ai-function.ts
: AI function creation utilitiessrc/model/utils/
: Various utility functions and helpers
The ChatModel
class is used for interacting with chat-based language models.
new ChatModel(args?: ChatModelArgs<CustomCtx>)
args
: Optional configuration objectparams
: Model parameters (e.g.,model
,temperature
)client
: Custom OpenAI client (optional)cache
: Cache implementation (optional)context
: Custom context object (optional)events
: Event handlers (optional)debug
: Enable debug logging (optional)
run(params: ChatModelRun, context?: CustomCtx): Promise<ChatModelResponse>
- Executes the chat model with the given parameters and context
extend(args?: PartialChatModelArgs<CustomCtx>): ChatModel<CustomCtx>
- Creates a new instance of the model with modified configuration
The CompletionModel
class is used for text completion tasks.
new CompletionModel(args?: CompletionModelArgs<CustomCtx>)
args
: Optional configuration object (similar to ChatModel)
run(params: CompletionModelRun, context?: CustomCtx): Promise<CompletionModelResponse>
- Executes the completion model with the given parameters and context
extend(args?: PartialCompletionModelArgs<CustomCtx>): CompletionModel<CustomCtx>
- Creates a new instance of the model with modified configuration
The EmbeddingModel
class is used for generating embeddings from text.
new EmbeddingModel(args?: EmbeddingModelArgs<CustomCtx>)
args
: Optional configuration object (similar to ChatModel)
run(params: EmbeddingModelRun, context?: CustomCtx): Promise<EmbeddingModelResponse>
- Generates embeddings for the given input texts
extend(args?: PartialEmbeddingModelArgs<CustomCtx>): EmbeddingModel<CustomCtx>
- Creates a new instance of the model with modified configuration
The SparseVectorModel
class is used for generating sparse vector representations.
new SparseVectorModel(args: SparseVectorModelArgs<CustomCtx>)
args
: Configuration objectserviceUrl
: URL of the SPLADE service (required)- Other options similar to ChatModel
run(params: SparseVectorModelRun, context?: CustomCtx): Promise<SparseVectorModelResponse>
- Generates sparse vector representations for the given input texts
extend(args?: PartialSparseVectorModelArgs<CustomCtx>): SparseVectorModel<CustomCtx>
- Creates a new instance of the model with modified configuration
Creates a function to extract structured data from text using OpenAI's structured output feature.
This is a better way to extract structured data than using the legacy createAIExtractFunction
function.
createExtractFunction<Schema extends z.ZodObject<any>>(args: {
chatModel: Model.Chat.Model;
name: string;
schema: Schema;
systemMessage: string;
}): (input: string | Msg) => Promise<z.infer<Schema>>
Creates a function meant to be used with OpenAI tool or function calling.
createAIFunction<Schema extends z.ZodObject<any>, Return>(
spec: {
name: string;
description?: string;
argsSchema: Schema;
},
implementation: (params: z.infer<Schema>) => Promise<Return>
): AIFunction<Schema, Return>
Creates a function to extract structured data from text using OpenAI function calling.
createAIExtractFunction<Schema extends z.ZodObject<any>>(
{
chatModel: Model.Chat.Model;
name: string;
description?: string;
schema: Schema;
maxRetries?: number;
systemMessage?: string;
functionCallConcurrency?: number;
},
customExtractImplementation?: (params: z.infer<Schema>) => z.infer<Schema> | Promise<z.infer<Schema>>
): ExtractFunction<Schema>
Creates a function to run a chat model in a loop, handling parsing, running, and inserting responses for function & tool call messages.
createAIRunner<Content = string>(args: {
chatModel: Model.Chat.Model;
functions?: AIFunction[];
shouldBreakLoop?: (msg: Msg) => boolean;
maxIterations?: number;
functionCallConcurrency?: number;
validateContent?: (content: string | null) => Content | Promise<Content>;
mode?: Runner.Mode;
systemMessage?: string;
onRetriableError?: (error: Error) => void;
}): Runner<Content>
Utility class for creating and checking message types.
MsgUtil.system(content: string, opts?): Msg.System
MsgUtil.user(content: string, opts?): Msg.User
MsgUtil.assistant(content: string, opts?): Msg.Assistant
MsgUtil.funcCall(function_call: { name: string; arguments: string }, opts?): Msg.FuncCall
MsgUtil.funcResult(content: Jsonifiable, name: string): Msg.FuncResult
MsgUtil.toolCall(tool_calls: Msg.Call.Tool[], opts?): Msg.ToolCall
MsgUtil.toolResult(content: Jsonifiable, tool_call_id: string, opts?): Msg.ToolResult
Utility for encoding, decoding, and counting tokens for various models.
createTokenizer(model: string): Tokenizer
Utilities for caching model responses.
type CacheStorage<KeyType, ValueType>
type CacheKey<Params extends Record<string, any>, KeyType = string>
defaultCacheKey<Params extends Record<string, any>>(params: Params): string
OpenAI Client (openai-fetch)
Dexter uses the openai-fetch
library to interact with the OpenAI API. This client is lightweight, well-typed, and provides a simple interface for making API calls. Here's how it's used in Dexter:
-
Default Client: By default, Dexter creates an instance of
OpenAIClient
fromopenai-fetch
when initializing models. -
Custom Client: You can provide your own instance of
OpenAIClient
when creating a model:import { OpenAIClient } from 'openai-fetch'; import { ChatModel } from '@dexaai/dexter'; const client = new OpenAIClient({ apiKey: 'your-api-key' }); const chatModel = new ChatModel({ client });
-
Client Caching: Dexter implements caching for
OpenAIClient
instances to improve performance when creating multiple models with the same configuration. -
Streaming Support: The
openai-fetch
client supports streaming responses, which Dexter utilizes for real-time output in chat models. -
Structured Output: Dexter supports OpenAI's structured output feature through the
createExtractFunction
, which uses theresponse_format
parameter with a JSON schema derived from a Zod schema.
Dexter defines a set of message types (Msg
) that closely align with the OpenAI API's message formats but with some enhancements for better type safety and easier handling. The MsgUtil
class provides methods for creating, checking, and asserting these message types.
Msg.System
: System messagesMsg.User
: User messagesMsg.Assistant
: Assistant messagesMsg.Refusal
: Refusal messages (thrown as errors in Dexter)Msg.FuncCall
: Function call messagesMsg.FuncResult
: Function result messagesMsg.ToolCall
: Tool call messagesMsg.ToolResult
: Tool result messages
These types are designed to be compatible with the ChatMessage
type from openai-fetch
, with some differences:
- Dexter throws a
RefusalError
for refusal messages instead of including them in theMsg
union. - The
content
property is always defined (string or null) in Dexter's types.
-
Creation Methods:
system
,user
,assistant
,funcCall
,funcResult
,toolCall
,toolResult
-
Type Checking Methods:
isSystem
,isUser
,isAssistant
,isRefusal
,isFuncCall
,isFuncResult
,isToolCall
,isToolResult
-
Type Assertion Methods:
assertSystem
,assertUser
,assertAssistant
,assertRefusal
,assertFuncCall
,assertFuncResult
,assertToolCall
,assertToolResult
-
Conversion Method:
fromChatMessage
: Converts anopenai-fetch
ChatMessage
to a DexterMsg
type
Dexter includes a telemetry system for tracking and logging model operations. The telemetry system is based on the OpenTelemetry standard and can be integrated with various observability platforms.
-
Default Telemetry: By default, Dexter uses a no-op telemetry provider that doesn't perform any actual logging or tracing.
-
Custom Telemetry: You can provide your own telemetry provider when initializing models. The provider should implement the
Telemetry.Provider
interface:interface Provider { startSpan<T>(options: SpanOptions, callback: (span: Span) => T): T; setTags(tags: { [key: string]: Primitive }): void; }
-
Span Attributes: Dexter automatically adds various attributes to telemetry spans, including model type, provider, input tokens, output tokens, and more.
-
Usage: Telemetry is used internally in the
AbstractModel
class to wrap therun
method, providing insights into model execution.
Dexter provides a flexible caching system to improve performance and reduce API calls:
-
Cache Interface: The cache must implement the
CacheStorage
interface:interface CacheStorage<KeyType, ValueType> { get: (key: KeyType) => Promise<ValueType | undefined> | ValueType | undefined; set: (key: KeyType, value: ValueType) => Promise<unknown> | unknown; }
-
Default Cache Key: Dexter uses a default cache key function that creates a SHA512 hash of the input parameters.
-
Custom Cache: You can provide your own cache implementation when initializing models:
import { ChatModel } from '@dexaai/dexter'; const customCache = new Map(); const chatModel = new ChatModel({ cache: customCache });
-
Cache Usage: Caching is automatically applied in the
AbstractModel
class. Before making an API call, it checks the cache for a stored response. After receiving a response, it stores it in the cache for future use. -
Cache Invalidation: Cache invalidation is left to the user. You can implement your own cache invalidation strategy based on your specific use case.
Dexter includes a tokenization system based on the tiktoken
library, which is used by OpenAI for their models. This system is crucial for accurately counting tokens and managing model inputs and outputs.
-
Tokenizer Creation: The
createTokenizer
function creates aTokenizer
instance for a specific model:const tokenizer = createTokenizer('gpt-3.5-turbo');
-
Tokenizer Methods:
encode(text: string): Uint32Array
: Encodes text to tokensdecode(tokens: number[] | Uint32Array): string
: Decodes tokens to textcountTokens(input?: string | ChatMessage | ChatMessage[]): number
: Counts tokens in various input formatstruncate({ text: string, max: number, from?: 'start' | 'end' }): string
: Truncates text to a maximum number of tokens
-
Model Integration: Each model instance has its own
Tokenizer
, which is used internally for token counting and management.
Dexter provides a system of event hooks that allow you to add custom logic at various points in the model execution process. These hooks are defined in the Model.Events
interface:
-
Available Hooks:
onStart
: Called before the model execution startsonApiResponse
: Called after receiving a response from the APIonComplete
: Called after the model execution is completeonError
: Called if an error occurs during model execution
-
Hook Parameters: Each hook receives an event object with relevant information, such as timestamps, model parameters, responses, and context.
-
Usage: Event hooks can be defined when creating a model instance:
const chatModel = new ChatModel({ events: { onStart: [(event) => console.log('Starting model execution', event)], onComplete: [(event) => console.log('Model execution complete', event)], }, });
-
Multiple Handlers: Each event can have multiple handlers, which are executed in the order they are defined.
-
Async Handlers: Event handlers can be asynchronous functions. Dexter uses
Promise.allSettled
to handle multiple async handlers. -
Extending Models: When using the
extend
method to create a new model instance, event handlers are merged, allowing you to add new handlers without removing existing ones.const boringModel = new ChatModel({ params: { model: 'gpt-4o', temperature: 0 } }); const funModel = boringModel.extend({ params: { temperature: 2 } }); const cheapAndFunModel = funModel.extend({ params: { model: 'gpt-4o-mini' } });
MIT © Dexa