
关于
适用于 JavaScript/TypeScript 的 Azure AI Voice Live SDK。通过双向 WebSocket 通信构建实时语音 AI 应用。
name: azure-ai-voicelive-ts description: 用于JavaScript/TypeScript的Azure AI Voice Live SDK。构建具有双向WebSocket通信的实时语音AI应用。 risk: unknown source: community date_added: '2026-02-27'
@azure/ai-voicelive (JavaScript/TypeScript)
用于在Node.js和浏览器环境中使用Azure AI构建双向语音助手的实时语音AI SDK。
安装
npm install @azure/ai-voicelive @azure/identity
# TypeScript用户
npm install @types/node
当前版本: 1.0.0-beta.3
支持的环境:
- Node.js LTS版本(20+)
- 现代浏览器(Chrome、Firefox、Safari、Edge)
环境变量
AZURE_VOICELIVE_ENDPOINT=https://<resource>.cognitiveservices.azure.com
# 可选:如果不使用Entra ID则使用API密钥
AZURE_VOICELIVE_API_KEY=<your-api-key>
# 可选:日志
AZURE_LOG_LEVEL=info
认证
Microsoft Entra ID(推荐)
import { DefaultAzureCredential } from "@azure/identity";
import { VoiceLiveClient } from "@azure/ai-voicelive";
const credential = new DefaultAzureCredential();
const endpoint = "https://your-resource.cognitiveservices.azure.com";
const client = new VoiceLiveClient(endpoint, credential);
API密钥
import { AzureKeyCredential } from "@azure/core-auth";
import { VoiceLiveClient } from "@azure/ai-voicelive";
const endpoint = "https://your-resource.cognitiveservices.azure.com";
const credential = new AzureKeyCredential("your-api-key");
const client = new VoiceLiveClient(endpoint, credential);
客户端层级结构
VoiceLiveClient
└── VoiceLiveSession (WebSocket连接)
├── updateSession() → 配置会话选项
├── subscribe() → 事件处理器(Azure SDK模式)
├── sendAudio() → 流式音频输入
├── addConversationItem() → 添加消息/函数输出
└── sendEvent() → 发送原始协议事件
快速开始
import { DefaultAzureCredential } from "@azure/identity";
import { VoiceLiveClient } from "@azure/ai-voicelive";
const credential = new DefaultAzureCredential();
const endpoint = process.env.AZURE_VOICELIVE_ENDPOINT!;
// 创建客户端并启动会话
const client = new VoiceLiveClient(endpoint, credential);
const session = await client.startSession("gpt-4o-mini-realtime-preview");
// 配置会话
await session.updateSession({
modalities: ["text", "audio"],
instructions: "You are a helpful AI assistant. Respond naturally.",
voice: {
type: "azure-standard",
name: "en-US-AvaNeural",
},
turnDetection: {
type: "server_vad",
threshold: 0.5,
prefixPaddingMs: 300,
silenceDurationMs: 500,
},
inputAudioFormat: "pcm16",
outputAudioFormat: "pcm16",
});
// 订阅事件
const subscription = session.subscribe({
onResponseAudioDelta: async (event, context) => {
// 处理流式音频输出
const audioData = event.delta;
playAudioChunk(audioData);
},
onResponseTextDelta: async (event, context) => {
// 处理流式文本
process.stdout.write(event.delta);
},
onInputAudioTranscriptionCompleted: async (event, context) => {
console.log("User said:", event.transcript);
},
});
// 从麦克风发送音频
function sendAudioChunk(audioBuffer: ArrayBuffer) {
session.sendAudio(audioBuffer);
}
会话配置
await session.updateSession({
// 模态
modalities: ["audio", "text"],
// 系统指令
instructions: "You are a customer service representative.",
// 语音选择
voice: {
type: "azure-standard", // 或 "azure-custom", "openai"
name: "en-US-AvaNeural",
},
// 轮次检测(VAD)
turnDetection: {
type: "server_vad", // 或 "azure_semantic_vad"
threshold: 0.5,
prefixPaddingMs: 300,
silenceDurationMs: 500,
},
// 音频格式
inputAudioFormat: "pcm16",
outputAudioFormat: "pcm16",
// 工具(函数调用)
tools: [
{
type: "function",
name: "get_weather",
description: "Get current weather",
parameters: {
type: "object",
properties: {
location: { type: "string" }
},
required: ["location"]
}
}
],
toolChoice: "auto",
});
事件处理(Azure SDK模式)
SDK使用基于订阅的事件处理模式:
const subscription = session.subscribe({
// 连接生命周期
onConnected: async (args, context) => {
console.log("Connected:", args.connectionId);
},
onDisconnected: async (args, context) => {
console.log("Disconnected:", args.code, args.reason);
},
onError: async (args, context) => {
console.error("Error:", args.error.message);
},
// 会话事件
onSessionCreated: async (event, context) => {
console.log("Session created:", context.sessionId);
},
onSessionUpdated: async (event, conte
兼容工具
Claude CodeCursor
标签
后端开发
