Speech-to-Text
OpenAiSttClient transcribes audio files using the OpenAI-compatible
/audio/transcriptions endpoint. Like all clients, it implements BotClient and uses
send() as its entry point.
Feature flag: api-clients
Setup
use aitk::prelude::*;
let mut client = OpenAiSttClient::new("https://api.openai.com/v1".into());
client.set_key("your-api-key").unwrap();
Transcribing audio
The audio must be provided as an Attachment on the last message. The response text
contains the transcription:
use futures::StreamExt;
let audio_bytes: Vec<u8> = std::fs::read("recording.mp3").unwrap();
let attachment = Attachment::from_bytes(
"recording.mp3".into(),
Some("audio/mpeg".into()),
&audio_bytes,
);
let bot_id = BotId::new("whisper-1");
let messages = vec![Message {
from: EntityId::User,
content: MessageContent {
attachments: vec![attachment],
..Default::default()
},
..Default::default()
}];
let mut stream = client.send(&bot_id, &messages, &[]);
while let Some(result) = stream.next().await {
if let Some(content) = result.into_value() {
println!("Transcription: {}", content.text);
}
}
The client sends the audio as a multipart form upload with the model ID and the file.