logo
平台介绍
快速接入
密钥管理
文本转语音
文本转语音介绍
POST
接口能力介绍(非流式)
SSE
接口能力介绍(流式)
WSS
接口能力介绍(WSS)
音色克隆
音色列表
智能体
视频生成
语音识别(ASR)
计费规则
常见问题
工作台
立即登录

语音合成 API (TTS)

基于文本到语音(Text-to-Speech, TTS)的同步语音合成功能,单次请求支持的最大文本长度为 10000 字符,适用于短句生成、语音对话、在线社交等多种场景。

接口概览

  • 接口地址: https://api.senseaudio.cn/v1/t2a_v2
  • 请求方式: POST
  • Content-Type: application/json
  • 鉴权方式: Bearer Token

请求配置

请求头 (Request Headers)

参数名必填说明示例
Authorization是鉴权 Token。格式:Bearer API_KEYBearer sk-123456…
Content-Type是内容类型。固定为 application/jsonapplication/json

请求参数 (Request Body)

核心参数

参数名类型必填描述示例值
modelstring是模型名称。固定值。SenseAudio-TTS-1.0
textstring是待合成的文本内容。支持中英文,最大 10000 字符。<break time=500>详解见下方停顿符说明你好,<break time=500>世界
streamboolean是流式输出。固定为 true。true
voice_settingobject是音色相关设置。详见下表。{ “voice_id”: ”…” }
audio_settingobject否音频格式设置。详见下表。{ “sample_rate”: 32000 }
dictionaryarray否多音字配置列表。详见下表(仅克隆音色使用、模型必须为SenseAudio-TTS-1.5)[{“original”: “好干净”,“replacement”: “[hao4]干净”}]

<break> 停顿符说明

<break> 用于在语音合成中插入停顿。

xml
复制
<break time=500>
  • time 单位为毫秒(ms)
  • 500 表示停顿 500 毫秒
  • 最小值为 100 毫秒,最大值无限制

示例:

text
复制
你好<break time=500>欢迎使用我们的服务

voice_setting (音色设置)

参数名类型必填描述默认值取值范围
voice_idstring是可用套餐音色ID、克隆音色ID,请参考 API音色服务说明。--
speedfloat否语速调节。1.0[0.5, 2.0]
volfloat否音量调节。1.0[0, 10]
pitchint否音调调节。0[-12, 12]

audio_setting (音频设置)

参数名类型必填描述默认值选项
formatstring否音频编码格式。“mp3”mp3, wav, pcm, flac
sample_rateint否音频采样率 (Hz)。320008000, 16000, 22050, 24000, 32000, 44100
bitrateint否比特率 (仅 MP3)。12800032000, 64000, 128000, 256000
channelint否声道数。21 (单声道), 2 (双声道)

dictionary (多音字纠正)

参数名类型必填描述默认值示例
originalstring是原始文本。无铺床铺地,量米量酒杯
replacementint是多音字配置。无铺床铺[di4],[liang2]米[liang4]酒杯

响应结构

参数名类型说明
dataobject返回的合成数据对象,可能为 null,需进行非空判断
data.audiostring合成后的音频数据,采用 hex 编码,格式与请求中指定的输出格式一致
data.statusint64当前音频流状态:1 表示合成中,2 表示合成结束
extra_infoobject音频的附加信息。流式返回时只有最后一个 chunk 会返回
extra_info.audio_lengthint64音频时长(毫秒)
extra_info.audio_sample_rateint64音频采样率
extra_info.audio_sizeint64音频文件大小(字节)
extra_info.bitrateint64音频比特率
extra_info.audio_formatstring生成音频文件的格式。取值范围:mp3, pcm, flac, wav
extra_info.audio_channelint生成音频声道数。1:单声道,2:双声道
extra_info.word_countint64字数:按 grapheme cluster 统计合成文本内容,且排除纯空白/标点/控制符的簇
extra_info.character_countint64字符数:按 Unicode 码点统计合成文本内容
base_respobject本次请求的状态码和详情
base_resp.status_codeint64状态码(HTTP status code)
base_resp.status_messagestring状态详情

响应示例

json
复制
{ "data": { "audio": "hex编码的音频数据...", "status": 2 }, "extra_info": { "audio_length": 3500, "audio_sample_rate": 32000, "audio_size": 56000, "bitrate": 128000, "audio_format": "mp3", "audio_channel": 1, "word_count": 24, "character_count": 30 }, "base_resp": { "status_code": 0, "status_message": "success" } }

代码示例

CURL

bash
复制
# 1. 发送请求并保存响应 curl -X POST https://api.senseaudio.cn/v1/t2a_v2 \ -H "Authorization: Bearer YOUR_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "SenseAudio-TTS-1.0", "text": "道可道,非常道。名可名,非常名。无名天地之始,有名万物之母。", "stream": false, "voice_setting": { "voice_id": "child_0001_a" } }' -o response.json # 2. 提取 hex 音频数据并解码为二进制文件 jq -r '.data.audio' response.json | xxd -r -p > output.mp3 # 3. 查看音频信息 jq '.extra_info' response.json

Python

python
复制
import requests API_URL = "https://api.senseaudio.cn/v1/t2a_v2" HEADERS = { "Authorization": "Bearer YOUR_API_KEY", "Content-Type": "application/json" } # 非流式合成 def tts_non_stream(): payload = { "model": "SenseAudio-TTS-1.0", "text": "道可道,非常道。名可名,非常名。", "stream": False, "voice_setting": { "voice_id": "child_0001_a" } } resp = requests.post(API_URL, json=payload, headers=HEADERS) if resp.status_code == 200: result = resp.json() if result.get("data") and result["data"].get("audio"): # 将 hex 编码的音频数据解码为二进制 audio_hex = result["data"]["audio"] audio_bytes = bytes.fromhex(audio_hex) with open("output.mp3", "wb") as f: f.write(audio_bytes) print("合成成功") print(f"音频时长: {result['extra_info']['audio_length']}ms") else: print(f"合成失败: {result['base_resp']['status_message']}") if __name__ == "__main__": tts_non_stream()

JavaScript

javascript
复制
const axios = require('axios'); const fs = require('fs'); const API_URL = 'https://api.senseaudio.cn/v1/t2a_v2'; const HEADERS = { 'Authorization': 'Bearer YOUR_API_KEY', 'Content-Type': 'application/json' }; // 非流式合成 async function tts() { const payload = { model: 'SenseAudio-TTS-1.0', text: '道可道,非常道。名可名,非常名。', stream: false, voice_setting: { voice_id: 'female_jiaomei' } }; const res = await axios.post(API_URL, payload, { headers: HEADERS }); const result = res.data; if (result.data && result.data.audio) { // 将 hex 编码的音频数据解码为二进制 const audioBuffer = Buffer.from(result.data.audio, 'hex'); fs.writeFileSync('output.mp3', audioBuffer); console.log('合成成功'); console.log(`音频时长: ${result.extra_info.audio_length}ms`); } else { console.log(`合成失败: ${result.base_resp.status_message}`); } } tts();

Go

go
复制
package main import ( "bytes" "encoding/hex" "encoding/json" "fmt" "io" "net/http" "os" ) const ( APIURL = "https://api.senseaudio.cn/v1/t2a_v2" APIKey = "YOUR_API_KEY" ) type TTSRequest struct { Model string `json:"model"` Text string `json:"text"` Stream bool `json:"stream"` VoiceSetting VoiceSetting `json:"voice_setting"` } type VoiceSetting struct { VoiceID string `json:"voice_id"` } type TTSResponse struct { Data struct { Audio string `json:"audio"` Status int64 `json:"status"` } `json:"data"` ExtraInfo struct { AudioLength int64 `json:"audio_length"` } `json:"extra_info"` BaseResp struct { StatusCode int64 `json:"status_code"` StatusMessage string `json:"status_message"` } `json:"base_resp"` } func main() { payload := TTSRequest{ Model: "SenseAudio-TTS-1.0", Text: "道可道,非常道。名可名,非常名。", Stream: false, VoiceSetting: VoiceSetting{ VoiceID: "female_jiaomei", }, } jsonData, _ := json.Marshal(payload) req, _ := http.NewRequest("POST", APIURL, bytes.NewBuffer(jsonData)) req.Header.Set("Authorization", "Bearer "+APIKey) req.Header.Set("Content-Type", "application/json") client := &http.Client{} resp, err := client.Do(req) if err != nil { fmt.Println("请求失败:", err) return } defer resp.Body.Close() body, _ := io.ReadAll(resp.Body) var result TTSResponse json.Unmarshal(body, &result) if result.Data.Audio != "" { // 将 hex 编码的音频数据解码为二进制 audioBytes, _ := hex.DecodeString(result.Data.Audio) os.WriteFile("output.mp3", audioBytes, 0644) fmt.Println("合成成功") fmt.Printf("音频时长: %dms\n", result.ExtraInfo.AudioLength) } else { fmt.Printf("合成失败: %s\n", result.BaseResp.StatusMessage) } }

Java

java
复制
import java.io.*; import java.net.HttpURLConnection; import java.net.URL; import org.json.JSONObject; public class SenseAudioTTS { private static final String API_URL = "https://api.senseaudio.cn/v1/t2a_v2"; private static final String API_KEY = "YOUR_API_KEY"; public static void main(String[] args) { try { JSONObject voiceSetting = new JSONObject(); voiceSetting.put("voice_id", "female_jiaomei"); JSONObject payload = new JSONObject(); payload.put("model", "SenseAudio-TTS-1.0"); payload.put("text", "道可道,非常道。名可名,非常名。"); payload.put("stream", false); payload.put("voice_setting", voiceSetting); URL url = new URL(API_URL); HttpURLConnection conn = (HttpURLConnection) url.openConnection(); conn.setRequestMethod("POST"); conn.setRequestProperty("Authorization", "Bearer " + API_KEY); conn.setRequestProperty("Content-Type", "application/json"); conn.setDoOutput(true); try (OutputStream os = conn.getOutputStream()) { byte[] input = payload.toString().getBytes("utf-8"); os.write(input, 0, input.length); } if (conn.getResponseCode() == 200) { BufferedReader reader = new BufferedReader( new InputStreamReader(conn.getInputStream(), "utf-8")); StringBuilder response = new StringBuilder(); String line; while ((line = reader.readLine()) != null) { response.append(line); } reader.close(); JSONObject result = new JSONObject(response.toString()); JSONObject data = result.optJSONObject("data"); if (data != null && data.has("audio")) { // 将 hex 编码的音频数据解码为二进制 String audioHex = data.getString("audio"); byte[] audioBytes = hexStringToByteArray(audioHex); try (FileOutputStream fos = new FileOutputStream("output.mp3")) { fos.write(audioBytes); } System.out.println("合成成功"); System.out.println("音频时长: " + result.getJSONObject("extra_info").getLong("audio_length") + "ms"); } else { System.out.println("合成失败: " + result.getJSONObject("base_resp").getString("status_message")); } } else { System.out.println("请求失败, 状态码: " + conn.getResponseCode()); } } catch (Exception e) { System.out.println("请求异常: " + e.getMessage()); } } // hex 字符串转字节数组 private static byte[] hexStringToByteArray(String hex) { int len = hex.length(); byte[] data = new byte[len / 2]; for (int i = 0; i < len; i += 2) { data[i / 2] = (byte) ((Character.digit(hex.charAt(i), 16) << 4) + Character.digit(hex.charAt(i + 1), 16)); } return data; } }

Swift

swift
复制
import Foundation struct TTSRequest: Codable { let model: String let text: String let stream: Bool let voiceSetting: VoiceSetting enum CodingKeys: String, CodingKey { case model, text, stream case voiceSetting = "voice_setting" } } struct VoiceSetting: Codable { let voiceId: String enum CodingKeys: String, CodingKey { case voiceId = "voice_id" } } struct TTSResponse: Codable { let data: AudioData? let extraInfo: ExtraInfo? let baseResp: BaseResp enum CodingKeys: String, CodingKey { case data case extraInfo = "extra_info" case baseResp = "base_resp" } } struct AudioData: Codable { let audio: String let status: Int64 } struct ExtraInfo: Codable { let audioLength: Int64 enum CodingKeys: String, CodingKey { case audioLength = "audio_length" } } struct BaseResp: Codable { let statusCode: Int64 let statusMessage: String enum CodingKeys: String, CodingKey { case statusCode = "status_code" case statusMessage = "status_message" } } func textToSpeech() { let apiURL = "https://api.senseaudio.cn/v1/t2a_v2" let apiKey = "YOUR_API_KEY" let request = TTSRequest( model: "SenseAudio-TTS-1.0", text: "道可道,非常道。名可名,非常名。", stream: false, voiceSetting: VoiceSetting(voiceId: "female_jiaomei") ) guard let url = URL(string: apiURL), let jsonData = try? JSONEncoder().encode(request) else { return } var urlRequest = URLRequest(url: url) urlRequest.httpMethod = "POST" urlRequest.setValue("Bearer \(apiKey)", forHTTPHeaderField: "Authorization") urlRequest.setValue("application/json", forHTTPHeaderField: "Content-Type") urlRequest.httpBody = jsonData let task = URLSession.shared.dataTask(with: urlRequest) { data, response, error in guard let data = data, error == nil else { print("请求失败: \(error?.localizedDescription ?? "Unknown error")") return } do { let result = try JSONDecoder().decode(TTSResponse.self, from: data) if let audioData = result.data { // 将 hex 编码的音频数据解码为二进制 if let audioBytes = Data(hexString: audioData.audio) { let fileURL = FileManager.default.urls(for: .documentDirectory, in: .userDomainMask)[0] .appendingPathComponent("output.mp3") try? audioBytes.write(to: fileURL) print("合成成功") if let extraInfo = result.extraInfo { print("音频时长: \(extraInfo.audioLength)ms") } } } else { print("合成失败: \(result.baseResp.statusMessage)") } } catch { print("解析失败: \(error)") } } task.resume() } // hex 字符串转 Data 扩展 extension Data { init?(hexString: String) { let len = hexString.count / 2 var data = Data(capacity: len) var index = hexString.startIndex for _ in 0..<len { let nextIndex = hexString.index(index, offsetBy: 2) if let byte = UInt8(hexString[index..<nextIndex], radix: 16) { data.append(byte) } else { return nil } index = nextIndex } self = data } } textToSpeech()