zw/keyboard

Fork 0

Files

CodeST 77fd46aa34 1

2026-01-23 21:51:37 +08:00

8.5 KiB

Raw Blame History

新音频流程说明

📋 需求概述

删除 ElevenLabs 接口：不再使用 requestElevenLabsSpeechWithText
新的流程：
- 请求 chat/message 接口，后端返回 audioId
- 用户点击语音按钮时，用 audioId 请求 /chat/audio/{audioId} 获取 MP3 地址
- 如果音频还未生成，显示等待效果（3秒后自动停止）

🔄 新的流程图

用户说话
    ↓
语音识别（Deepgram）
    ↓
添加用户消息到聊天列表
    ↓
请求 /chat/message 接口
    ↓
后端返回：
{
  "code": 0,
  "message": "ok",
  "data": {
    "aiResponse": "AI 回复文本",
    "audioId": "6e3e90575ce04658ab6c45d77a506100",
    "llmDuration": 1572
  }
}
    ↓
添加 AI 消息到聊天列表（带 audioId）
    ↓
用户点击语音按钮
    ↓
请求 /chat/audio/{audioId} 接口
    ↓
后端返回：
{
  "code": 0,
  "data": {
    "url": "http://example.com/audio.mp3"
  }
}
    ↓
下载音频文件
    ↓
播放音频

📝 修改清单

1. AiVM.h/m（网络层）

新增字段

@interface KBAiMessageData : NSObject
@property(nonatomic, copy, nullable) NSString *aiResponse;  // 新增
@property(nonatomic, copy, nullable) NSString *audioId;     // 新增
@property(nonatomic, assign) NSInteger llmDuration;         // 新增
@end

新增接口

/// 根据 audioId 获取音频 URL
- (void)requestAudioWithAudioId:(NSString *)audioId
                     completion:(AiVMAudioURLCompletion)completion;

删除接口

// ❌ 已删除
- (void)requestElevenLabsSpeechWithText:...

2. KBChatMessage.h/m（消息模型）

新增字段

/// 音频 ID - 用于异步加载音频
@property (nonatomic, copy, nullable) NSString *audioId;

新增构造方法

/// 创建 AI 消息（带 audioId，异步加载音频）
+ (instancetype)assistantMessageWithText:(NSString *)text
                                 audioId:(nullable NSString *)audioId;

3. KBChatTableView.h/m（聊天视图）

新增 API

/// 添加 AI 消息（带 audioId，异步加载音频）
- (void)addAssistantMessage:(NSString *)text
                    audioId:(nullable NSString *)audioId;

新增功能

异步加载音频：点击语音按钮时，如果有 audioId，则请求音频 URL
等待效果：加载音频时显示播放中状态，3秒后自动停止
音频缓存：下载后的音频数据缓存到消息对象，下次点击直接播放

新增方法

- (void)loadAndPlayAudioForMessage:(KBChatMessage *)message atIndexPath:(NSIndexPath *)indexPath;
- (void)downloadAndPlayAudioFromURL:(NSString *)urlString forMessage:(KBChatMessage *)message atIndexPath:(NSIndexPath *)indexPath;
- (void)startWaitingForCell:(NSIndexPath *)indexPath;
- (void)stopWaitingForCell:(NSIndexPath *)indexPath;
- (void)waitingTimeout;

4. KBAiMainVC.m（主控制器）

删除的代码

// ❌ 已删除
@property(nonatomic, copy) NSString *elevenLabsApiKey;
@property(nonatomic, copy) NSString *elevenLabsVoiceId;

// ❌ 已删除
self.elevenLabsVoiceId = @"...";
self.elevenLabsApiKey = @"...";

// ❌ 已删除 ElevenLabs 相关的所有调用

修改的流程

// 原来：
// 1. 请求 chat/message
// 2. 请求 ElevenLabs TTS
// 3. 添加消息（带音频数据）
// 4. 播放音频

// 现在：
// 1. 请求 chat/message（返回 audioId）
// 2. 添加消息（带 audioId）
// 3. 用户点击语音按钮时异步加载音频

🎯 核心实现

1. 点击语音按钮的处理逻辑

- (void)assistantMessageCell:(KBChatAssistantMessageCell *)cell
    didTapVoiceButtonForMessage:(KBChatMessage *)message {
    
    // 如果有 audioData，直接播放
    if (message.audioData && message.audioData.length > 0) {
        [self playAudioForMessage:message atIndexPath:indexPath];
        return;
    }
    
    // 如果有 audioId，异步加载音频
    if (message.audioId.length > 0) {
        [self loadAndPlayAudioForMessage:message atIndexPath:indexPath];
        return;
    }
}

2. 异步加载音频

- (void)loadAndPlayAudioForMessage:(KBChatMessage *)message atIndexPath:(NSIndexPath *)indexPath {
    // 1. 开始等待效果（显示播放中状态）
    [self startWaitingForCell:indexPath];
    
    // 2. 请求音频 URL
    [self.aiVM requestAudioWithAudioId:message.audioId
                            completion:^(NSString *audioURL, NSError *error) {
        // 3. 停止等待效果
        [self stopWaitingForCell:indexPath];
        
        if (error) {
            NSLog(@"加载音频失败");
            return;
        }
        
        // 4. 下载音频数据
        [self downloadAndPlayAudioFromURL:audioURL
                              forMessage:message
                             atIndexPath:indexPath];
    }];
}

3. 等待效果（3秒超时）

- (void)startWaitingForCell:(NSIndexPath *)indexPath {
    self.waitingCellIndexPath = indexPath;
    
    // 更新 Cell 为等待状态（显示播放中图标）
    KBChatAssistantMessageCell *cell = [self.tableView cellForRowAtIndexPath:indexPath];
    [cell updateVoicePlayingState:YES];
    
    // 3 秒后自动停止
    self.waitingTimer = [NSTimer scheduledTimerWithTimeInterval:3.0
                                                         target:self
                                                       selector:@selector(waitingTimeout)
                                                       userInfo:nil
                                                        repeats:NO];
}

- (void)waitingTimeout {
    NSLog(@"音频加载超时");
    [self stopWaitingForCell:self.waitingCellIndexPath];
}

4. 音频缓存

- (void)downloadAndPlayAudioFromURL:(NSString *)urlString
                         forMessage:(KBChatMessage *)message
                        atIndexPath:(NSIndexPath *)indexPath {
    // 下载音频
    NSURLSessionDataTask *task = [session dataTaskWithURL:url
                                        completionHandler:^(NSData *data, ...) {
        // 缓存到消息对象
        message.audioData = data;
        
        // 计算时长
        AVAudioPlayer *player = [[AVAudioPlayer alloc] initWithData:data error:nil];
        message.audioDuration = player.duration;
        
        // 刷新 Cell
        [self.tableView reloadRowsAtIndexPaths:@[indexPath] ...];
        
        // 播放音频
        [self playAudioForMessage:message atIndexPath:indexPath];
    }];
    [task resume];
}

📊 数据结构

chat/message 接口返回

{
  "code": 0,
  "message": "ok",
  "data": {
    "aiResponse": "Ugh, seriously? It's Tiffany...",
    "audioId": "6e3e90575ce04658ab6c45d77a506100",
    "llmDuration": 1572
  }
}

/chat/audio/{audioId} 接口返回

{
  "code": 0,
  "data": {
    "url": "http://127.0.0.1:4523/m1/7401033-7133645-default/chat/audio/1"
  }
}

✅ 优势

后端统一处理 TTS：不需要前端配置 ElevenLabs API Key
异步加载：不阻塞 UI，用户体验更好
音频缓存：下载后缓存，再次点击直接播放
等待效果：3秒超时保护，避免无限等待
降低耦合：前端不需要关心 TTS 实现细节

🧪 测试清单

请求 chat/message 接口成功
返回的 audioId 正确保存
点击语音按钮触发异步加载
等待效果正常显示（播放中图标）
3秒后自动停止等待
音频 URL 请求成功
音频下载成功
音频播放正常
再次点击直接播放（缓存生效）
错误处理正常（网络失败、URL 无效等）

🔧 调试建议

1. 查看日志

NSLog(@"[KBChatTableView] 加载音频失败: %@", error);
NSLog(@"[KBChatTableView] 音频 URL: %@", audioURL);
NSLog(@"[KBChatTableView] 音频加载超时");

2. 检查返回数据

确认 audioId 不为空
确认 /chat/audio/{audioId} 返回的 URL 格式正确
确认 URL 可以正常访问

3. 测试超时

故意延迟后端响应，测试 3秒超时是否生效

🎉 完成！

新的音频流程已经完全实现，具备以下特性：

✅ 后端统一处理 TTS
✅ 异步加载音频
✅ 等待效果（3秒超时）
✅ 音频缓存
✅ 错误处理

运行项目即可测试新流程！

8.5 KiB Raw Blame History Unescape Escape

新音频流程说明

📋 需求概述

🔄 新的流程图

📝 修改清单

1. AiVM.h/m（网络层）

新增字段

新增接口

删除接口

2. KBChatMessage.h/m（消息模型）

新增字段

新增构造方法

3. KBChatTableView.h/m（聊天视图）

新增 API

新增功能

新增方法

4. KBAiMainVC.m（主控制器）

删除的代码

修改的流程

🎯 核心实现

1. 点击语音按钮的处理逻辑

2. 异步加载音频

3. 等待效果（3秒超时）

4. 音频缓存

📊 数据结构

chat/message 接口返回

/chat/audio/{audioId} 接口返回

✅ 优势

🧪 测试清单

🔧 调试建议

1. 查看日志

2. 检查返回数据

3. 测试超时

🎉 完成！

8.5 KiB

Raw Blame History