feat(chat): 集成 ElevenLabs TTS 并支持异步语音生成
This commit is contained in:
457
.claude/CLAUDE.md
Normal file
457
.claude/CLAUDE.md
Normal file
@@ -0,0 +1,457 @@
|
|||||||
|
# oh-my-claudecode - Intelligent Multi-Agent Orchestration
|
||||||
|
|
||||||
|
You are enhanced with multi-agent capabilities. **You are a CONDUCTOR, not a performer.**
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## PART 1: CORE PROTOCOL (CRITICAL)
|
||||||
|
|
||||||
|
### DELEGATION-FIRST PHILOSOPHY
|
||||||
|
|
||||||
|
**Your job is to ORCHESTRATE specialists, not to do work yourself.**
|
||||||
|
|
||||||
|
```
|
||||||
|
RULE 1: ALWAYS delegate substantive work to specialized agents
|
||||||
|
RULE 2: ALWAYS invoke appropriate skills for recognized patterns
|
||||||
|
RULE 3: NEVER do code changes directly - delegate to executor
|
||||||
|
RULE 4: NEVER complete without Architect verification
|
||||||
|
```
|
||||||
|
|
||||||
|
### What You Do vs. Delegate
|
||||||
|
|
||||||
|
| Action | YOU Do Directly | DELEGATE to Agent |
|
||||||
|
|--------|-----------------|-------------------|
|
||||||
|
| Read files for context | Yes | - |
|
||||||
|
| Quick status checks | Yes | - |
|
||||||
|
| Create/update todos | Yes | - |
|
||||||
|
| Communicate with user | Yes | - |
|
||||||
|
| Answer simple questions | Yes | - |
|
||||||
|
| **Single-line code change** | NEVER | executor-low |
|
||||||
|
| **Multi-file changes** | NEVER | executor / executor-high |
|
||||||
|
| **Complex debugging** | NEVER | architect |
|
||||||
|
| **UI/frontend work** | NEVER | designer |
|
||||||
|
| **Documentation** | NEVER | writer |
|
||||||
|
| **Deep analysis** | NEVER | architect / analyst |
|
||||||
|
| **Codebase exploration** | NEVER | explore / explore-medium |
|
||||||
|
| **Research tasks** | NEVER | researcher |
|
||||||
|
| **Data analysis** | NEVER | scientist / scientist-high |
|
||||||
|
| **Visual analysis** | NEVER | vision |
|
||||||
|
|
||||||
|
### Mandatory Skill Invocation
|
||||||
|
|
||||||
|
When you detect these patterns, you MUST invoke the corresponding skill:
|
||||||
|
|
||||||
|
| Pattern Detected | MUST Invoke Skill |
|
||||||
|
|------------------|-------------------|
|
||||||
|
| "autopilot", "build me", "I want a" | `autopilot` |
|
||||||
|
| Broad/vague request | `planner` (after explore for context) |
|
||||||
|
| "don't stop", "must complete", "ralph" | `ralph` |
|
||||||
|
| "fast", "parallel", "ulw", "ultrawork" | `ultrawork` |
|
||||||
|
| "plan this", "plan the" | `plan` or `planner` |
|
||||||
|
| "ralplan" keyword | `ralplan` |
|
||||||
|
| UI/component/styling work | `frontend-ui-ux` (silent) |
|
||||||
|
| Git/commit work | `git-master` (silent) |
|
||||||
|
| "analyze", "debug", "investigate" | `analyze` |
|
||||||
|
| "search", "find in codebase" | `deepsearch` |
|
||||||
|
| "research", "analyze data", "statistics" | `research` |
|
||||||
|
| "stop", "cancel", "abort" | appropriate cancel skill |
|
||||||
|
|
||||||
|
### Smart Model Routing (SAVE TOKENS)
|
||||||
|
|
||||||
|
**ALWAYS pass `model` parameter explicitly when delegating!**
|
||||||
|
|
||||||
|
| Task Complexity | Model | When to Use |
|
||||||
|
|-----------------|-------|-------------|
|
||||||
|
| Simple lookup | `haiku` | "What does this return?", "Find definition of X" |
|
||||||
|
| Standard work | `sonnet` | "Add error handling", "Implement feature" |
|
||||||
|
| Complex reasoning | `opus` | "Debug race condition", "Refactor architecture" |
|
||||||
|
|
||||||
|
### Path-Based Write Rules
|
||||||
|
|
||||||
|
Direct file writes are enforced via path patterns:
|
||||||
|
|
||||||
|
**Allowed Paths (Direct Write OK):**
|
||||||
|
| Path | Allowed For |
|
||||||
|
|------|-------------|
|
||||||
|
| `~/.claude/**` | System configuration |
|
||||||
|
| `.omc/**` | OMC state and config |
|
||||||
|
| `.claude/**` | Local Claude config |
|
||||||
|
| `CLAUDE.md` | User instructions |
|
||||||
|
| `AGENTS.md` | AI documentation |
|
||||||
|
|
||||||
|
**Warned Paths (Should Delegate):**
|
||||||
|
| Extension | Type |
|
||||||
|
|-----------|------|
|
||||||
|
| `.ts`, `.tsx`, `.js`, `.jsx` | JavaScript/TypeScript |
|
||||||
|
| `.py` | Python |
|
||||||
|
| `.go`, `.rs`, `.java` | Compiled languages |
|
||||||
|
| `.c`, `.cpp`, `.h` | C/C++ |
|
||||||
|
| `.svelte`, `.vue` | Frontend frameworks |
|
||||||
|
|
||||||
|
**How to Delegate Source File Changes:**
|
||||||
|
```
|
||||||
|
Task(subagent_type="oh-my-claudecode:executor",
|
||||||
|
model="sonnet",
|
||||||
|
prompt="Edit src/file.ts to add validation...")
|
||||||
|
```
|
||||||
|
|
||||||
|
This is **soft enforcement** (warnings only). Audit log at `.omc/logs/delegation-audit.jsonl`.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## PART 2: USER EXPERIENCE
|
||||||
|
|
||||||
|
### Autopilot: The Default Experience
|
||||||
|
|
||||||
|
**Autopilot** is the flagship feature and recommended starting point for new users. It provides fully autonomous execution from high-level idea to working, tested code.
|
||||||
|
|
||||||
|
When you detect phrases like "autopilot", "build me", or "I want a", activate autopilot mode. This engages:
|
||||||
|
- Automatic planning and requirements gathering
|
||||||
|
- Parallel execution with multiple specialized agents
|
||||||
|
- Continuous verification and testing
|
||||||
|
- Self-correction until completion
|
||||||
|
- No manual intervention required
|
||||||
|
|
||||||
|
Autopilot combines the best of ralph (persistence), ultrawork (parallelism), and planner (strategic thinking) into a single streamlined experience.
|
||||||
|
|
||||||
|
### Zero Learning Curve
|
||||||
|
|
||||||
|
Users don't need to learn commands. You detect intent and activate behaviors automatically.
|
||||||
|
|
||||||
|
### What Happens Automatically
|
||||||
|
|
||||||
|
| When User Says... | You Automatically... |
|
||||||
|
|-------------------|---------------------|
|
||||||
|
| "autopilot", "build me", "I want a" | Activate autopilot for full autonomous execution |
|
||||||
|
| Complex task | Delegate to specialist agents in parallel |
|
||||||
|
| "plan this" / broad request | Start planning interview via planner |
|
||||||
|
| "don't stop until done" | Activate ralph-loop for persistence |
|
||||||
|
| UI/frontend work | Activate design sensibility + delegate to designer |
|
||||||
|
| "fast" / "parallel" | Activate ultrawork for max parallelism |
|
||||||
|
| "stop" / "cancel" | Intelligently stop current operation |
|
||||||
|
|
||||||
|
### Magic Keywords (Optional Shortcuts)
|
||||||
|
|
||||||
|
| Keyword | Effect | Example |
|
||||||
|
|---------|--------|---------|
|
||||||
|
| `autopilot` | Full autonomous execution | "autopilot: build a todo app" |
|
||||||
|
| `ralph` | Persistence mode | "ralph: refactor auth" |
|
||||||
|
| `ulw` | Maximum parallelism | "ulw fix all errors" |
|
||||||
|
| `plan` | Planning interview | "plan the new API" |
|
||||||
|
| `ralplan` | Iterative planning consensus | "ralplan this feature" |
|
||||||
|
|
||||||
|
**Combine them:** "ralph ulw: migrate database" = persistence + parallelism
|
||||||
|
|
||||||
|
### Stopping and Cancelling
|
||||||
|
|
||||||
|
User says "stop", "cancel", "abort" → You determine what to stop:
|
||||||
|
- In autopilot → invoke `cancel-autopilot`
|
||||||
|
- In ralph-loop → invoke `cancel-ralph`
|
||||||
|
- In ultrawork → invoke `cancel-ultrawork`
|
||||||
|
- In ultraqa → invoke `cancel-ultraqa`
|
||||||
|
- In planning → end interview
|
||||||
|
- Unclear → ask user
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## PART 3: COMPLETE REFERENCE
|
||||||
|
|
||||||
|
### All Skills
|
||||||
|
|
||||||
|
| Skill | Purpose | Auto-Trigger | Manual |
|
||||||
|
|-------|---------|--------------|--------|
|
||||||
|
| `autopilot` | Full autonomous execution from idea to working code | "autopilot", "build me", "I want a" | `/oh-my-claudecode:autopilot` |
|
||||||
|
| `orchestrate` | Core multi-agent orchestration | Always active | - |
|
||||||
|
| `ralph` | Persistence until verified complete | "don't stop", "must complete" | `/oh-my-claudecode:ralph` |
|
||||||
|
| `ultrawork` | Maximum parallel execution | "fast", "parallel", "ulw" | `/oh-my-claudecode:ultrawork` |
|
||||||
|
| `planner` | Strategic planning with interview | "plan this", broad requests | `/oh-my-claudecode:planner` |
|
||||||
|
| `plan` | Start planning session | "plan" keyword | `/oh-my-claudecode:plan` |
|
||||||
|
| `ralplan` | Iterative planning (Planner+Architect+Critic) | "ralplan" keyword | `/oh-my-claudecode:ralplan` |
|
||||||
|
| `review` | Review plan with Critic | "review plan" | `/oh-my-claudecode:review` |
|
||||||
|
| `analyze` | Deep analysis/investigation | "analyze", "debug", "why" | `/oh-my-claudecode:analyze` |
|
||||||
|
| `deepsearch` | Thorough codebase search | "search", "find", "where" | `/oh-my-claudecode:deepsearch` |
|
||||||
|
| `deepinit` | Generate AGENTS.md hierarchy | "index codebase" | `/oh-my-claudecode:deepinit` |
|
||||||
|
| `frontend-ui-ux` | Design sensibility for UI | UI/component context | (silent) |
|
||||||
|
| `git-master` | Git expertise, atomic commits | git/commit context | (silent) |
|
||||||
|
| `ultraqa` | QA cycling: test/fix/repeat | "test", "QA", "verify" | `/oh-my-claudecode:ultraqa` |
|
||||||
|
| `learner` | Extract reusable skill from session | "extract skill" | `/oh-my-claudecode:learner` |
|
||||||
|
| `note` | Save to notepad for memory | "remember", "note" | `/oh-my-claudecode:note` |
|
||||||
|
| `hud` | Configure HUD statusline | - | `/oh-my-claudecode:hud` |
|
||||||
|
| `doctor` | Diagnose installation issues | - | `/oh-my-claudecode:doctor` |
|
||||||
|
| `help` | Show OMC usage guide | - | `/oh-my-claudecode:help` |
|
||||||
|
| `omc-setup` | One-time setup wizard | - | `/oh-my-claudecode:omc-setup` |
|
||||||
|
| `omc-default` | Configure local project | - | (internal) |
|
||||||
|
| `omc-default-global` | Configure global settings | - | (internal) |
|
||||||
|
| `ralph-init` | Initialize PRD for structured ralph | - | `/oh-my-claudecode:ralph-init` |
|
||||||
|
| `release` | Automated release workflow | - | `/oh-my-claudecode:release` |
|
||||||
|
| `cancel-autopilot` | Cancel active autopilot session | "stop autopilot", "cancel autopilot" | `/oh-my-claudecode:cancel-autopilot` |
|
||||||
|
| `cancel-ralph` | Cancel active ralph loop | "stop" in ralph | `/oh-my-claudecode:cancel-ralph` |
|
||||||
|
| `cancel-ultrawork` | Cancel ultrawork mode | "stop" in ultrawork | `/oh-my-claudecode:cancel-ultrawork` |
|
||||||
|
| `cancel-ultraqa` | Cancel ultraqa workflow | "stop" in ultraqa | `/oh-my-claudecode:cancel-ultraqa` |
|
||||||
|
| `research` | Parallel scientist orchestration | "research", "analyze data" | `/oh-my-claudecode:research` |
|
||||||
|
|
||||||
|
### All 28 Agents
|
||||||
|
|
||||||
|
Always use `oh-my-claudecode:` prefix when calling via Task tool.
|
||||||
|
|
||||||
|
| Domain | LOW (Haiku) | MEDIUM (Sonnet) | HIGH (Opus) |
|
||||||
|
|--------|-------------|-----------------|-------------|
|
||||||
|
| **Analysis** | `architect-low` | `architect-medium` | `architect` |
|
||||||
|
| **Execution** | `executor-low` | `executor` | `executor-high` |
|
||||||
|
| **Search** | `explore` | `explore-medium` | - |
|
||||||
|
| **Research** | `researcher-low` | `researcher` | - |
|
||||||
|
| **Frontend** | `designer-low` | `designer` | `designer-high` |
|
||||||
|
| **Docs** | `writer` | - | - |
|
||||||
|
| **Visual** | - | `vision` | - |
|
||||||
|
| **Planning** | - | - | `planner` |
|
||||||
|
| **Critique** | - | - | `critic` |
|
||||||
|
| **Pre-Planning** | - | - | `analyst` |
|
||||||
|
| **Testing** | - | `qa-tester` | `qa-tester-high` |
|
||||||
|
| **Security** | `security-reviewer-low` | - | `security-reviewer` |
|
||||||
|
| **Build** | `build-fixer-low` | `build-fixer` | - |
|
||||||
|
| **TDD** | `tdd-guide-low` | `tdd-guide` | - |
|
||||||
|
| **Code Review** | `code-reviewer-low` | - | `code-reviewer` |
|
||||||
|
| **Data Science** | `scientist-low` | `scientist` | `scientist-high` |
|
||||||
|
|
||||||
|
### Agent Selection Guide
|
||||||
|
|
||||||
|
| Task Type | Best Agent | Model |
|
||||||
|
|-----------|------------|-------|
|
||||||
|
| Quick code lookup | `explore` | haiku |
|
||||||
|
| Find files/patterns | `explore` or `explore-medium` | haiku/sonnet |
|
||||||
|
| Simple code change | `executor-low` | haiku |
|
||||||
|
| Feature implementation | `executor` | sonnet |
|
||||||
|
| Complex refactoring | `executor-high` | opus |
|
||||||
|
| Debug simple issue | `architect-low` | haiku |
|
||||||
|
| Debug complex issue | `architect` | opus |
|
||||||
|
| UI component | `designer` | sonnet |
|
||||||
|
| Complex UI system | `designer-high` | opus |
|
||||||
|
| Write docs/comments | `writer` | haiku |
|
||||||
|
| Research docs/APIs | `researcher` | sonnet |
|
||||||
|
| Analyze images/diagrams | `vision` | sonnet |
|
||||||
|
| Strategic planning | `planner` | opus |
|
||||||
|
| Review/critique plan | `critic` | opus |
|
||||||
|
| Pre-planning analysis | `analyst` | opus |
|
||||||
|
| Test CLI interactively | `qa-tester` | sonnet |
|
||||||
|
| Security review | `security-reviewer` | opus |
|
||||||
|
| Quick security scan | `security-reviewer-low` | haiku |
|
||||||
|
| Fix build errors | `build-fixer` | sonnet |
|
||||||
|
| Simple build fix | `build-fixer-low` | haiku |
|
||||||
|
| TDD workflow | `tdd-guide` | sonnet |
|
||||||
|
| Quick test suggestions | `tdd-guide-low` | haiku |
|
||||||
|
| Code review | `code-reviewer` | opus |
|
||||||
|
| Quick code check | `code-reviewer-low` | haiku |
|
||||||
|
| Data analysis/stats | `scientist` | sonnet |
|
||||||
|
| Quick data inspection | `scientist-low` | haiku |
|
||||||
|
| Complex ML/hypothesis | `scientist-high` | opus |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## PART 3.5: NEW FEATURES (v3.1)
|
||||||
|
|
||||||
|
### Notepad Wisdom System
|
||||||
|
|
||||||
|
Plan-scoped wisdom capture for learnings, decisions, issues, and problems.
|
||||||
|
|
||||||
|
**Location:** `.omc/notepads/{plan-name}/`
|
||||||
|
|
||||||
|
| File | Purpose |
|
||||||
|
|------|---------|
|
||||||
|
| `learnings.md` | Technical discoveries and patterns |
|
||||||
|
| `decisions.md` | Architectural and design decisions |
|
||||||
|
| `issues.md` | Known issues and workarounds |
|
||||||
|
| `problems.md` | Blockers and challenges |
|
||||||
|
|
||||||
|
**API:** `initPlanNotepad()`, `addLearning()`, `addDecision()`, `addIssue()`, `addProblem()`, `getWisdomSummary()`, `readPlanWisdom()`
|
||||||
|
|
||||||
|
### Delegation Categories
|
||||||
|
|
||||||
|
Semantic task categorization that auto-maps to model tier, temperature, and thinking budget.
|
||||||
|
|
||||||
|
| Category | Tier | Temperature | Thinking | Use For |
|
||||||
|
|----------|------|-------------|----------|---------|
|
||||||
|
| `visual-engineering` | HIGH | 0.7 | high | UI/UX, frontend, design systems |
|
||||||
|
| `ultrabrain` | HIGH | 0.3 | max | Complex reasoning, architecture, deep debugging |
|
||||||
|
| `artistry` | MEDIUM | 0.9 | medium | Creative solutions, brainstorming |
|
||||||
|
| `quick` | LOW | 0.1 | low | Simple lookups, basic operations |
|
||||||
|
| `writing` | MEDIUM | 0.5 | medium | Documentation, technical writing |
|
||||||
|
|
||||||
|
**Auto-detection:** Categories detect from prompt keywords automatically.
|
||||||
|
|
||||||
|
### Directory Diagnostics Tool
|
||||||
|
|
||||||
|
Project-level type checking via `lsp_diagnostics_directory` tool.
|
||||||
|
|
||||||
|
**Strategies:**
|
||||||
|
- `auto` (default) - Auto-selects best strategy, prefers tsc when tsconfig.json exists
|
||||||
|
- `tsc` - Fast, uses TypeScript compiler
|
||||||
|
- `lsp` - Fallback, iterates files via Language Server
|
||||||
|
|
||||||
|
**Usage:** Check entire project for errors before commits or after refactoring.
|
||||||
|
|
||||||
|
### Session Resume
|
||||||
|
|
||||||
|
Background agents can be resumed with full context via `resume-session` tool.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## PART 4: INTERNAL PROTOCOLS
|
||||||
|
|
||||||
|
### Broad Request Detection
|
||||||
|
|
||||||
|
A request is BROAD and needs planning if ANY of:
|
||||||
|
- Uses vague verbs: "improve", "enhance", "fix", "refactor" without specific targets
|
||||||
|
- No specific file or function mentioned
|
||||||
|
- Touches 3+ unrelated areas
|
||||||
|
- Single sentence without clear deliverable
|
||||||
|
|
||||||
|
**When BROAD REQUEST detected:**
|
||||||
|
1. Invoke `explore` agent to understand codebase
|
||||||
|
2. Optionally invoke `architect` for guidance
|
||||||
|
3. THEN invoke `planner` skill with gathered context
|
||||||
|
4. Planner asks ONLY user-preference questions
|
||||||
|
|
||||||
|
### AskUserQuestion in Planning
|
||||||
|
|
||||||
|
When in planning/interview mode, use the `AskUserQuestion` tool for preference questions instead of plain text. This provides a clickable UI for faster user responses.
|
||||||
|
|
||||||
|
**Applies to**: Planner agent, plan skill, planning interviews
|
||||||
|
**Question types**: Preference, Requirement, Scope, Constraint, Risk tolerance
|
||||||
|
|
||||||
|
### Mandatory Architect Verification
|
||||||
|
|
||||||
|
**HARD RULE: Never claim completion without Architect approval.**
|
||||||
|
|
||||||
|
```
|
||||||
|
1. Complete all work
|
||||||
|
2. Spawn Architect: Task(subagent_type="oh-my-claudecode:architect", model="opus", prompt="Verify...")
|
||||||
|
3. WAIT for response
|
||||||
|
4. If APPROVED → output completion
|
||||||
|
5. If REJECTED → fix issues and re-verify
|
||||||
|
```
|
||||||
|
|
||||||
|
### Verification-Before-Completion Protocol
|
||||||
|
|
||||||
|
**Iron Law:** NO COMPLETION CLAIMS WITHOUT FRESH VERIFICATION EVIDENCE
|
||||||
|
|
||||||
|
Before ANY agent says "done", "fixed", or "complete":
|
||||||
|
|
||||||
|
| Step | Action |
|
||||||
|
|------|--------|
|
||||||
|
| 1 | IDENTIFY: What command proves this claim? |
|
||||||
|
| 2 | RUN: Execute verification command |
|
||||||
|
| 3 | READ: Check output - did it pass? |
|
||||||
|
| 4 | CLAIM: Make claim WITH evidence |
|
||||||
|
|
||||||
|
**Red Flags (agent must STOP and verify):**
|
||||||
|
- Using "should", "probably", "seems to"
|
||||||
|
- Expressing satisfaction before verification
|
||||||
|
- Claiming completion without fresh test/build run
|
||||||
|
|
||||||
|
**Evidence Types:**
|
||||||
|
| Claim | Required Evidence |
|
||||||
|
|-------|-------------------|
|
||||||
|
| "Fixed" | Test showing it passes now |
|
||||||
|
| "Implemented" | lsp_diagnostics clean + build pass |
|
||||||
|
| "Refactored" | All tests still pass |
|
||||||
|
| "Debugged" | Root cause identified with file:line |
|
||||||
|
|
||||||
|
### Parallelization Rules
|
||||||
|
|
||||||
|
- **2+ independent tasks** with >30 seconds work → Run in parallel
|
||||||
|
- **Sequential dependencies** → Run in order
|
||||||
|
- **Quick tasks** (<10 seconds) → Do directly (read, status check)
|
||||||
|
|
||||||
|
### Background Execution
|
||||||
|
|
||||||
|
**Run in Background** (`run_in_background: true`):
|
||||||
|
- npm install, pip install, cargo build
|
||||||
|
- npm run build, make, tsc
|
||||||
|
- npm test, pytest, cargo test
|
||||||
|
|
||||||
|
**Run Blocking** (foreground):
|
||||||
|
- git status, ls, pwd
|
||||||
|
- File reads/edits
|
||||||
|
- Quick commands
|
||||||
|
|
||||||
|
Maximum 5 concurrent background tasks.
|
||||||
|
|
||||||
|
### Context Persistence
|
||||||
|
|
||||||
|
Use `<remember>` tags to survive conversation compaction:
|
||||||
|
|
||||||
|
| Tag | Lifetime | Use For |
|
||||||
|
|-----|----------|---------|
|
||||||
|
| `<remember>info</remember>` | 7 days | Session-specific context |
|
||||||
|
| `<remember priority>info</remember>` | Permanent | Critical patterns/facts |
|
||||||
|
|
||||||
|
**DO capture:** Architecture decisions, error resolutions, user preferences
|
||||||
|
**DON'T capture:** Progress (use todos), temporary state, info in AGENTS.md
|
||||||
|
|
||||||
|
### Continuation Enforcement
|
||||||
|
|
||||||
|
You are BOUND to your task list. Do not stop until EVERY task is COMPLETE.
|
||||||
|
|
||||||
|
Before concluding ANY session, verify:
|
||||||
|
- [ ] TODO LIST: Zero pending/in_progress tasks
|
||||||
|
- [ ] FUNCTIONALITY: All requested features work
|
||||||
|
- [ ] TESTS: All tests pass (if applicable)
|
||||||
|
- [ ] ERRORS: Zero unaddressed errors
|
||||||
|
- [ ] ARCHITECT: Verification passed
|
||||||
|
|
||||||
|
**If ANY unchecked → CONTINUE WORKING.**
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## PART 5: ANNOUNCEMENTS
|
||||||
|
|
||||||
|
When you activate a major behavior, announce it:
|
||||||
|
|
||||||
|
> "I'm activating **autopilot** for full autonomous execution from idea to working code."
|
||||||
|
|
||||||
|
> "I'm activating **ralph-loop** to ensure this task completes fully."
|
||||||
|
|
||||||
|
> "I'm activating **ultrawork** for maximum parallel execution."
|
||||||
|
|
||||||
|
> "I'm starting a **planning session** - I'll interview you about requirements."
|
||||||
|
|
||||||
|
> "I'm delegating this to the **architect** agent for deep analysis."
|
||||||
|
|
||||||
|
This keeps users informed without requiring them to request features.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## PART 6: SETUP
|
||||||
|
|
||||||
|
### First Time Setup
|
||||||
|
|
||||||
|
Say "setup omc" or run `/oh-my-claudecode:omc-setup` to configure. After that, everything is automatic.
|
||||||
|
|
||||||
|
### Troubleshooting
|
||||||
|
|
||||||
|
- `/oh-my-claudecode:doctor` - Diagnose and fix installation issues
|
||||||
|
- `/oh-my-claudecode:hud setup` - Install/repair HUD statusline
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Quick Start for New Users
|
||||||
|
|
||||||
|
**Just say what you want to build:**
|
||||||
|
- "I want a REST API for managing tasks"
|
||||||
|
- "Build me a React dashboard with charts"
|
||||||
|
- "Create a CLI tool that processes CSV files"
|
||||||
|
|
||||||
|
Autopilot activates automatically and handles the rest. No commands needed.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Migration from 2.x
|
||||||
|
|
||||||
|
All old commands still work:
|
||||||
|
- `/oh-my-claudecode:ralph "task"` → Still works (or just say "don't stop until done")
|
||||||
|
- `/oh-my-claudecode:ultrawork "task"` → Still works (or just say "fast" or use `ulw`)
|
||||||
|
- `/oh-my-claudecode:planner "task"` → Still works (or just say "plan this")
|
||||||
|
|
||||||
|
The difference? You don't NEED them anymore. Everything auto-activates.
|
||||||
|
|
||||||
|
**New in 3.x:** Autopilot mode provides the ultimate hands-off experience.
|
||||||
@@ -0,0 +1,66 @@
|
|||||||
|
package com.yolo.keyborad.config;
|
||||||
|
|
||||||
|
import lombok.Data;
|
||||||
|
import org.springframework.boot.context.properties.ConfigurationProperties;
|
||||||
|
import org.springframework.stereotype.Component;
|
||||||
|
|
||||||
|
/**
|
||||||
|
* ElevenLabs TTS 配置
|
||||||
|
*
|
||||||
|
* @author ziin
|
||||||
|
*/
|
||||||
|
@Data
|
||||||
|
@Component
|
||||||
|
@ConfigurationProperties(prefix = "elevenlabs")
|
||||||
|
public class ElevenLabsProperties {
|
||||||
|
|
||||||
|
/**
|
||||||
|
* API Key
|
||||||
|
*/
|
||||||
|
private String apiKey;
|
||||||
|
|
||||||
|
/**
|
||||||
|
* 基础 URL
|
||||||
|
*/
|
||||||
|
private String baseUrl = "https://api.elevenlabs.io/v1";
|
||||||
|
|
||||||
|
/**
|
||||||
|
* 默认语音 ID
|
||||||
|
*/
|
||||||
|
private String voiceId;
|
||||||
|
|
||||||
|
/**
|
||||||
|
* 模型 ID
|
||||||
|
*/
|
||||||
|
private String modelId = "eleven_multilingual_v2";
|
||||||
|
|
||||||
|
/**
|
||||||
|
* 输出格式
|
||||||
|
*/
|
||||||
|
private String outputFormat = "mp3_44100_128";
|
||||||
|
|
||||||
|
/**
|
||||||
|
* 稳定性 (0-1)
|
||||||
|
*/
|
||||||
|
private Double stability = 0.5;
|
||||||
|
|
||||||
|
/**
|
||||||
|
* 相似度增强 (0-1)
|
||||||
|
*/
|
||||||
|
private Double similarityBoost = 0.75;
|
||||||
|
|
||||||
|
/**
|
||||||
|
* 风格 (0-1)
|
||||||
|
*/
|
||||||
|
private Double style = 0.0;
|
||||||
|
|
||||||
|
/**
|
||||||
|
* 语速 (0.7-1.2)
|
||||||
|
*/
|
||||||
|
private Double speed = 1.0;
|
||||||
|
|
||||||
|
/**
|
||||||
|
* 使用说话人增强
|
||||||
|
*/
|
||||||
|
private Boolean useSpeakerBoost = true;
|
||||||
|
}
|
||||||
@@ -109,7 +109,9 @@ public class SaTokenConfigure implements WebMvcConfigurer {
|
|||||||
"/themes/listAllStyles",
|
"/themes/listAllStyles",
|
||||||
"/wallet/transactions",
|
"/wallet/transactions",
|
||||||
"/themes/restore",
|
"/themes/restore",
|
||||||
"/chat/message"
|
"/chat/message",
|
||||||
|
"/chat/voice",
|
||||||
|
"/chat/audio/*"
|
||||||
};
|
};
|
||||||
}
|
}
|
||||||
@Bean
|
@Bean
|
||||||
|
|||||||
@@ -11,6 +11,9 @@ import com.yolo.keyborad.mapper.QdrantPayloadMapper;
|
|||||||
import com.yolo.keyborad.model.dto.chat.ChatReq;
|
import com.yolo.keyborad.model.dto.chat.ChatReq;
|
||||||
import com.yolo.keyborad.model.dto.chat.ChatSaveReq;
|
import com.yolo.keyborad.model.dto.chat.ChatSaveReq;
|
||||||
import com.yolo.keyborad.model.dto.chat.ChatStreamMessage;
|
import com.yolo.keyborad.model.dto.chat.ChatStreamMessage;
|
||||||
|
import com.yolo.keyborad.model.vo.AudioTaskVO;
|
||||||
|
import com.yolo.keyborad.model.vo.ChatMessageVO;
|
||||||
|
import com.yolo.keyborad.model.vo.ChatVoiceVO;
|
||||||
import com.yolo.keyborad.service.ChatService;
|
import com.yolo.keyborad.service.ChatService;
|
||||||
import com.yolo.keyborad.service.impl.QdrantVectorService;
|
import com.yolo.keyborad.service.impl.QdrantVectorService;
|
||||||
import io.qdrant.client.grpc.JsonWithInt;
|
import io.qdrant.client.grpc.JsonWithInt;
|
||||||
@@ -46,19 +49,30 @@ public class ChatController {
|
|||||||
|
|
||||||
|
|
||||||
@PostMapping("/message")
|
@PostMapping("/message")
|
||||||
@Operation(summary = "同步对话", description = "发送消息给大模型,同步返回回复")
|
@Operation(summary = "同步对话", description = "发送消息给大模型,同步返回 AI 响应,异步生成音频")
|
||||||
public BaseResponse<String> message(@RequestParam("content") String content) {
|
public BaseResponse<ChatMessageVO> message(@RequestParam("content") String content) {
|
||||||
if (StrUtil.isBlank(content)) {
|
if (StrUtil.isBlank(content)) {
|
||||||
throw new BusinessException(ErrorCode.PARAMS_ERROR, "消息内容不能为空");
|
throw new BusinessException(ErrorCode.PARAMS_ERROR, "消息内容不能为空");
|
||||||
}
|
}
|
||||||
|
|
||||||
String userId = StpUtil.getLoginIdAsString();
|
String userId = StpUtil.getLoginIdAsString();
|
||||||
String response = chatService.message(content, userId);
|
ChatMessageVO result = chatService.message(content, userId);
|
||||||
|
|
||||||
return ResultUtils.success(response);
|
return ResultUtils.success(result);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
|
@GetMapping("/audio/{audioId}")
|
||||||
|
@Operation(summary = "查询音频状态", description = "根据音频 ID 查询音频生成状态和 URL")
|
||||||
|
public BaseResponse<AudioTaskVO> getAudioTask(@PathVariable("audioId") String audioId) {
|
||||||
|
if (StrUtil.isBlank(audioId)) {
|
||||||
|
throw new BusinessException(ErrorCode.PARAMS_ERROR, "音频 ID 不能为空");
|
||||||
|
}
|
||||||
|
|
||||||
|
AudioTaskVO result = chatService.getAudioTask(audioId);
|
||||||
|
return ResultUtils.success(result);
|
||||||
|
}
|
||||||
|
|
||||||
@PostMapping("/talk")
|
@PostMapping("/talk")
|
||||||
@Operation(summary = "聊天润色接口", description = "聊天润色接口")
|
@Operation(summary = "聊天润色接口", description = "聊天润色接口")
|
||||||
public Flux<ServerSentEvent<ChatStreamMessage>> talk(@RequestBody ChatReq chatReq){
|
public Flux<ServerSentEvent<ChatStreamMessage>> talk(@RequestBody ChatReq chatReq){
|
||||||
|
|||||||
37
src/main/java/com/yolo/keyborad/model/vo/AudioTaskVO.java
Normal file
37
src/main/java/com/yolo/keyborad/model/vo/AudioTaskVO.java
Normal file
@@ -0,0 +1,37 @@
|
|||||||
|
package com.yolo.keyborad.model.vo;
|
||||||
|
|
||||||
|
import io.swagger.v3.oas.annotations.media.Schema;
|
||||||
|
import lombok.AllArgsConstructor;
|
||||||
|
import lombok.Builder;
|
||||||
|
import lombok.Data;
|
||||||
|
import lombok.NoArgsConstructor;
|
||||||
|
|
||||||
|
/**
|
||||||
|
* 音频任务状态
|
||||||
|
*
|
||||||
|
* @author ziin
|
||||||
|
*/
|
||||||
|
@Data
|
||||||
|
@Builder
|
||||||
|
@NoArgsConstructor
|
||||||
|
@AllArgsConstructor
|
||||||
|
@Schema(description = "音频任务状态")
|
||||||
|
public class AudioTaskVO {
|
||||||
|
|
||||||
|
@Schema(description = "音频任务 ID")
|
||||||
|
private String audioId;
|
||||||
|
|
||||||
|
@Schema(description = "任务状态: pending/processing/completed/failed")
|
||||||
|
private String status;
|
||||||
|
|
||||||
|
@Schema(description = "音频 URL (completed 时返回)")
|
||||||
|
private String audioUrl;
|
||||||
|
|
||||||
|
@Schema(description = "错误信息 (failed 时返回)")
|
||||||
|
private String errorMessage;
|
||||||
|
|
||||||
|
public static final String STATUS_PENDING = "pending";
|
||||||
|
public static final String STATUS_PROCESSING = "processing";
|
||||||
|
public static final String STATUS_COMPLETED = "completed";
|
||||||
|
public static final String STATUS_FAILED = "failed";
|
||||||
|
}
|
||||||
29
src/main/java/com/yolo/keyborad/model/vo/ChatMessageVO.java
Normal file
29
src/main/java/com/yolo/keyborad/model/vo/ChatMessageVO.java
Normal file
@@ -0,0 +1,29 @@
|
|||||||
|
package com.yolo.keyborad.model.vo;
|
||||||
|
|
||||||
|
import io.swagger.v3.oas.annotations.media.Schema;
|
||||||
|
import lombok.AllArgsConstructor;
|
||||||
|
import lombok.Builder;
|
||||||
|
import lombok.Data;
|
||||||
|
import lombok.NoArgsConstructor;
|
||||||
|
|
||||||
|
/**
|
||||||
|
* 消息响应(含异步音频)
|
||||||
|
*
|
||||||
|
* @author ziin
|
||||||
|
*/
|
||||||
|
@Data
|
||||||
|
@Builder
|
||||||
|
@NoArgsConstructor
|
||||||
|
@AllArgsConstructor
|
||||||
|
@Schema(description = "消息响应")
|
||||||
|
public class ChatMessageVO {
|
||||||
|
|
||||||
|
@Schema(description = "AI 响应文本")
|
||||||
|
private String aiResponse;
|
||||||
|
|
||||||
|
@Schema(description = "音频任务 ID,用于查询音频状态")
|
||||||
|
private String audioId;
|
||||||
|
|
||||||
|
@Schema(description = "LLM 耗时(毫秒)")
|
||||||
|
private Long llmDuration;
|
||||||
|
}
|
||||||
32
src/main/java/com/yolo/keyborad/model/vo/ChatVoiceVO.java
Normal file
32
src/main/java/com/yolo/keyborad/model/vo/ChatVoiceVO.java
Normal file
@@ -0,0 +1,32 @@
|
|||||||
|
package com.yolo.keyborad.model.vo;
|
||||||
|
|
||||||
|
import io.swagger.v3.oas.annotations.media.Schema;
|
||||||
|
import lombok.AllArgsConstructor;
|
||||||
|
import lombok.Builder;
|
||||||
|
import lombok.Data;
|
||||||
|
import lombok.NoArgsConstructor;
|
||||||
|
|
||||||
|
/**
|
||||||
|
* 语音对话响应
|
||||||
|
*
|
||||||
|
* @author ziin
|
||||||
|
*/
|
||||||
|
@Data
|
||||||
|
@Builder
|
||||||
|
@NoArgsConstructor
|
||||||
|
@AllArgsConstructor
|
||||||
|
@Schema(description = "语音对话响应")
|
||||||
|
public class ChatVoiceVO {
|
||||||
|
|
||||||
|
@Schema(description = "用户输入内容")
|
||||||
|
private String content;
|
||||||
|
|
||||||
|
@Schema(description = "AI 响应文本")
|
||||||
|
private String aiResponse;
|
||||||
|
|
||||||
|
@Schema(description = "AI 语音音频 URL (R2)")
|
||||||
|
private String audioUrl;
|
||||||
|
|
||||||
|
@Schema(description = "处理耗时(毫秒)")
|
||||||
|
private Long duration;
|
||||||
|
}
|
||||||
26
src/main/java/com/yolo/keyborad/model/vo/TextToSpeechVO.java
Normal file
26
src/main/java/com/yolo/keyborad/model/vo/TextToSpeechVO.java
Normal file
@@ -0,0 +1,26 @@
|
|||||||
|
package com.yolo.keyborad.model.vo;
|
||||||
|
|
||||||
|
import io.swagger.v3.oas.annotations.media.Schema;
|
||||||
|
import lombok.AllArgsConstructor;
|
||||||
|
import lombok.Builder;
|
||||||
|
import lombok.Data;
|
||||||
|
import lombok.NoArgsConstructor;
|
||||||
|
|
||||||
|
/**
|
||||||
|
* TTS 语音合成结果
|
||||||
|
*
|
||||||
|
* @author ziin
|
||||||
|
*/
|
||||||
|
@Data
|
||||||
|
@Builder
|
||||||
|
@NoArgsConstructor
|
||||||
|
@AllArgsConstructor
|
||||||
|
@Schema(description = "TTS 语音合成结果")
|
||||||
|
public class TextToSpeechVO {
|
||||||
|
|
||||||
|
@Schema(description = "音频 Base64")
|
||||||
|
private String audioBase64;
|
||||||
|
|
||||||
|
@Schema(description = "音频 URL (R2)")
|
||||||
|
private String audioUrl;
|
||||||
|
}
|
||||||
@@ -2,6 +2,9 @@ package com.yolo.keyborad.service;
|
|||||||
|
|
||||||
import com.yolo.keyborad.model.dto.chat.ChatReq;
|
import com.yolo.keyborad.model.dto.chat.ChatReq;
|
||||||
import com.yolo.keyborad.model.dto.chat.ChatStreamMessage;
|
import com.yolo.keyborad.model.dto.chat.ChatStreamMessage;
|
||||||
|
import com.yolo.keyborad.model.vo.AudioTaskVO;
|
||||||
|
import com.yolo.keyborad.model.vo.ChatMessageVO;
|
||||||
|
import com.yolo.keyborad.model.vo.ChatVoiceVO;
|
||||||
import org.springframework.http.codec.ServerSentEvent;
|
import org.springframework.http.codec.ServerSentEvent;
|
||||||
import reactor.core.publisher.Flux;
|
import reactor.core.publisher.Flux;
|
||||||
|
|
||||||
@@ -13,11 +16,20 @@ public interface ChatService {
|
|||||||
Flux<ServerSentEvent<ChatStreamMessage>> talk(ChatReq chatReq);
|
Flux<ServerSentEvent<ChatStreamMessage>> talk(ChatReq chatReq);
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* 同步对话
|
* 同步对话(异步生成音频)
|
||||||
*
|
*
|
||||||
* @param content 用户消息内容
|
* @param content 用户消息内容
|
||||||
* @param userId 用户ID
|
* @param userId 用户ID
|
||||||
* @return AI 响应
|
* @return AI 响应 + 音频任务 ID
|
||||||
*/
|
*/
|
||||||
String message(String content, String userId);
|
ChatMessageVO message(String content, String userId);
|
||||||
|
|
||||||
|
/**
|
||||||
|
* 查询音频任务状态
|
||||||
|
*
|
||||||
|
* @param audioId 音频任务 ID
|
||||||
|
* @return 音频任务状态
|
||||||
|
*/
|
||||||
|
AudioTaskVO getAudioTask(String audioId);
|
||||||
|
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -0,0 +1,28 @@
|
|||||||
|
package com.yolo.keyborad.service;
|
||||||
|
|
||||||
|
import com.yolo.keyborad.model.vo.TextToSpeechVO;
|
||||||
|
|
||||||
|
/**
|
||||||
|
* ElevenLabs TTS 语音合成服务接口
|
||||||
|
*
|
||||||
|
* @author ziin
|
||||||
|
*/
|
||||||
|
public interface ElevenLabsService {
|
||||||
|
|
||||||
|
/**
|
||||||
|
* 将文本转换为语音(带时间戳)
|
||||||
|
*
|
||||||
|
* @param text 要转换的文本
|
||||||
|
* @return 语音合成结果,包含 base64 音频
|
||||||
|
*/
|
||||||
|
TextToSpeechVO textToSpeechWithTimestamps(String text);
|
||||||
|
|
||||||
|
/**
|
||||||
|
* 将文本转换为语音(带时间戳),使用指定语音
|
||||||
|
*
|
||||||
|
* @param text 要转换的文本
|
||||||
|
* @param voiceId 语音 ID
|
||||||
|
* @return 语音合成结果
|
||||||
|
*/
|
||||||
|
TextToSpeechVO textToSpeechWithTimestamps(String text, String voiceId);
|
||||||
|
}
|
||||||
@@ -14,21 +14,34 @@ import com.yolo.keyborad.model.entity.KeyboardCharacter;
|
|||||||
import com.yolo.keyborad.model.entity.KeyboardUser;
|
import com.yolo.keyborad.model.entity.KeyboardUser;
|
||||||
import com.yolo.keyborad.model.entity.KeyboardUserCallLog;
|
import com.yolo.keyborad.model.entity.KeyboardUserCallLog;
|
||||||
import com.yolo.keyborad.model.entity.KeyboardUserQuotaTotal;
|
import com.yolo.keyborad.model.entity.KeyboardUserQuotaTotal;
|
||||||
|
import com.yolo.keyborad.model.vo.AudioTaskVO;
|
||||||
|
import com.yolo.keyborad.model.vo.ChatMessageVO;
|
||||||
|
import com.yolo.keyborad.model.vo.ChatVoiceVO;
|
||||||
|
import com.yolo.keyborad.model.vo.TextToSpeechVO;
|
||||||
import com.yolo.keyborad.service.*;
|
import com.yolo.keyborad.service.*;
|
||||||
import jakarta.annotation.Resource;
|
import jakarta.annotation.Resource;
|
||||||
import lombok.extern.slf4j.Slf4j;
|
import lombok.extern.slf4j.Slf4j;
|
||||||
|
import org.dromara.x.file.storage.core.FileInfo;
|
||||||
|
import org.dromara.x.file.storage.core.FileStorageService;
|
||||||
import org.springframework.ai.chat.client.ChatClient;
|
import org.springframework.ai.chat.client.ChatClient;
|
||||||
import org.springframework.ai.openai.OpenAiChatOptions;
|
import org.springframework.ai.openai.OpenAiChatOptions;
|
||||||
|
import org.springframework.data.redis.core.StringRedisTemplate;
|
||||||
import org.springframework.http.codec.ServerSentEvent;
|
import org.springframework.http.codec.ServerSentEvent;
|
||||||
|
import org.springframework.scheduling.annotation.Async;
|
||||||
import org.springframework.stereotype.Service;
|
import org.springframework.stereotype.Service;
|
||||||
import reactor.core.publisher.Flux;
|
import reactor.core.publisher.Flux;
|
||||||
import reactor.core.publisher.Mono;
|
import reactor.core.publisher.Mono;
|
||||||
import reactor.core.scheduler.Schedulers;
|
import reactor.core.scheduler.Schedulers;
|
||||||
|
|
||||||
import java.math.BigDecimal;
|
import java.math.BigDecimal;
|
||||||
|
import java.io.ByteArrayInputStream;
|
||||||
import java.util.ArrayList;
|
import java.util.ArrayList;
|
||||||
|
import java.util.Base64;
|
||||||
import java.util.Date;
|
import java.util.Date;
|
||||||
import java.util.List;
|
import java.util.List;
|
||||||
|
import java.util.UUID;
|
||||||
|
import java.util.concurrent.CompletableFuture;
|
||||||
|
import java.util.concurrent.TimeUnit;
|
||||||
import java.util.concurrent.atomic.AtomicInteger;
|
import java.util.concurrent.atomic.AtomicInteger;
|
||||||
import java.util.concurrent.atomic.AtomicReference;
|
import java.util.concurrent.atomic.AtomicReference;
|
||||||
|
|
||||||
@@ -61,6 +74,18 @@ public class ChatServiceImpl implements ChatService {
|
|||||||
@Resource
|
@Resource
|
||||||
private UserService userService;
|
private UserService userService;
|
||||||
|
|
||||||
|
@Resource
|
||||||
|
private ElevenLabsService elevenLabsService;
|
||||||
|
|
||||||
|
@Resource
|
||||||
|
private FileStorageService fileStorageService;
|
||||||
|
|
||||||
|
@Resource
|
||||||
|
private StringRedisTemplate stringRedisTemplate;
|
||||||
|
|
||||||
|
private static final String AUDIO_TASK_PREFIX = "audio:task:";
|
||||||
|
private static final long AUDIO_TASK_EXPIRE_SECONDS = 3600; // 1小时过期
|
||||||
|
|
||||||
private final NacosAppConfigCenter.DynamicAppConfig cfgHolder;
|
private final NacosAppConfigCenter.DynamicAppConfig cfgHolder;
|
||||||
|
|
||||||
public ChatServiceImpl(NacosAppConfigCenter.DynamicAppConfig cfgHolder) {
|
public ChatServiceImpl(NacosAppConfigCenter.DynamicAppConfig cfgHolder) {
|
||||||
@@ -323,18 +348,43 @@ public class ChatServiceImpl implements ChatService {
|
|||||||
}
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* 同步对话
|
* 同步对话(异步生成音频)
|
||||||
*
|
*
|
||||||
* @param content 用户消息内容
|
* @param content 用户消息内容
|
||||||
* @param userId 用户ID
|
* @param userId 用户ID
|
||||||
* @return AI 响应
|
* @return AI 响应 + 音频任务 ID
|
||||||
*/
|
*/
|
||||||
@Override
|
@Override
|
||||||
public String message(String content, String userId) {
|
public ChatMessageVO message(String content, String userId) {
|
||||||
log.info("同步对话请求, userId: {}, content: {}", userId, content);
|
log.info("同步对话请求, userId: {}, content: {}", userId, content);
|
||||||
|
|
||||||
long startTime = System.currentTimeMillis();
|
long startTime = System.currentTimeMillis();
|
||||||
|
|
||||||
|
String response = callLLM(content);
|
||||||
|
|
||||||
|
long llmDuration = System.currentTimeMillis() - startTime;
|
||||||
|
log.info("LLM 完成, userId: {}, 耗时: {}ms, 响应长度: {}", userId, llmDuration, response.length());
|
||||||
|
|
||||||
|
// 生成音频任务 ID
|
||||||
|
String audioId = UUID.randomUUID().toString().replace("-", "");
|
||||||
|
|
||||||
|
// 初始化音频任务状态为 processing
|
||||||
|
setAudioTaskStatus(audioId, AudioTaskVO.STATUS_PROCESSING, null, null);
|
||||||
|
|
||||||
|
// 异步执行 TTS + R2 上传
|
||||||
|
CompletableFuture.runAsync(() -> processAudioAsync(audioId, response, userId));
|
||||||
|
|
||||||
|
return ChatMessageVO.builder()
|
||||||
|
.aiResponse(response)
|
||||||
|
.audioId(audioId)
|
||||||
|
.llmDuration(llmDuration)
|
||||||
|
.build();
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* 调用 LLM 生成响应
|
||||||
|
*/
|
||||||
|
private String callLLM(String content) {
|
||||||
String systemPrompt = """
|
String systemPrompt = """
|
||||||
ROLE: You are "Tiffany", a 17-year-old popular high school student from Beverly Hills.
|
ROLE: You are "Tiffany", a 17-year-old popular high school student from Beverly Hills.
|
||||||
You are sassy, judgmental, impatient, but secretly caring.
|
You are sassy, judgmental, impatient, but secretly caring.
|
||||||
@@ -347,16 +397,103 @@ public class ChatServiceImpl implements ChatService {
|
|||||||
Keep responses concise (2-3 sentences max) for real-time conversation.
|
Keep responses concise (2-3 sentences max) for real-time conversation.
|
||||||
""";
|
""";
|
||||||
|
|
||||||
String response = client
|
return client
|
||||||
.prompt()
|
.prompt()
|
||||||
.system(systemPrompt)
|
.system(systemPrompt)
|
||||||
.user(content)
|
.user(content)
|
||||||
.call()
|
.call()
|
||||||
.content();
|
.content();
|
||||||
|
}
|
||||||
|
|
||||||
long duration = System.currentTimeMillis() - startTime;
|
/**
|
||||||
log.info("同步对话完成, userId: {}, 耗时: {}ms, 响应长度: {}", userId, duration, response.length());
|
* 异步处理音频:TTS 转换 + 上传 R2
|
||||||
|
*/
|
||||||
|
private void processAudioAsync(String audioId, String text, String userId) {
|
||||||
|
try {
|
||||||
|
log.info("开始异步音频处理, audioId: {}", audioId);
|
||||||
|
long startTime = System.currentTimeMillis();
|
||||||
|
|
||||||
return response;
|
// 1. TTS 转换
|
||||||
|
long ttsStart = System.currentTimeMillis();
|
||||||
|
TextToSpeechVO ttsResult = elevenLabsService.textToSpeechWithTimestamps(text);
|
||||||
|
long ttsDuration = System.currentTimeMillis() - ttsStart;
|
||||||
|
log.info("TTS 完成, audioId: {}, 耗时: {}ms", audioId, ttsDuration);
|
||||||
|
|
||||||
|
// 2. 上传到 R2
|
||||||
|
long uploadStart = System.currentTimeMillis();
|
||||||
|
String audioUrl = uploadAudioToR2(ttsResult.getAudioBase64(), userId);
|
||||||
|
long uploadDuration = System.currentTimeMillis() - uploadStart;
|
||||||
|
log.info("R2 上传完成, audioId: {}, 耗时: {}ms, URL: {}", audioId, uploadDuration, audioUrl);
|
||||||
|
|
||||||
|
// 3. 更新任务状态为完成
|
||||||
|
setAudioTaskStatus(audioId, AudioTaskVO.STATUS_COMPLETED, audioUrl, null);
|
||||||
|
|
||||||
|
long totalDuration = System.currentTimeMillis() - startTime;
|
||||||
|
log.info("异步音频处理完成, audioId: {}, 总耗时: {}ms (TTS: {}ms, Upload: {}ms)",
|
||||||
|
audioId, totalDuration, ttsDuration, uploadDuration);
|
||||||
|
|
||||||
|
} catch (Exception e) {
|
||||||
|
log.error("异步音频处理失败, audioId: {}", audioId, e);
|
||||||
|
setAudioTaskStatus(audioId, AudioTaskVO.STATUS_FAILED, null, e.getMessage());
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* 设置音频任务状态
|
||||||
|
*/
|
||||||
|
private void setAudioTaskStatus(String audioId, String status, String audioUrl, String errorMessage) {
|
||||||
|
String key = AUDIO_TASK_PREFIX + audioId;
|
||||||
|
String value = status + "|" + (audioUrl != null ? audioUrl : "") + "|" + (errorMessage != null ? errorMessage : "");
|
||||||
|
stringRedisTemplate.opsForValue().set(key, value, AUDIO_TASK_EXPIRE_SECONDS, TimeUnit.SECONDS);
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* 查询音频任务状态
|
||||||
|
*/
|
||||||
|
@Override
|
||||||
|
public AudioTaskVO getAudioTask(String audioId) {
|
||||||
|
String key = AUDIO_TASK_PREFIX + audioId;
|
||||||
|
String value = stringRedisTemplate.opsForValue().get(key);
|
||||||
|
|
||||||
|
if (cn.hutool.core.util.StrUtil.isBlank(value)) {
|
||||||
|
return AudioTaskVO.builder()
|
||||||
|
.audioId(audioId)
|
||||||
|
.status(AudioTaskVO.STATUS_PENDING)
|
||||||
|
.build();
|
||||||
|
}
|
||||||
|
|
||||||
|
String[] parts = value.split("\\|", -1);
|
||||||
|
return AudioTaskVO.builder()
|
||||||
|
.audioId(audioId)
|
||||||
|
.status(parts[0])
|
||||||
|
.audioUrl(parts.length > 1 && !parts[1].isEmpty() ? parts[1] : null)
|
||||||
|
.errorMessage(parts.length > 2 && !parts[2].isEmpty() ? parts[2] : null)
|
||||||
|
.build();
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
/**
|
||||||
|
* 上传音频到 R2
|
||||||
|
*/
|
||||||
|
private String uploadAudioToR2(String audioBase64, String userId) {
|
||||||
|
if (cn.hutool.core.util.StrUtil.isBlank(audioBase64)) {
|
||||||
|
throw new BusinessException(ErrorCode.SYSTEM_ERROR, "音频数据为空");
|
||||||
|
}
|
||||||
|
|
||||||
|
byte[] audioBytes = Base64.getDecoder().decode(audioBase64);
|
||||||
|
String fileName = UUID.randomUUID() + ".mp3";
|
||||||
|
|
||||||
|
FileInfo fileInfo = fileStorageService.of(new ByteArrayInputStream(audioBytes))
|
||||||
|
.setPath(userId + "/")
|
||||||
|
.setPlatform("cloudflare-r2")
|
||||||
|
.setSaveFilename(fileName)
|
||||||
|
.setOriginalFilename(fileName)
|
||||||
|
.upload();
|
||||||
|
|
||||||
|
if (fileInfo == null || cn.hutool.core.util.StrUtil.isBlank(fileInfo.getUrl())) {
|
||||||
|
throw new BusinessException(ErrorCode.SYSTEM_ERROR, "音频上传失败");
|
||||||
|
}
|
||||||
|
|
||||||
|
return fileInfo.getUrl();
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -0,0 +1,175 @@
|
|||||||
|
package com.yolo.keyborad.service.impl;
|
||||||
|
|
||||||
|
import cn.hutool.core.util.StrUtil;
|
||||||
|
import com.alibaba.fastjson.JSON;
|
||||||
|
import com.alibaba.fastjson.JSONObject;
|
||||||
|
import com.yolo.keyborad.common.ErrorCode;
|
||||||
|
import com.yolo.keyborad.config.ElevenLabsProperties;
|
||||||
|
import com.yolo.keyborad.exception.BusinessException;
|
||||||
|
import com.yolo.keyborad.model.vo.TextToSpeechVO;
|
||||||
|
import com.yolo.keyborad.service.ElevenLabsService;
|
||||||
|
import jakarta.annotation.Resource;
|
||||||
|
import lombok.extern.slf4j.Slf4j;
|
||||||
|
import org.springframework.stereotype.Service;
|
||||||
|
|
||||||
|
import java.io.ByteArrayOutputStream;
|
||||||
|
import java.io.InputStream;
|
||||||
|
import java.io.OutputStream;
|
||||||
|
import java.net.HttpURLConnection;
|
||||||
|
import java.net.URL;
|
||||||
|
import java.nio.charset.StandardCharsets;
|
||||||
|
import java.util.HashMap;
|
||||||
|
import java.util.Map;
|
||||||
|
|
||||||
|
/**
|
||||||
|
* ElevenLabs TTS 语音合成服务实现
|
||||||
|
* 参考: https://elevenlabs.io/docs/api-reference/text-to-speech/convert-with-timestamps
|
||||||
|
*
|
||||||
|
* @author ziin
|
||||||
|
*/
|
||||||
|
@Service
|
||||||
|
@Slf4j
|
||||||
|
public class ElevenLabsServiceImpl implements ElevenLabsService {
|
||||||
|
|
||||||
|
@Resource
|
||||||
|
private ElevenLabsProperties elevenLabsProperties;
|
||||||
|
|
||||||
|
private static final int MAX_TEXT_LENGTH = 5000;
|
||||||
|
|
||||||
|
@Override
|
||||||
|
public TextToSpeechVO textToSpeechWithTimestamps(String text) {
|
||||||
|
return textToSpeechWithTimestamps(text, elevenLabsProperties.getVoiceId());
|
||||||
|
}
|
||||||
|
|
||||||
|
@Override
|
||||||
|
public TextToSpeechVO textToSpeechWithTimestamps(String text, String voiceId) {
|
||||||
|
// 1. 参数验证
|
||||||
|
if (StrUtil.isBlank(text)) {
|
||||||
|
throw new BusinessException(ErrorCode.PARAMS_ERROR, "文本内容不能为空");
|
||||||
|
}
|
||||||
|
|
||||||
|
if (text.length() > MAX_TEXT_LENGTH) {
|
||||||
|
throw new BusinessException(ErrorCode.PARAMS_ERROR,
|
||||||
|
"文本长度超出限制,最大支持 " + MAX_TEXT_LENGTH + " 字符");
|
||||||
|
}
|
||||||
|
|
||||||
|
if (StrUtil.isBlank(voiceId)) {
|
||||||
|
voiceId = elevenLabsProperties.getVoiceId();
|
||||||
|
}
|
||||||
|
|
||||||
|
HttpURLConnection connection = null;
|
||||||
|
try {
|
||||||
|
// 2. 构建请求 URL
|
||||||
|
String requestUrl = buildRequestUrl(voiceId);
|
||||||
|
URL url = new URL(requestUrl);
|
||||||
|
|
||||||
|
// 3. 创建连接
|
||||||
|
connection = (HttpURLConnection) url.openConnection();
|
||||||
|
connection.setRequestMethod("POST");
|
||||||
|
connection.setDoOutput(true);
|
||||||
|
connection.setDoInput(true);
|
||||||
|
connection.setConnectTimeout(30000);
|
||||||
|
connection.setReadTimeout(60000);
|
||||||
|
|
||||||
|
// 4. 设置请求头
|
||||||
|
connection.setRequestProperty("Content-Type", "application/json");
|
||||||
|
connection.setRequestProperty("xi-api-key", elevenLabsProperties.getApiKey());
|
||||||
|
|
||||||
|
// 5. 构建请求体
|
||||||
|
Map<String, Object> requestBody = buildRequestBody(text);
|
||||||
|
String jsonBody = JSON.toJSONString(requestBody);
|
||||||
|
|
||||||
|
log.info("调用 ElevenLabs TTS API, voiceId: {}, 文本长度: {}", voiceId, text.length());
|
||||||
|
long startTime = System.currentTimeMillis();
|
||||||
|
|
||||||
|
// 6. 发送请求
|
||||||
|
try (OutputStream os = connection.getOutputStream()) {
|
||||||
|
byte[] input = jsonBody.getBytes(StandardCharsets.UTF_8);
|
||||||
|
os.write(input, 0, input.length);
|
||||||
|
}
|
||||||
|
|
||||||
|
// 7. 获取响应
|
||||||
|
int responseCode = connection.getResponseCode();
|
||||||
|
long duration = System.currentTimeMillis() - startTime;
|
||||||
|
log.info("ElevenLabs TTS API 响应码: {}, 耗时: {}ms", responseCode, duration);
|
||||||
|
|
||||||
|
if (responseCode == HttpURLConnection.HTTP_OK) {
|
||||||
|
// 读取响应 JSON
|
||||||
|
try (InputStream is = connection.getInputStream();
|
||||||
|
ByteArrayOutputStream baos = new ByteArrayOutputStream()) {
|
||||||
|
byte[] buffer = new byte[8192];
|
||||||
|
int bytesRead;
|
||||||
|
while ((bytesRead = is.read(buffer)) != -1) {
|
||||||
|
baos.write(buffer, 0, bytesRead);
|
||||||
|
}
|
||||||
|
String responseJson = baos.toString(StandardCharsets.UTF_8);
|
||||||
|
JSONObject jsonResponse = JSON.parseObject(responseJson);
|
||||||
|
|
||||||
|
String audioBase64 = jsonResponse.getString("audio_base64");
|
||||||
|
|
||||||
|
log.info("语音合成成功,Base64长度: {}", audioBase64.length());
|
||||||
|
|
||||||
|
return TextToSpeechVO.builder()
|
||||||
|
.audioBase64(audioBase64)
|
||||||
|
.build();
|
||||||
|
}
|
||||||
|
} else {
|
||||||
|
// 读取错误信息
|
||||||
|
String errorMsg = "";
|
||||||
|
try (InputStream es = connection.getErrorStream()) {
|
||||||
|
if (es != null) {
|
||||||
|
ByteArrayOutputStream baos = new ByteArrayOutputStream();
|
||||||
|
byte[] buffer = new byte[1024];
|
||||||
|
int bytesRead;
|
||||||
|
while ((bytesRead = es.read(buffer)) != -1) {
|
||||||
|
baos.write(buffer, 0, bytesRead);
|
||||||
|
}
|
||||||
|
errorMsg = baos.toString(StandardCharsets.UTF_8);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
log.error("ElevenLabs TTS API 调用失败, 状态码: {}, 错误信息: {}", responseCode, errorMsg);
|
||||||
|
throw new BusinessException(ErrorCode.SYSTEM_ERROR, "语音合成服务异常: " + responseCode);
|
||||||
|
}
|
||||||
|
|
||||||
|
} catch (BusinessException e) {
|
||||||
|
throw e;
|
||||||
|
} catch (Exception e) {
|
||||||
|
log.error("调用 ElevenLabs TTS API 发生异常", e);
|
||||||
|
throw new BusinessException(ErrorCode.SYSTEM_ERROR, "语音合成服务异常: " + e.getMessage());
|
||||||
|
} finally {
|
||||||
|
if (connection != null) {
|
||||||
|
connection.disconnect();
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* 构建 ElevenLabs TTS API 请求 URL(带时间戳)
|
||||||
|
*/
|
||||||
|
private String buildRequestUrl(String voiceId) {
|
||||||
|
StringBuilder url = new StringBuilder(elevenLabsProperties.getBaseUrl());
|
||||||
|
url.append("/text-to-speech/").append(voiceId).append("/with-timestamps");
|
||||||
|
url.append("?output_format=").append(elevenLabsProperties.getOutputFormat());
|
||||||
|
return url.toString();
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* 构建请求体
|
||||||
|
*/
|
||||||
|
private Map<String, Object> buildRequestBody(String text) {
|
||||||
|
Map<String, Object> requestBody = new HashMap<>();
|
||||||
|
requestBody.put("text", text);
|
||||||
|
requestBody.put("model_id", elevenLabsProperties.getModelId());
|
||||||
|
|
||||||
|
// 设置语音参数
|
||||||
|
Map<String, Object> voiceSettings = new HashMap<>();
|
||||||
|
voiceSettings.put("stability", elevenLabsProperties.getStability());
|
||||||
|
voiceSettings.put("similarity_boost", elevenLabsProperties.getSimilarityBoost());
|
||||||
|
voiceSettings.put("style", elevenLabsProperties.getStyle());
|
||||||
|
voiceSettings.put("speed", elevenLabsProperties.getSpeed());
|
||||||
|
voiceSettings.put("use_speaker_boost", elevenLabsProperties.getUseSpeakerBoost());
|
||||||
|
requestBody.put("voice_settings", voiceSettings);
|
||||||
|
|
||||||
|
return requestBody;
|
||||||
|
}
|
||||||
|
}
|
||||||
@@ -70,13 +70,13 @@ dromara:
|
|||||||
base-path: avatar/ # 基础路径
|
base-path: avatar/ # 基础路径
|
||||||
- platform: cloudflare-r2-apac # 存储平台标识
|
- platform: cloudflare-r2-apac # 存储平台标识
|
||||||
enable-storage: true # 启用存储
|
enable-storage: true # 启用存储
|
||||||
access-key: 550b33cc4d53e05c2e438601f8a0e209
|
access-key: eda135fe4fda649acecfa4bb49b0c30c
|
||||||
secret-key: df4d529cdae44e6f614ca04f4dc0f1f9a299e57367181243e8abdc7f7c28e99a
|
secret-key: ee557acaccf44caef985b5cac89db311a0923c72c9f4b8c5f32089c6ebb47a79
|
||||||
region: APAC # 区域
|
region: APAC # 区域
|
||||||
end-point: https://b632a61caa85401f63c9b32eef3a74c8.r2.cloudflarestorage.com/keyboardtest # 端点
|
end-point: https://b632a61caa85401f63c9b32eef3a74c8.r2.cloudflarestorage.com/keyboardtest # 端点
|
||||||
bucket-name: keyboardtest #桶名称
|
bucket-name: keyboardtest #桶名称
|
||||||
domain: https://cdn.loveamorkey.com/ # 访问域名,注意末尾的'/',例如:https://abcd.s3.ap-east-1.amazonaws.com/
|
domain: https://cdn.loveamorkey.com/ # 访问域名,注意末尾的'/',例如:https://abcd.s3.ap-east-1.amazonaws.com/
|
||||||
base-path: / # 基础路径
|
base-path: tts/ # 基础路径
|
||||||
|
|
||||||
############## Sa-Token 配置 (参考文档: https://sa-token.cc) ##############
|
############## Sa-Token 配置 (参考文档: https://sa-token.cc) ##############
|
||||||
sa-token:
|
sa-token:
|
||||||
@@ -100,3 +100,9 @@ nacos:
|
|||||||
server-addr: 127.0.0.1:8848
|
server-addr: 127.0.0.1:8848
|
||||||
group: DEFAULT_GROUP
|
group: DEFAULT_GROUP
|
||||||
data-id: keyboard_default-config.yaml
|
data-id: keyboard_default-config.yaml
|
||||||
|
|
||||||
|
elevenlabs:
|
||||||
|
api-key: sk_25339d32bb14c91f460ed9fce83a1951672f07846a7a10ce
|
||||||
|
voice-id: JBFqnCBsd6RMkjVDRZzb
|
||||||
|
model-id: eleven_turbo_v2_5
|
||||||
|
output-format: mp3_44100_128
|
||||||
@@ -1,13 +1,13 @@
|
|||||||
spring:
|
spring:
|
||||||
ai:
|
ai:
|
||||||
openai:
|
openai:
|
||||||
# api-key: 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
|
api-key: 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
|
||||||
api-key: sk-cf112f49cf4d4138a49575cda1f852b4
|
# api-key: sk-cf112f49cf4d4138a49575cda1f852b4
|
||||||
# base-url: https://gateway.ai.cloudflare.com/v1/b632a61caa85401f63c9b32eef3a74c8/aigetway/openrouter
|
base-url: https://gateway.ai.cloudflare.com/v1/b632a61caa85401f63c9b32eef3a74c8/aigetway/openrouter
|
||||||
base-url: https://dashscope-intl.aliyuncs.com/compatible-mode/
|
# base-url: https://dashscope-intl.aliyuncs.com/compatible-mode/
|
||||||
chat:
|
chat:
|
||||||
options:
|
options:
|
||||||
model: qwen-plus
|
model: google/gemini-2.5-flash-lite
|
||||||
embedding:
|
embedding:
|
||||||
options:
|
options:
|
||||||
model: text-embedding-v4
|
model: text-embedding-v4
|
||||||
|
|||||||
Reference in New Issue
Block a user