Voice System · cloned-voice agent loop

※ vision

A bidirectional voice loop wired into every phase of my agent's lifecycle. I talk; Whisper transcribes; the agent acts; ElevenLabs speaks back in a voice cloned from my own — so the agent's voice sounds like a friend, not a robot. Phase transitions, status updates, error messages, and meaningful state changes all get a brief verbal note.

※ why

Most agent UIs assume you're staring at a screen. I'm usually doing something else — pacing, dishes, walking to a thing — and I want my agent to talk to me like a co-founder, not a chatbot. ElevenLabs voice cloning got uncomfortably good, so I cloned my own voice and now my agent narrates itself.

what it does today

▶TTS at every PAI phase boundary ('Entering the Build phase')
▶speech-to-text via Whisper for the input side
▶voice ID lookup so the same voice plays consistently
▶a `notify` HTTP endpoint any tool can hit
▶graceful muting when I plug in headphones (most of the time)

※ next steps

always-on conversational mode (11.ai integration), screen-share context for the agent, and the long-running goal: a co-pilot I can pair-program with by talking, not typing.

   voice loop
   ─────────────
        you ─ whisper ─→ claude ─ tools ─→ result
                                              │
   ┌──────────────────────────────────────────┘
   ▼
   ElevenLabs (cloned voice) ─→ speakers
   "I finished the migration. You should look at the schema file."