Deploy Voice Agents with Pipecat and Amazon Bedrock
This post is a collaboration between AWS and Pipecat. Deploying intelligent voice agents that maintain natural, human-like conversations requires streaming to users where they are, across web, mobile, and phone channels, even under heavy traffic and unreliable network conditions. Even small delays can break the conversational flow, causing users to perceive the agent as unresponsive or unreliable. For use cases such as customer support, virtual assistants, and outbound campaigns, a natural flow is critical for user experience.
In this series of posts, you will learn how streaming architectures help address these challenges using Pipecat voice agents on Amazon Bedrock AgentCore Runtime. In Part 1, you will learn how to deploy Pipecat voice agents on AgentCore Runtime using different network transport approaches including WebSockets, WebRTC, and telephony integration, with practical deployment guidance and code samples.
Deploying real-time voice agents is challenging: you need low-latency streaming, strict isolation for security, and the ability to scale dynamically to unpredictable conversation volume. Without an appropriately designed architecture, you can experience audio jitter, scalability constraints, inflated costs due to over-provisioning, and increased complexity.
Amazon Bedrock AgentCore Runtime addresses these challenges by providing a secure, serverless environment for scaling dynamic AI agents. Each conversation session runs in isolated microVMs for security. It auto-scales for traffic spikes and handles continuous sessions for up to 8 hours, making it ideal for long, multi-turn voice interactions. It charges only for resources actively used, helping to minimize costs associated with idle infrastructure.
Pipecat, an agentic framework for building real-time voice AI pipelines, runs on AgentCore Runtime with minimal setup. Package your Pipecat voice pipeline as a container and deploy it directly to AgentCore Runtime. The runtime supports bidirectional streaming for real-time audio and built-in observability to trace agent reasoning and tool calls.
Create hyper-personalized viewing experiences with AI assistant
Create AI-generated soundtrack in Shorts with Dream Track
Related articles
Alibaba Introduces VimRAG: a New Framework for Multimodal RAG
Alibaba has unveiled VimRAG — a new framework for multimodal RAG, addressing challenges in working with visual data.
Combining Google Search and Google Maps in a Single Gemini API Call
Explore Google's Gemini API updates that allow combining tools in a single request.
Z.AI Launches GLM-5.1: A Record-Breaking Agentic Model
Z.AI announces GLM-5.1, a record-breaking model for agentic tasks.