Calibrate: open-source no-code simulation and evaluation studio for voice agents

"How do you actually test a voice agent?"

This was the first question I asked when I joined ARTPARK (at IISc) 5 months back to build the infra for solving problems in public health using voice agents with my mentor and friend, Jigar.

If you have ever deployed a voice agent, you are familiar with the pain: your voice agent works perfectly in demos but falls apart in production.

Across teams, the pattern is clear:

We built Calibrate to fill this gap by applying a proven paradigm from software engineering to voice agents: unit tests + end-to-end tests.

๐ŸŽ™๏ธ Test the entire agent end-to-end

Calibrate lets you simulate conversations using realistic user personas ("who" the user is) and scenarios ("what" the user is doing). This lets you stress-test failure modes like users interrupting the agent, being hesitant, giving partial answers and more.

โš™๏ธ Test each component of your agent

You don't have to guess which provider is right for you anymore. With Calibrate, you can benchmark different providers (like Google, Sarvam, ElevenLabs and more) for each component of your voice stack (STT, TTS, LLM) on your dataset.

๐Ÿ”„ The path to reliability

This creates a virtuous loop where you:

Over time, your test suite grows to capture the key failure modes.

You not only ship with confidence but ensure you never repeat a bug again.

Whether you are a PM or founder or ML engineer, Calibrate is built for you.

Start Calibrating now: calibrate.artpark.ai

Calibrate comes with a CLI too:
pip install calibrate-agent

The best part: it is open-source. Forever.
https://github.com/artpark-sahai-org/calibrate

We are early and have a long roadmap ahead. If you are building voice agents and care about the future of responsible Voice AI, let's do it together!

Join our community ๐Ÿค

Discord: https://lnkd.in/gpKaY_np
WhatsApp: https://lnkd.in/gXF3w4bR