Workers AI Supercharges Apps With Leonardo and Deepgram
A New Era for AI-Powered Applications
Cloudflare is expanding its Workers AI platform, reinforcing the idea that the future of AI models is smaller, faster, and globally distributed. By integrating specialized GPUs into data centers worldwide, Cloudflare has built an infrastructure designed for high-speed, low-latency inference. This strategic expansion now includes premier partner models from Leonardo.Ai and Deepgram, specifically chosen for their exceptional speed-to-performance ratio, which aligns perfectly with the Workers AI architecture.
This isn't just about offering standalone AI models; it's about providing a comprehensive suite of developer tools to build entire applications. Whether you're creating an image generation service using Workers for logic, R2 for storage, and Images for media delivery, or a real-time voice agent powered by WebRTC and WebSockets, Cloudflare offers a holistic platform to bring your most ambitious AI projects to life.
Next-Generation Image Generation with Leonardo Models
Workers AI is proud to partner with Leonardo.Ai, a generative AI media lab renowned for its powerful, proprietary models. This collaboration brings two of their state-of-the-art image generation models to the Cloudflare network: @cf/leonardo/phoenix-1.0
and @cf/leonardo/lucid-origin
.
“We’re excited to enable Cloudflare customers a new avenue to extend and use our image generation technology in creative ways such as creating character images for gaming, generating personalized images for websites, and a host of other uses... all through the Workers AI and the Cloudflare Developer Platform.” - Peter Runham, CTO, Leonardo.Ai
The Phoenix model, trained from the ground up by Leonardo, excels at complex tasks like rendering text and maintaining high prompt coherence. A 1024x1024 image with 25 steps can be generated in just 4.89 seconds.
bash
curl --request POST
--url https://api.cloudflare.com/client/v4/accounts/{ACCOUNT_ID}/ai/run/@cf/leonardo/draco-1.0
--header 'Authorization: Bearer {TOKEN}'
--header 'Content-Type: application/json'
--data '{
"prompt": "A 1950s-style neon diner sign glowing at night that reads ''OPEN 24 HOURS'' with chrome details and vintage typography.",
"width":1024,
"height":1024,
"steps": 25,
"seed":1,
"guidance": 4,
"negative_prompt": "bad image, low quality, signature, overexposed, jpeg artifacts, undefined, unclear, Noisy, grainy, oversaturated, overcontrasted"
}'
For photorealistic results, the Lucid Origin model is an excellent choice. This recent addition to Leonardo's model family generated a 1024x1024 image in just 4.38 seconds.
bash
curl --request POST
--url https://api.cloudflare.com/client/v4/accounts/{ACCOUNT_ID}/ai/run/@cf/leonardo/lucid-origin
--header 'Authorization: Bearer {TOKEN}'
--header 'Content-Type: application/json'
--data '{
"prompt": "A 1950s-style neon diner sign glowing at night that reads ''OPEN 24 HOURS'' with chrome details and vintage typography.",
"width":1024,
"height":1024,
"steps": 25,
"seed":1,
"guidance": 4,
"negative_prompt": "bad image, low quality, signature, overexposed, jpeg artifacts, undefined, unclear, Noisy, grainy, oversaturated, overcontrasted"
}'
Real-Time Voice AI with Deepgram Models
Deepgram is a leader in voice AI, developing audio models that allow for natural, human-like interaction. By integrating Deepgram's models, Workers AI now provides the tools to build ultra-low-latency voice agents.
"By hosting our voice models on Cloudflare's Workers AI, we're enabling developers to create real-time, expressive voice agents with ultra-low latency. Cloudflare's global network brings AI compute closer to users everywhere, so customers can now deliver lightning-fast conversational AI experiences without worrying about complex infrastructure." - Adam Sypniewski, CTO, Deepgram
The new models include @cf/deepgram/nova-3
, a highly accurate speech-to-text model, and @cf/deepgram/aura-1
, a context-aware text-to-speech model that delivers natural pacing and expressiveness. These models also feature WebSocket support, enabling persistent connections for real-time, bi-directional communication.
Here is how you can use the Nova 3 model with the AI binding:
javascript const URL = "https://www.some-website.com/audio.mp3"; const mp3 = await fetch(URL);
const res = await env.AI.run("@cf/deepgram/nova-3", { "audio": { body: mp3.body, contentType: "audio/mpeg" }, "detect_language": true });
Building Full-Stack AI on Cloudflare
These new models are key components in building sophisticated AI applications entirely on the Cloudflare platform. A typical workflow for a voice agent could include:
- Capture audio with Cloudflare Realtime from any WebRTC source.
- Pipe it via WebSocket to your processing pipeline.
- Transcribe audio with Deepgram models running on Workers AI.
- Process the text with an LLM hosted on Workers AI or proxied via the AI Gateway.
- Orchestrate the entire flow with Realtime Agents.
Try These Models Today
This expansion marks a significant step forward in making advanced, low-latency AI accessible to developers everywhere. To explore pricing, implementation details, and start building with these new partner models, check out the official Workers AI developer documentation.