Back to all posts

Workers AI Supercharges Apps With Leonardo and Deepgram

2025-08-27Michelle ChenNikhil Kothari4 minutes read
AI
Developer Platform
Generative AI

A New Era for AI-Powered Applications

Cloudflare is expanding its Workers AI platform, reinforcing the idea that the future of AI models is smaller, faster, and globally distributed. By integrating specialized GPUs into data centers worldwide, Cloudflare has built an infrastructure designed for high-speed, low-latency inference. This strategic expansion now includes premier partner models from Leonardo.Ai and Deepgram, specifically chosen for their exceptional speed-to-performance ratio, which aligns perfectly with the Workers AI architecture.

This isn't just about offering standalone AI models; it's about providing a comprehensive suite of developer tools to build entire applications. Whether you're creating an image generation service using Workers for logic, R2 for storage, and Images for media delivery, or a real-time voice agent powered by WebRTC and WebSockets, Cloudflare offers a holistic platform to bring your most ambitious AI projects to life.

Workers AI expands its model catalog

Next-Generation Image Generation with Leonardo Models

Workers AI is proud to partner with Leonardo.Ai, a generative AI media lab renowned for its powerful, proprietary models. This collaboration brings two of their state-of-the-art image generation models to the Cloudflare network: @cf/leonardo/phoenix-1.0 and @cf/leonardo/lucid-origin.

“We’re excited to enable Cloudflare customers a new avenue to extend and use our image generation technology in creative ways such as creating character images for gaming, generating personalized images for websites, and a host of other uses... all through the Workers AI and the Cloudflare Developer Platform.” - Peter Runham, CTO, Leonardo.Ai

The Phoenix model, trained from the ground up by Leonardo, excels at complex tasks like rendering text and maintaining high prompt coherence. A 1024x1024 image with 25 steps can be generated in just 4.89 seconds.

bash curl --request POST
--url https://api.cloudflare.com/client/v4/accounts/{ACCOUNT_ID}/ai/run/@cf/leonardo/draco-1.0
--header 'Authorization: Bearer {TOKEN}'
--header 'Content-Type: application/json'
--data '{ "prompt": "A 1950s-style neon diner sign glowing at night that reads ''OPEN 24 HOURS'' with chrome details and vintage typography.", "width":1024, "height":1024, "steps": 25, "seed":1, "guidance": 4, "negative_prompt": "bad image, low quality, signature, overexposed, jpeg artifacts, undefined, unclear, Noisy, grainy, oversaturated, overcontrasted" }'

A 1950s style neon diner sign generated by the Phoenix model

For photorealistic results, the Lucid Origin model is an excellent choice. This recent addition to Leonardo's model family generated a 1024x1024 image in just 4.38 seconds.

bash curl --request POST
--url https://api.cloudflare.com/client/v4/accounts/{ACCOUNT_ID}/ai/run/@cf/leonardo/lucid-origin
--header 'Authorization: Bearer {TOKEN}'
--header 'Content-Type: application/json'
--data '{ "prompt": "A 1950s-style neon diner sign glowing at night that reads ''OPEN 24 HOURS'' with chrome details and vintage typography.", "width":1024, "height":1024, "steps": 25, "seed":1, "guidance": 4, "negative_prompt": "bad image, low quality, signature, overexposed, jpeg artifacts, undefined, unclear, Noisy, grainy, oversaturated, overcontrasted" }'

A photorealistic image of a neon diner sign generated by the Lucid Origin model

Real-Time Voice AI with Deepgram Models

Deepgram is a leader in voice AI, developing audio models that allow for natural, human-like interaction. By integrating Deepgram's models, Workers AI now provides the tools to build ultra-low-latency voice agents.

"By hosting our voice models on Cloudflare's Workers AI, we're enabling developers to create real-time, expressive voice agents with ultra-low latency. Cloudflare's global network brings AI compute closer to users everywhere, so customers can now deliver lightning-fast conversational AI experiences without worrying about complex infrastructure." - Adam Sypniewski, CTO, Deepgram

The new models include @cf/deepgram/nova-3, a highly accurate speech-to-text model, and @cf/deepgram/aura-1, a context-aware text-to-speech model that delivers natural pacing and expressiveness. These models also feature WebSocket support, enabling persistent connections for real-time, bi-directional communication.

Here is how you can use the Nova 3 model with the AI binding:

javascript const URL = "https://www.some-website.com/audio.mp3"; const mp3 = await fetch(URL);

const res = await env.AI.run("@cf/deepgram/nova-3", { "audio": { body: mp3.body, contentType: "audio/mpeg" }, "detect_language": true });

Building Full-Stack AI on Cloudflare

These new models are key components in building sophisticated AI applications entirely on the Cloudflare platform. A typical workflow for a voice agent could include:

  1. Capture audio with Cloudflare Realtime from any WebRTC source.
  2. Pipe it via WebSocket to your processing pipeline.
  3. Transcribe audio with Deepgram models running on Workers AI.
  4. Process the text with an LLM hosted on Workers AI or proxied via the AI Gateway.
  5. Orchestrate the entire flow with Realtime Agents.

Try These Models Today

This expansion marks a significant step forward in making advanced, low-latency AI accessible to developers everywhere. To explore pricing, implementation details, and start building with these new partner models, check out the official Workers AI developer documentation.

Read Original Post
ImaginePro newsletter

Subscribe to our newsletter!

Subscribe to our newsletter to get the latest news and designs.