Doubao AI Evolves With Real Time Video Interaction
Doubao Steps Up AI Interaction with Live Video Calls
Tech giant ByteDance, known as the parent company of TikTok, has recently enhanced its Doubao chatbot. This popular artificial intelligence (AI) app in China now boasts a groundbreaking real-time video call function. This new feature transforms Doubao into an interactive digital assistant, ready to engage with users in a more visual and dynamic way.
How to Use Doubao's Real Time Video Assistance
This innovative function enables users to have interactive video conversations directly with the AI powering Doubao. According to an announcement made last Friday on Doubao’s official WeChat channel, activating this feature is simple. Users just need to turn on their smartphone's camera while engaged in a voice call with the chatbot.
Everyday Uses for Doubao's New Video Powers
Once the video function is active, Doubao's capabilities expand significantly. It can act as your personal real-time guide during a museum tour, offering insights as you explore. If you're tending to your garden, Doubao can become a knowledgeable tutor, helping identify plants or offering care tips. While shopping for groceries, it can transform into a recipe master, suggesting dishes or helping you pick ingredients. Moreover, Doubao can serve as an analyst, assisting in the study of charts, graphs, or even videos by visually processing the information presented.
The Tech Powering Doubao's Visual Intelligence
The foundation of this new video call feature, as stated by Doubao, is ByteDance’s advanced visual reasoning AI model. This sophisticated model is engineered to seamlessly integrate visual information with language inputs. This synergy not only powers the interactive assistance but also supports users in content creation and detailed study of various subjects. Additionally, the feature is enhanced with online search capabilities, allowing Doubao to access and provide the most current information from the internet.
ByteDance Pushes Boundaries in Generative AI
The introduction of this real-time, interactive video call function in Doubao is a clear demonstration of ByteDance's ongoing advancements in generative AI (GenAI). This development highlights the sophisticated multimodal capabilities embedded in products built on ByteDance's proprietary AI models. Generative AI, for context, refers to algorithms designed to create novel content, which can range from audio and code to images, text, simulations, and videos.
Contextualizing ByteDance's AI Innovations
This latest update to Doubao follows other recent AI achievements from ByteDance. Earlier this month, Doubao showcased its ability to convert any photograph into pixel art. Furthermore, in February, ByteDance unveiled its OmniHuman-1 multimodal AI model. This model garnered significant attention for its impressive capability to transform static photos and audio clips into lifelike, animated videos.
Chinese chatbot app icons displayed on a smartphone. These include, shown clockwise, iFlytek’s Spark, ByteDance’s Doubao, Zhipu AI’s Zhipu Qingyan and Baidu’s Ernie Bot. Photo: Shutterstock