Google Gemini Users Can Now Generate 30-Second Music Tracks from Text and Images

February 19, 2026

Google users interacting with the Gemini chatbot have gained access to a new music generation feature powered by an advanced AI model developed by DeepMind. This update allows for the creation of short, 30-second musical tracks based on various inputs such as textual descriptions, images, and video clips.

Multimodal AI Enables Creative Music Composition

The innovation comes through beta access to an AI model named Lyria 3, integrated within Google’s Gemini chatbot. Users can now produce original audio compositions seamlessly within the chat interface, without needing to launch separate applications or tools.

This multimodal capability leverages input modalities beyond simple text prompts, incorporating visual guidance from pictures or videos to influence the generated musical pieces. Such integration marks a notable step forward in AI-driven creative tools, combining visual and textual data to craft audio content.

While details such as the underlying architecture, quality metrics, or potential commercial deployment have not been explicitly revealed, this feature underscores ongoing efforts to make AI creativity more accessible to everyday users. The ability to produce bespoke music simply by describing a scene, mood, or concept, or by uploading multimedia content, opens a wide range of possibilities for casual creators, content producers, and hobbyists alike.

This development is part of a broader trend within the AI community, where generative models continue to expand their reach from text and images into audio and music realms. By embedding this functionality directly into an established chatbot environment, Google lowers the technical barrier and streamlines the interactive experience.

As beta access continues, user feedback is likely to inform enhancements and refinements, potentially influencing future iterations of music synthesis technology integrated within AI conversational agents.

Google Gemini’s latest update enables users to create 30-second music compositions using text descriptions, images, and videos directly within the chatbot interface.