Watch our AI talks at I/O 2025

Published: May 22, 2025

AI is transforming how web developers are building websites and web applications. At Google I/O 2025, we shared what we've been working on over the last year, demonstrated how our partners are making use of AI on the web, and announced new built-in AI APIs.

Did you miss the event? Good news, you can now watch the talks on-demand!

Practical built-in AI with Gemini Nano in Chrome

Our core mission is to make Chrome and the web smarter for all developers and all users. In this talk, Thomas Steiner shares updates to built-in AI, practical use cases, and a look at our future.

Built-in AI runs client-side models in the browser, which has several advantages:

  • Private: Sensitive user data remains on the device, never needing to leave the browser.
  • Offline: Applications can access AI capabilities, even without an internet connection.
  • Performant: Thanks to hardware acceleration, these APIs deliver excellent performance.

Take a look at code samples for each of the built-in AI APIs, get an update on their status, and see what companies are implementing this technology.

Multimodal APIs

We are working on brand new multimodal APIs. This means you can ask Gemini Nano about what it "sees" in visual content or "hears" in audio content. For example, get suggestions for alternative text on uploaded images on a blog platform, that users can refine and tweak. Or, you could ask Gemini Nano to write descriptions or transcriptions for podcasts.

Hybrid AI

One challenge developers face with client-side AI is that not all platforms and browsers meet the hardware requirements to run a model on-device. Gemini and Firebase partnered to build the Firebase Web SDK so that when client-side implementations are unavailable, you can fallback to Gemini Nano on a server.

Working with you

We're so glad to have worked with so many developers on built-in AI APIs. Our efforts aren't possible without you.

  • Early Preview Program: More than 16,000 developers have joined the EPP, testing new APIs, discovering new use cases, and providing feedback to build better AI for the web.
  • Hackathons: We've hosted two hackathons, and you built some incredible websites and Extensions.

Your work isn't over. Keep sharing your feedback, testing the new built-in APIs, and we'll keep iterating. You can even help standardize these APIs by joining the W3C's Web Machine Learning Community Group.

The future of Chrome Extensions with Gemini in your browser

The number of AI-powered Extensions has doubled in the last two years. In fact, 10% of all Extensions installed from the Chrome Web Store use AI. In this talk, Sebastian Benz gives practical examples for why Chrome Extensions and Gemini are such a powerful combination.

Examples range from how you can make the browser more helpful by extracting and processing data from websites on the client using Chrome's newly launched prompt API.

Over demonstrating the potential of new multimodal capabilities of Chrome's prompting API in Chrome Extensions to make audio and images more accessible to users.

To taking a look at the future of browsing by explaining how Google DeepMind's Project Mariner uses Chrome Extensions and the latest Gemini Cloud APIs to build a full-blown browser agent.

Explore the potential of using Gemini in the cloud or in the browser in Chrome Extensions to build new browsing experiences and make the browser more helpful.

Web AI use cases and strategies in the real world

Yuriko Hirota
Yuriko Hirota
Swetha Gopalakrishnan
Swetha Gopalakrishnan

Yuriko Hirota and Swetha Gopalakrishnan highlighted real-world examples of companies using AI on the web to improve their business and user experience.Whether their solution uses client-side models, server-side, or a hybrid solution, what matters is the exciting new functions and features that you make available to your users, right now.

BILIBILI made their video streams more engaging with a new feature: bullet-screen comments. They offer real-time user comments in the video, rendered behind the speaker. To do so, they use image segmentation, a well-understood machine learning concept. As a result, session duration increased by 30%! Tokopedia reduced friction in their seller verification process using a face detection model, to assess the quality of photos uploaded. As a result, they reduced manual approvals by almost 70%.

Vision Nanny, a web platform for children with Cerebral Visual Impairment (CVI), provides AI-powered vision stimulation activities. They use multiple MediaPipe libraries, including the hand landmark detection model, which locates key points of the hands in an image, video, or in real-time. A pilot with 50 children demonstrated that Vision Nanny delivered responses 5x faster than manual visual stimulation activities. Therapists reported saving an average of three hours per session by removing manual setup.

Google Meet has several features enabled by AI, from improving lighting to reducing blur and fuzzy videos. The biggest challenge is that these features need to work in real-time. That's where WebAssembly (Wasm) comes in, to tap into the full power of a computer's CPU and enable real-time video processing.

These are just a few real-world examples of AI happening on the web. Several other companies experimented with the built-in AI APIs, some of which shared their work in case studies.

Client-side Web AI agents to build smarter future user experiences

Jason Mayes walked through the future of the internet: Web AI Agents. The web has an agentic future, bringing AI capabilities directly to the browser, to perform useful work on your behalf, beyond the capabilities of large language models (LLMs).

With a client-side approach, there's enhanced privacy, reduced latency, and potential significant cost savings. Agents allow you to upgrade your existing website, to perform tasks autonomously for a user, dynamically selecting and using exposed tools–potentially in a loop–allowing the agent to complete potentially complex or multi-step tasks.

Agents can:

  • Plan and divide sub-tasks, handling more complex problems through multi-step planning to break down the task into logical steps to complete.
  • Select the best tools, whether it's functions, API usage, or datastore access to augmented language model's base knowledge, then perform actions on the outside world.
  • Retain context-based memory, based on prior outputs from the agent or external tools. Short-term memory acts like a FIFO buffer of context history up to the context window size of the model, versus long-term memory where a vector database can be used to store information to recall as needed from prior conversation sessions or other data sources entirely.

Web AI agents are designed to integrate into existing web technologies in JavaScript. Ultimately, it's important that we continue accelerating our hardware to best run models in the browser. Looking towards the future, technology like WebNN will play a key role in optimizing model execution across CPUs, GPUs, and NPUs. With the trend towards smaller LLMs and continued advancement, this will only grow to be more powerful in the future.

Consider using a hybrid approach, combining on-device processing with strategic cloud calls, so you can create intelligent, responsive, and personalized user experiences in the browser right now. Soon, your return from investing in Web AI approach should pay off as devices become more capable at running LLMs.

Catch up on Google I/O 2025

We've released all of the talks for Google I/O 2025, with a playlist dedicated to web developers. Watch even more on io.google/2025.