Chrome Built-in AI: Simple Multimodal Vision

This example demonstrates the ability of the built-in AI to handle multimodal input, specifically processing images. You can either upload an image file or take a picture using your webcam. The AI will then generate a description of the image content.

Image Input

Upload an image file:


Capture with your webcam:

Webcam idle.

Status and Output

Ready. Choose an image input method.


A Note on Chrome Flags

To use this feature, you must have the "Enable built-in AI" flag enabled in Chrome. Open a new tab, paste the link below, press Enter, enable the flag, and then restart Chrome.



My GitHub

You can find more of my work on my hpssjellis GitHub page:

By Jeremy Ellis LinkedIn