top of page

Gemini Live Breaks The Forth Wall As Your Own Personal Superpowered Eyes

  • Writer: Don Batsford
    Don Batsford
  • Jul 20
  • 3 min read

Google's Gemini Live in has introduced the ability to use a device's camera to show the AI objects and environments. 

ree

This moves beyond simple voice to the integration of the camera and screen sharing directly into conversations and allows the AI to "see" what the user sees, providing a continuous video feed that Gemini can interpret and respond to in real time. It is like a tour guide, handy person, and professor is always with you.


For example, point your phone at a plant and ask for identification or ask why it looks sad (is it water? Is it bugs? more/less light?), or show the AI a broken appliance and receive troubleshooting steps. 


Seeing the real world in real time is a notable step forward, transforming AI from a passive assistant to an active participant in the user's environment…but it feels different. It feels novel, it feel like AI will be able to help more people, in more places where tech hasn’t been able to go before.


The experience of walking around a space, showing the AI various objects, and asking questions about the environment has a sense of gravity, a tangible advancement in AI. The low-latency responses and the ability to interrupt Gemini at any point create a more natural and fluid interaction. This is particularly useful in scenarios that require hands-on assistance, such as following a complex recipe/boardgame/craft, performing a DIY project, or even getting real-time feedback on an outfit. Or in my case, asking questions like, how old is this, what is this called in a different language, or just talk to me about this random thing…


On Android devices, Gemini Live can be activated by a long press of the power button, and from there, initiate a "Live" session with camera or screen sharing. Feel free to switch between the front and rear cameras, or to share the screen to discuss on-screen content.

While the concept of using a camera to interact with an AI is not entirely new, the implementation in Gemini Live, the ability to have a fluid, back-and-forth dialogue about one's immediate surroundings is a powerful tool for marketers and AI enthusiasts alike, offers a glimpse into a future where the line between the digital and physical worlds becomes increasingly blurred.


More examples:

  • “Hey Gemini! What can you tell me about this painting, who painted it, when, where was it painted?”

  • While pointing the camera at a circuit board: "Identify the components on this board and explain their functions. Based on the visible connections, what is the likely purpose of this device?"

  • During a live event: "Listen to the speaker and provide a real-time summary of their key points. Cross-reference their statements with recent industry news and identify any potential inaccuracies or noteworthy insights."

  • Screen-sharing a data dashboard: "Analyze the trends in this data. What are the key performance indicators, and are there any anomalies or correlations that I should be aware of? Project the likely outcomes for the next quarter based on this data."

  • While cooking: "I've substituted an ingredient in this recipe. How will this affect the cooking time and final texture? What adjustments should I make to compensate?"

  • Pointing the camera at a piece of code on a monitor: "Review this code for any errors or inefficiencies. Suggest optimizations for better performance and readability. How can this code be integrated with the existing codebase on my screen?"

  • During a workout: "Analyze my form for this exercise. What adjustments can I make to improve my technique and prevent injury? Track my repetitions and heart rate, and provide feedback on my performance."

  • Pointing the camera at a historical landmark: "Provide a detailed history of this landmark, including its architectural style and significant events that took place here. Are there any hidden details or features that I might be missing?"



 
 
 

Comments


bottom of page