A slick editing job made that Google Gemini video so amazing
3 mins read

A slick editing job made that Google Gemini video so amazing

With Google Gemini, things are not as they seem.

On Wednesday, Google released Gemini, its latest AI model. The demo video, lasting six minutes, showcased Gemini’s impressive capabilities, such as tracking a ball in a cup trick, identifying countries on a map, and recognizing a simple duck drawing. 

This video garnered excitement among tech enthusiasts on social media, leading many to believe that Artificial General Intelligence (AGI) is making significant strides. However, some viewers argue that the video doesn’t live up to the hype, suggesting that Gemini may not be as groundbreaking as initially thought.

Soon after the release of the Gemini video, experts quickly uncovered that it was a tad exaggerated. Parmy Olson, reporting for Bloomberg, highlighted that the video had undergone various edits and manipulations.

The Gemini demo was embellished in what way?

Google has confirmed that the video was not recorded in real time. Instead, it was crafted by utilizing still image frames from the footage and guiding the process through text prompts. This clarification comes from a spokesperson representing Google.

It appears that the person’s voice guides Gemini, but the audio was added later in the process. The Google spokesperson mentioned that the user’s voice overlays real excerpts from the prompts used to generate the Gemini output. Moreover, the YouTube description notes that the video’s latency has been reduced, and Gemini outputs have been shortened for brevity. Essentially, the quick response time seen in the video is not reflective of real-time performance.

Read Also | Google launches Gemini, an AI model it hopes will defeat GPT-4

Following the revelation that Gemini’s capabilities were enhanced through polished video editing, Oriol Vinyals, Google DeepMind VP of Learning and Research Lead, clarified on X (formerly Twitter). He stated, “All the user prompts and outputs in the video are real, shortened for brevity.” Vinyals emphasized that the video aims to illustrate potential multimodal user experiences with Gemini and was created to inspire developers.

Despite the demonstration of Gemini’s actual achievements in reasoning skills through text prompts and photo stills, users expressed a sense of deception. Some questioned the authenticity of presenting prompts as “real but shortened,” with one commenter stating, “Sorry, ‘real but shortened’ isn’t a thing.”

The controversy surrounding the demo overshadowed some of Gemini’s notable accomplishments. The blog post that detailed how the video was created highlighted Gemini’s impressive reasoning skills. Other promotional videos showcased specific use cases, such as extracting scientific data from a large number of research papers or aiding parents in assisting their children with math and physics homework.

Also Read | YouTube videos can’t be summarized by Microsoft’s Edge Copilot AI

Ultimately, whether Gemini’s abilities meet, exceed, or fall short of expectations is left to the judgment of users.

Leave a Reply

Your email address will not be published. Required fields are marked *