Skip to content
Advertisement
When Appliance Fail?

Google's Gemma 4 AI models get 3x speed boost by predicting future tokens

Up to 3x the speed with no loss of quality—is it too good to be true?

schedule 15:44 visibility 65 views
Google's Gemma 4 AI models get 3x speed boost by predicting future tokens
Source: Ars Technica

Google launched its Gemma 4 open models this spring, promising a new level of power and performance for local AI. Google's take on edge AI could be getting even faster already with the release of Multi-Token Prediction (MTP) drafters for Gemma. Google says these experimental models leverage a form of speculative decoding to take a guess at future tokens, which can speed up generation compared to the way models generate tokens on their own.

The latest Gemma models are built on the same underlying technology that powers Google's frontier Gemini AI, but they're tuned to run locally. Gemini is optimized to run on Google's custom TPU chips, which operate in enormous clusters with super-fast interconnects and memory. A single high-power AI accelerator can run the largest Gemma 4 model at full precision, and quantizing will let it run on a consumer GPU.

Gemma allows users to tinker with AI on their hardware rather than sharing all their data with a cloud AI system from Google or someone else. Google also changed the license for Gemma 4 to Apache 2.0, which is much more permissive than the custom Gemma license Google employed for previous releases. However, there are inherent limitations in the hardware most people have to run local AI models. That's where MTP comes in.

Read full article

Comments

newspaper

Originally published at

Ars Technica

open_in_new Read Full Article

Related Articles

GOG apologizes for emailing people Nazi symbols
Technology

GOG apologizes for emailing people Nazi symbols

GOG sent a newsletter about the game The End of the Sun on June 5th that included symbols associated with the Nazi SS. The Steam competitor issued a statement attributing the inclusion to a "series of mistakes," including miscommunication with the...

The Verge
Meta made its own AI-generated clickbait news feed
Technology

Meta made its own AI-generated clickbait news feed

Facebook has long been filled with feeds of clickbait articles. Now, Meta is making its own clickbait articles with AI. The standalone Meta AI app now has a "For You" section that populates a list of clickbait-style stories for you to read. But the...

The Verge

Read More

Here comes new Siri again
Technology

Here comes new Siri again

Apple has been on its back foot, AI-wise, for the past few years. But in a strange way, playing from behind might not be such a bad move. At WWDC on Monday, Apple appears to be getting ready to reintroduce us to the new Siri. Again. As a reminder...

The Verge
The next YouTube phenomenon hitting the big screen
Technology

The next YouTube phenomenon hitting the big screen

Hi, friends! Welcome to Installer No. 131, your guide to the best and Verge-iest stuff in the world. (If you're new here, welcome, happy last week of productivity before the World Cup starts, and also you can read all the old editions at the...

The Verge
Your Appliance Broke?
Reliable Repair for