Chrome Dev Channel で Gemini Nano をオンデバイス LLM として試す

Chrome の次のバージョンにオンデバイスLLMとして Gemini Nano がインクルードされることが検討されているらしい。 Chrome Dev Channel で既に試せる。

`Gemini Nano`

Googleが開発する生成AIモデル「Gemini」ファミリーの１つ。特にオンデバイスでのタスクに最適化された小型モデル。小型ですがマルチモーダルに対応しており画像や音声などを処理可能です。既に Pixel 8/8a や Samsung Galaxy S24 などの特定のスマホへの統合が進んでいます。

`Chrome Dev Channel`

Chrome ブラウザには、Stable、Extended Stable、Beta、Dev、Canary の 5 つのチャンネルがあり、Dev はいち早く新機能を試せる。

Dev チャンネルの機能は完全に安定しているわけではありません。デベロッパーは Dev チャンネルを使用して Chrome の新機能をプレビューできます（Stable 版より 9～12 週間早くプレビュー可能）。

hata

利用するための手順はざっくり次のとおり

Chrome の Dev Channel をPCにインストールする。
次の２つの flags を変更し、１つの component をアップデート。
1. chrome://flags/#optimization-guide-on-device-model:
  - Default → Enabled BypassPerfRequirement
2. chrome://flags/#prompt-api-for-gemini-nano:
  - Default → Enabled
3. chrome://components の Optimization Guide On Device Model
  - アップデートを確認 。自分のてもとでは バージョン: 2024.6.5.2205 になった。

「Optimization Guide On Device Model」が当初表示されなくて？？？。しかし Chrome Dev を再起動したりお茶を飲んだりしてたらいつの間にか表示されるようになっていた。表示されない方いたら試してみてください。

`Enabled BypassPerfRequirement` とは？

Enabled bypass performance requirement の略だと思う。性能要件はとりあえず無視して有効化する、ということ。小型モデルとはいえ LLM なのでPCのCPU/GPUなどの性能次第ではまともに使えなかったり場合によっては悪影響があるおそれもあるんだろうけどとりあえず評価したいという場合に、くらいで理解した。

参考

hata

esa の記事を要約してくれるコードを書いてみた。

(async () => {
    if ((await window.ai.canCreateTextSession()) === "readily") {
        console.log("Gemini Nano is ready.");
        const postText = document.querySelector("div.post-body").innerText;
        console.time("prompt");
        const session = await window.ai.createTextSession();
        const result = await session.prompt(`次の文章を３行に要約してください。\n\n----\n\n${postText}`);
        console.timeEnd("prompt");
        console.log(result);
        window.alert(result);
    } else {
        console.log("Gemini Nano is NOT ready.");
    }
})();

ブックマークレット（ブックマークに保存するとクリックで実行できる JS）にするとこう↓。

javascript: (async () => { if ((await window.ai.canCreateTextSession()) === "readily") { console.log("Gemini Nano is ready."); const postText = document.querySelector("div.post-body").innerText; console.time("prompt"); const session = await window.ai.createTextSession(); const result = await session.prompt(`次の文章を３行に要約してください。\n\n----\n\n${postText}`); console.timeEnd("prompt"); console.log(result); window.alert(result); } else { console.log("Gemini Nano is NOT ready."); } })();

記事の量次第でも大きく変わるが、レイテンシは数秒から場合によっては30秒以上。 NW 通信はないわけだが、やはりけっこう処理時間がかかる。 M1 MacBook Air だからかもしれない。

hata

ついでにデジタル庁の各記事を要約してくれるコードも書いてみた。記事の内容を取ってくるdocument.querySelector/querySelectorAll 以外さっきとまったく同じ。

コード：

(async () => {
    if ((await window.ai.canCreateTextSession()) === "readily") {
        console.log("Gemini Nano is ready.");
        const postText = document.querySelectorAll("article.article > div")[1].innerText;
        console.time("prompt");
        const session = await window.ai.createTextSession();
        const result = await session.prompt(`次の文章を３行に要約してください。\n\n----\n\n${postText}`);
        console.timeEnd("prompt");
        console.log(result);
        window.alert(result);
    } else {
        console.log("Gemini Nano is NOT ready.");
    }
})();

上記のブックマークレット：

javascript:(async () => { if ((await window.ai.canCreateTextSession()) === "readily") { console.log("Gemini Nano is ready."); const postText = document.querySelectorAll("article.article > div")[1].innerText; console.time("prompt"); const session = await window.ai.createTextSession(); const result = await session.prompt(`次の文章を３行に要約してください。\n\n----\n\n${postText}`); console.timeEnd("prompt"); console.log(result); window.alert(result); } else { console.log("Gemini Nano is NOT ready."); } })();

生成された結果と処理時間：

hata

こういった単なる要約はたぶん高レベル API の提供が充実してくると思うでもっと簡単になるだろう。

を読むと、

Translation API
Summarization API
ファインチューニング（LoRA）API

などが示唆されている。

このアーキテクチャ図も夢が膨らむ。

Gemini Nano

Chrome Dev Channel

Enabled BypassPerfRequirement とは？

参考

`Gemini Nano`

`Chrome Dev Channel`

`Enabled BypassPerfRequirement` とは？