Greek Alphabet Software AcademyPublicationへの投稿

🐕

Web app から Gemini API を呼び出してみる

tmassh

2023/12/20に公開

Google Developer Groups in Japan Advent Calendar 2023 12/19の記事です。

こんにちは。tmassh です。
Google Cloud のエンタープライズ向けユーザー会である Jagu’e’r の Evangelist としての活動を中心に、Google Developer Groups Kobe (GDG Kobe) のオーガナイザー補助をしています。
また、他にもいくつかの技術コミュニティで活動をしています。
今日はこの前出た Gemini の API を使ってマルチモーダルの Web App をささっと作ってみたので記事に残しておきます。

tl;dr

Web App から Gemini API を呼び出すことができます。まずは簡単な PoC を作る段階の参考になれば幸いです。

手順

API Key を取得する
簡単なフロントを構成する
呼び出せるように編集する

API Key を取得する

Google AI Studio から取得できます。

"Create API Key in new project" をクリックするだけなので楽。
出てきた API はコピーしておきましょう。

簡単なフロントを構成する

今回は react で構築します。
ついでに material 3 を使ってみる。

まずは react プロジェクトを作った後、ディレクトリに移動して material 3 を install します。

npm install @material/web

button と text area だけ配置する

App.js

import './App.css';
import '@material/web/button/filled-button';
import '@material/web/textfield/filled-text-field';

function App() {
  return (
    <div className="App">
      <md-filled-text-field
        type="textarea"
        rows="5">
      </md-filled-text-field>
      <md-filled-button id='button'>send</md-filled-button>
    </div>
  );
}

export default App;

CSS は適当でも OK

SDK のインストール

npm install @google/generative-ai

まずはテキストが投げられるようにする

統合してみる

gemini.js

import { GoogleGenerativeAI } from "@google/generative-ai";
import { API_KEY } from "./config";

const genAI = new GoogleGenerativeAI(API_KEY);

export async function geminiRun(prompt) {
  const model = genAI.getGenerativeModel({ model: "gemini-pro"});

  const result = await model.generateContent(prompt);
  const response = await result.response;
  const text = response.text();
  console.log(text);

  return text;
}

App.js

import './App.css';
import React,{useRef} from 'react';
import '@material/web/button/filled-button';
import '@material/web/textfield/filled-text-field';
import { geminiRun } from './gemini';

function App() {
  const inputRef = useRef(null);
  const outputRef = useRef(null);

  const onButtonClick = async () => {
    if (!inputRef.current) { return; }
    
    const inputText = inputRef.current.value;
    const response = await geminiRun(inputText);
    
    outputRef.current.value = response;
  }


  return (
    <div className="App">
      <p>Input</p>
      <md-filled-text-field
        id="input"
        type="textarea"
        rows="5"
        ref={inputRef}>
      </md-filled-text-field>

      <p>Output</p>
      <md-filled-text-field
        id="output"
        type="textarea"
        rows="5"
        readonly="true"
      ref={outputRef}>
      </md-filled-text-field>

      <md-filled-button type="button" id='button' onClick={onButtonClick}>send</md-filled-button>
    </div>
  );
}

export default App;

ここまででこんな感じ。

マルチモーダルにする

gemini-pro-vision を扱えるようにする

gemini.js

import { GoogleGenerativeAI } from "@google/generative-ai";
import { API_KEY } from "./config";

const genAI = new GoogleGenerativeAI(API_KEY);

export async function geminiMultiRun(prompt, file) {
  const model = genAI.getGenerativeModel({ model: "gemini-pro-vision"});
  const imageFile = await fileToGenerativePart(file);

  const result = await model.generateContent([prompt, imageFile]);
  const response = await result.response;
  const text = response.text();
  console.log(text);

  return text;
}

async function fileToGenerativePart(file) {
  const base64EncodedDataPromise = new Promise((resolve) => {
    const reader = new FileReader();
    reader.onloadend = () => resolve(reader.result.split(',')[1]);
    reader.readAsDataURL(file);
  });
  return {
    inlineData: { data: await base64EncodedDataPromise, mimeType: file.type },
  };
}

画像を扱えるようにしておく

App.js

function App(){
  ...
  const [file, setFile] = useState(null);
  const [image, setImage] = useState(null);

  const onImageChange = (event) => {
    if (event.target.files && event.target.files[0]) {
      setFile(event.target.files[0]);
      setImage(URL.createObjectURL(event.target.files[0]));
    }
  }
  ...
  const onButtonClick = async () => {...
}

とりあえずそれっぽいものができた！

おわりに

Web App にも簡単に統合できるのが嬉しい。マルチモーダルも怖くない！PoC もサクサク作れそう。
とにかくクイックスタートが充実していますのでみなさんもぜひ、冬休みに遊んでみてください。
https://ai.google.dev/tutorials?hl=ja

Greek Alphabet Software AcademyPublication

神戸を中心として活動する、シリコンバレー流ソフトウェア技術者の養成機関

tl;dr

手順

API Key を取得する

簡単なフロントを構成する

SDK のインストール

まずはテキストが投げられるようにする

マルチモーダルにする

おわりに

Discussion