iTranslated by AI

The content below is an AI-generated translation. This is an experimental feature, and may contain errors. View original article
🍑

Technical Notes on "AI Anno" in the Tokyo Gubernatorial Election

に公開

Introduction

In the Tokyo gubernatorial election, "AI" has become a prominent keyword.
In particular, "AI Yuriko" and "AI Anno" are conducting election campaigns using the candidates themselves as AI avatars.

As someone who also develops AIVtubers on the YouTube platform, I am paying close attention to this movement!

This time, I will note down the technical aspects of "AI Anno" that I found interesting.

What is "AI Anno"?


It is an AI avatar that has learned Takahiro Anno's manifesto.
It responds to questions 24 hours a day via online (YouTube Live) and phone.
Here, we will look at AI Anno on YouTube Live.

How to ask "AI Anno" a question?

https://www.youtube.com/watch?v=pj6AHriidnU

"AI Anno" will answer your questions when you post them in the YouTube Live comment section.
There are a few precautions.
The following is a transcript of what is written in the YouTube Live description field.

Reception of "AI Anno"

https://x.com/takiyori0608/status/1804830508326519203
https://x.com/sa_yakusa/status/1804646605813117176
https://x.com/suzumaro2/status/1804300712245105085
https://x.com/seiseieiaieii/status/1804723747716567307

Technologies Likely Used in "AI Anno"

These are my predictions as an AIVtuber operator. For your reference!

Speech Synthesis AI

Speech synthesis AI trained on Takahiro Anno's own voice.
A voice AI model was created using technology that learns from one's own voice.
OpenAI also has "Voice Engine," which is a good technical example to refer to!
https://www.watch.impress.co.jp/docs/news/1580522.html

Text To Speech (TTS)

Technology to generate speech from text.

TTS (Text-to-Speech) is the task of generating natural-sounding speech from text input. TTS models can be extended to a single model that generates speech from multiple speakers and multiple languages.

LLM

Text generation AI such as GPT-4o.
I suspect that generative AI API services like OpenAI's are being used for stable streaming.

https://openai.com/index/hello-gpt-4o/

RAG

Creating answers using the manifesto as a database?

I'm not very familiar with RAG, so I'll omit the details below.
※ Someone please tell me!!! lol

3D Avatar

Creating 3D avatars with VRoid Studio

When I saw "AI Anno,"

"It's VRoid!!!"

I shouted (lol).

Since I have also created 3D avatars with VRoid Studio, the face looked particularly familiar.

What is VRoid Studio?
VRoid Studio is an application for Windows, macOS, and iPad that allows you to create 3D models of humanoid characters.
In addition, the character models created can be used for various purposes, whether commercial or non-commercial.
Whether you want to communicate with others in VR/AR space or work as a Virtual YouTuber, please try using characters as your own alter ego in VR/AR space or for producing works such as videos!
— From the official VRoid documentation

I wrote about how to make 3D avatars in detail here.
https://zenn.dev/yasuna/books/commentary_aituber/viewer/3dvrm

Animation Motion

Executing motions like "Handshake" in VRMA format
There is an animation format called "VRM Animation (.vrma)" that can be used with the "VRM (.vrm)" common format for 3D character models.
I suspect that motion files like "Handshake" are prepared using this technology.
https://vroid.com/news/6HozzBIV0KkcKf9dc1fZGW

Summary

I operate an AIVtuber as an individual developer, and it was a huge shock to see a candidate appearing in the Tokyo gubernatorial election using similar technology to communicate with the public.
I intend to continue keeping an eye on "AI Anno"!

I post about development and management on X like this, so please follow me!
https://twitter.com/yasun_ai

Oh, and of course, please subscribe to my explanation-based AITuber channel! 💪('ω'💪)
https://www.youtube.com/@sns-university

GitHubで編集を提案

Discussion