iTranslated by AI
AI VTuber Development Diary: From Creating AI Characters to YouTube Streaming with OBS
Introduction
Target Audience
- People who want to create AI characters in the future
- People who want to know how to use OBS Studio and stream on YouTube
- People who want to know how much can be achieved in a limited period of 15 days
Background
With the news that Shizuku AI received large-scale funding from a major US VC, bringing its corporate value to approximately 12 billion yen, I think many people have become interested in AI VTubers.
Following this news, for those thinking of developing AI VTubers or AI characters, I have decided to publish the diary of when I developed an AITuber in a limited period of 15 days.
Since the content is from about six months ago, it includes some outdated information, but it also covers how to create AI characters and YouTube streaming using OBS Studio, so I hope it will serve as a starting point for knowing what to do to begin activities.
So, let's get started.
Day 1: Talking with AI Nike-chan using aituber-kit
On June 16, 2025, I applied for the AITuber Meetup organized by NukoNuko-san.
Because I applied late, I am currently on the waitlist. However, believing in the possibility of participating, I will challenge myself to develop an AITuber.
There are only 18 days left until Friday, July 4th. I'll start by learning how to make an AITuber immediately.
I won't think about what kind of character I want to make or what I want to do at this point; I'll proceed with the policy of first making something that works.
Trying out aituber-kit
To get a feel for it, I'll first try using Nike-chan's aituber-kit.
I will try to do the development on EVO-X2 (Ubuntu) as much as possible.
$ git clone aituber-kit
$ cd aituber-kit
$ pnpm install
$ pnpm run dev
By accessing http://localhost:3000 in a web browser, you can summon AI Nike-chan.
On EVO-X2, it moves very smoothly!


LLM Settings (When using Ollama)
- Gear icon -> AI Settings ->
Select AI Service: SelectOllama -
Enter URL: Enterhttp://localhost:11434/api -
Select Model: Select a model already installed inOllama.- It would be more convenient if this could be selected from a dropdown menu.
-> I believe the list of models can be loaded fromhttp://localhost:11434/api/tags.
- It would be more convenient if this could be selected from a dropdown menu.

TTS Settings (When using AivisSpeech Engine)
This setting should be straightforward.

If you haven't installed AivisSpeech Engine, you can install it using the following steps. For Windows or Mac, I think it's easier to use by just installing the app.
Installing AivisSpeech Engine
If Docker is not installed, install it by referring to something like the following.
If already installed, you can start AivisSpeech Engine with the following commands.
I encountered a permission error halfway through, but solved it by asking Claude.
$ git clone https://github.com/Aivis-Project/AivisSpeech-Engine.git
$ cd AivisSpeech-Engine
$ docker pull ghcr.io/aivis-project/aivisspeech-engine:cpu-latest
docker run --rm -p '10101:10101' \
-v ~/.local/share/AivisSpeech-Engine:/home/user/.local/share/AivisSpeech-Engine-Dev \
ghcr.io/aivis-project/aivisspeech-engine:cpu-latest
Now it's possible to talk with the LLM model selected in Ollama and play the output text as synthesized speech. That's all for today.
Day 2: Character Creation
Nike-chan summarized how to make an AITuber without using AITuberKit in the article "Starting AITuber Creation from Scratch." So, I will proceed with development following this for study until I get stuck.
First, I'll start by creating a character.
Installing VTubeStudio
First, install Steam.
It was a bit confusing, but I completed the installation while creating an account by referring to the following page.
Next, download and install VTubeStudio.
Trying to LAUNCH.


It... moves!

Steps to Run VTubeStudio with OpenSeeFace Facial Recognition (for Linux)
I feel like I'm drifting away from AITuber development, but I found a page that seems to describe how to run VTubeStudio on Linux. Looking closely, it seems facial recognition is available. I'll try installing it for now.
$ git clone https://github.com/emilianavt/OpenSeeFace
$ cd OpenSeeFace
$ mv pyproject.toml pyproject.toml.orig
$ uv init --python 3.10
$ uv add onnxruntime opencv-python pillow numpy==1.26.1
$ echo -e "ip=0.0.0.0\nport=11573" > "$HOME/.local/share/Steam/steamapps/common/VTube Studio/VTube Studio_Data/StreamingAssets/ip.txt"
$ uv run facetracker.py -W 1280 -H 720 --discard-after 0 --scan-every 0 --no-3d-adapt 1 --max-feature-updates 900 -c 0 --ip 127.0.0.1 --port 11573
With this, if you have a webcam, facial recognition will work in real-time.
Then, in VTubeStudio's Settings -> Webcam Tracking, if you turn Camera ON, distance, face orientation, and eye movement (?) will be reflected on the character in real-time. Unfortunately, limb and facial expression movements don't seem to be reflected.

Now I've joined the ranks of VTubers! I've also created a YouTube channel, and I'll do my best aiming for the Gold Creator Award first!
Creating Character Model (Part 1)
Jokes aside, after all this hard work, I still haven't made an original character model.
It seems Live2D or VRM can be used in AITuberKit, so I'll research how to make each. Is VTubeStudio only for Live2D?
Live2D
Of course, you can buy a model, but you can also make one yourself.
I'm not very confident, but from what I've researched, it can be made using the following steps:
- Create a psd with separated parts using PhotoShop, CLIP STUDIO PAINT, Krita, etc.
- Add motion using Live2D Cubism Editor.
It seems good that it can also move in VTubeStudio, but the difficulty seems a bit high, so I'll set it aside for now.
VRM
You can also buy pre-made models for this, but it seems you can create models using VRoid Studio.
Immediately installed VRoid Studio on Steam.

Launch.


I'll postpone making a model for now, load a sample character, and try exporting a VRM file.
I confirmed that by storing it in aituber-kit/public/vrm, I can select the stored model in the AITuber-Kit settings screen. I tried loading AvatarSample_M.vrm.

Since I've confirmed it loads without problems, I'll be able to move my own model in AITuberKit if I create one in VRoid Studio. So, I'll finally try making my own model. It looks like it will take a lot of time.

I want a service that automatically selects parts or outfits close to an image I provide.
Day 3: Character Creation 2
Creating Character Model (Part 2)
Anyway, character creation is complete!
I ended up making two different outfits for no reason. Yukata is powerful. I wanted to change the color of the skirt for the second one, but it seemed a bit difficult, so I left it as default for now.




It seems like I can customize outfits and various other things, and honestly, I want to be more particular, but it seems like no matter how much time I have, it won't be enough, so I'll use this VRM model for now. It doesn't have a name yet.
Reconfirming the Route
According to Nike-chan's article, the future steps are as follows.
- Create characters
- Model creation: Using a VRM model
- Character settings
- Prepare tools to use
- LLM: Ollama/gemini-2.5-flash/gpt-4o
- TTS: AivisSpeech
- Streaming software: OBS Studio
- YouTube API
- YouTube channel creation
- AITuber program creation
- ...
Since the tool preparation is already mostly done, the remaining tasks are character settings, YouTube API settings, and AITuber program creation.
However, what I can make with this is a standard AITuber—one that retrieves live chat and responds based on it—and I feel that it's somehow not what I want to make.
So, what do I want to make? More specifically, for whom and what kind of AITuber (or AI character) do I want to create? I might need to think about that carefully before moving to the next step.
Day 4: Learning from Successful AITubers
Although I don't want to have too many preconceptions, it's also not good to know too little about AITubers.
So today, I'll investigate previous research on "AITuberList," a site where Nike-chan has compiled information about AITubers.
Checking the site, there are a total of 208 people registered as of writing this note. Due to time constraints, I've decided to narrow my research down to AITubers with more than 10,000 subscribers. Applying this filter leaves about 18 people, though some AITubers can probably be excluded.
Aren't they good at singing?
I played several AITuber channels.
What particularly caught my attention among them were the "singing streams." While their AI-ness clearly shows when they are just talking, you can't feel it at all when they are singing. In fact, it's quite moving. For example, the video below.
What does this mean? Are they creating the character's voice with speech synthesis software originally meant for singing? Or is a human singing?
And the motion looks quite natural too. I felt that those in the 50,000-subscriber class are really on a different level.
Day 5: OBS Studio
Since I got home late from work today, I'll only perform the installation of the streaming software, OBS Studio.
Looking at the official OBS Studio website, there is a Linux version, so I'll download that. Apparently, on Ubuntu 24.04 and later, OBS Studio can be installed via the official PPA.
$ sudo add-apt-repository ppa:obsproject/obs-studio
$ sudo apt update
$ sudo apt install obs-studio
I'll proceed by referring to the article "Thorough Explanation of Settings and Streaming Methods for YouTube Live with OBS".
Enabling YouTube Live Streaming
First of all, I need to enable live streaming on the YouTube side, so I'll enable it immediately.
It seems that once you request it and register your phone number, there is a 24-hour waiting period.

OBS Settings
Launch OBS and configure the necessary settings.


For now, I'll proceed with the settings like shown above and click the "Connect Account" button to link it with my Google account.


When I clicked "Next" on the screen above, I couldn't connect to the server...

Is it because live streaming isn't enabled yet? It's getting late, so that's all for today.
Day 6: OBS Studio Continued
Confirm that live streaming is enabled on YouTube Studio.

Next, when I connected to the server again in OBS Studio, it was successful this time.

Output/Video Settings
Referring to the article, I configured the settings as follows.


Audio Settings
I don't plan to use a microphone, and since I'm not sure how to configure it, I'll leave it at the default for now.

I also proceeded with the microphone filter settings as described in the article.

Hmm, I want to output the synthesized voice of the AI character as the microphone input, but I'm somewhat skeptical if this will work.
Next seems to be finally "How to Start Streaming on YouTube". It's gotten a bit late, so let's end here today.
Day 7: Live Streaming ~ Comment Retrieval -> Utterance via Speech Synthesis
Setting up the YouTube Data API Key
Obtain the YouTube Data API v3 key based on the following article.

For now, I configured it according to A-Uta-san's notes. Thank you for the information!
Settings on the AITuberKit Side
Open the Settings (gear icon) -> YouTube settings screen and turn on YouTube Mode. Fields for entering the YouTube API Key and YouTube Live ID will appear.

Enter the API key obtained above into the API Key field, and for the YouTube Live ID, enter the ID assigned after creating the stream frame in OBS. Specifically, if the live stream URL is https://youtube.com/live/G_4m84Q_K3Y, entering G_4m84Q_K3Y is sufficient.
The settings were quite a bit of work, but I finally managed to get the AI character to retrieve comments and return responses!
A point where I personally got stuck was that comments aren't loaded unless you click a button on the chat screen.

↑ In this state, comments are not loaded.
As shown above, there is an icon like a YouTube play button; clicking this finally enables comment retrieval. I ended up creating about four stream frames before noticing this.

↑ In this state, comments are being loaded.
I was able to get this far in a week thanks to AITuberKit; if I had started from scratch, I think it would have taken more than 10 times as long. Big thanks again to Nike-chan-san!!
Now, I want to continue making AI characters after reconsidering who I want to make them for and what exactly I want to create.
Day 8: A Short Break
I am writing this several days after Day 8, but around this time, I became busy with my main job due to business trips and other reasons.
Therefore, I will post my X (Twitter) post from that day and add any details I can recall.
Using Tailscale, I can access the EVO-X2 at home, but GUI-based development is tough on a clunker laptop released 13 years ago.
The article about motion was very interesting, and I would end up trying to create my own motions a few days later.
Day 9: Mastra + Ollama
Almost no noticeable progress today as I had very little time.
I realized there is a demand for "Mastra + Ollama" and decided to write an article about it.
Also, I noticed a few days later that the AI character of Sakura Kouya-san and the character I created were quite similar (lol).
Since my character design came later, I decided to make another outfit as well.
Day 10: Writing an Article on Local AI Agents
While writing an article on local AI agents using Mastra + Ollama + MCP, I finally attempted to connect it to AITuberKit.
Unfortunately, I found that ollama-ai-provider does not support Streaming mode when using Tools, while AITuberKit only supports Streaming mode...
Day 11: Challenging Ollama Tool Streaming Support
I finished and published the article on local AI agents.
After that, I challenged myself to support Ollama Tool Streaming in order to connect AITuberKit with the local AI agent of Mastra + Ollama.
Checking pull requests and forks, I found a fork that supported Tool Streaming.
However, it didn't work as-is, so I had Claude Code fix it and introduced it into AITuberKit, which allowed me to connect them!
However, since only I can use it in this state, I want to summarize it in an article after making it easily installable by anyone, such as by deploying it to npm.
Day 12: Trying to Change the AI Character's Animation
I checked where the motions for AITuberKit are defined and found that the following file is the animation file:
While researching vrma files, I found a VRM animation set released for free on Booth. Much appreciated!
I immediately downloaded it, placed it in the public folder, and was able to change the motion by editing the filename of /idle_loop.vrma below.
Furthermore, I learned that you can load and play VRM and vrma files in Blender.
By installing the following add-on to Blender, you can import VRMs.
Day 13: Reflecting Motions Captured by a Webcam onto the AI Character
I'll record screenshots of the procedure for importing a VRM into Blender as a reminder.




Once you've gotten this far, you can load a model simply by dragging and dropping the vrm file. Note that it's somewhat heavy.

Next, I'll show how to load a vrma animation. This was quite confusing.
First, either click the < button at the top right of the screen or press the n key.

Click the VRM tab, and if it's set to VRM 0.0, change it to VRM 1.0 to make Spring Bone clickable. Then, an "Enable Animation" option will appear below; click it to check the box.

Finally, select "Armature" even further to the right and drag and drop the vrma file to load the animation. In this state, clicking the play button ▶ at the bottom will play the animation.

However, perhaps because the GPU of the EVO-X2 (Ubuntu) is not yet supported, the playback was quite slow at just under 10fps.
Anyway, with this, animation playback on Blender is now possible.
Finally, I expected to be able to create original animations from Claude Desktop by introducing Blender MCP, but as expected, it didn't go well, and I gave up.
However, in this article, they seem to be controlling facial expressions with Blender MCP, so if proper instructions are given, motion creation might also be possible. No, this is just controlling, not creating... 🤔
Motion Generation from Webcam
There seem to be various ways to do this, but I was able to create a .vrma file by creating a motion file called bvh using XR Animator and then using a tool called bvh2vrma.
I felt that XR Animator is a tool with some quirks, but I think you'll figure out how to use it if you play around with it.
Since bvh2vrma is simple and very easy to understand, you probably won't get lost in how to use it.
Day 14: Extending AITuberKit
I've started to form an image of the AI character I want to create.
To achieve that, I wanted a feature in AITuberKit to read images using Ollama, but it didn't seem to be supported, so I asked Claude Code to add it for me.
As a bonus, I also added a feature to paste images into the chat box. This was more for improving convenience rather than a feature I wanted for the AI character itself.
I'm putting it off for now, but I plan to publish the forked repository.
Day 15: Closing for Now
It's 3 days until the AITuber Meetup, but I'm busy this week and don't expect much progress.
And now, while writing this article, I was organizing the trajectory of these past two weeks and more.
I won't write about what kind of AI character I want to make here; instead, I'd like to participate in the AITuber Meetup and talk about it there if possible.
For now, I should have earned the qualification to participate in the AITuber Meetup, but since I'm still on the waitlist, I recognize there's a relatively high possibility I won't be able to attend.
That said, the reason I've been able to challenge myself in so many ways up to this point is largely thanks to NukoNuko-san for organizing the AITuber Meetup and Nike-chan-san for releasing AITuberKit, an incredibly easy-to-use tool.
-> Overwhelming gratitude! Thank you very much!!!
I still haven't finished most of the AI character I want to make, but I'd like to continue development whenever I find the time.
Summary
I'll close by returning to the present with an afterword.
Actually, after writing this diary and participating in the AITuber Meetup, I haven't been able to do much AITuber development. The major reason is likely the remarkable performance improvement of video generation AIs like Sora2.
This is because AI characters using 2D or 3D models have limited expressiveness, and video generation wins in terms of high flexibility. However, video generation still has challenges regarding cost and real-time capabilities.
Besides the rise of video generation, there are other reasons why I haven't been able to continue AITuber development in earnest, but since this article has become packed with content, I'll talk about them on another occasion.
I hope that Shizuku AI will trigger a renewed excitement in the AI character community. I also hope this article serves as a starting point for those about to begin AI VTuber development.
Discussion