🎬

Live2Dで口パクやモーションさせて録画したい！

2024/12/25に公開

FFmpeg

remotion

live2d

tech

前回はLive2DをWebで動かす方法をご紹介しました。

今回は更にニッチなユースケースである「Live2Dでリップシンクやモーションをさせて、それを録画する」ためにやったことをご紹介します。

具体的に言うと次のような「どのタイミングでどんな音声を読み上げたりモーションを行うか」と規定したオブジェクトを渡すとそれ通りにリップシンクやモーションが行われ、その模様を録画する、ということをやります。

type DaihonItem = {
  fromSec: number;
  durationSec: number;
  audioUrl?: string;
  motion?: string;
};

type Daihon = DaihonItem[];

const daihon: Daihon = [
  {
    fromSec: 0,
    durationSec: 3,
    audioUrl: "/audio/test1.wav",
  },
  {
    fromSec: 3,
    durationSec: 1.9,
    audioUrl: "/audio/test2.wav",
    motion: "Flick",
  },
];

音声はないですが、最終的にこんな感じの動画が作られます。

https://youtu.be/1spTcGSB1mM

なぜそんなことをしたいのか

そもそもなぜそんなことをしたいのか、それは私が「プログラマブルに動画を制作しているから」です。
私は今RemotionというReactで動画が作れるライブラリを用いて動画制作をしています。(Remotionについては昔書いたこちらの解説記事をご参照ください)
このRemotionではWebブラウザ上で動画を描画しながら制作できるのですが、アニメーションをさせる時には「このフレームではこのプロパティがこの値」という形で指定していきます。

例えば、「最初に透明度が上がって登場するアニメーション」は下記のように書きます。

export const FadeIn = () => {
  const frame = useCurrentFrame();
 
  const opacity = Math.min(1, frame / 60);
 
  return (
    <AbsoluteFill
      style={{
        justifyContent: "center",
        alignItems: "center",
        backgroundColor: "white",
        fontSize: 80,
      }}
    >
      <div style={{ opacity: opacity }}>Hello World!</div>
    </AbsoluteFill>
  );
};

Live2Dモデルは独立して動いており、フレーム単位で動きが制御できないため、そのままRemotion上で動かすと残念な感じになります。一応Remotionを今回は題材にしてますが、他のプログラマブルに作る方法でも同様なのではないかと思います。

そこで、事前に「音声とモーションを合わせたLive2D映像」を背景透過で録画し、それをRemotionに組み込むアセットとして使う！という方法で解決しようというわけです。

次の順番で作っていきます

MediaRecorderで描画内容を録画
PlayWrightでヘッドレスに録画実行
動画をRemotionで動かして良い感じかを確認

MediaRecorderで描画内容を録画

まずはLive2Dモデルを動かしつつ、それを録画する環境を作ります。
全体像はこちらです。Live2D部分に関しては前回記事をご参照ください。

Live2DRecorder.tsx

"use client";

import React, { useEffect, useRef, useState } from "react";
import { Application, Ticker, DisplayObject } from "pixi.js";
import { Live2DModel } from "pixi-live2d-display-lipsyncpatch/cubism4";

enum RecordingState {
  Idle,
  Recording,
  Stopped,
}

type DaihonItem = {
  fromSec: number;
  durationSec: number;
  audioUrl?: string;
  motion?: string;
};

type Daihon = DaihonItem[];

const daihon: Daihon = [
  {
    fromSec: 0,
    durationSec: 3,
    audioUrl: "/audio/test1.wav",
  },
  {
    fromSec: 3,
    durationSec: 1.9,
    audioUrl: "/audio/test2.wav",
    motion: "Flick",
  },
];

const SCREEN_WIDTH = 1920;
const SCREEN_HEIGHT = 1080;
const MODEL_WIDTH_PX = 880;

const setModelPosition = (
  app: Application,
  model: InstanceType<typeof Live2DModel>
) => {
  const scale = MODEL_WIDTH_PX / model.width;
  model.scale.set(scale);
  model.x = app.renderer.width - model.width * scale + 100;
  model.y = app.renderer.height - model.height * scale;
};

export default function Live2DRecorder() {
  const canvasRef = useRef<HTMLCanvasElement>(null);
  const [app, setApp] = useState<Application | null>(null);
  const [model, setModel] = useState<InstanceType<typeof Live2DModel> | null>(
    null
  );

  const [recorder, setRecorder] = useState<MediaRecorder | null>(null);
  const [recordingState, setRecordingState] = useState<RecordingState>(
    RecordingState.Idle
  );
  const chunksRef = useRef<Blob[]>([]);

  const initLive2DModel = async (currentApp: Application) => {
    const loadedModel = await Live2DModel.from(
      "/live2d/hiyori/runtime/hiyori_pro_t11.model3.json",
      {
        ticker: Ticker.shared,
      }
    );
    currentApp.stage.addChild(loadedModel as unknown as DisplayObject);
    loadedModel.anchor.set(0.5, 0.5);

    // こうするとモデルがカーソルに目線を向けなくなります
    loadedModel.eventMode = "passive";

    setModelPosition(currentApp, loadedModel);
    setModel(loadedModel);
  };

  useEffect(() => {
    if (!canvasRef.current) return;

    const pixiApp = new Application({
      width: SCREEN_WIDTH,
      height: SCREEN_HEIGHT,
      view: canvasRef.current,
      backgroundAlpha: 0,
    });
    setApp(pixiApp);
    initLive2DModel(pixiApp);
  }, []);

  const playDaihonItem = async (
    item: DaihonItem,
    model: InstanceType<typeof Live2DModel>
  ) => {
    if (item.motion) {
      model.motion(item.motion);
    }
    if (item.audioUrl) {
      model.speak(item.audioUrl);
    }

    return new Promise((resolve) =>
      setTimeout(resolve, item.durationSec * 1000)
    );
  };

  const playAllDaihon = async (
    model: InstanceType<typeof Live2DModel>,
    mediaRecorder: MediaRecorder
  ) => {
    mediaRecorder.start();

    daihon.forEach((item, index) => {
      setTimeout(async () => {
        await playDaihonItem(item, model);
        if (index === daihon.length - 1) {
          mediaRecorder.stop();
          setRecordingState(RecordingState.Stopped);
        }
      }, item.fromSec * 1000);
    });
  };

  const startRecording = () => {
    if (!canvasRef.current) return;
    if (recordingState === RecordingState.Recording) return;

    const canvasStream = canvasRef.current.captureStream(60); // 60fps

    const options = {
      mimeType: "video/webm; codecs=vp9",
    } as MediaRecorderOptions;

    try {
      const mediaRecorder = new MediaRecorder(canvasStream, options);
      setRecorder(mediaRecorder);
      chunksRef.current = [];

      mediaRecorder.ondataavailable = (e) => {
        if (e.data && e.data.size > 0) {
          chunksRef.current.push(e.data);
        }
      };

      setRecordingState(RecordingState.Recording);

      // なぜか間を開けないとモーションしてくれないので1秒待つ
      setTimeout(() => {
        playAllDaihon(model as InstanceType<typeof Live2DModel>, mediaRecorder);
      }, 1000);

      mediaRecorder.onstop = () => {
        // 録画データを1つのBlobにまとめてダウンロード
        const blob = new Blob(chunksRef.current, {
          type: mediaRecorder.mimeType,
        });
        const url = URL.createObjectURL(blob);

        const a = document.createElement("a");
        a.style.display = "none";
        a.href = url;
        a.download = `live2d_capture_${Date.now()}.webm`;

        document.body.appendChild(a);
        a.click();
        setTimeout(() => {
          document.body.removeChild(a);
        }, 100);
      };
    } catch (e) {
      console.error("MediaRecorder start error:", e);
    }
  };

  return (
    <div>
      <header className="flex justify-end items-center h-[64px] bg-black px-4">
        <div>
          {recordingState !== RecordingState.Recording && (
            <button
              id="startButton"
              onClick={startRecording}
              className="bg-blue-500 hover:bg-blue-700 text-white font-bold py-2 px-4 rounded"
            >
              録画開始
            </button>
          )}
          {recordingState === RecordingState.Recording && (
            <button
              id="inProgressButton"
              className="bg-red-500 hover:bg-red-700 text-white font-bold py-2 px-4 rounded disabled:opacity-50 disabled:cursor-not-allowed"
              disabled
            >
              録画中
            </button>
          )}
        </div>
      </header>
      <div
        style={{
          width: SCREEN_WIDTH,
          height: SCREEN_HEIGHT,
        }}
      >
        <canvas ref={canvasRef} />
      </div>
    </div>
  );
}

今回はMediaRecorderというWeb APIを使って録画していきます。

まずは録画にあたって必要な変数たちを定義します。

enum RecordingState {
  Idle,
  Recording,
  Stopped,
}

export default function Live2DRecorder() {
  const canvasRef = useRef<HTMLCanvasElement>(null);
  const [recorder, setRecorder] = useState<MediaRecorder | null>(null);
  const [recordingState, setRecordingState] = useState<RecordingState>(
    RecordingState.Idle
  );
  const chunksRef = useRef<Blob[]>([]);
...

次に録画を開始する関数です。
次のようにLive2Dモデルを描画しているCanvasの内容を取得します。chunksRefに録画の内容が溜まっていくので、こちらは後に一つのblobにしてダウンロードしていきます。

また、今回は動画はWebMで作ります。動画に透明度(アルファチャンネル)を持たせたいからで、その際mp4のH.265 HEVCコーデックかWebMのVP9コーデックあたりが選択肢になるのですが、圧縮率も高いWebMにしてみます。アルファチャンネルを持たせられない場合はクロマキー合成などを頑張ることになると思います。

https://www.webmproject.org/about/

  const startRecording = () => {
    if (!canvasRef.current) return;
    if (recordingState === RecordingState.Recording) return;

    const canvasStream = canvasRef.current.captureStream(60); // 60fps

    const options = {
      mimeType: "video/webm; codecs=vp9",
    } as MediaRecorderOptions;

    try {
      const mediaRecorder = new MediaRecorder(canvasStream, options);
      setRecorder(mediaRecorder);
      chunksRef.current = [];

      mediaRecorder.ondataavailable = (e) => {
        if (e.data && e.data.size > 0) {
          chunksRef.current.push(e.data);
        }
      };

      setRecordingState(RecordingState.Recording);

      // なぜか間を開けないとモーションしてくれないので1秒待つ
      setTimeout(() => {
        playAllDaihon(model as InstanceType<typeof Live2DModel>, mediaRecorder);
      }, 1000);

次にレコード開始のために呼んでいる playAllDaihon 関数です。
シンプルにfromSecでsetTimeoutして指定したモーションなり音声ファイルを再生してのリップシンクなりを行います。
durationSecが必要なのは、最後自動で録画を収録する際に .motionや.speakのPromiseを待つ形にすると、なんか一瞬で終わってぶつ切りになってしまうのでちゃんと秒数指定した方がいいなとなったからです。この値は台本を作る際にwavファイルの秒数取得するプログラム書いたりして作ります。

type DaihonItem = {
  fromSec: number;
  durationSec: number;
  audioUrl?: string;
  motion?: string;
};

type Daihon = DaihonItem[];

const daihon: Daihon = [
  {
    fromSec: 0,
    durationSec: 3,
    audioUrl: "/audio/test1.wav",
  },
  {
    fromSec: 3,
    durationSec: 1.9,
    audioUrl: "/audio/test2.wav",
    motion: "Flick",
  },
];

export default function Live2DRecorder() {
  ...
  const playDaihonItem = async (
    item: DaihonItem,
    model: InstanceType<typeof Live2DModel>
  ) => {
    if (item.motion) {
      model.motion(item.motion);
    }
    if (item.audioUrl) {
      model.speak(item.audioUrl);
    }

    return new Promise((resolve) =>
      setTimeout(resolve, item.durationSec * 1000)
    );
  };

  const playAllDaihon = async (
    model: InstanceType<typeof Live2DModel>,
    mediaRecorder: MediaRecorder
  ) => {
    mediaRecorder.start();

    daihon.forEach((item, index) => {
      setTimeout(async () => {
        await playDaihonItem(item, model);
        if (index === daihon.length - 1) {
          mediaRecorder.stop();
          setRecordingState(RecordingState.Stopped);
        }
      }, item.fromSec * 1000);
    });
  };

そして終わった後のダウンロード処理です。
MediaRecorderのonstopでchaunksRefに溜まったデータをひとまとめにしてダウンロードを実行します。

      mediaRecorder.onstop = () => {
        // 録画データを1つのBlobにまとめてダウンロード
        const blob = new Blob(chunksRef.current, {
          type: mediaRecorder.mimeType,
        });
        const url = URL.createObjectURL(blob);

        const a = document.createElement("a");
        a.style.display = "none";
        a.href = url;
        a.download = `live2d_capture_${Date.now()}.webm`;

        document.body.appendChild(a);
        a.click();
        setTimeout(() => {
          document.body.removeChild(a);
        }, 100);
      };

あとはこの録画を実行するボタンを作ったら完成です！

  return (
    <div>
      <header className="flex justify-end items-center h-[64px] bg-black px-4">
        <div>
          {recordingState !== RecordingState.Recording && (
            <button
              id="startButton"
              onClick={startRecording}
              className="bg-blue-500 hover:bg-blue-700 text-white font-bold py-2 px-4 rounded"
            >
              録画開始
            </button>
          )}
          {recordingState === RecordingState.Recording && (
            <button
              id="inProgressButton"
              className="bg-red-500 hover:bg-red-700 text-white font-bold py-2 px-4 rounded disabled:opacity-50 disabled:cursor-not-allowed"
              disabled
            >
              録画中
            </button>
          )}
        </div>
      </header>
      <div
        style={{
          width: SCREEN_WIDTH,
          height: SCREEN_HEIGHT,
        }}
      >
        <canvas ref={canvasRef} />
      </div>
    </div>
  );

撮影スタジオができました。

Playwrightでヘッドレスに録画実行

次に、逐一画面を開いての録画は面倒なのでヘッドレスに録画を実行できるようにします。
そんなに複雑なこともしないので一気にコード載せちゃいます。Playwrightを使っています。
ffmpegのところだけちょっと謎だと思うのでこの後補足説明をします。

const { chromium } = require("playwright");
const path = require("path");
const fs = require("fs");

const { spawn } = require("child_process");

async function processVideo(inputPath: string, outputPath: string) {
  return new Promise((resolve, reject) => {
    const ffmpegArgs = [
      "-c:v",
      "libvpx-vp9",
      "-i",
      inputPath,
      "-c:v",
      "libvpx-vp9",
      "-auto-alt-ref",
      "0",
      "-pix_fmt",
      "yuva420p",
      "-c:a",
      "libvorbis",
      outputPath,
    ];

    const ffmpeg = spawn("ffmpeg", ffmpegArgs);

    ffmpeg.stdout.on("data", (data: Buffer) => {
      console.log(`FFmpeg stdout: ${data}`);
    });

    ffmpeg.stderr.on("data", (data: Buffer) => {
      console.error(`FFmpeg stderr: ${data}`);
    });

    ffmpeg.on("close", (code: number) => {
      if (code === 0) {
        resolve(outputPath);
      } else {
        reject(new Error(`FFmpeg process exited with code ${code}`));
      }
    });

    ffmpeg.on("error", (err: Error) => {
      reject(new Error(`Failed to start FFmpeg process: ${err.message}`));
    });
  });
}

async function main() {
  const browser = await chromium.launch({
    headless: true,
  });
  const context = await browser.newContext({
    acceptDownloads: true,
  });
  const page = await context.newPage();

  const downloadPath = path.resolve(__dirname, "downloads");
  if (!fs.existsSync(downloadPath)) {
    fs.mkdirSync(downloadPath);
  }

  // Live2DRecorderのページを開く
  const targetUrl = "http://localhost:8080/playground";
  await page.goto(targetUrl);

  await page.waitForSelector("#startButton");

  // 外側からLive2Dの描画終わったか判定する方法分からんからとりあえず3秒待つ
  await page.waitForTimeout(3000);

  console.log("=== Start Recording ===");

  // 録画開始
  await page.click("#startButton");
  await page.waitForSelector("#inProgressButton");

  const downloadPromise = page.waitForEvent("download");
  const download = await downloadPromise;
  const filename = await download.suggestedFilename();
  await download.saveAs(path.join(downloadPath, filename));
  console.log(filename);

  console.log("=== Download Complete ===");
  await browser.close();
  console.log("=== Recording Complete ===");

  await processVideo(
    path.join(downloadPath, filename),
    path.join(downloadPath, "output.webm")
  );
}

main().catch((err) => {
  console.error(err);
  process.exit(1);
});

ffmpegで動画を再エンコードをしている理由

取得したWebM動画をffmpegで再度エンコードしているのですが、理由としては二つありまして

動画のdurationがInifinityになる
なんか「PIPELINE_ERROR_DECODE: video decode error!」エラーが作成された動画を使って動画を出力しようとした時に出る

MediaRecorderで録画した動画の再生時間のメタデータが無限になってしまい、とりあえずの再生はできるのですが、この動画を使って他の動画をエンコードしようとした時にエラーが出てしまいます。

Could not get video metadata Error: Unable to determine video duration for /test.webm - got Infinity. Re-encoding this video may fix this issue.

取り急ぎメタ情報を付与するだけなら次のようにコピーするだけで大丈夫です。

ffmpeg -i test.webm -c copy test-fixed.webm

ただこれだけだと次のようなエラーが出ました。正直なぜこのエラーが起こっているのかをちゃんと理解している訳ではないのですが、o1さんの解説を下記に貼っておきます。

PIPELINE_ERROR_DECODE: video decode error!

「MediaRecorder → WebM」の段階で、メタデータやインデックスが完全に書き込まれていない場合があります。ブラウザ上でのストリーミング再生は通っても、フレーム単位で読み込む際に不具合が出ることがある。

という訳で次のffmpegコマンドを使って再エンコードを行います。

ffmpeg -c:v libvpx-vp9 -i test.webm \
  -c:v libvpx-vp9 \
  -auto-alt-ref 0 \
  -pix_fmt yuva420p \
  -c:a libvorbis \
  test-fixed.webm

肝としてはインプットの前に -c:v libvpx-vp9 を指定している点です。
これをつけないと yuv420p として入力が認識されアルファが失われました。(=背景が透過しない)
これをつけると無事アルファ付きで再エンコードが行えます。
こちらのスクラップに救われました感謝。

動作確認

ちゃんと背景画像が見えながら今回生成した動画と合体することができました。やったね！！
https://youtu.be/dbBqtMVs8PI

Discussion

ログインするとコメントできます