🤖

【LLM】Llama3をGoogle Colabで動かすだけ

に公開

こんにちは。
絶賛、卒研を進めている学生です。

今何かと話題の生成AIですが、「生成AIを卒研に取り入れたらおもろくね」という単純な興味から、アメリカのMeta社が無償でオープンソースとして公開している Llama を最近触り始めました。
https://www.llama.com/

そこで、今回は LlamaをGoogle Colab上で動かすまで を解説していきます。

0. 前提

0.1 Llamaとは

Llamaとは、「ラマ」と読み、FacebookやInstagramなどを運営するMeta社が開発した大規模言語モデル(LLM)です。

簡単にいうと、ChatGPTやGeminiのようなAIの頭の中です。

このLlamaの魅力は(モデルにもよりますが)無料かつ商用利用が可能であるという点です。

この魅力から、私は卒研でぜひこのLlamaを活用したいと考えています。

そして、Llama内で様々なモデルがある内の「Meta-Llama-3.1-8B-Instruct」を使います。これは、Llama 3.1ファミリーの中でもパラメータ数が少なく、計算資源を抑えつつ、高速かつ高精度な応答が可能なモデルです。またInstructモデルは、ユーザーからの指示や質問に対して自然で有用な応答を返すよう特別に訓練されているものです。

0.2 Google Colabとは

Google Colabとは、「グーグル コラボ」と読み、Googleが提供している基本無料で使えるPython実行環境ツールです。

Google Colabの魅力は、環境構築が不要で、かつ強力な性能の計算パワーを基本無料で使えるという点です。

この魅力に加え、保存や共有がローカルより楽なので、私はローカルよりもColab上でPythonを実行することが多いです。

今回は、Pythonを実行するために、Google Colabを使います。

0.3 Hugging Faceとは

Hugging Faceとは、「ハギングフェイス」と読み、機械学習モデルやデータセットを共有・発見できるオープンソースのプラットフォームです。

要するに、AIのモデルや学習データをみんなで共有して、誰でも簡単に使えるようにするサイトです。

今回は、Llamaのモデルを持ってくるために、Hugging Faceを使います。

1. Hugging Faceの準備

1.1 Hugging Faceへの登録

まず次のリンクから、Hugging Faceへ新規登録します。
https://huggingface.co/

Sign Up」を押します。

あとは、表示されるステップに沿って登録を行なってください。

1.2 Llamaの利用申請

このURLからLlamaのページにアクセスします。
https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct

「LLAMA 3.1 COMMUNITY LICENSE AGREEMENT」の下にある「Expand to review and access」を押します。

広げた部分の最下部にある「By agreeing you accept to share your contact information (email and username) with the repository authors.」の下にあるフォーム全てに必要事項を入力します。

なお、入力事項は次のとおりです。私は全て英語で入力しました。

事項 入力内容 備考
First Name 下の名前 山田太郎さんだったら「太郎」
Last Name 上の名前(苗字) 山田太郎さんだったら「山田」
Date of birth 誕生日
Country
Affiliation 所属 私の場合は学校名を入力した
Job title 役職 私の場合は学生なので「Student」を選択した

入力し終えたら、チェックボックスにチェックを入れ、下の「Submit」を押します。

次のように、「Your request to access this repository has been submitted and is awaiting a review from the repository authors. You can check the status of all your access requests in your settings.」と表示されれば成功です!

あとは、数分待つとメールが届き、無事申請が完了します!(待っている間に1.3以降の手順を進めることもできます)

1.3 Hugging Faceのアクセストークン

Hugging Face側の設定を進めていきます。

次のリンクから、Hugging Faceの「アクセストークン」ページに行きます。
https://huggingface.co/settings/tokens

右上にある「+ Create new token」を押します。(私の場合はすでに1つ存在していますが、ここは気にしないでください)

Token Name」に任意の名前を入力し、User permissionsのRepositoriesの「Read access to contents of all public gated repos you can access」にチェックマークを入れます。

他の部分はそのままにして、最下部の「Create token」を押します。

押した後、アクセストークンのコードが表示されるため、このコードをコピーして、安全な場所にペーストし保管しておきます。

2. Google Colabの準備

ここで、ようやくGoogle Colabに取り掛かります。

Googleドライブのページなどから、Google Colabを新規作成します。

2.1 ランタイムのタイプ変更

Llamaを実行するためにGPUを使いたいので、Google Colabのランタイムのタイプの変更を行います。

上の「ランタイム」タブから「ランタイムのタイプを変更」を押します。

デフォルトでは「CPU」が選択されていると思いますが、これを「T4 GPU」に変更します。

2.2 パッケージのインストール

実行に必要なパッケージをインストールします。

# パッケージのインストール
!pip install -U transformers accelerate bitsandbytes
!pip install torch
実行結果

あくまで一例です。

Requirement already satisfied: transformers in /usr/local/lib/python3.11/dist-packages (4.51.3)
Requirement already satisfied: accelerate in /usr/local/lib/python3.11/dist-packages (1.6.0)
Collecting accelerate
  Downloading accelerate-1.7.0-py3-none-any.whl.metadata (19 kB)
Collecting bitsandbytes
  Downloading bitsandbytes-0.45.5-py3-none-manylinux_2_24_x86_64.whl.metadata (5.0 kB)
Requirement already satisfied: filelock in /usr/local/lib/python3.11/dist-packages (from transformers) (3.18.0)
Requirement already satisfied: huggingface-hub<1.0,>=0.30.0 in /usr/local/lib/python3.11/dist-packages (from transformers) (0.31.2)
Requirement already satisfied: numpy>=1.17 in /usr/local/lib/python3.11/dist-packages (from transformers) (2.0.2)
Requirement already satisfied: packaging>=20.0 in /usr/local/lib/python3.11/dist-packages (from transformers) (24.2)
Requirement already satisfied: pyyaml>=5.1 in /usr/local/lib/python3.11/dist-packages (from transformers) (6.0.2)
Requirement already satisfied: regex!=2019.12.17 in /usr/local/lib/python3.11/dist-packages (from transformers) (2024.11.6)
Requirement already satisfied: requests in /usr/local/lib/python3.11/dist-packages (from transformers) (2.32.3)
Requirement already satisfied: tokenizers<0.22,>=0.21 in /usr/local/lib/python3.11/dist-packages (from transformers) (0.21.1)
Requirement already satisfied: safetensors>=0.4.3 in /usr/local/lib/python3.11/dist-packages (from transformers) (0.5.3)
Requirement already satisfied: tqdm>=4.27 in /usr/local/lib/python3.11/dist-packages (from transformers) (4.67.1)
Requirement already satisfied: psutil in /usr/local/lib/python3.11/dist-packages (from accelerate) (5.9.5)
Requirement already satisfied: torch>=2.0.0 in /usr/local/lib/python3.11/dist-packages (from accelerate) (2.6.0+cu124)
Requirement already satisfied: fsspec>=2023.5.0 in /usr/local/lib/python3.11/dist-packages (from huggingface-hub<1.0,>=0.30.0->transformers) (2025.3.2)
Requirement already satisfied: typing-extensions>=3.7.4.3 in /usr/local/lib/python3.11/dist-packages (from huggingface-hub<1.0,>=0.30.0->transformers) (4.13.2)
Requirement already satisfied: networkx in /usr/local/lib/python3.11/dist-packages (from torch>=2.0.0->accelerate) (3.4.2)
Requirement already satisfied: jinja2 in /usr/local/lib/python3.11/dist-packages (from torch>=2.0.0->accelerate) (3.1.6)
Collecting nvidia-cuda-nvrtc-cu12==12.4.127 (from torch>=2.0.0->accelerate)
  Downloading nvidia_cuda_nvrtc_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-runtime-cu12==12.4.127 (from torch>=2.0.0->accelerate)
  Downloading nvidia_cuda_runtime_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-cupti-cu12==12.4.127 (from torch>=2.0.0->accelerate)
  Downloading nvidia_cuda_cupti_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cudnn-cu12==9.1.0.70 (from torch>=2.0.0->accelerate)
  Downloading nvidia_cudnn_cu12-9.1.0.70-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cublas-cu12==12.4.5.8 (from torch>=2.0.0->accelerate)
  Downloading nvidia_cublas_cu12-12.4.5.8-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cufft-cu12==11.2.1.3 (from torch>=2.0.0->accelerate)
  Downloading nvidia_cufft_cu12-11.2.1.3-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-curand-cu12==10.3.5.147 (from torch>=2.0.0->accelerate)
  Downloading nvidia_curand_cu12-10.3.5.147-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cusolver-cu12==11.6.1.9 (from torch>=2.0.0->accelerate)
  Downloading nvidia_cusolver_cu12-11.6.1.9-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cusparse-cu12==12.3.1.170 (from torch>=2.0.0->accelerate)
  Downloading nvidia_cusparse_cu12-12.3.1.170-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)
Requirement already satisfied: nvidia-cusparselt-cu12==0.6.2 in /usr/local/lib/python3.11/dist-packages (from torch>=2.0.0->accelerate) (0.6.2)
Requirement already satisfied: nvidia-nccl-cu12==2.21.5 in /usr/local/lib/python3.11/dist-packages (from torch>=2.0.0->accelerate) (2.21.5)
Requirement already satisfied: nvidia-nvtx-cu12==12.4.127 in /usr/local/lib/python3.11/dist-packages (from torch>=2.0.0->accelerate) (12.4.127)
Collecting nvidia-nvjitlink-cu12==12.4.127 (from torch>=2.0.0->accelerate)
  Downloading nvidia_nvjitlink_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Requirement already satisfied: triton==3.2.0 in /usr/local/lib/python3.11/dist-packages (from torch>=2.0.0->accelerate) (3.2.0)
Requirement already satisfied: sympy==1.13.1 in /usr/local/lib/python3.11/dist-packages (from torch>=2.0.0->accelerate) (1.13.1)
Requirement already satisfied: mpmath<1.4,>=1.1.0 in /usr/local/lib/python3.11/dist-packages (from sympy==1.13.1->torch>=2.0.0->accelerate) (1.3.0)
Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.11/dist-packages (from requests->transformers) (3.4.2)
Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.11/dist-packages (from requests->transformers) (3.10)
Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.11/dist-packages (from requests->transformers) (2.4.0)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.11/dist-packages (from requests->transformers) (2025.4.26)
Requirement already satisfied: MarkupSafe>=2.0 in /usr/local/lib/python3.11/dist-packages (from jinja2->torch>=2.0.0->accelerate) (3.0.2)
Downloading accelerate-1.7.0-py3-none-any.whl (362 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 362.1/362.1 kB 12.1 MB/s eta 0:00:00
Downloading bitsandbytes-0.45.5-py3-none-manylinux_2_24_x86_64.whl (76.1 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 76.1/76.1 MB 10.8 MB/s eta 0:00:00
Downloading nvidia_cublas_cu12-12.4.5.8-py3-none-manylinux2014_x86_64.whl (363.4 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 363.4/363.4 MB 2.7 MB/s eta 0:00:00
Downloading nvidia_cuda_cupti_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl (13.8 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 13.8/13.8 MB 119.6 MB/s eta 0:00:00
Downloading nvidia_cuda_nvrtc_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl (24.6 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 24.6/24.6 MB 87.9 MB/s eta 0:00:00
Downloading nvidia_cuda_runtime_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl (883 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 883.7/883.7 kB 59.2 MB/s eta 0:00:00
Downloading nvidia_cudnn_cu12-9.1.0.70-py3-none-manylinux2014_x86_64.whl (664.8 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 664.8/664.8 MB 2.7 MB/s eta 0:00:00
Downloading nvidia_cufft_cu12-11.2.1.3-py3-none-manylinux2014_x86_64.whl (211.5 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 211.5/211.5 MB 5.3 MB/s eta 0:00:00
Downloading nvidia_curand_cu12-10.3.5.147-py3-none-manylinux2014_x86_64.whl (56.3 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 56.3/56.3 MB 16.9 MB/s eta 0:00:00
Downloading nvidia_cusolver_cu12-11.6.1.9-py3-none-manylinux2014_x86_64.whl (127.9 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 127.9/127.9 MB 7.5 MB/s eta 0:00:00
Downloading nvidia_cusparse_cu12-12.3.1.170-py3-none-manylinux2014_x86_64.whl (207.5 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 207.5/207.5 MB 5.4 MB/s eta 0:00:00
Downloading nvidia_nvjitlink_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl (21.1 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 21.1/21.1 MB 43.6 MB/s eta 0:00:00
Installing collected packages: nvidia-nvjitlink-cu12, nvidia-curand-cu12, nvidia-cufft-cu12, nvidia-cuda-runtime-cu12, nvidia-cuda-nvrtc-cu12, nvidia-cuda-cupti-cu12, nvidia-cublas-cu12, nvidia-cusparse-cu12, nvidia-cudnn-cu12, nvidia-cusolver-cu12, bitsandbytes, accelerate
  Attempting uninstall: nvidia-nvjitlink-cu12
    Found existing installation: nvidia-nvjitlink-cu12 12.5.82
    Uninstalling nvidia-nvjitlink-cu12-12.5.82:
      Successfully uninstalled nvidia-nvjitlink-cu12-12.5.82
  Attempting uninstall: nvidia-curand-cu12
    Found existing installation: nvidia-curand-cu12 10.3.6.82
    Uninstalling nvidia-curand-cu12-10.3.6.82:
      Successfully uninstalled nvidia-curand-cu12-10.3.6.82
  Attempting uninstall: nvidia-cufft-cu12
    Found existing installation: nvidia-cufft-cu12 11.2.3.61
    Uninstalling nvidia-cufft-cu12-11.2.3.61:
      Successfully uninstalled nvidia-cufft-cu12-11.2.3.61
  Attempting uninstall: nvidia-cuda-runtime-cu12
    Found existing installation: nvidia-cuda-runtime-cu12 12.5.82
    Uninstalling nvidia-cuda-runtime-cu12-12.5.82:
      Successfully uninstalled nvidia-cuda-runtime-cu12-12.5.82
  Attempting uninstall: nvidia-cuda-nvrtc-cu12
    Found existing installation: nvidia-cuda-nvrtc-cu12 12.5.82
    Uninstalling nvidia-cuda-nvrtc-cu12-12.5.82:
      Successfully uninstalled nvidia-cuda-nvrtc-cu12-12.5.82
  Attempting uninstall: nvidia-cuda-cupti-cu12
    Found existing installation: nvidia-cuda-cupti-cu12 12.5.82
    Uninstalling nvidia-cuda-cupti-cu12-12.5.82:
      Successfully uninstalled nvidia-cuda-cupti-cu12-12.5.82
  Attempting uninstall: nvidia-cublas-cu12
    Found existing installation: nvidia-cublas-cu12 12.5.3.2
    Uninstalling nvidia-cublas-cu12-12.5.3.2:
      Successfully uninstalled nvidia-cublas-cu12-12.5.3.2
  Attempting uninstall: nvidia-cusparse-cu12
    Found existing installation: nvidia-cusparse-cu12 12.5.1.3
    Uninstalling nvidia-cusparse-cu12-12.5.1.3:
      Successfully uninstalled nvidia-cusparse-cu12-12.5.1.3
  Attempting uninstall: nvidia-cudnn-cu12
    Found existing installation: nvidia-cudnn-cu12 9.3.0.75
    Uninstalling nvidia-cudnn-cu12-9.3.0.75:
      Successfully uninstalled nvidia-cudnn-cu12-9.3.0.75
  Attempting uninstall: nvidia-cusolver-cu12
    Found existing installation: nvidia-cusolver-cu12 11.6.3.83
    Uninstalling nvidia-cusolver-cu12-11.6.3.83:
      Successfully uninstalled nvidia-cusolver-cu12-11.6.3.83
  Attempting uninstall: accelerate
    Found existing installation: accelerate 1.6.0
    Uninstalling accelerate-1.6.0:
      Successfully uninstalled accelerate-1.6.0
Successfully installed accelerate-1.7.0 bitsandbytes-0.45.5 nvidia-cublas-cu12-12.4.5.8 nvidia-cuda-cupti-cu12-12.4.127 nvidia-cuda-nvrtc-cu12-12.4.127 nvidia-cuda-runtime-cu12-12.4.127 nvidia-cudnn-cu12-9.1.0.70 nvidia-cufft-cu12-11.2.1.3 nvidia-curand-cu12-10.3.5.147 nvidia-cusolver-cu12-11.6.1.9 nvidia-cusparse-cu12-12.3.1.170 nvidia-nvjitlink-cu12-12.4.127
Requirement already satisfied: torch in /usr/local/lib/python3.11/dist-packages (2.6.0+cu124)
Requirement already satisfied: filelock in /usr/local/lib/python3.11/dist-packages (from torch) (3.18.0)
Requirement already satisfied: typing-extensions>=4.10.0 in /usr/local/lib/python3.11/dist-packages (from torch) (4.13.2)
Requirement already satisfied: networkx in /usr/local/lib/python3.11/dist-packages (from torch) (3.4.2)
Requirement already satisfied: jinja2 in /usr/local/lib/python3.11/dist-packages (from torch) (3.1.6)
Requirement already satisfied: fsspec in /usr/local/lib/python3.11/dist-packages (from torch) (2025.3.2)
Requirement already satisfied: nvidia-cuda-nvrtc-cu12==12.4.127 in /usr/local/lib/python3.11/dist-packages (from torch) (12.4.127)
Requirement already satisfied: nvidia-cuda-runtime-cu12==12.4.127 in /usr/local/lib/python3.11/dist-packages (from torch) (12.4.127)
Requirement already satisfied: nvidia-cuda-cupti-cu12==12.4.127 in /usr/local/lib/python3.11/dist-packages (from torch) (12.4.127)
Requirement already satisfied: nvidia-cudnn-cu12==9.1.0.70 in /usr/local/lib/python3.11/dist-packages (from torch) (9.1.0.70)
Requirement already satisfied: nvidia-cublas-cu12==12.4.5.8 in /usr/local/lib/python3.11/dist-packages (from torch) (12.4.5.8)
Requirement already satisfied: nvidia-cufft-cu12==11.2.1.3 in /usr/local/lib/python3.11/dist-packages (from torch) (11.2.1.3)
Requirement already satisfied: nvidia-curand-cu12==10.3.5.147 in /usr/local/lib/python3.11/dist-packages (from torch) (10.3.5.147)
Requirement already satisfied: nvidia-cusolver-cu12==11.6.1.9 in /usr/local/lib/python3.11/dist-packages (from torch) (11.6.1.9)
Requirement already satisfied: nvidia-cusparse-cu12==12.3.1.170 in /usr/local/lib/python3.11/dist-packages (from torch) (12.3.1.170)
Requirement already satisfied: nvidia-cusparselt-cu12==0.6.2 in /usr/local/lib/python3.11/dist-packages (from torch) (0.6.2)
Requirement already satisfied: nvidia-nccl-cu12==2.21.5 in /usr/local/lib/python3.11/dist-packages (from torch) (2.21.5)
Requirement already satisfied: nvidia-nvtx-cu12==12.4.127 in /usr/local/lib/python3.11/dist-packages (from torch) (12.4.127)
Requirement already satisfied: nvidia-nvjitlink-cu12==12.4.127 in /usr/local/lib/python3.11/dist-packages (from torch) (12.4.127)
Requirement already satisfied: triton==3.2.0 in /usr/local/lib/python3.11/dist-packages (from torch) (3.2.0)
Requirement already satisfied: sympy==1.13.1 in /usr/local/lib/python3.11/dist-packages (from torch) (1.13.1)
Requirement already satisfied: mpmath<1.4,>=1.1.0 in /usr/local/lib/python3.11/dist-packages (from sympy==1.13.1->torch) (1.3.0)
Requirement already satisfied: MarkupSafe>=2.0 in /usr/local/lib/python3.11/dist-packages (from jinja2->torch) (3.0.2)

かなり時間がかかるかと思いますが、最終的に「Successfully」的なものが表示されていれば成功です!

2.3 Hugging Faceへのログイン

Google Colab上で、Hugging Faceへログインします。

次のコードを実行します。

# Hugging Face ログイン
!huggingface-cli login

すると「HUGGING FACE」の文字が現れるとともに、「Enter your token」と表示されるので、ここに1.3で先ほどコピーした「Hugging Face アクセストークンコード」を入力し、Enter を押します。(画像内で表示されている「iCloudパスワード」は気にしないでください)

その後、「Add token as git credential? (Y/n)」と言われるので、「Y」と入力し Enter を押します。

「Your token has been saved to /root/.cache/huggingface/token
Login successful.」と表示されれば、成功です!(先ほど入力したアクセストークン名も、ところどころに表示されていると思います)

2.4 モデルのダウンロード

ここからは、Hugging FaceからLlamaのモデルをダウンロードします。

次のコードを実行します。

# モデルのダウンロード
!huggingface-cli download meta-llama/Meta-Llama-3.1-8B-Instruct --include "original/*" --local-dir Meta-Llama-3.1-8B-Instruct

すると、ダウンロードが開始されます。

途中経過を見てみると、完了率完了した容量 が表示されていることがわかります。

途中経過(実行コードではありません)
consolidated.00.pth:   2% 346M/16.1G [00:04<04:13, 62.0MB/s]
実行結果

あくまで一例です。

Fetching 3 files:   0% 0/3 [00:00<?, ?it/s]Downloading 'original/consolidated.00.pth' to 'Meta-Llama-3.1-8B-Instruct/.cache/huggingface/download/original/_dLw4ih-O1I9AkO57vYC89Z48Os=.ab33d910f405204e5d388bc3521503584800461dc96808e287821dd451c1edac.incomplete'

consolidated.00.pth:   0% 0.00/16.1G [00:00<?, ?B/s]Downloading 'original/tokenizer.model' to 'Meta-Llama-3.1-8B-Instruct/.cache/huggingface/download/original/7iVfz3cUOMr-hyjiqqRDHEwVBAM=.82e9d31979e92ab929cd544440f129d9ecd797b69e327f80f17e1c50d5551b55.incomplete'


tokenizer.model: 100% 2.18M/2.18M [00:00<00:00, 32.4MB/s]
Download complete. Moving file to Meta-Llama-3.1-8B-Instruct/original/tokenizer.model

consolidated.00.pth:   0% 10.5M/16.1G [00:00<03:37, 73.9MB/s]Downloading 'original/params.json' to 'Meta-Llama-3.1-8B-Instruct/.cache/huggingface/download/original/jqHB00sRqBVJXCrFOHz5gDS2Bg8=.f1131204e79d0c09d2bac93f11569a8a655d68ba.incomplete'

consolidated.00.pth:   0% 31.5M/16.1G [00:00<02:15, 118MB/s] 
consolidated.00.pth:   0% 62.9M/16.1G [00:00<01:25, 186MB/s]

params.json: 100% 199/199 [00:00<00:00, 787kB/s]
Download complete. Moving file to Meta-Llama-3.1-8B-Instruct/original/params.json

consolidated.00.pth:   1% 94.4M/16.1G [00:00<01:12, 219MB/s]
consolidated.00.pth:   1% 136M/16.1G [00:00<01:20, 197MB/s] 
consolidated.00.pth:   1% 168M/16.1G [00:00<01:11, 222MB/s]
consolidated.00.pth:   1% 199M/16.1G [00:00<01:06, 240MB/s]
consolidated.00.pth:   1% 231M/16.1G [00:01<01:07, 234MB/s]
consolidated.00.pth:   2% 262M/16.1G [00:01<01:04, 246MB/s]
consolidated.00.pth:   2% 294M/16.1G [00:01<01:27, 179MB/s]
consolidated.00.pth:   2% 325M/16.1G [00:01<01:16, 207MB/s]
consolidated.00.pth:   2% 357M/16.1G [00:01<01:08, 228MB/s]
consolidated.00.pth:   2% 388M/16.1G [00:01<01:11, 221MB/s]
consolidated.00.pth:   3% 419M/16.1G [00:01<01:08, 230MB/s]
consolidated.00.pth:   3% 451M/16.1G [00:02<01:08, 229MB/s]
consolidated.00.pth:   3% 482M/16.1G [00:02<01:05, 238MB/s]
consolidated.00.pth:   3% 514M/16.1G [00:02<01:05, 237MB/s]
consolidated.00.pth:   3% 545M/16.1G [00:03<03:36, 71.5MB/s]
consolidated.00.pth:   4% 566M/16.1G [00:05<08:29, 30.4MB/s]
consolidated.00.pth:   4% 619M/16.1G [00:05<04:57, 51.9MB/s]
consolidated.00.pth:   4% 661M/16.1G [00:05<03:30, 73.1MB/s]
consolidated.00.pth:   4% 692M/16.1G [00:05<02:48, 91.3MB/s]
consolidated.00.pth:   5% 724M/16.1G [00:06<02:16, 112MB/s] 
consolidated.00.pth:   5% 755M/16.1G [00:06<01:52, 137MB/s]
consolidated.00.pth:   5% 797M/16.1G [00:06<01:26, 177MB/s]
consolidated.00.pth:   5% 839M/16.1G [00:06<01:16, 199MB/s]
consolidated.00.pth:   5% 870M/16.1G [00:06<01:12, 211MB/s]
consolidated.00.pth:   6% 912M/16.1G [00:06<01:02, 242MB/s]
consolidated.00.pth:   6% 944M/16.1G [00:06<01:02, 244MB/s]
consolidated.00.pth:   6% 986M/16.1G [00:06<00:53, 282MB/s]
consolidated.00.pth:   6% 1.03G/16.1G [00:06<00:49, 302MB/s]
consolidated.00.pth:   7% 1.07G/16.1G [00:07<01:10, 214MB/s]
consolidated.00.pth:   7% 1.10G/16.1G [00:07<01:06, 224MB/s]
consolidated.00.pth:   7% 1.13G/16.1G [00:07<01:10, 213MB/s]
consolidated.00.pth:   7% 1.16G/16.1G [00:07<01:09, 213MB/s]
consolidated.00.pth:   8% 1.21G/16.1G [00:07<01:01, 240MB/s]
consolidated.00.pth:   8% 1.24G/16.1G [00:08<01:00, 245MB/s]
consolidated.00.pth:   8% 1.27G/16.1G [00:08<00:58, 252MB/s]
consolidated.00.pth:   8% 1.30G/16.1G [00:08<00:58, 250MB/s]
consolidated.00.pth:   8% 1.34G/16.1G [00:08<00:55, 265MB/s]
consolidated.00.pth:   9% 1.37G/16.1G [00:08<00:58, 252MB/s]
consolidated.00.pth:   9% 1.41G/16.1G [00:08<00:57, 257MB/s]
consolidated.00.pth:   9% 1.44G/16.1G [00:08<00:58, 252MB/s]
consolidated.00.pth:   9% 1.47G/16.1G [00:08<00:57, 254MB/s]
consolidated.00.pth:   9% 1.50G/16.1G [00:09<00:57, 253MB/s]
consolidated.00.pth:  10% 1.53G/16.1G [00:09<01:00, 242MB/s]
consolidated.00.pth:  10% 1.56G/16.1G [00:09<01:01, 237MB/s]
consolidated.00.pth:  10% 1.59G/16.1G [00:09<00:58, 247MB/s]
consolidated.00.pth:  10% 1.63G/16.1G [00:09<00:58, 249MB/s]
consolidated.00.pth:  10% 1.66G/16.1G [00:09<00:57, 252MB/s]
consolidated.00.pth:  11% 1.69G/16.1G [00:09<01:02, 229MB/s]
consolidated.00.pth:  11% 1.72G/16.1G [00:10<01:07, 212MB/s]
consolidated.00.pth:  11% 1.76G/16.1G [00:10<01:03, 225MB/s]
consolidated.00.pth:  11% 1.79G/16.1G [00:10<00:58, 244MB/s]
consolidated.00.pth:  11% 1.84G/16.1G [00:10<00:55, 254MB/s]
consolidated.00.pth:  12% 1.87G/16.1G [00:10<00:53, 266MB/s]
consolidated.00.pth:  12% 1.91G/16.1G [00:10<00:49, 286MB/s]
consolidated.00.pth:  12% 1.94G/16.1G [00:10<00:48, 290MB/s]
consolidated.00.pth:  12% 1.98G/16.1G [00:10<00:45, 306MB/s]
consolidated.00.pth:  13% 2.02G/16.1G [00:11<00:44, 314MB/s]
consolidated.00.pth:  13% 2.07G/16.1G [00:11<00:42, 333MB/s]
consolidated.00.pth:  13% 2.11G/16.1G [00:11<00:46, 302MB/s]
consolidated.00.pth:  13% 2.14G/16.1G [00:11<01:26, 162MB/s]
consolidated.00.pth:  14% 2.18G/16.1G [00:11<01:09, 198MB/s]
consolidated.00.pth:  14% 2.22G/16.1G [00:12<01:02, 221MB/s]
consolidated.00.pth:  14% 2.25G/16.1G [00:15<07:53, 29.2MB/s]
consolidated.00.pth:  14% 2.31G/16.1G [00:15<05:05, 44.9MB/s]
consolidated.00.pth:  15% 2.34G/16.1G [00:16<04:04, 56.2MB/s]
consolidated.00.pth:  15% 2.37G/16.1G [00:16<03:17, 69.3MB/s]
consolidated.00.pth:  15% 2.41G/16.1G [00:16<02:23, 95.0MB/s]
consolidated.00.pth:  15% 2.45G/16.1G [00:16<01:48, 125MB/s] 
consolidated.00.pth:  16% 2.50G/16.1G [00:16<01:25, 159MB/s]
consolidated.00.pth:  16% 2.54G/16.1G [00:16<01:15, 180MB/s]
consolidated.00.pth:  16% 2.58G/16.1G [00:16<01:11, 189MB/s]
consolidated.00.pth:  16% 2.61G/16.1G [00:17<01:08, 196MB/s]
consolidated.00.pth:  16% 2.64G/16.1G [00:17<01:08, 195MB/s]
consolidated.00.pth:  17% 2.67G/16.1G [00:17<01:22, 163MB/s]
consolidated.00.pth:  17% 2.69G/16.1G [00:21<10:48, 20.6MB/s]
consolidated.00.pth:  17% 2.74G/16.1G [00:22<06:59, 31.7MB/s]
consolidated.00.pth:  17% 2.79G/16.1G [00:22<04:23, 50.3MB/s]
consolidated.00.pth:  18% 2.83G/16.1G [00:22<03:10, 69.3MB/s]
consolidated.00.pth:  18% 2.87G/16.1G [00:22<02:30, 87.4MB/s]
consolidated.00.pth:  18% 2.90G/16.1G [00:22<02:03, 106MB/s] 
consolidated.00.pth:  18% 2.94G/16.1G [00:22<01:44, 126MB/s]
consolidated.00.pth:  18% 2.97G/16.1G [00:22<01:28, 148MB/s]
consolidated.00.pth:  19% 3.00G/16.1G [00:22<01:16, 171MB/s]
consolidated.00.pth:  19% 3.04G/16.1G [00:23<01:03, 205MB/s]
consolidated.00.pth:  19% 3.07G/16.1G [00:23<00:59, 219MB/s]
consolidated.00.pth:  19% 3.10G/16.1G [00:23<00:56, 227MB/s]
consolidated.00.pth:  20% 3.14G/16.1G [00:23<00:55, 234MB/s]
consolidated.00.pth:  20% 3.17G/16.1G [00:23<00:52, 244MB/s]
consolidated.00.pth:  20% 3.20G/16.1G [00:23<00:54, 237MB/s]
consolidated.00.pth:  20% 3.23G/16.1G [00:23<00:51, 247MB/s]
consolidated.00.pth:  20% 3.26G/16.1G [00:23<00:49, 256MB/s]
consolidated.00.pth:  21% 3.29G/16.1G [00:24<00:50, 253MB/s]
consolidated.00.pth:  21% 3.32G/16.1G [00:24<00:50, 253MB/s]
consolidated.00.pth:  21% 3.36G/16.1G [00:24<00:49, 255MB/s]
consolidated.00.pth:  21% 3.40G/16.1G [00:24<00:48, 264MB/s]
consolidated.00.pth:  21% 3.43G/16.1G [00:24<00:53, 236MB/s]
consolidated.00.pth:  22% 3.47G/16.1G [00:24<00:45, 274MB/s]
consolidated.00.pth:  22% 3.50G/16.1G [00:24<00:49, 253MB/s]
consolidated.00.pth:  22% 3.53G/16.1G [00:25<00:51, 245MB/s]
consolidated.00.pth:  22% 3.58G/16.1G [00:25<00:45, 276MB/s]
consolidated.00.pth:  22% 3.61G/16.1G [00:25<00:53, 235MB/s]
consolidated.00.pth:  23% 3.64G/16.1G [00:25<00:54, 230MB/s]
consolidated.00.pth:  23% 3.67G/16.1G [00:25<00:52, 234MB/s]
consolidated.00.pth:  23% 3.71G/16.1G [00:25<00:45, 269MB/s]
consolidated.00.pth:  23% 3.74G/16.1G [00:25<00:45, 270MB/s]
consolidated.00.pth:  24% 3.77G/16.1G [00:26<00:54, 223MB/s]
consolidated.00.pth:  24% 3.82G/16.1G [00:26<00:48, 252MB/s]
consolidated.00.pth:  24% 3.86G/16.1G [00:26<00:44, 272MB/s]
consolidated.00.pth:  24% 3.89G/16.1G [00:26<00:48, 249MB/s]
consolidated.00.pth:  24% 3.92G/16.1G [00:26<00:47, 254MB/s]
consolidated.00.pth:  25% 3.95G/16.1G [00:26<00:49, 246MB/s]
consolidated.00.pth:  25% 3.98G/16.1G [00:26<00:48, 248MB/s]
consolidated.00.pth:  25% 4.02G/16.1G [00:26<00:49, 245MB/s]
consolidated.00.pth:  25% 4.05G/16.1G [00:27<00:48, 246MB/s]
consolidated.00.pth:  25% 4.08G/16.1G [00:27<00:51, 230MB/s]
consolidated.00.pth:  26% 4.11G/16.1G [00:27<00:52, 228MB/s]
consolidated.00.pth:  26% 4.15G/16.1G [00:27<00:45, 259MB/s]
consolidated.00.pth:  26% 4.19G/16.1G [00:27<00:42, 281MB/s]
consolidated.00.pth:  26% 4.23G/16.1G [00:27<00:43, 269MB/s]
consolidated.00.pth:  27% 4.26G/16.1G [00:27<00:45, 257MB/s]
consolidated.00.pth:  27% 4.29G/16.1G [00:28<00:47, 249MB/s]
consolidated.00.pth:  27% 4.32G/16.1G [00:28<00:53, 219MB/s]
consolidated.00.pth:  27% 4.36G/16.1G [00:28<00:45, 255MB/s]
consolidated.00.pth:  27% 4.41G/16.1G [00:28<00:37, 307MB/s]
consolidated.00.pth:  28% 4.46G/16.1G [00:28<00:45, 256MB/s]
consolidated.00.pth:  28% 4.49G/16.1G [00:28<00:53, 218MB/s]
consolidated.00.pth:  28% 4.52G/16.1G [00:29<00:52, 220MB/s]
consolidated.00.pth:  28% 4.56G/16.1G [00:29<00:44, 256MB/s]
consolidated.00.pth:  29% 4.61G/16.1G [00:29<00:37, 302MB/s]
consolidated.00.pth:  29% 4.67G/16.1G [00:29<00:36, 309MB/s]
consolidated.00.pth:  29% 4.71G/16.1G [00:29<00:49, 231MB/s]
consolidated.00.pth:  30% 4.74G/16.1G [00:29<00:53, 213MB/s]
consolidated.00.pth:  30% 4.78G/16.1G [00:29<00:45, 246MB/s]
consolidated.00.pth:  30% 4.81G/16.1G [00:30<00:46, 240MB/s]
consolidated.00.pth:  30% 4.85G/16.1G [00:30<00:41, 272MB/s]
consolidated.00.pth:  30% 4.90G/16.1G [00:30<00:37, 298MB/s]
consolidated.00.pth:  31% 4.94G/16.1G [00:30<00:44, 251MB/s]
consolidated.00.pth:  31% 4.97G/16.1G [00:30<00:44, 251MB/s]
consolidated.00.pth:  31% 5.00G/16.1G [00:30<00:45, 244MB/s]
consolidated.00.pth:  31% 5.03G/16.1G [00:30<00:44, 248MB/s]
consolidated.00.pth:  32% 5.06G/16.1G [00:31<00:45, 244MB/s]
consolidated.00.pth:  32% 5.10G/16.1G [00:31<00:51, 215MB/s]
consolidated.00.pth:  32% 5.13G/16.1G [00:31<01:04, 169MB/s]
consolidated.00.pth:  32% 5.15G/16.1G [00:36<10:08, 17.9MB/s]
consolidated.00.pth:  32% 5.19G/16.1G [00:36<06:26, 28.2MB/s]
consolidated.00.pth:  33% 5.22G/16.1G [00:36<04:43, 38.3MB/s]
consolidated.00.pth:  33% 5.26G/16.1G [00:36<03:12, 56.0MB/s]
consolidated.00.pth:  33% 5.31G/16.1G [00:36<02:18, 77.9MB/s]
consolidated.00.pth:  33% 5.34G/16.1G [00:37<01:53, 94.8MB/s]
consolidated.00.pth:  33% 5.37G/16.1G [00:37<02:11, 81.1MB/s]
consolidated.00.pth:  34% 5.40G/16.1G [00:37<01:54, 92.9MB/s]
consolidated.00.pth:  34% 5.42G/16.1G [00:37<01:41, 105MB/s] 
consolidated.00.pth:  34% 5.45G/16.1G [00:38<01:23, 128MB/s]
consolidated.00.pth:  34% 5.49G/16.1G [00:38<01:06, 159MB/s]
consolidated.00.pth:  34% 5.53G/16.1G [00:38<01:00, 173MB/s]
consolidated.00.pth:  35% 5.56G/16.1G [00:40<03:48, 46.0MB/s]
consolidated.00.pth:  35% 5.58G/16.1G [00:42<06:50, 25.5MB/s]
consolidated.00.pth:  35% 5.61G/16.1G [00:42<04:50, 35.9MB/s]
consolidated.00.pth:  35% 5.64G/16.1G [00:42<03:33, 48.9MB/s]
consolidated.00.pth:  35% 5.67G/16.1G [00:42<02:42, 64.0MB/s]
consolidated.00.pth:  36% 5.70G/16.1G [00:42<02:05, 82.5MB/s]
consolidated.00.pth:  36% 5.75G/16.1G [00:43<01:28, 117MB/s] 
consolidated.00.pth:  36% 5.78G/16.1G [00:43<01:12, 141MB/s]
consolidated.00.pth:  36% 5.81G/16.1G [00:43<01:10, 146MB/s]
consolidated.00.pth:  36% 5.85G/16.1G [00:43<00:55, 184MB/s]
consolidated.00.pth:  37% 5.89G/16.1G [00:43<00:50, 203MB/s]
consolidated.00.pth:  37% 5.92G/16.1G [00:43<00:48, 211MB/s]
consolidated.00.pth:  37% 5.97G/16.1G [00:43<00:42, 236MB/s]
consolidated.00.pth:  37% 6.00G/16.1G [00:44<00:42, 236MB/s]
consolidated.00.pth:  38% 6.03G/16.1G [00:44<00:40, 245MB/s]
consolidated.00.pth:  38% 6.06G/16.1G [00:44<00:39, 252MB/s]
consolidated.00.pth:  38% 6.09G/16.1G [00:44<00:37, 265MB/s]
consolidated.00.pth:  38% 6.13G/16.1G [00:44<00:36, 275MB/s]
consolidated.00.pth:  38% 6.17G/16.1G [00:44<00:38, 259MB/s]
consolidated.00.pth:  39% 6.20G/16.1G [00:44<00:39, 251MB/s]
consolidated.00.pth:  39% 6.23G/16.1G [00:44<00:40, 245MB/s]
consolidated.00.pth:  39% 6.26G/16.1G [00:45<00:48, 201MB/s]
consolidated.00.pth:  39% 6.30G/16.1G [00:45<00:42, 230MB/s]
consolidated.00.pth:  39% 6.33G/16.1G [00:45<00:48, 201MB/s]
consolidated.00.pth:  40% 6.36G/16.1G [00:45<01:04, 151MB/s]
consolidated.00.pth:  40% 6.40G/16.1G [00:45<00:55, 175MB/s]
consolidated.00.pth:  40% 6.44G/16.1G [00:46<00:47, 203MB/s]
consolidated.00.pth:  40% 6.47G/16.1G [00:46<00:47, 201MB/s]
consolidated.00.pth:  40% 6.50G/16.1G [00:46<00:45, 209MB/s]
consolidated.00.pth:  41% 6.54G/16.1G [00:46<00:46, 203MB/s]
consolidated.00.pth:  41% 6.57G/16.1G [00:46<00:56, 167MB/s]
consolidated.00.pth:  41% 6.61G/16.1G [00:47<00:49, 191MB/s]
consolidated.00.pth:  41% 6.64G/16.1G [00:47<00:48, 193MB/s]
consolidated.00.pth:  42% 6.67G/16.1G [00:47<00:45, 207MB/s]
consolidated.00.pth:  42% 6.71G/16.1G [00:47<00:39, 238MB/s]
consolidated.00.pth:  42% 6.74G/16.1G [00:49<03:04, 50.6MB/s]
consolidated.00.pth:  42% 6.76G/16.1G [00:52<07:35, 20.4MB/s]
consolidated.00.pth:  42% 6.78G/16.1G [00:52<06:09, 25.1MB/s]
consolidated.00.pth:  43% 6.83G/16.1G [00:53<03:53, 39.5MB/s]
consolidated.00.pth:  43% 6.87G/16.1G [00:53<02:38, 57.9MB/s]
consolidated.00.pth:  43% 6.91G/16.1G [00:53<01:53, 80.7MB/s]
consolidated.00.pth:  43% 6.94G/16.1G [00:53<01:32, 98.9MB/s]
consolidated.00.pth:  43% 6.98G/16.1G [00:53<01:08, 132MB/s] 
consolidated.00.pth:  44% 7.01G/16.1G [00:53<00:59, 153MB/s]
consolidated.00.pth:  44% 7.05G/16.1G [00:53<00:53, 169MB/s]
consolidated.00.pth:  44% 7.08G/16.1G [00:53<00:48, 187MB/s]
consolidated.00.pth:  44% 7.11G/16.1G [00:53<00:43, 206MB/s]
consolidated.00.pth:  44% 7.14G/16.1G [00:54<00:43, 204MB/s]
consolidated.00.pth:  45% 7.18G/16.1G [00:54<00:37, 236MB/s]
consolidated.00.pth:  45% 7.21G/16.1G [00:54<00:36, 240MB/s]
consolidated.00.pth:  45% 7.25G/16.1G [00:54<00:52, 169MB/s]
consolidated.00.pth:  45% 7.30G/16.1G [00:54<00:38, 229MB/s]
consolidated.00.pth:  46% 7.33G/16.1G [00:54<00:37, 235MB/s]
consolidated.00.pth:  46% 7.37G/16.1G [00:55<00:37, 232MB/s]
consolidated.00.pth:  46% 7.40G/16.1G [00:55<00:50, 172MB/s]
consolidated.00.pth:  46% 7.43G/16.1G [00:55<00:43, 196MB/s]
consolidated.00.pth:  46% 7.47G/16.1G [00:55<00:41, 206MB/s]
consolidated.00.pth:  47% 7.50G/16.1G [00:55<00:39, 217MB/s]
consolidated.00.pth:  47% 7.53G/16.1G [00:55<00:38, 221MB/s]
consolidated.00.pth:  47% 7.56G/16.1G [00:56<00:43, 197MB/s]
consolidated.00.pth:  47% 7.60G/16.1G [00:56<00:35, 236MB/s]
consolidated.00.pth:  48% 7.64G/16.1G [00:56<00:32, 258MB/s]
consolidated.00.pth:  48% 7.68G/16.1G [00:56<00:33, 254MB/s]
consolidated.00.pth:  48% 7.72G/16.1G [00:56<00:31, 264MB/s]
consolidated.00.pth:  48% 7.75G/16.1G [00:56<00:35, 237MB/s]
consolidated.00.pth:  49% 7.79G/16.1G [00:56<00:29, 276MB/s]
consolidated.00.pth:  49% 7.82G/16.1G [00:57<00:34, 236MB/s]
consolidated.00.pth:  49% 7.85G/16.1G [00:57<00:37, 221MB/s]
consolidated.00.pth:  49% 7.90G/16.1G [00:57<00:31, 257MB/s]
consolidated.00.pth:  49% 7.95G/16.1G [00:57<00:26, 304MB/s]
consolidated.00.pth:  50% 7.99G/16.1G [00:57<00:32, 248MB/s]
consolidated.00.pth:  50% 8.02G/16.1G [00:57<00:32, 247MB/s]
consolidated.00.pth:  50% 8.05G/16.1G [00:58<00:32, 249MB/s]
consolidated.00.pth:  50% 8.08G/16.1G [00:58<00:33, 237MB/s]
consolidated.00.pth:  51% 8.12G/16.1G [00:58<00:31, 252MB/s]
consolidated.00.pth:  51% 8.16G/16.1G [00:58<00:28, 276MB/s]
consolidated.00.pth:  51% 8.19G/16.1G [00:58<00:28, 275MB/s]
consolidated.00.pth:  51% 8.22G/16.1G [00:58<00:37, 210MB/s]
consolidated.00.pth:  51% 8.25G/16.1G [00:58<00:34, 228MB/s]
consolidated.00.pth:  52% 8.29G/16.1G [00:58<00:29, 260MB/s]
consolidated.00.pth:  52% 8.33G/16.1G [00:59<00:29, 265MB/s]
consolidated.00.pth:  52% 8.36G/16.1G [00:59<00:31, 245MB/s]
consolidated.00.pth:  52% 8.39G/16.1G [00:59<00:31, 244MB/s]
consolidated.00.pth:  52% 8.42G/16.1G [00:59<00:29, 257MB/s]
consolidated.00.pth:  53% 8.45G/16.1G [00:59<00:28, 270MB/s]
consolidated.00.pth:  53% 8.48G/16.1G [00:59<00:33, 226MB/s]
consolidated.00.pth:  53% 8.52G/16.1G [00:59<00:29, 257MB/s]
consolidated.00.pth:  53% 8.56G/16.1G [01:00<00:28, 259MB/s]
consolidated.00.pth:  53% 8.59G/16.1G [01:00<00:28, 261MB/s]
consolidated.00.pth:  54% 8.62G/16.1G [01:00<00:31, 238MB/s]
consolidated.00.pth:  54% 8.65G/16.1G [01:00<00:30, 243MB/s]
consolidated.00.pth:  54% 8.69G/16.1G [01:00<00:27, 268MB/s]
consolidated.00.pth:  54% 8.72G/16.1G [01:00<00:29, 248MB/s]
consolidated.00.pth:  55% 8.76G/16.1G [01:00<00:29, 248MB/s]
consolidated.00.pth:  55% 8.79G/16.1G [01:00<00:27, 260MB/s]
consolidated.00.pth:  55% 8.82G/16.1G [01:01<00:26, 268MB/s]
consolidated.00.pth:  55% 8.87G/16.1G [01:01<00:22, 317MB/s]
consolidated.00.pth:  55% 8.91G/16.1G [01:01<00:29, 239MB/s]
consolidated.00.pth:  56% 8.94G/16.1G [01:01<00:31, 226MB/s]
consolidated.00.pth:  56% 8.99G/16.1G [01:01<00:26, 266MB/s]
consolidated.00.pth:  56% 9.03G/16.1G [01:01<00:24, 282MB/s]
consolidated.00.pth:  56% 9.06G/16.1G [01:02<00:27, 250MB/s]
consolidated.00.pth:  57% 9.09G/16.1G [01:02<00:29, 236MB/s]
consolidated.00.pth:  57% 9.12G/16.1G [01:02<00:29, 237MB/s]
consolidated.00.pth:  57% 9.15G/16.1G [01:02<00:29, 234MB/s]
consolidated.00.pth:  57% 9.19G/16.1G [01:02<00:29, 234MB/s]
consolidated.00.pth:  57% 9.22G/16.1G [01:02<00:27, 252MB/s]
consolidated.00.pth:  58% 9.25G/16.1G [01:02<00:25, 267MB/s]
consolidated.00.pth:  58% 9.28G/16.1G [01:02<00:26, 258MB/s]
consolidated.00.pth:  58% 9.31G/16.1G [01:03<00:30, 223MB/s]
consolidated.00.pth:  58% 9.34G/16.1G [01:03<00:28, 232MB/s]
consolidated.00.pth:  58% 9.40G/16.1G [01:03<00:25, 263MB/s]
consolidated.00.pth:  59% 9.43G/16.1G [01:03<00:25, 259MB/s]
consolidated.00.pth:  59% 9.46G/16.1G [01:03<00:28, 233MB/s]
consolidated.00.pth:  59% 9.51G/16.1G [01:03<00:23, 283MB/s]
consolidated.00.pth:  59% 9.54G/16.1G [01:03<00:25, 257MB/s]
consolidated.00.pth:  60% 9.57G/16.1G [01:04<00:27, 237MB/s]
consolidated.00.pth:  60% 9.60G/16.1G [01:04<00:29, 218MB/s]
consolidated.00.pth:  60% 9.64G/16.1G [01:04<00:30, 213MB/s]
consolidated.00.pth:  60% 9.67G/16.1G [01:04<00:34, 186MB/s]
consolidated.00.pth:  60% 9.69G/16.1G [01:09<05:35, 19.0MB/s]
consolidated.00.pth:  61% 9.73G/16.1G [01:09<03:32, 29.7MB/s]
consolidated.00.pth:  61% 9.77G/16.1G [01:09<02:22, 44.0MB/s]
consolidated.00.pth:  61% 9.81G/16.1G [01:09<01:40, 62.0MB/s]
consolidated.00.pth:  61% 9.87G/16.1G [01:09<01:07, 91.4MB/s]
consolidated.00.pth:  62% 9.91G/16.1G [01:10<00:54, 114MB/s] 
consolidated.00.pth:  62% 9.95G/16.1G [01:10<00:42, 145MB/s]
consolidated.00.pth:  62% 9.99G/16.1G [01:10<00:34, 173MB/s]
consolidated.00.pth:  62% 10.0G/16.1G [01:10<00:39, 151MB/s]
consolidated.00.pth:  63% 10.1G/16.1G [01:13<02:28, 40.5MB/s]
consolidated.00.pth:  63% 10.1G/16.1G [01:13<02:06, 47.4MB/s]
consolidated.00.pth:  63% 10.1G/16.1G [01:14<02:27, 40.2MB/s]
consolidated.00.pth:  63% 10.2G/16.1G [01:14<01:37, 60.5MB/s]
consolidated.00.pth:  63% 10.2G/16.1G [01:14<01:18, 74.4MB/s]
consolidated.00.pth:  64% 10.2G/16.1G [01:14<01:12, 81.3MB/s]
consolidated.00.pth:  64% 10.2G/16.1G [01:14<01:06, 87.9MB/s]
consolidated.00.pth:  64% 10.3G/16.1G [01:14<00:45, 126MB/s] 
consolidated.00.pth:  64% 10.3G/16.1G [01:15<00:36, 158MB/s]
consolidated.00.pth:  64% 10.3G/16.1G [01:15<00:37, 154MB/s]
consolidated.00.pth:  65% 10.4G/16.1G [01:15<00:29, 194MB/s]
consolidated.00.pth:  65% 10.4G/16.1G [01:15<00:24, 226MB/s]
consolidated.00.pth:  65% 10.5G/16.1G [01:15<00:22, 244MB/s]
consolidated.00.pth:  65% 10.5G/16.1G [01:15<00:24, 231MB/s]
consolidated.00.pth:  66% 10.5G/16.1G [01:15<00:23, 231MB/s]
consolidated.00.pth:  66% 10.6G/16.1G [01:16<00:22, 243MB/s]
consolidated.00.pth:  66% 10.6G/16.1G [01:16<00:22, 240MB/s]
consolidated.00.pth:  66% 10.6G/16.1G [01:16<00:22, 242MB/s]
consolidated.00.pth:  66% 10.7G/16.1G [01:16<00:26, 202MB/s]
consolidated.00.pth:  67% 10.7G/16.1G [01:16<00:38, 140MB/s]
consolidated.00.pth:  67% 10.7G/16.1G [01:19<02:46, 32.2MB/s]
consolidated.00.pth:  67% 10.8G/16.1G [01:19<01:37, 54.6MB/s]
consolidated.00.pth:  67% 10.8G/16.1G [01:19<01:02, 83.4MB/s]
consolidated.00.pth:  68% 10.9G/16.1G [01:19<00:48, 108MB/s] 
consolidated.00.pth:  68% 10.9G/16.1G [01:19<00:37, 137MB/s]
consolidated.00.pth:  68% 10.9G/16.1G [01:20<00:34, 147MB/s]
consolidated.00.pth:  68% 11.0G/16.1G [01:20<00:43, 116MB/s]
consolidated.00.pth:  68% 11.0G/16.1G [01:25<04:13, 20.0MB/s]
consolidated.00.pth:  69% 11.0G/16.1G [01:25<03:27, 24.2MB/s]
consolidated.00.pth:  69% 11.1G/16.1G [01:25<02:16, 36.6MB/s]
consolidated.00.pth:  69% 11.1G/16.1G [01:26<01:33, 52.8MB/s]
consolidated.00.pth:  69% 11.1G/16.1G [01:26<01:07, 72.8MB/s]
consolidated.00.pth:  70% 11.2G/16.1G [01:26<00:54, 89.4MB/s]
consolidated.00.pth:  70% 11.2G/16.1G [01:26<00:41, 118MB/s] 
consolidated.00.pth:  70% 11.3G/16.1G [01:26<00:33, 143MB/s]
consolidated.00.pth:  70% 11.3G/16.1G [01:26<00:28, 166MB/s]
consolidated.00.pth:  71% 11.3G/16.1G [01:26<00:25, 183MB/s]
consolidated.00.pth:  71% 11.4G/16.1G [01:26<00:24, 196MB/s]
consolidated.00.pth:  71% 11.4G/16.1G [01:27<00:21, 213MB/s]
consolidated.00.pth:  71% 11.4G/16.1G [01:27<00:20, 224MB/s]
consolidated.00.pth:  71% 11.5G/16.1G [01:27<00:19, 232MB/s]
consolidated.00.pth:  72% 11.5G/16.1G [01:27<00:17, 264MB/s]
consolidated.00.pth:  72% 11.5G/16.1G [01:27<00:15, 288MB/s]
consolidated.00.pth:  72% 11.6G/16.1G [01:27<00:18, 246MB/s]
consolidated.00.pth:  72% 11.6G/16.1G [01:27<00:19, 231MB/s]
consolidated.00.pth:  72% 11.6G/16.1G [01:28<00:18, 236MB/s]
consolidated.00.pth:  73% 11.7G/16.1G [01:28<00:18, 234MB/s]
consolidated.00.pth:  73% 11.7G/16.1G [01:28<00:17, 252MB/s]
consolidated.00.pth:  73% 11.7G/16.1G [01:28<00:19, 225MB/s]
consolidated.00.pth:  73% 11.8G/16.1G [01:28<00:16, 266MB/s]
consolidated.00.pth:  73% 11.8G/16.1G [01:28<00:18, 232MB/s]
consolidated.00.pth:  74% 11.8G/16.1G [01:28<00:16, 258MB/s]
consolidated.00.pth:  74% 11.9G/16.1G [01:29<00:16, 251MB/s]
consolidated.00.pth:  74% 11.9G/16.1G [01:29<00:17, 239MB/s]
consolidated.00.pth:  74% 11.9G/16.1G [01:29<00:16, 243MB/s]
consolidated.00.pth:  75% 12.0G/16.1G [01:29<00:14, 278MB/s]
consolidated.00.pth:  75% 12.0G/16.1G [01:29<00:17, 236MB/s]
consolidated.00.pth:  75% 12.0G/16.1G [01:29<00:15, 252MB/s]
consolidated.00.pth:  75% 12.1G/16.1G [01:29<00:16, 245MB/s]
consolidated.00.pth:  76% 12.1G/16.1G [01:30<00:14, 274MB/s]
consolidated.00.pth:  76% 12.2G/16.1G [01:30<00:19, 204MB/s]
consolidated.00.pth:  76% 12.2G/16.1G [01:30<00:16, 236MB/s]
consolidated.00.pth:  76% 12.2G/16.1G [01:30<00:16, 234MB/s]
consolidated.00.pth:  76% 12.3G/16.1G [01:30<00:14, 267MB/s]
consolidated.00.pth:  77% 12.3G/16.1G [01:30<00:11, 316MB/s]
consolidated.00.pth:  77% 12.4G/16.1G [01:31<00:16, 227MB/s]
consolidated.00.pth:  77% 12.4G/16.1G [01:31<00:14, 257MB/s]
consolidated.00.pth:  77% 12.4G/16.1G [01:31<00:14, 242MB/s]
consolidated.00.pth:  78% 12.5G/16.1G [01:31<00:13, 257MB/s]
consolidated.00.pth:  78% 12.5G/16.1G [01:31<00:11, 312MB/s]
consolidated.00.pth:  78% 12.6G/16.1G [01:31<00:15, 226MB/s]
consolidated.00.pth:  78% 12.6G/16.1G [01:32<00:16, 206MB/s]
consolidated.00.pth:  79% 12.6G/16.1G [01:32<00:14, 240MB/s]
consolidated.00.pth:  79% 12.7G/16.1G [01:32<00:13, 247MB/s]
consolidated.00.pth:  79% 12.7G/16.1G [01:32<00:13, 250MB/s]
consolidated.00.pth:  80% 12.8G/16.1G [01:32<00:11, 282MB/s]
consolidated.00.pth:  80% 12.8G/16.1G [01:32<00:09, 324MB/s]
consolidated.00.pth:  80% 12.9G/16.1G [01:32<00:11, 269MB/s]
consolidated.00.pth:  80% 12.9G/16.1G [01:33<00:13, 230MB/s]
consolidated.00.pth:  81% 12.9G/16.1G [01:33<00:12, 244MB/s]
consolidated.00.pth:  81% 13.0G/16.1G [01:33<00:11, 279MB/s]
consolidated.00.pth:  81% 13.0G/16.1G [01:33<00:12, 253MB/s]
consolidated.00.pth:  81% 13.0G/16.1G [01:33<00:12, 246MB/s]
consolidated.00.pth:  81% 13.1G/16.1G [01:33<00:12, 249MB/s]
consolidated.00.pth:  82% 13.1G/16.1G [01:33<00:12, 243MB/s]
consolidated.00.pth:  82% 13.1G/16.1G [01:34<00:19, 151MB/s]
consolidated.00.pth:  82% 13.2G/16.1G [01:34<00:18, 157MB/s]
consolidated.00.pth:  82% 13.2G/16.1G [01:34<00:17, 163MB/s]
consolidated.00.pth:  82% 13.2G/16.1G [01:34<00:15, 183MB/s]
consolidated.00.pth:  82% 13.2G/16.1G [01:34<00:13, 204MB/s]
consolidated.00.pth:  83% 13.3G/16.1G [01:34<00:12, 225MB/s]
consolidated.00.pth:  83% 13.3G/16.1G [01:35<00:10, 259MB/s]
consolidated.00.pth:  83% 13.4G/16.1G [01:35<00:09, 290MB/s]
consolidated.00.pth:  83% 13.4G/16.1G [01:35<00:10, 252MB/s]
consolidated.00.pth:  84% 13.4G/16.1G [01:35<00:10, 249MB/s]
consolidated.00.pth:  84% 13.5G/16.1G [01:35<00:10, 243MB/s]
consolidated.00.pth:  84% 13.5G/16.1G [01:35<00:10, 239MB/s]
consolidated.00.pth:  84% 13.5G/16.1G [01:35<00:09, 262MB/s]
consolidated.00.pth:  84% 13.6G/16.1G [01:36<00:09, 252MB/s]
consolidated.00.pth:  85% 13.6G/16.1G [01:36<00:09, 248MB/s]
consolidated.00.pth:  85% 13.6G/16.1G [01:36<00:11, 205MB/s]
consolidated.00.pth:  85% 13.7G/16.1G [01:40<01:29, 26.9MB/s]
consolidated.00.pth:  85% 13.7G/16.1G [01:40<00:58, 40.2MB/s]
consolidated.00.pth:  86% 13.7G/16.1G [01:40<00:40, 57.5MB/s]
consolidated.00.pth:  86% 13.8G/16.1G [01:40<00:28, 79.1MB/s]
consolidated.00.pth:  86% 13.8G/16.1G [01:40<00:23, 97.1MB/s]
consolidated.00.pth:  86% 13.8G/16.1G [01:40<00:19, 116MB/s] 
consolidated.00.pth:  86% 13.9G/16.1G [01:40<00:14, 150MB/s]
consolidated.00.pth:  87% 13.9G/16.1G [01:40<00:12, 173MB/s]
consolidated.00.pth:  87% 14.0G/16.1G [01:40<00:09, 215MB/s]
consolidated.00.pth:  87% 14.0G/16.1G [01:41<00:08, 232MB/s]
consolidated.00.pth:  87% 14.0G/16.1G [01:41<00:08, 229MB/s]
consolidated.00.pth:  88% 14.1G/16.1G [01:41<00:08, 230MB/s]
consolidated.00.pth:  88% 14.1G/16.1G [01:41<00:08, 232MB/s]
consolidated.00.pth:  88% 14.1G/16.1G [01:41<00:09, 214MB/s]
consolidated.00.pth:  88% 14.2G/16.1G [01:42<00:18, 102MB/s]
consolidated.00.pth:  88% 14.2G/16.1G [01:45<01:21, 23.0MB/s]
consolidated.00.pth:  89% 14.2G/16.1G [01:46<00:46, 39.1MB/s]
consolidated.00.pth:  89% 14.3G/16.1G [01:46<00:31, 55.8MB/s]
consolidated.00.pth:  89% 14.3G/16.1G [01:46<00:22, 75.8MB/s]
consolidated.00.pth:  89% 14.4G/16.1G [01:46<00:18, 93.1MB/s]
consolidated.00.pth:  90% 14.4G/16.1G [01:46<00:15, 111MB/s] 
consolidated.00.pth:  90% 14.4G/16.1G [01:46<00:12, 134MB/s]
consolidated.00.pth:  90% 14.4G/16.1G [01:46<00:10, 158MB/s]
consolidated.00.pth:  90% 14.5G/16.1G [01:46<00:08, 186MB/s]
consolidated.00.pth:  90% 14.5G/16.1G [01:47<00:06, 221MB/s]
consolidated.00.pth:  91% 14.6G/16.1G [01:47<00:06, 221MB/s]
consolidated.00.pth:  91% 14.6G/16.1G [01:47<00:06, 234MB/s]
consolidated.00.pth:  91% 14.6G/16.1G [01:47<00:05, 244MB/s]
consolidated.00.pth:  91% 14.7G/16.1G [01:47<00:05, 242MB/s]
consolidated.00.pth:  91% 14.7G/16.1G [01:47<00:05, 243MB/s]
consolidated.00.pth:  92% 14.7G/16.1G [01:47<00:05, 239MB/s]
consolidated.00.pth:  92% 14.8G/16.1G [01:47<00:05, 241MB/s]
consolidated.00.pth:  92% 14.8G/16.1G [01:48<00:05, 228MB/s]
consolidated.00.pth:  92% 14.8G/16.1G [01:48<00:05, 228MB/s]
consolidated.00.pth:  92% 14.8G/16.1G [01:48<00:05, 223MB/s]
consolidated.00.pth:  93% 14.9G/16.1G [01:48<00:05, 218MB/s]
consolidated.00.pth:  93% 14.9G/16.1G [01:48<00:05, 207MB/s]
consolidated.00.pth:  93% 14.9G/16.1G [01:48<00:05, 207MB/s]
consolidated.00.pth:  93% 15.0G/16.1G [01:48<00:04, 223MB/s]
consolidated.00.pth:  93% 15.0G/16.1G [01:49<00:04, 231MB/s]
consolidated.00.pth:  94% 15.0G/16.1G [01:49<00:04, 233MB/s]
consolidated.00.pth:  94% 15.1G/16.1G [01:49<00:04, 247MB/s]
consolidated.00.pth:  94% 15.1G/16.1G [01:49<00:03, 250MB/s]
consolidated.00.pth:  94% 15.1G/16.1G [01:49<00:03, 266MB/s]
consolidated.00.pth:  94% 15.2G/16.1G [01:49<00:02, 301MB/s]
consolidated.00.pth:  95% 15.2G/16.1G [01:49<00:02, 324MB/s]
consolidated.00.pth:  95% 15.3G/16.1G [01:49<00:02, 346MB/s]
consolidated.00.pth:  95% 15.3G/16.1G [01:50<00:02, 298MB/s]
consolidated.00.pth:  95% 15.3G/16.1G [01:50<00:03, 200MB/s]
consolidated.00.pth:  96% 15.4G/16.1G [01:50<00:02, 233MB/s]
consolidated.00.pth:  96% 15.4G/16.1G [01:50<00:02, 265MB/s]
consolidated.00.pth:  96% 15.4G/16.1G [01:50<00:02, 264MB/s]
consolidated.00.pth:  96% 15.5G/16.1G [01:50<00:01, 290MB/s]
consolidated.00.pth:  97% 15.5G/16.1G [01:50<00:01, 286MB/s]
consolidated.00.pth:  97% 15.6G/16.1G [01:51<00:01, 286MB/s]
consolidated.00.pth:  97% 15.6G/16.1G [01:51<00:01, 277MB/s]
consolidated.00.pth:  97% 15.6G/16.1G [01:51<00:01, 286MB/s]
consolidated.00.pth:  98% 15.7G/16.1G [01:51<00:01, 323MB/s]
consolidated.00.pth:  98% 15.7G/16.1G [01:51<00:01, 203MB/s]
consolidated.00.pth:  98% 15.7G/16.1G [01:56<00:12, 25.8MB/s]
consolidated.00.pth:  98% 15.8G/16.1G [01:56<00:05, 42.8MB/s]
consolidated.00.pth:  99% 15.9G/16.1G [01:56<00:03, 57.2MB/s]
consolidated.00.pth:  99% 15.9G/16.1G [01:56<00:02, 70.5MB/s]
consolidated.00.pth:  99% 15.9G/16.1G [01:56<00:01, 85.6MB/s]
consolidated.00.pth:  99% 16.0G/16.1G [01:56<00:00, 114MB/s] 
consolidated.00.pth: 100% 16.0G/16.1G [01:56<00:00, 147MB/s]
consolidated.00.pth: 100% 16.1G/16.1G [01:57<00:00, 137MB/s]
Download complete. Moving file to Meta-Llama-3.1-8B-Instruct/original/consolidated.00.pth
Fetching 3 files: 100% 3/3 [01:57<00:00, 39.20s/it] 
/content/Meta-Llama-3.1-8B-Instruct

2.5 パイプラインの構築

テキスト生成パイプラインを構築します。

その前に「テキスト生成パイプライン」とは、Hugging Faceの transformers ライブラリで提供されている、文章の続きを自動で生成するための一連の処理(前処理・モデルによる生成・後処理)をまとめた仕組みです。これを使うことで、難しい設定をせずに簡単にテキスト生成が行えます。

そして、このパイプラインを構築するために、次のコードを実行します。

# 構築
import transformers
import torch

# 使用する事前学習済みモデルのIDを指定
model_id = "meta-llama/Meta-Llama-3.1-8B-Instruct"

# テキスト生成パイプラインの構築
pipeline = transformers.pipeline(
    "text-generation",                                  # テキスト生成を指定
    model=model_id,                                     # 上で指定したモデルを使う
    model_kwargs={"torch_dtype": torch.bfloat16},       # bfloat16でメモリ効率を最適化
    device_map="auto",                                  # 自動的にGPUなどのデバイスに割り当て
)
実行結果

↓ 途中経過

↓ 実行完了後

2.6 チャット形式の構築

ここまででようやくLlamaが使えるようになりました!

あとは、ChatGPTやGeminiのように対話形式でのチャットを構築してみましょう!

次のコードを実行します。

messages = []       # 会話履歴リスト

while True:
    user_input = input("【あなた】 \n")

    # ユーザーがexit, quit, 終了のいずれかをプロンプトに入力したら終了させる
    if user_input.lower() in ["exit", "quit", "終了"]:
        break

    # ユーザーの入力を会話履歴リストに追加
    messages.append({"role": "user", "content": user_input})

    # パイプラインで回答を生成
    # 第1引数: 会話履歴 (過去の会話履歴を順に渡すことで、AIが文脈を理解して自然な返答を生成できる)
    # 第2引数: 生成する最大トークン数を指定(この値によって、回答の長さが変わる)
    outputs = pipeline(messages, max_new_tokens=256)

    # 最新の応答回答を取り出しresponseに代入
    response = outputs[0]["generated_text"][-1]["content"]
    print(f"【Llama】\n {response}")
    print()

    messages.append({"role": "assistant", "content": response})

すると、次のように下側にプロンプトを入力するフィールドが出てきます。

あとは、ChatGPTやGeminiと同様に、プロンプトを入れるだけです!

試しに「インスタのリールについて教えて」という、インスタグラムではなくインスタと略し、かつ日本語で、プロンプトを入力してみました。

すると、かなり時間はかかりましたが、ちゃんと「インスタグラム」と呼んで回答してくれました。

ただ、回答が途中で途切れています🥺

while で実行しているので、続けてチャットをすることもできます。

まとめ

以上のようにして、Google Colab上でLlamaを実行することができました。

コードをカスタマイズして、新たに発展性を持たせて、卒研を進めたいと思います。

最後までご覧いただきありがとうございました。

ソースコード

今回の最終的なGoogle Colabのリンクを掲載しておきます。
https://colab.research.google.com/drive/13ZvyHaXnOACsG3d9LbRQ2zxOSVLpb7kD?usp=sharing

(参考サイト)

次のサイトを参考にさせていただきました。
https://note.com/npaka/n/n73b0786f48e9
https://miralab.co.jp/media/hugging-face/#index_id8

Discussion