📑

ゼロからLLMをつくりたくなったときに参考になりそうなサイト

2024/05/27に公開

120

NLP

LLM

idea

 はじめに「ゼロからLLMつくりたいなー」と思っていますが、なかなか時間がとれないので、いざというとき（？）のために、参考になりそうなサイトをまとめておきます。
個人的な備忘録です。まだ全然作れていないので、どれが良いという評価もできません。

 NLP2024チュートリアル良さそう。
https://x.com/hmtd223/status/1775035068077215945?s=20
https://github.com/hiroshi-matsuda-rit/NLP2024-tutorial-3
NLP2024-チュートリアル３-作って学ぶ 日本語大規模言語モデル

 Neural Networks: Zero to HeroAndrej Karpathyさんの動画。英語ですが、すごい良さそう
https://www.bioerrorlog.work/entry/andrej-karpathy-nn-zero-to-hero
https://www.youtube.com/watch?v=kCc8FmEb1nY
Karparthyさんのリポジトリ

https://github.com/karpathy/LLM101n
同じくKarpathyさんのnanoGPT

https://github.com/karpathy/nanoGPT

 LlamaLlamaの情報

https://github.com/naklecha/llama3-from-scratch
https://github.com/meta-llama/llama/tree/main/llama
https://github.com/meta-llama/llama3/tree/main/llama

 GENIAChttps://zenn.dev/p/matsuolab
https://github.com/matsuolab/ucllm_nedo_prod
https://note.com/uchidama/n/na980f4f95e45

 小型LlamaモデルのMegatron-LMを用いた事前学習と継続事前学習https://www.youtube.com/watch?v=Wl76E8_3_6M
https://zenn.dev/matsuolab/articles/febe6150cee2b7
https://zenn.dev/matsuolab/articles/9f05f2be70cff8
https://zenn.dev/matsuolab/articles/528c67549c9771

 AttentionAttentionをスクラッチで作るリポジトリ

https://github.com/SuperHotDogCat/Attention-from-scratch

 必要な知識や環境構築LLMを作るために必要な知識

https://qiita.com/SuperHotDogCat/items/003154de08610e1ee182
環境構築

https://zenn.dev/elith/articles/e4dbbb62752e04
実装例は画像認識ではありますが、フレームワークをつくるためのコーディングという点で参考になると思います。

ゼロから作るDeep Learning ❸ ―フレームワーク編

 その他https://github.com/llm-jp/llm-jp-sft
LLaVA(画像系）

https://github.com/tosiyuki/LLaVA-JP
モデルマージ

https://note.com/ngc_shj/n/na9b41adb9131
https://zenn.dev/tokyotech_lm/articles/5f4211b9ed3197
https://zenn.dev/yuki127/articles/813e72d026f230
https://zenn.dev/selllous/articles/transformers_pretrain_to_ft
https://zenn.dev/fusic/articles/fd6fbe8a5e966d
https://tech-blog.abeja.asia/entry/training-gpt-2-202411
https://github.com/HandsOnLLM/Hands-On-Large-Language-Models

 まとめLLMに限らず、何かをゼロから小さく作ってみるのは、回り道のようで理解を深める最短経路と思っています。すぐ効果があるようなものではないですが、こういうのはまとめて取り組んでいきたいですね。
日本語の良い書籍が出ないかなーと期待もしています。

 参考リンクhttps://note.com/npaka/n/n23e2a05cb650
https://note.com/kan_hatakeyama
LLM関係の最初に読むべき論文リスト

https://eugeneyan.com/writing/llm-reading-list/

 変更履歴2024/07/09 アップデート

Discussion

ログインするとコメントできます