Closed2023/10/09にクローズ1

MemoryError: Unable to allocate 2.03 TiB for an array with shape (35167920,) and data type <U15882

要素数の非常に大きなnumpy配列を確保しようとした場合に以下のようなエラーが出た。

paragraph_ids = np.array(cohere_dataset["paragraph_id"], dtype=int)
paragraph_texts = np.array(cohere_dataset["text"], dtype=str)

---------------------------------------------------------------------------
MemoryError                               Traceback (most recent call last)
/ml-docker/working/kaggle-llm-exam/make_wikipedia_full_dataset.ipynb Cell 5 line 2
      1 paragraph_ids = np.array(cohere_dataset["paragraph_id"], dtype=int)
----> 2 paragraph_texts = np.array(cohere_dataset["text"], dtype=str)
MemoryError: Unable to allocate 2.03 TiB for an array with shape (35167920,) and data type <U15882

調査

StackOverflowに同様の問題が上がっていて[1]、システムのovercommit handling[2]の振る舞いが原因らしい。

action1: overcommitを常に許容するように設定を変更する

StackOverflowの解答のようにovercommitのポリシーを0->1に変更する。

# echo 1 > /proc/sys/vm/overcommit_memory

結果

メモリを使い果たし、以下のようなエラーが出た。元々システムの空きメモリを超えるmallocを拒否するための振る舞いらしく、そもそも物理メモリが不足していたようだ。

エラーメッセージ

The Kernel crashed while executing code in the the current cell or a previous cell. Please review the code in the cell(s) to identify a possible cause of the failure. Click here for more info. View Jupyter log for further details.

action1の復旧手順

# echo 0 > /proc/sys/vm/overcommit_memory

参考

このスクラップは2023/10/09にクローズされました