👌

2023/07/27に公開

乱数の生成

``````%%time
from random import random

with open('input', 'w') as fout:
for _ in range(1000000000):
print((random() - 0.5) * 1000000, file=fout)
``````
``````CPU times: user 23min 45s, sys: 36.8 s, total: 24min 22s
Wall time: 24min 37s
``````

sortコマンド

``````%%time
!sort -n input > output_sort
``````
``````CPU times: user 11.2 s, sys: 1.37 s, total: 12.6 s
Wall time: 39min 30s
``````
``````!wc -l output_sort
``````
``````1000000000 output_sort
-499999.99974544795
-499999.99890903855
-499999.9982175315
-499999.9977990386
-499999.9960182383
-499999.9955448005
-499999.99452409195
-499999.9943637242
-499999.9931208382
-499999.9897155051
``````

ちゃんとソートできています。

std::sort

STLのsortを使ってソートしてみます。

``````%%writefile /content/std_sort_vector.cpp

#include <stdio.h>
#include <stdlib.h>
#include <chrono>
#include <vector>
#include <algorithm>

int main(int argc, char *argv[])
{
char buf[BUFSIZ];
std::vector<double> input;

auto start = std::chrono::system_clock::now();
while (fgets(buf, BUFSIZ, stdin) != NULL) {
input.push_back(atof(buf));
}
auto end = std::chrono::system_clock::now();
double elapsed = std::chrono::duration_cast<std::chrono::milliseconds>(end-start).count();
fprintf(stderr, "load: %f sec\n", elapsed / 1000);

start = std::chrono::system_clock::now();
std::sort(input.begin(), input.end());
end = std::chrono::system_clock::now();
elapsed = std::chrono::duration_cast<std::chrono::milliseconds>(end-start).count();
fprintf(stderr, "sort: %f sec\n", elapsed / 1000);

start = std::chrono::system_clock::now();
for (size_t i = 0; i < input.size(); i++)
{
printf("%.12lf\n", input[i]);
}
end = std::chrono::system_clock::now();
elapsed = std::chrono::duration_cast<std::chrono::milliseconds>(end-start).count();
fprintf(stderr, "out:  %f sec\n", elapsed / 1000);
}
``````
``````Writing /content/std_sort_vector.cpp
``````
``````!g++ -Ofast std_sort_vector.cpp
``````
``````%%time
!./a.out < input > output_std_vector
``````
``````load: 287.233000 sec
sort: 152.108000 sec
out:  1266.370000 sec
CPU times: user 7.85 s, sys: 876 ms, total: 8.73 s
Wall time: 28min 26s
``````
``````!wc -l output_std_vector
``````
``````1000000000 output_std_vector
-499999.999745447945
-499999.998909038550
-499999.998217531480
-499999.997799038596
-499999.996018238307
-499999.995544800477
-499999.994524091948
-499999.994363724196
-499999.993120838189
-499999.989715505100
``````

IOの時間が26分で同じとすると、sortコマンドのソートの時間は15分です。STLのsortは2分半ですから相当早い。速いとは聞いていましたが、浮動小数点数に特化しているのも効いている可能性があります。sortコマンドは高機能な分、余計な処理が挟まっているのかもしれません。

ska sort

それでは基数ソートを測定しましょう。

``````%cd /content
!git clone https://github.com/skarupke/ska_sort.git
``````
``````/content
Cloning into 'ska_sort'...
remote: Enumerating objects: 16, done.[K
remote: Total 16 (delta 0), reused 0 (delta 0), pack-reused 16[K
Receiving objects: 100% (16/16), 14.25 KiB | 7.13 MiB/s, done.
Resolving deltas: 100% (4/4), done.
``````
``````%%writefile /content/ska_sort.cpp
#include <stdio.h>
#include <stdlib.h>
#include <vector>
#include <chrono>
#include "ska_sort/ska_sort.hpp"

int main(int argc, char *argv[])
{
char buf[BUFSIZ];
std::vector<double> input;
double f;

auto start = std::chrono::system_clock::now();
while (fgets(buf, BUFSIZ, stdin) != NULL) {
input.push_back(atof(buf));
}
auto end = std::chrono::system_clock::now();
double elapsed = std::chrono::duration_cast<std::chrono::milliseconds>(end-start).count();
fprintf(stderr, "load: %f sec\n", elapsed / 1000);

start = std::chrono::system_clock::now();
ska_sort(input.begin(), input.end());
end = std::chrono::system_clock::now();
elapsed = std::chrono::duration_cast<std::chrono::milliseconds>(end-start).count();
fprintf(stderr, "sort: %f sec\n", elapsed / 1000);

start = std::chrono::system_clock::now();
for (size_t i = 0; i < input.size(); i++)
{
printf("%.12lf\n", input[i]);
}
end = std::chrono::system_clock::now();
elapsed = std::chrono::duration_cast<std::chrono::milliseconds>(end-start).count();
fprintf(stderr, "out:  %f sec\n", elapsed / 1000);
}
``````
``````Writing /content/ska_sort.cpp
``````
``````!g++ -Ofast ska_sort.cpp
``````
``````%%time
!./a.out < input > output_ska
``````
``````load: 279.663000 sec
sort: 59.707000 sec
out:  1266.158000 sec
CPU times: user 7.29 s, sys: 839 ms, total: 8.13 s
Wall time: 26min 46s
``````
``````!wc -l output_ska
``````
``````1000000000 output_ska
-499999.999745447945
-499999.998909038550
-499999.998217531480
-499999.997799038596
-499999.996018238307
-499999.995544800477
-499999.994524091948
-499999.994363724196
-499999.993120838189
-499999.989715505100
``````