- & 0xffをいろんな場所に直書きしているのをなんとかする
- cpu命令を別クラスに実装する
ロゴつくる- エラーハンドリング
putsにする- 実行コマンドを追加する
- 実行コマンドごとにファイルを分ける
- start, bench, stackprof
- apuの定数
- audioの環境変数(48000)を参照してステップ数を計算するようにする
- 512をaudioから引っ張ってくる
- joypad入力管理を別クラスにする
yjitの有効化を実行時に行うREADMEにスクショ追加- ppuの実装を見直す
- マニュアルと他実装を見る
- ぷよぷよのOPが変
- テストを書く
影なしのスクショを取る- タイトルとfpsを表示する
- シリアル通信を追加する
- 他のテストも試す
- pyboy参考
- テストファイルを作って、ciで実行する
- 描画バグの修正
- MBCタイプの追加
- ゲームボーイカラー対応
- Wasm対応
- apu, ppu, joypadのリファクタリング
- ベンチマークまわりの整備
- 描画と音声のありなしをできるようにしたい
- READMEにも追加する
- Rubyのベンチマークプログラムとして使えるようにしたい
- 不要なrequireの削除
- 描画の高速化
- 不要な描画をしない
data ='', 'r') { }
p data[0x134..0x143].pack('C*').strip
=> "TOBU"
Github Copilotがサジェストしてくれる。便利。
- DEC命令のハーフキャリーフラグの説明が、H - Set if no borrow from bit 4.となっているが、borrowがあったときにたてる?
- 1
- 5: pop afでfの値をsetするとき、下位4bitを0000にしていなかった
- 3
- 1: 0xe8と0xf8のcフラグの計算方法
- cflag = (@sp & 0xff) + (byte & 0xff) > 0xff で通った
- 1: 0xe8と0xf8のcフラグの計算方法
- 描画処理が行われるタイミング
- → エミュレータでは、CPU命令を実行するたびにサイクル数を数えておき、VBlankになるまでサイクル数がたまったら画面を更新すれば良い
- VRAMにデータを書き込むタイミング
- → VBlankになるまでlyレジスタを見ながらループする
yjit: false
1: 36.740829 sec
2: 36.468515 sec
3: 36.177083 sec
FPS: 41.1385591742566
yjit: true
1: 32.305559 sec
2: 32.094778 sec
3: 31.889601 sec
FPS: 46.73385499531633
→ render_spritesがボトルネックになっている
Mode: cpu(1000)
Samples: 9081 (1.08% miss rate)
GC: 4 (0.04%)
3727 (41.0%) 1920 (21.1%) Rubyboy::Ppu#render_sprites
1800 (19.8%) 1800 (19.8%) Rubyboy::Operand#initialize
1448 (15.9%) 1448 (15.9%) Integer#zero?
3346 (36.8%) 1296 (14.3%) Enumerable#each_slice
919 (10.1%) 919 (10.1%) Integer#<<
424 (4.7%) 424 (4.7%) Integer#<=>
3552 (39.1%) 294 (3.2%) Array#each
162 (1.8%) 159 (1.8%) Rubyboy::Cpu#flags
147 (1.6%) 147 (1.6%) Array#size
104 (1.1%) 104 (1.1%) Integer#>>
2220 (24.4%) 71 (0.8%) Rubyboy::Ppu#render_bg
6259 (68.9%) 58 (0.6%) Rubyboy::Ppu#step
149 (1.6%) 55 (0.6%) Rubyboy::Ppu#get_color
44 (0.5%) 44 (0.5%) Rubyboy::Ppu#to_signed_byte
177 (1.9%) 38 (0.4%) Rubyboy::Timer#step
34 (0.4%) 34 (0.4%) Integer#-@
146 (1.6%) 29 (0.3%) Rubyboy::Cartridge::Mbc1#read_byte
29 (0.3%) 29 (0.3%) Rubyboy::Registers#write8
915 (10.1%) 24 (0.3%) Rubyboy::Ppu#get_pixel
17 (0.2%) 17 (0.2%) Rubyboy::Registers#read8
14 (0.2%) 14 (0.2%) Rubyboy::Cpu#increment_pc_by_byte
9054 (99.7%) 14 (0.2%) Rubyboy::Console#bench
434 (4.8%) 11 (0.1%) Range#===
398 (4.4%) 10 (0.1%) Rubyboy::Bus#read_byte
9 (0.1%) 9 (0.1%) Rubyboy::Ppu#handle_ly_eq_lyc
154 (1.7%) 8 (0.1%) Rubyboy::Ppu#render_window
1244 (13.7%) 7 (0.1%) Rubyboy::Ppu#get_tile_index
2597 (28.6%) 6 (0.1%) Rubyboy::Cpu#exec
110 (1.2%) 5 (0.1%) Rubyboy::Bus#write_byte
9 (0.1%) 5 (0.1%) Rubyboy::Ppu#write_byte
Rubyboy::Ppu#render_sprites (/Users/yamasaki/dev/gb-emulator/rubyboy/rubyboy/lib/rubyboy/ppu.rb:220)
samples: 1920 self (21.1%) / 3727 total (41.0%)
3727 ( 100.0%) Rubyboy::Ppu#step
1902 ( 51.0%) Enumerable#each_slice
46 ( 1.2%) Enumerator#with_index
35 ( 0.9%) Array#each
29 ( 0.8%) Integer#times
callees (1807 total):
3307 ( 183.0%) Enumerator#each
339 ( 18.8%) Enumerator#with_index
39 ( 2.2%) Enumerable#each_slice
36 ( 2.0%) Array#each
34 ( 1.9%) Integer#-@
29 ( 1.6%) Integer#times
20 ( 1.1%) Rubyboy::Ppu#get_pixel
9 ( 0.5%) Integer#zero?
5 ( 0.3%) Rubyboy::Ppu#get_color
1 ( 0.1%) Enumerable#sort_by
| 220 | def render_sprites
3 (0.0%) | 221 | return if @lcdc[LCDC[:sprite_enable]].zero?
| 222 |
2 (0.0%) | 223 | sprite_height = @lcdc[LCDC[:sprite_size]].zero? ? 8 : 16
| 224 | sprites = []
| 225 | cnt = 0
3346 (36.8%) | 226 | @oam.each_slice(4).each do |sprite_attr|
| 227 | sprite = {
| 228 | y: (sprite_attr[0] - 16) % 256,
| 229 | x: (sprite_attr[1] - 8) % 256,
| 230 | tile_index: sprite_attr[2],
| 231 | flags: sprite_attr[3]
| 232 | }
| 233 | next if sprite[:y] > @ly || sprite[:y] + sprite_height <= @ly
| 234 |
| 235 | sprites << sprite
| 236 | cnt += 1
15 (0.2%) / 15 (0.2%) | 237 | break if cnt == 10
1887 (20.8%) / 1887 (20.8%) | 238 | end
386 (4.3%) / 12 (0.1%) | 239 | sprites = sprites.sort_by.with_index { |sprite, i| [-sprite[:x], -i] }
| 240 |
36 (0.4%) | 241 | sprites.each do |sprite|
| 242 | flags = sprite[:flags]
4 (0.0%) | 243 | pallet = flags[SPRITE_FLAGS[:dmg_palette]].zero? ? @obp0 : @obp1
| 244 | tile_index = sprite[:tile_index]
| 245 | tile_index &= 0xfe if sprite_height == 16
| 246 | y = (@ly - sprite[:y]) % 256
2 (0.0%) / 2 (0.0%) | 247 | y = sprite_height - y - 1 if flags[SPRITE_FLAGS[:y_flip]] == 1
| 248 | tile_index = (tile_index + 1) % 256 if y >= 8
| 249 | y %= 8
| 250 |
29 (0.3%) | 251 | 8.times do |x|
2 (0.0%) / 2 (0.0%) | 252 | x_flipped = flags[SPRITE_FLAGS[:x_flip]] == 1 ? 7 - x : x
| 253 |
20 (0.2%) | 254 | pixel = get_pixel(tile_index, x_flipped, y)
| 255 | i = (sprite[:x] + x) % 256
| 256 |
| 257 | next if || i >= LCD_WIDTH
2 (0.0%) / 2 (0.0%) | 258 | next if flags[SPRITE_FLAGS[:priority]] == 1 && @bg_pixels[i] != 0
| 259 |
5 (0.1%) | 260 | @buffer[@ly * LCD_WIDTH + i] = get_color(pallet, pixel)
| 261 | end
FPS: 46.73385499531633 → 49.2233733053377
== 0に修正
FPS: 49.2233733053377 → 49.36641822413328
FPS: 49.36641822413328 → 50.94130878614299
stackprof 2回目
Mode: cpu(1000)
Samples: 5666 (1.73% miss rate)
GC: 7 (0.12%)
1334 (23.5%) 1334 (23.5%) Rubyboy::Ppu#to_signed_byte
1260 (22.2%) 1260 (22.2%) Integer#<<
662 (11.7%) 662 (11.7%) Integer#<=>
913 (16.1%) 471 (8.3%) Array#each
403 (7.1%) 403 (7.1%) Rubyboy::Registers#read8
410 (7.2%) 214 (3.8%) Enumerable#each_slice
3739 (66.0%) 190 (3.4%) Rubyboy::Ppu#step
180 (3.2%) 180 (3.2%) Rubyboy::Timer#step
961 (17.0%) 165 (2.9%) Rubyboy::Ppu#render_sprites
195 (3.4%) 163 (2.9%) Rubyboy::Cpu#flags
98 (1.7%) 98 (1.7%) Integer#>>
2415 (42.6%) 87 (1.5%) Rubyboy::Ppu#render_bg
57 (1.0%) 57 (1.0%) Array#size
138 (2.4%) 49 (0.9%) Rubyboy::Ppu#get_color
152 (2.7%) 32 (0.6%) Rubyboy::Cartridge::Mbc1#read_byte
31 (0.5%) 31 (0.5%) Rubyboy::Registers#write8
29 (0.5%) 29 (0.5%) Integer#-@
1020 (18.0%) 27 (0.5%) Rubyboy::Ppu#get_pixel
19 (0.3%) 19 (0.3%) Rubyboy::Interrupt#interrupts
869 (15.3%) 17 (0.3%) Rubyboy::Cpu#get_value
1349 (23.8%) 15 (0.3%) Rubyboy::Ppu#get_tile_index
5634 (99.4%) 13 (0.2%) Rubyboy::Console#bench
643 (11.3%) 11 (0.2%) Rubyboy::Bus#read_byte
569 (10.0%) 10 (0.2%) Rubyboy::Cpu#ld8
9 (0.2%) 9 (0.2%) Rubyboy::Cpu#increment_pc_by_byte
8 (0.1%) 8 (0.1%) Rubyboy::Ppu#handle_ly_eq_lyc
100 (1.8%) 7 (0.1%) Rubyboy::Bus#write_byte
666 (11.8%) 6 (0.1%) Range#===
164 (2.9%) 5 (0.1%) Rubyboy::Ppu#render_window
5 (0.1%) 5 (0.1%) FFI::FunctionType#initialize
Initialize tile_map_addr outside the loop
FPS: 50.94130878614299 → 56.6580741129914
Precompute outside the loop
FPS: 56.6580741129914 → 60.44140113483162
TODO: 定数をやめる
Ruby v3.2 -> v3.3
FPS: 61.021 → 115.236
rubyboy % stackprof stackprof-cpu-myapp.dump
Mode: cpu(1000)
Samples: 16405 (4.57% miss rate)
GC: 5593 (34.09%)
3688 (22.5%) 3688 (22.5%) (sweeping)
2332 (14.2%) 2109 (12.9%) Enumerable#flat_map
2050 (12.5%) 2050 (12.5%) Integer#<=>
5593 (34.1%) 1679 (10.2%) (garbage collection)
1038 (6.3%) 1038 (6.3%) Rubyboy::Ppu#to_signed_byte
1004 (6.1%) 1004 (6.1%) Rubyboy::SDL.RenderClear
646 (3.9%) 646 (3.9%) Rubyboy::Ppu#get_pixel
437 (2.7%) 437 (2.7%) Integer#>>
701 (4.3%) 332 (2.0%) Rubyboy::Ppu#render_sprites
1354 (8.3%) 278 (1.7%) Rubyboy::Lcd#draw
3825 (23.3%) 257 (1.6%) Rubyboy::Ppu#step
1627 (9.9%) 255 (1.6%) Rubyboy::Ppu#render_bg
633 (3.9%) 247 (1.5%) Enumerable#each_slice
230 (1.4%) 230 (1.4%) Rubyboy::Registers#read8
226 (1.4%) 226 (1.4%) (marking)
2332 (14.2%) 223 (1.4%) Rubyboy::Console#buffer_to_pixel_data
2933 (17.9%) 194 (1.2%) Integer#times
1228 (7.5%) 185 (1.1%) Rubyboy::Ppu#render_window
178 (1.1%) 178 (1.1%) Rubyboy::Timer#step
524 (3.2%) 110 (0.7%) Rubyboy::Ppu#get_color
95 (0.6%) 95 (0.6%) Rubyboy::Registers#write8
116 (0.7%) 81 (0.5%) Rubyboy::Cpu#flags
80 (0.5%) 80 (0.5%) Rubyboy::Registers#read16
662 (4.0%) 80 (0.5%) Rubyboy::Cartridge::Mbc1#read_byte
1203 (7.3%) 69 (0.4%) Rubyboy::Cpu#ld8
62 (0.4%) 62 (0.4%) Array#size
57 (0.3%) 57 (0.3%) Rubyboy::Cpu#increment_pc_by_byte
56 (0.3%) 56 (0.3%) Rubyboy::SDL.UpdateTexture
44 (0.3%) 44 (0.3%) Rubyboy::Interrupt#interrupts
2090 (12.7%) 44 (0.3%) Range#===
rubyboy % stackprof stackprof-cpu-myapp.dump
Mode: cpu(1000)
Samples: 11758 (6.52% miss rate)
GC: 3103 (26.39%)
1857 (15.8%) 1857 (15.8%) Integer#<=>
1542 (13.1%) 1542 (13.1%) (sweeping)
3103 (26.4%) 1459 (12.4%) (garbage collection)
1087 (9.2%) 1087 (9.2%) Rubyboy::SDL.RenderClear
950 (8.1%) 950 (8.1%) Rubyboy::Ppu#to_signed_byte
1797 (15.3%) 646 (5.5%) Rubyboy::Ppu#render_bg
606 (5.2%) 606 (5.2%) Rubyboy::Ppu#get_pixel
1381 (11.7%) 467 (4.0%) Rubyboy::Ppu#render_window
712 (6.1%) 287 (2.4%) Rubyboy::Ppu#render_sprites
281 (2.4%) 281 (2.4%) Rubyboy::SDL.UpdateTexture
618 (5.3%) 261 (2.2%) Enumerable#each_slice
4152 (35.3%) 249 (2.1%) Rubyboy::Ppu#step
246 (2.1%) 246 (2.1%) Integer#>>
192 (1.6%) 192 (1.6%) Rubyboy::Registers#read8
3252 (27.7%) 178 (1.5%) Integer#times
162 (1.4%) 162 (1.4%) Rubyboy::Timer#step
102 (0.9%) 102 (0.9%) (marking)
321 (2.7%) 101 (0.9%) Rubyboy::Ppu#get_color
600 (5.1%) 100 (0.9%) Rubyboy::Cartridge::Mbc1#read_byte
88 (0.7%) 88 (0.7%) Rubyboy::Registers#write8
80 (0.7%) 80 (0.7%) Rubyboy::Registers#read16
78 (0.7%) 78 (0.7%) Array#size
1108 (9.4%) 73 (0.6%) Rubyboy::Cpu#ld8
104 (0.9%) 69 (0.6%) Rubyboy::Cpu#flags
64 (0.5%) 64 (0.5%) Rubyboy::SDL.GetKeyboardState
620 (5.3%) 56 (0.5%) Array#each
1904 (16.2%) 52 (0.4%) Range#===
1441 (12.3%) 52 (0.4%) Rubyboy::Lcd#draw
41 (0.3%) 41 (0.3%) Rubyboy::Cpu#increment_pc_by_byte
8648 (73.5%) 39 (0.3%) Rubyboy::Console#bench
rubyboy % heap-profiler tmp/report
Total allocated: 563.01 MB (4198804 objects)
Total retained: 10.13 kB (252 objects)
allocated memory by gem
563.01 MB rubyboy/lib
320.00 B heap-profiler-0.7.0
allocated memory by file
454.17 MB rubyboy/lib/rubyboy/cpu.rb
93.18 MB rubyboy/lib/rubyboy/ppu.rb
10.06 MB rubyboy/lib/rubyboy/apu.rb
4.35 MB rubyboy/lib/rubyboy/audio.rb
1.25 MB rubyboy/lib/rubyboy.rb
720.00 B rubyboy/lib/rubyboy/lcd.rb
416.00 B rubyboy/lib/rubyboy/apu_channels/channel2.rb
416.00 B rubyboy/lib/rubyboy/apu_channels/channel1.rb
320.00 B rubyboy/lib/rubyboy/bus.rb
320.00 B heap-profiler-0.7.0/lib/heap_profiler/reporter.rb
296.00 B rubyboy/lib/rubyboy/interrupt.rb
120.00 B rubyboy/lib/rubyboy/apu_channels/channel4.rb
80.00 B rubyboy/lib/rubyboy/registers.rb
40.00 B rubyboy/lib/rubyboy/cartridge/mbc1.rb
40.00 B rubyboy/lib/rubyboy/apu_channels/channel3.rb
allocated memory by location
77.83 MB rubyboy/lib/rubyboy/cpu.rb:600
65.28 MB rubyboy/lib/rubyboy/ppu.rb:248
38.96 MB rubyboy/lib/rubyboy/cpu.rb:283
35.15 MB rubyboy/lib/rubyboy/cpu.rb:174
19.13 MB rubyboy/lib/rubyboy/cpu.rb:85
18.67 MB rubyboy/lib/rubyboy/cpu.rb:272
18.47 MB rubyboy/lib/rubyboy/cpu.rb:87
16.77 MB rubyboy/lib/rubyboy/cpu.rb:234
15.01 MB rubyboy/lib/rubyboy/cpu.rb:292
14.61 MB rubyboy/lib/rubyboy/ppu.rb:239
13.74 MB rubyboy/lib/rubyboy/cpu.rb:231
13.67 MB rubyboy/lib/rubyboy/cpu.rb:79
11.77 MB rubyboy/lib/rubyboy/cpu.rb:172
11.38 MB rubyboy/lib/rubyboy/cpu.rb:166
11.35 MB rubyboy/lib/rubyboy/cpu.rb:167
10.51 MB rubyboy/lib/rubyboy/cpu.rb:96
9.17 MB rubyboy/lib/rubyboy/cpu.rb:114
8.86 MB rubyboy/lib/rubyboy/cpu.rb:86
8.55 MB rubyboy/lib/rubyboy/cpu.rb:219
8.08 MB rubyboy/lib/rubyboy/ppu.rb:244
7.46 MB rubyboy/lib/rubyboy/cpu.rb:280
6.67 MB rubyboy/lib/rubyboy/cpu.rb:227
6.22 MB rubyboy/lib/rubyboy/cpu.rb:173
6.18 MB rubyboy/lib/rubyboy/cpu.rb:63
5.74 MB rubyboy/lib/rubyboy/cpu.rb:256
5.55 MB rubyboy/lib/rubyboy/cpu.rb:113
5.51 MB rubyboy/lib/rubyboy/cpu.rb:178
5.32 MB rubyboy/lib/rubyboy/cpu.rb:61
5.31 MB rubyboy/lib/rubyboy/cpu.rb:123
5.21 MB rubyboy/lib/rubyboy/ppu.rb:236
5.15 MB rubyboy/lib/rubyboy/cpu.rb:294
4.76 MB rubyboy/lib/rubyboy/cpu.rb:229
4.35 MB rubyboy/lib/rubyboy/audio.rb:31
3.60 MB rubyboy/lib/rubyboy/cpu.rb:58
3.34 MB rubyboy/lib/rubyboy/cpu.rb:70
3.30 MB rubyboy/lib/rubyboy/cpu.rb:94
2.95 MB rubyboy/lib/rubyboy/cpu.rb:163
2.87 MB rubyboy/lib/rubyboy/cpu.rb:106
2.64 MB rubyboy/lib/rubyboy/cpu.rb:147
2.12 MB rubyboy/lib/rubyboy/cpu.rb:228
2.12 MB rubyboy/lib/rubyboy/cpu.rb:139
2.01 MB rubyboy/lib/rubyboy/cpu.rb:276
1.90 MB rubyboy/lib/rubyboy/cpu.rb:71
1.87 MB rubyboy/lib/rubyboy/apu.rb:51
1.87 MB rubyboy/lib/rubyboy/apu.rb:58
1.74 MB rubyboy/lib/rubyboy/cpu.rb:66
1.55 MB rubyboy/lib/rubyboy/cpu.rb:171
1.44 MB rubyboy/lib/rubyboy/apu.rb:53
1.44 MB rubyboy/lib/rubyboy/apu.rb:60
1.39 MB rubyboy/lib/rubyboy/apu.rb:52
allocated memory by class
462.20 MB Hash
49.79 MB Array
14.61 MB Enumerator
10.96 MB <memo> (IMEMO)
10.96 MB <ifunc> (IMEMO)
10.06 MB Float
4.35 MB FFI::MemoryPointer
55.88 kB FFI::Pointer
25.68 kB <throw_data> (IMEMO)
6.92 kB <callcache> (IMEMO)
2.96 kB <constcache> (IMEMO)
96.00 B <ment> (IMEMO)
allocated objects by gem
4198796 rubyboy/lib
8 heap-profiler-0.7.0
allocated objects by file
2839605 rubyboy/lib/rubyboy/cpu.rb
1105342 rubyboy/lib/rubyboy/ppu.rb
251462 rubyboy/lib/rubyboy/apu.rb
1294 rubyboy/lib/rubyboy.rb
1048 rubyboy/lib/rubyboy/audio.rb
18 rubyboy/lib/rubyboy/lcd.rb
8 rubyboy/lib/rubyboy/bus.rb
8 heap-profiler-0.7.0/lib/heap_profiler/reporter.rb
5 rubyboy/lib/rubyboy/apu_channels/channel2.rb
5 rubyboy/lib/rubyboy/apu_channels/channel1.rb
3 rubyboy/lib/rubyboy/apu_channels/channel4.rb
2 rubyboy/lib/rubyboy/registers.rb
2 rubyboy/lib/rubyboy/interrupt.rb
1 rubyboy/lib/rubyboy/cartridge/mbc1.rb
1 rubyboy/lib/rubyboy/apu_channels/channel3.rb
allocated objects by location
689584 rubyboy/lib/rubyboy/ppu.rb:248
486434 rubyboy/lib/rubyboy/cpu.rb:600
273889 rubyboy/lib/rubyboy/ppu.rb:239
243478 rubyboy/lib/rubyboy/cpu.rb:283
219714 rubyboy/lib/rubyboy/cpu.rb:174
119570 rubyboy/lib/rubyboy/cpu.rb:85
116703 rubyboy/lib/rubyboy/cpu.rb:272
115434 rubyboy/lib/rubyboy/cpu.rb:87
104839 rubyboy/lib/rubyboy/cpu.rb:234
93804 rubyboy/lib/rubyboy/cpu.rb:292
91296 rubyboy/lib/rubyboy/ppu.rb:236
85878 rubyboy/lib/rubyboy/cpu.rb:231
85438 rubyboy/lib/rubyboy/cpu.rb:79
73590 rubyboy/lib/rubyboy/cpu.rb:172
71146 rubyboy/lib/rubyboy/cpu.rb:166
70944 rubyboy/lib/rubyboy/cpu.rb:167
65709 rubyboy/lib/rubyboy/cpu.rb:96
57340 rubyboy/lib/rubyboy/cpu.rb:114
55390 rubyboy/lib/rubyboy/cpu.rb:86
53465 rubyboy/lib/rubyboy/cpu.rb:219
50512 rubyboy/lib/rubyboy/ppu.rb:244
46849 rubyboy/lib/rubyboy/apu.rb:51
46847 rubyboy/lib/rubyboy/apu.rb:58
46596 rubyboy/lib/rubyboy/cpu.rb:280
41691 rubyboy/lib/rubyboy/cpu.rb:227
38898 rubyboy/lib/rubyboy/cpu.rb:173
38615 rubyboy/lib/rubyboy/cpu.rb:63
36024 rubyboy/lib/rubyboy/apu.rb:53
36023 rubyboy/lib/rubyboy/apu.rb:60
35883 rubyboy/lib/rubyboy/cpu.rb:256
34737 rubyboy/lib/rubyboy/apu.rb:52
34736 rubyboy/lib/rubyboy/apu.rb:59
34695 rubyboy/lib/rubyboy/cpu.rb:113
34468 rubyboy/lib/rubyboy/cpu.rb:178
33268 rubyboy/lib/rubyboy/cpu.rb:61
33186 rubyboy/lib/rubyboy/cpu.rb:123
32219 rubyboy/lib/rubyboy/cpu.rb:294
29773 rubyboy/lib/rubyboy/cpu.rb:229
22490 rubyboy/lib/rubyboy/cpu.rb:58
20864 rubyboy/lib/rubyboy/cpu.rb:70
20604 rubyboy/lib/rubyboy/cpu.rb:94
18442 rubyboy/lib/rubyboy/cpu.rb:163
17946 rubyboy/lib/rubyboy/cpu.rb:106
16512 rubyboy/lib/rubyboy/cpu.rb:147
13241 rubyboy/lib/rubyboy/cpu.rb:228
13230 rubyboy/lib/rubyboy/cpu.rb:139
12579 rubyboy/lib/rubyboy/cpu.rb:276
11893 rubyboy/lib/rubyboy/cpu.rb:71
10850 rubyboy/lib/rubyboy/cpu.rb:66
9706 rubyboy/lib/rubyboy/cpu.rb:171
allocated objects by class
2888757 Hash
416967 Array
273888 <memo> (IMEMO)
273888 <ifunc> (IMEMO)
251442 Float
91296 Enumerator
1040 FFI::MemoryPointer
642 <throw_data> (IMEMO)
635 FFI::Pointer
173 <callcache> (IMEMO)
74 <constcache> (IMEMO)
2 <ment> (IMEMO)
retained memory by gem
9.81 kB rubyboy/lib
320.00 B heap-profiler-0.7.0
retained memory by file
3.92 kB rubyboy/lib/rubyboy/cpu.rb
2.20 kB rubyboy/lib/rubyboy/ppu.rb
960.00 B rubyboy/lib/rubyboy.rb
720.00 B rubyboy/lib/rubyboy/lcd.rb
720.00 B rubyboy/lib/rubyboy/apu.rb
328.00 B rubyboy/lib/rubyboy/audio.rb
320.00 B rubyboy/lib/rubyboy/bus.rb
320.00 B heap-profiler-0.7.0/lib/heap_profiler/reporter.rb
160.00 B rubyboy/lib/rubyboy/apu_channels/channel2.rb
160.00 B rubyboy/lib/rubyboy/apu_channels/channel1.rb
120.00 B rubyboy/lib/rubyboy/apu_channels/channel4.rb
80.00 B rubyboy/lib/rubyboy/registers.rb
40.00 B rubyboy/lib/rubyboy/interrupt.rb
40.00 B rubyboy/lib/rubyboy/cartridge/mbc1.rb
40.00 B rubyboy/lib/rubyboy/apu_channels/channel3.rb
retained memory by location
160.00 B rubyboy/lib/rubyboy.rb:79
160.00 B rubyboy/lib/rubyboy.rb:78
152.00 B rubyboy/lib/rubyboy/ppu.rb:248
120.00 B rubyboy/lib/rubyboy/lcd.rb:28
120.00 B rubyboy/lib/rubyboy/audio.rb:34
80.00 B rubyboy/lib/rubyboy/registers.rb:73
80.00 B rubyboy/lib/rubyboy/ppu.rb:209
80.00 B rubyboy/lib/rubyboy/ppu.rb:107
80.00 B rubyboy/lib/rubyboy/lcd.rb:45
80.00 B rubyboy/lib/rubyboy/lcd.rb:44
80.00 B rubyboy/lib/rubyboy/lcd.rb:35
80.00 B rubyboy/lib/rubyboy/lcd.rb:31
80.00 B rubyboy/lib/rubyboy/lcd.rb:30
80.00 B rubyboy/lib/rubyboy/lcd.rb:29
80.00 B rubyboy/lib/rubyboy/cpu.rb:23
80.00 B rubyboy/lib/rubyboy/cpu.rb:1248
80.00 B rubyboy/lib/rubyboy/audio.rb:27
80.00 B rubyboy/lib/rubyboy/apu_channels/channel2.rb:76
80.00 B rubyboy/lib/rubyboy/apu.rb:65
80.00 B rubyboy/lib/rubyboy/apu.rb:51
80.00 B rubyboy/lib/rubyboy/apu.rb:50
80.00 B rubyboy/lib/rubyboy.rb:75
80.00 B rubyboy/lib/rubyboy.rb:74
80.00 B rubyboy/lib/rubyboy.rb:43
80.00 B heap-profiler-0.7.0/lib/heap_profiler/reporter.rb:58
80.00 B heap-profiler-0.7.0/lib/heap_profiler/reporter.rb:53
80.00 B heap-profiler-0.7.0/lib/heap_profiler/reporter.rb:52
48.00 B rubyboy/lib/rubyboy/ppu.rb:239
48.00 B rubyboy/lib/rubyboy/audio.rb:28
40.00 B rubyboy/lib/rubyboy/ppu.rb:306
40.00 B rubyboy/lib/rubyboy/ppu.rb:180
40.00 B rubyboy/lib/rubyboy/ppu.rb:179
40.00 B rubyboy/lib/rubyboy/lcd.rb:27
40.00 B rubyboy/lib/rubyboy/cpu.rb:92
40.00 B rubyboy/lib/rubyboy/cpu.rb:894
40.00 B rubyboy/lib/rubyboy/cpu.rb:76
40.00 B rubyboy/lib/rubyboy/cpu.rb:748
40.00 B rubyboy/lib/rubyboy/cpu.rb:69
40.00 B rubyboy/lib/rubyboy/cpu.rb:64
40.00 B rubyboy/lib/rubyboy/cpu.rb:275
40.00 B rubyboy/lib/rubyboy/cpu.rb:254
40.00 B rubyboy/lib/rubyboy/cpu.rb:219
40.00 B rubyboy/lib/rubyboy/cpu.rb:1027
40.00 B rubyboy/lib/rubyboy/bus.rb:87
40.00 B rubyboy/lib/rubyboy.rb:81
40.00 B rubyboy/lib/rubyboy.rb:80
40.00 B rubyboy/lib/rubyboy.rb:76
40.00 B rubyboy/lib/rubyboy.rb:45
40.00 B rubyboy/lib/rubyboy.rb:44
40.00 B rubyboy/lib/rubyboy.rb:39
retained memory by class
6.96 kB <callcache> (IMEMO)
3.00 kB <constcache> (IMEMO)
96.00 B <ment> (IMEMO)
72.00 B Thread::Mutex
retained objects by gem
244 rubyboy/lib
8 heap-profiler-0.7.0
retained objects by file
98 rubyboy/lib/rubyboy/cpu.rb
54 rubyboy/lib/rubyboy/ppu.rb
24 rubyboy/lib/rubyboy.rb
18 rubyboy/lib/rubyboy/lcd.rb
18 rubyboy/lib/rubyboy/apu.rb
8 rubyboy/lib/rubyboy/bus.rb
8 rubyboy/lib/rubyboy/audio.rb
8 heap-profiler-0.7.0/lib/heap_profiler/reporter.rb
4 rubyboy/lib/rubyboy/apu_channels/channel2.rb
4 rubyboy/lib/rubyboy/apu_channels/channel1.rb
3 rubyboy/lib/rubyboy/apu_channels/channel4.rb
2 rubyboy/lib/rubyboy/registers.rb
1 rubyboy/lib/rubyboy/interrupt.rb
1 rubyboy/lib/rubyboy/cartridge/mbc1.rb
1 rubyboy/lib/rubyboy/apu_channels/channel3.rb
retained objects by location
4 rubyboy/lib/rubyboy.rb:79
4 rubyboy/lib/rubyboy.rb:78
3 rubyboy/lib/rubyboy/ppu.rb:248
3 rubyboy/lib/rubyboy/lcd.rb:28
3 rubyboy/lib/rubyboy/audio.rb:34
2 rubyboy/lib/rubyboy/registers.rb:73
2 rubyboy/lib/rubyboy/ppu.rb:209
2 rubyboy/lib/rubyboy/ppu.rb:107
2 rubyboy/lib/rubyboy/lcd.rb:45
2 rubyboy/lib/rubyboy/lcd.rb:44
2 rubyboy/lib/rubyboy/lcd.rb:35
2 rubyboy/lib/rubyboy/lcd.rb:31
2 rubyboy/lib/rubyboy/lcd.rb:30
2 rubyboy/lib/rubyboy/lcd.rb:29
2 rubyboy/lib/rubyboy/cpu.rb:23
2 rubyboy/lib/rubyboy/cpu.rb:1248
2 rubyboy/lib/rubyboy/audio.rb:27
2 rubyboy/lib/rubyboy/apu_channels/channel2.rb:76
2 rubyboy/lib/rubyboy/apu.rb:65
2 rubyboy/lib/rubyboy/apu.rb:51
2 rubyboy/lib/rubyboy/apu.rb:50
2 rubyboy/lib/rubyboy.rb:75
2 rubyboy/lib/rubyboy.rb:74
2 rubyboy/lib/rubyboy.rb:43
2 heap-profiler-0.7.0/lib/heap_profiler/reporter.rb:58
2 heap-profiler-0.7.0/lib/heap_profiler/reporter.rb:53
2 heap-profiler-0.7.0/lib/heap_profiler/reporter.rb:52
1 rubyboy/lib/rubyboy/ppu.rb:306
1 rubyboy/lib/rubyboy/ppu.rb:193
1 rubyboy/lib/rubyboy/ppu.rb:180
1 rubyboy/lib/rubyboy/ppu.rb:179
1 rubyboy/lib/rubyboy/lcd.rb:37
1 rubyboy/lib/rubyboy/lcd.rb:36
1 rubyboy/lib/rubyboy/lcd.rb:27
1 rubyboy/lib/rubyboy/cpu.rb:92
1 rubyboy/lib/rubyboy/cpu.rb:894
1 rubyboy/lib/rubyboy/cpu.rb:76
1 rubyboy/lib/rubyboy/cpu.rb:748
1 rubyboy/lib/rubyboy/cpu.rb:69
1 rubyboy/lib/rubyboy/cpu.rb:275
1 rubyboy/lib/rubyboy/cpu.rb:254
1 rubyboy/lib/rubyboy/cpu.rb:219
1 rubyboy/lib/rubyboy/cpu.rb:1027
1 rubyboy/lib/rubyboy/bus.rb:87
1 rubyboy/lib/rubyboy.rb:81
1 rubyboy/lib/rubyboy.rb:80
1 rubyboy/lib/rubyboy.rb:76
1 rubyboy/lib/rubyboy.rb:45
1 rubyboy/lib/rubyboy.rb:44
1 rubyboy/lib/rubyboy.rb:39
retained objects by class
174 <callcache> (IMEMO)
75 <constcache> (IMEMO)
2 <ment> (IMEMO)
1 Thread::Mutex
Allocated String Report
Retained String Report
rubyboy % stackprof stackprof-cpu-myapp.dump
Mode: cpu(1000)
Samples: 12679 (6.31% miss rate)
GC: 2873 (22.66%)
2092 (16.5%) 2092 (16.5%) Integer#<=>
1480 (11.7%) 1480 (11.7%) (sweeping)
2873 (22.7%) 1320 (10.4%) (garbage collection)
1180 (9.3%) 1180 (9.3%) Rubyboy::Ppu#to_signed_byte
1153 (9.1%) 1153 (9.1%) Rubyboy::SDL.RenderClear
749 (5.9%) 749 (5.9%) Rubyboy::Ppu#get_pixel
2181 (17.2%) 739 (5.8%) Rubyboy::Ppu#render_bg
1691 (13.3%) 608 (4.8%) Rubyboy::Ppu#render_window
868 (6.8%) 378 (3.0%) Rubyboy::Ppu#render_sprites
300 (2.4%) 300 (2.4%) Integer#>>
770 (6.1%) 292 (2.3%) Enumerable#each_slice
5044 (39.8%) 290 (2.3%) Rubyboy::Ppu#step
221 (1.7%) 221 (1.7%) Rubyboy::Registers#read8
220 (1.7%) 220 (1.7%) Rubyboy::SDL.UpdateTexture
3939 (31.1%) 189 (1.5%) Integer#times
184 (1.5%) 184 (1.5%) Rubyboy::Timer#step
388 (3.1%) 118 (0.9%) Rubyboy::Ppu#get_color
109 (0.9%) 109 (0.9%) Rubyboy::Registers#write8
105 (0.8%) 105 (0.8%) Array#size
664 (5.2%) 74 (0.6%) Rubyboy::Cartridge::Mbc1#read_byte
73 (0.6%) 73 (0.6%) (marking)
749 (5.9%) 68 (0.5%) Array#each
100 (0.8%) 68 (0.5%) Rubyboy::Cpu#flags
66 (0.5%) 66 (0.5%) Rubyboy::Cpu#increment_pc_by_byte
1206 (9.5%) 64 (0.5%) Rubyboy::Cpu#ld8
61 (0.5%) 61 (0.5%) Rubyboy::Registers#read16
9800 (77.3%) 57 (0.4%) Rubyboy::Console#bench
2137 (16.9%) 48 (0.4%) Range#===
1434 (11.3%) 38 (0.3%) Rubyboy::Lcd#draw
35 (0.3%) 35 (0.3%) Rubyboy::SDL.GetKeyboardState
rubyboy % stackprof stackprof-cpu-myapp.dump
Mode: cpu(1000)
Samples: 8706 (8.09% miss rate)
GC: 890 (10.22%)
1135 (13.0%) 1135 (13.0%) Rubyboy::SDL.RenderClear
1034 (11.9%) 1034 (11.9%) Integer#<=>
2797 (32.1%) 1026 (11.8%) Rubyboy::Ppu#render_bg
939 (10.8%) 939 (10.8%) Rubyboy::Ppu#to_signed_byte
633 (7.3%) 633 (7.3%) Rubyboy::Ppu#get_pixel
455 (5.2%) 455 (5.2%) (sweeping)
1109 (12.7%) 454 (5.2%) Rubyboy::Ppu#render_sprites
890 (10.2%) 405 (4.7%) (garbage collection)
899 (10.3%) 387 (4.4%) Enumerable#each_slice
4575 (52.5%) 280 (3.2%) Rubyboy::Ppu#step
247 (2.8%) 247 (2.8%) Rubyboy::SDL.UpdateTexture
231 (2.7%) 231 (2.7%) Integer#>>
3311 (38.0%) 192 (2.2%) Integer#times
164 (1.9%) 164 (1.9%) Rubyboy::Timer#step
374 (4.3%) 139 (1.6%) Rubyboy::Ppu#render_window
116 (1.3%) 116 (1.3%) Rubyboy::Registers#read8
96 (1.1%) 96 (1.1%) Array#size
298 (3.4%) 87 (1.0%) Rubyboy::Ppu#get_color
455 (5.2%) 62 (0.7%) Rubyboy::Cartridge::Mbc1#read_byte
7807 (89.7%) 55 (0.6%) Rubyboy::Console#bench
974 (11.2%) 51 (0.6%) Array#each
48 (0.6%) 48 (0.6%) Rubyboy::Registers#write8
47 (0.5%) 47 (0.5%) Rubyboy::Registers#read16
52 (0.6%) 33 (0.4%) Rubyboy::Cpu#flags
30 (0.3%) 30 (0.3%) (marking)
1434 (16.5%) 30 (0.3%) Rubyboy::Lcd#draw
1036 (11.9%) 30 (0.3%) Range#===
25 (0.3%) 25 (0.3%) Rubyboy::Interrupt#interrupts
1547 (17.8%) 25 (0.3%) Rubyboy::Cpu#exec
21 (0.2%) 21 (0.2%) Rubyboy::SDL.GetKeyboardState
rubyboy % RUBYOPT=--yjit bundle exec rubyboy bench
Ruby: 3.3.0
YJIT: true
1: 29.089206 sec
FPS: 51.56551883884352
rubyboy % bundle exec stackprof stackprof-cpu-myapp.dump
Mode: cpu(1000)
Samples: 11807 (5.57% miss rate)
GC: 1554 (13.16%)
2237 (18.9%) 2237 (18.9%) Integer#<=>
1133 (9.6%) 1133 (9.6%) Rubyboy::Ppu#to_signed_byte
1108 (9.4%) 1108 (9.4%) Rubyboy::SDL.RenderClear
1037 (8.8%) 1037 (8.8%) (sweeping)
2236 (18.9%) 804 (6.8%) Rubyboy::Ppu#render_bg
770 (6.5%) 770 (6.5%) Rubyboy::Ppu#get_pixel
1652 (14.0%) 585 (5.0%) Rubyboy::Ppu#render_window
1554 (13.2%) 488 (4.1%) (garbage collection)
386 (3.3%) 386 (3.3%) Integer#>>
888 (7.5%) 364 (3.1%) Rubyboy::Ppu#render_sprites
787 (6.7%) 315 (2.7%) Enumerable#each_slice
5070 (42.9%) 275 (2.3%) Rubyboy::Ppu#step
236 (2.0%) 236 (2.0%) Rubyboy::SDL.UpdateTexture
207 (1.8%) 207 (1.8%) Rubyboy::Timer#step
199 (1.7%) 199 (1.7%) Rubyboy::Registers#a=
1007 (8.5%) 189 (1.6%) Rubyboy::Cpu#get_value
3985 (33.8%) 189 (1.6%) Integer#times
130 (1.1%) 130 (1.1%) Rubyboy::Cpu#flags
116 (1.0%) 116 (1.0%) Array#size
380 (3.2%) 96 (0.8%) Rubyboy::Ppu#get_color
719 (6.1%) 90 (0.8%) Rubyboy::Cartridge::Mbc1#read_byte
2302 (19.5%) 71 (0.6%) Range#===
761 (6.4%) 67 (0.6%) Array#each
61 (0.5%) 61 (0.5%) Rubyboy::Registers#hl
10253 (86.8%) 48 (0.4%) Rubyboy::Console#bench
48 (0.4%) 48 (0.4%) Rubyboy::Registers#b=
46 (0.4%) 46 (0.4%) Rubyboy::Cpu#increment_pc_by_byte
1410 (11.9%) 45 (0.4%) Rubyboy::Lcd#draw
1171 (9.9%) 38 (0.3%) Rubyboy::Ppu#get_tile_index
36 (0.3%) 36 (0.3%) Rubyboy::Registers#f=
- heap-profilerでメモリ使用箇所を探して最適化する
- flag取得のために毎回ハッシュを作っていた箇所をつくらないように
- レジスタの読み書きをsendメソッドを使わず
when :a then @registers.a = value
rubyboy % RUBYOPT=--yjit bundle exec rubyboy bench
Ruby: 3.3.0
YJIT: true
1: 26.798767 sec
FPS: 55.97272441676141
rubyboy % bundle exec stackprof stackprof-cpu-myapp.dump
Mode: cpu(1000)
Samples: 10430 (5.57% miss rate)
GC: 283 (2.71%)
2275 (21.8%) 2275 (21.8%) Integer#<=>
1267 (12.1%) 1267 (12.1%) Rubyboy::SDL.RenderClear
1186 (11.4%) 1186 (11.4%) Rubyboy::Ppu#to_signed_byte
2366 (22.7%) 864 (8.3%) Rubyboy::Ppu#render_bg
784 (7.5%) 784 (7.5%) Rubyboy::Ppu#get_pixel
1773 (17.0%) 641 (6.1%) Rubyboy::Ppu#render_window
992 (9.5%) 415 (4.0%) Rubyboy::Ppu#render_sprites
334 (3.2%) 334 (3.2%) Integer#>>
852 (8.2%) 319 (3.1%) Enumerable#each_slice
5453 (52.3%) 311 (3.0%) Rubyboy::Ppu#step
4199 (40.3%) 213 (2.0%) Integer#times
188 (1.8%) 188 (1.8%) Rubyboy::Timer#step
187 (1.8%) 187 (1.8%) (sweeping)
142 (1.4%) 142 (1.4%) Rubyboy::SDL.UpdateTexture
129 (1.2%) 129 (1.2%) Array#size
426 (4.1%) 114 (1.1%) Rubyboy::Ppu#get_color
851 (8.2%) 109 (1.0%) Array#each
981 (9.4%) 105 (1.0%) Rubyboy::Cpu#get_value
283 (2.7%) 85 (0.8%) (garbage collection)
708 (6.8%) 75 (0.7%) Rubyboy::Cartridge::Mbc1#read_byte
67 (0.6%) 67 (0.6%) Rubyboy::Cpu#increment_pc_by_byte
10147 (97.3%) 66 (0.6%) Rubyboy::Console#bench
2327 (22.3%) 53 (0.5%) Range#===
43 (0.4%) 43 (0.4%) Rubyboy::Registers#hl
37 (0.4%) 37 (0.4%) Rubyboy::Interrupt#interrupts
34 (0.3%) 34 (0.3%) Rubyboy::Registers#a=
2958 (28.4%) 33 (0.3%) Rubyboy::Cpu#exec
1216 (11.7%) 30 (0.3%) Rubyboy::Ppu#get_tile_index
2030 (19.5%) 28 (0.3%) Rubyboy::Bus#read_byte
1455 (14.0%) 26 (0.2%) Rubyboy::Lcd#draw
FPS: 51.56551883884352 -> 55.97272441676141
GC: 13.16% -> 2.71%
→ あらかじめaddrと処理の内容をキャッシュしておくことで比較無しで高速に処理を実行できるようにする (参考:
def read_byte(addr)
case addr
when 0x0000..0x7fff
when 0x8000..0x9fff
rubyboy % RUBYOPT=--yjit bundle exec rubyboy bench
Ruby: 3.3.0
YJIT: true
1: 21.75409 sec
FPS: 68.95255099156066
rubyboy % bundle exec stackprof stackprof-cpu-myapp.dump
Mode: cpu(1000)
Samples: 9505 (6.87% miss rate)
GC: 325 (3.42%)
1238 (13.0%) 1238 (13.0%) Rubyboy::Ppu#to_signed_byte
1208 (12.7%) 1208 (12.7%) Rubyboy::SDL.RenderClear
2558 (26.9%) 907 (9.5%) Rubyboy::Ppu#render_bg
865 (9.1%) 865 (9.1%) Rubyboy::Ppu#get_pixel
849 (8.9%) 849 (8.9%) Rubyboy::Cartridge::Mbc1#set_methods
1803 (19.0%) 663 (7.0%) Rubyboy::Ppu#render_window
1053 (11.1%) 460 (4.8%) Rubyboy::Ppu#render_sprites
5782 (60.8%) 346 (3.6%) Rubyboy::Ppu#step
906 (9.5%) 343 (3.6%) Enumerable#each_slice
313 (3.3%) 313 (3.3%) Integer#>>
4412 (46.4%) 245 (2.6%) Integer#times
237 (2.5%) 237 (2.5%) (sweeping)
197 (2.1%) 197 (2.1%) Rubyboy::Timer#step
193 (2.0%) 193 (2.0%) Rubyboy::SDL.UpdateTexture
1141 (12.0%) 162 (1.7%) Rubyboy::Bus#set_methods
433 (4.6%) 134 (1.4%) Rubyboy::Ppu#get_color
114 (1.2%) 114 (1.2%) Array#size
478 (5.0%) 109 (1.1%) Rubyboy::Cpu#get_value
918 (9.7%) 99 (1.0%) Array#each
75 (0.8%) 75 (0.8%) Rubyboy::Cpu#increment_pc_by_byte
9180 (96.6%) 68 (0.7%) Rubyboy::Console#bench
325 (3.4%) 65 (0.7%) (garbage collection)
49 (0.5%) 49 (0.5%) Integer#<=>
45 (0.5%) 45 (0.5%) Rubyboy::Interrupt#interrupts
36 (0.4%) 36 (0.4%) Rubyboy::Registers#hl
36 (0.4%) 36 (0.4%) Rubyboy::Registers#a=
1651 (17.4%) 35 (0.4%) Rubyboy::Cpu#exec
1042 (11.0%) 30 (0.3%) Rubyboy::Bus#read_byte
1267 (13.3%) 29 (0.3%) Rubyboy::Ppu#get_tile_index
1448 (15.2%) 26 (0.3%) Rubyboy::Lcd#draw
FPS: 55.97272441676141 -> 68.95255099156066
Integer#<=>: 21.8% -> 0.5%
rubyboy % RUBYOPT=--yjit bundle exec rubyboy bench
Ruby: 3.3.0
YJIT: true
1: 27.340768 sec
FPS: 54.86312601021302