iTranslated by AI
I Tried Ractor::Port (But...)
I tried using Ractor::Port, which was introduced in Ractor::Port — The Story of Revamping Ractor's API, because I wanted to see how it feels to use. However, it didn't speed up at all, so I'm probably using it the wrong way... Please feel free to correct me...
Target
I'll skip the details, but I'll use a tool called Magika for file format identification as the target. To operate Magika:
- Extract features from the target file
- Run inference using a machine learning model via ONNX Runtime
These are the steps involved. The former should be primarily IO-bound, and the latter CPU-bound (I'm not using a GPU).
I will try executing each process with Ractor.
Environment
I ran this on an 8-core MacBook Pro.
% sysctl -n hw.physicalcpu_max
8
% ruby -v
ruby 4.0.0preview3 (2025-12-18 master cfa3e7cf75) +PRISM [arm64-darwin25]
Code
If we call the feature extraction part "extractor" and the file format identification via ONNX Runtime "session", and run four Ractors for each, the code might look something like this:
def multi_sessions_multi_extractors
results = []
m = 4
n = 4
session_addresses = m.times.collect {Ractor::Port.new}
sessions = m.times.collect {|i|
Ractor.new(SESSIONS[i], session_addresses[i]) {|session, address|
while true
address << Ractor.current
features, result_address = Ractor.receive
result = session.infer(features)
result_address << result
end
}
}
extractor_addresses = n.times.collect {Ractor::Port.new}
extractors = n.times.collect {|i|
Ractor.new(extractor_addresses[i]) {|address|
while true
address << Ractor.current
path, port = Ractor.receive
features = extract_features(path)
port << features
end
}
}
result_address = Ractor::Port.new
PATHS.each do |path|
extractor_address, extractor = Ractor.select(*extractor_addresses)
extractor.send [path, result_address]
features = result_address.receive
session_address, session = Ractor.select(*session_addresses)
session.send [features, result_address]
_, result = result_address.receive
results << result
end
raise "Few results: #{results.length}" unless results.length == PATHS.length
end
PATHS is just a collection of 5,725 files within vendor/bundle.
Since it worked for now, I implemented and benchmarked the following patterns:
- Single session, single extractor
- Four sessions, one extractor
- One session, four extractors
- Four sessions, four extractors
- Four Ractors combining both session and extractor
user system total real
Single session, single extractor 84.934568 0.216099 85.150667 ( 21.411487)
Multi sessions, single extractor 181.183169 1.419365 182.602534 ( 24.306846)
Single session, multi extractors 83.441309 0.301563 83.742872 ( 21.150998)
Multi sessions, multi extractors 86.485860 0.410438 86.896298 ( 21.868833)
Multi sessions extractors 184.441108 1.257088 185.698196 ( 24.782769)
Hmm, it doesn't really change much...
PATHS.each do |path|
extractor_address, extractor = Ractor.select(*extractor_addresses)
extractor.send [path, result_address]
features = result_address.receive
session_address, session = Ractor.select(*session_addresses)
session.send [features, result_address]
_, result = result_address.receive
results << result
end
Regarding the section above, it ends up being sequential after all. I didn't quite figure out how to handle this gracefully—the kind of thing where you'd use a Queue with threads.
Also, when I looked at the Activity Monitor, the CPU usage was around 400% even when there was only one extractor and one session, so I don't really understand what's happening...
I wonder if using an extension library is also causing some issues.
Since I'm lost on everything, I'd be happy if someone could post a blog article or something showing the proper way to use it.
I've placed the full code in a snippet.
Discussion