Rubyの不要コード削除補助ツール
最近Railsプロジェクトの不要コード削除を行っていて、雑ツールを書いて作業効率化をしている。
2つのツールを作った。両方ともコードの依存関係を可視化するツールだけれども、それぞれ用途が違う。1つは各ファイル間の依存関係を大まかに可視化するもの、もう1つは特定のファイルがどのファイルから依存されているかを出力するもの。
とりあえず手元で動くことを重視しているので、対象とするコードによっては何かしらエラーで動かなかったりするかもしれない。
記事中のコードはCC-0とします。
dependency_graph.rb
これは各ファイル間の依存関係を大まかに可視化するためのスクリプト。ファイルの一覧を与えると、与えられたファイル内でどのような依存関係があるかがグラフとして出力される。
実装は以下。
実装
# dependency_graph.rb:
# Generate a dependency graph of Ruby files.
#
# Usage
# bin/rails r dependency_graph.rb [--pattern PAT] FILES...
#
# Example
# bin/rails r dependency_graph.rb --pattern 'UnusedFeature' $(git grep UnusedFeature app/ | cut -d : -f 1 | sort -u)
require 'parser/current'
require 'optparse'
class TreeVisitor
def start(node)
__send__ :"on_#{node.type}", node, nil
end
def method_missing(name, *args)
if name.to_s.start_with?('on_')
node = args[0]
node.children.each do |child|
__send__(:"on_#{child.type}", child, node) if child.is_a?(Parser::AST::Node)
end
else
super
end
end
end
class DepCollector < TreeVisitor
attr_reader :constants
class UndeterminedConstant < StandardError; end
def initialize
@ns = [Object]
@constants = []
end
def on_class(node, _)
@ns.push eval_const(node.children[0])
super
ensure
@ns.pop
end
def on_const(node, parent)
unless ignored_const?(node, parent)
@constants << eval_const(node)
end
super
rescue UndeterminedConstant
end
alias on_module on_class
private
def ignored_const?(node, parent)
return false if parent.nil?
case parent.type
when :const, :module
return true
when :class
return parent.children[0] == node
end
end
def eval_const(node)
name = const_name(node)
@ns.reverse_each do |ns|
return ns.const_get(name)
rescue NameError
end
raise NameError, "#{name} not found"
end
def const_name(node)
return if node.nil?
return '' if node.type == :cbase
raise UndeterminedConstant if node.type != :const
[*const_name(node.children[0]), node.children[1]].join('::')
end
end
puts <<~GRAPHVIZ
digraph G {
graph [
layout = fdp;
]
GRAPHVIZ
nodes = Hash.new { |h, k| h[k] = "n#{h.size}" }
pat = nil
OptionParser.new do |opts|
opts.on('--pattern PAT') { |p| pat = Regexp.new(p) }
end.parse!
ARGV.each do |file|
node = Parser::CurrentRuby.parse(File.read(file))
c = DepCollector.new
c.start(node)
deps = c.constants.select { |c| c.is_a?(Module) }.uniq.map { |c|
next unless c.name
next if pat && !pat.match?(c.name)
loc = Object.const_source_location(c.name)
next if loc.empty?
[Pathname(loc.first).relative_path_from(Dir.pwd).to_s, c.name]
}.compact
puts "#{nodes[file]} [ label = \"#{file}\" ]"
deps.each do |path, name|
puts "#{nodes[path]} [ label = \"#{path}\" ]" unless nodes.key?(path)
puts "#{nodes[file]} -> #{nodes[path]} [ label = \"#{name}\" ]"
end
end
puts '}'
Implementation
定数の参照を依存関係として出力する。つまりa.rb
にA
定数が定義されていて、b.rb
にA
定数への参照があれば、b.rb
はa.rb
に依存しているとみなす。
定数の関係だけで、メソッドの依存などは見ていない。それでうまくいくのかと思うかもしれないけれど、厳密さを求めなければなかなか良い出力を得られる。Packwerkが同様のことをやっていて、アイディアを参考にした。
やっていることは、受け取ったファイルをparser gemでパースして、定数参照を抜き出し、参照されている定数の定義位置を集めている。
結果はGraphvizのコードとして出力している。
Usage
調査対象のアプリケーションのコードを読み込んでコードを実行する。Railsならばbin/rails r
で実行すれば良い。
引数として、依存関係を手に入れたいファイルの一覧を受け取る。また--pattern
オプションに正規表現を渡すことで、そのパターンにマッチした定数のみを依存関係として見る。
依存関係を出してどうするかというと、コード削除をするときにどこから手を付けるかの目安にする。
例えば強く依存し合っているファイル群があり、そのファイル群以外との依存が少なければ、そのファイル群をまとめて削除することが検討できる。またどこからも依存されていないファイルがあれば、そこからの削除は手を付けやすいはず。
実行例
例として、rbs gemのRBS::Collection
機能を対象として実行してみた。(この機能を消すことはないけれど、手近な例として。)
$ bundle exec ruby -Ilib -rfileutils -rrbs -rrbs/cli -rrbs/annotate depndency_graph.rb --pattern 'Collection' $(git grep Collection lib/ | cut -d : -f 1 | sort -u)
出力
digraph G {
graph [
layout = fdp;
]
n0 [ label = "lib/rbs/cli.rb" ]
n1 [ label = "lib/rbs/collection/config.rb" ]
n0 -> n1 [ label = "RBS::Collection::Config" ]
n2 [ label = "lib/rbs/collection/config/lockfile.rb" ]
n0 -> n2 [ label = "RBS::Collection::Config::Lockfile" ]
n3 [ label = "lib/rbs/collection/installer.rb" ]
n0 -> n3 [ label = "RBS::Collection::Installer" ]
n4 [ label = "lib/rbs/collection/cleaner.rb" ]
n0 -> n4 [ label = "RBS::Collection::Cleaner" ]
n5 [ label = "lib/rbs/collection.rb" ]
n4 [ label = "lib/rbs/collection/cleaner.rb" ]
n4 -> n1 [ label = "RBS::Collection::Config" ]
n1 [ label = "lib/rbs/collection/config.rb" ]
n6 [ label = "lib/rbs/collection/config/lockfile_generator.rb" ]
n1 -> n6 [ label = "RBS::Collection::Config::LockfileGenerator" ]
n7 [ label = "lib/rbs/collection/sources/rubygems.rb" ]
n1 -> n7 [ label = "RBS::Collection::Sources::Rubygems" ]
n8 [ label = "lib/rbs/collection/sources/stdlib.rb" ]
n1 -> n8 [ label = "RBS::Collection::Sources::Stdlib" ]
n9 [ label = "lib/rbs/collection/sources/base.rb" ]
n1 -> n9 [ label = "RBS::Collection::Sources" ]
n2 [ label = "lib/rbs/collection/config/lockfile.rb" ]
n2 -> n2 [ label = "RBS::Collection::Config::Lockfile" ]
n2 -> n9 [ label = "RBS::Collection::Sources" ]
n2 -> n1 [ label = "RBS::Collection::Config::CollectionNotAvailable" ]
n10 [ label = "lib/rbs/collection/sources/git.rb" ]
n2 -> n10 [ label = "RBS::Collection::Sources::Git" ]
n11 [ label = "lib/rbs/collection/sources/local.rb" ]
n2 -> n11 [ label = "RBS::Collection::Sources::Local" ]
n6 [ label = "lib/rbs/collection/config/lockfile_generator.rb" ]
n6 -> n1 [ label = "RBS::Collection::Config" ]
n6 -> n2 [ label = "RBS::Collection::Config::Lockfile" ]
n6 -> n8 [ label = "RBS::Collection::Sources::Stdlib" ]
n6 -> n6 [ label = "RBS::Collection::Config::LockfileGenerator::GemfileLockMismatchError" ]
n6 -> n9 [ label = "RBS::Collection::Sources" ]
n3 [ label = "lib/rbs/collection/installer.rb" ]
n3 -> n2 [ label = "RBS::Collection::Config::Lockfile" ]
n12 [ label = "lib/rbs/collection/sources.rb" ]
n12 -> n10 [ label = "RBS::Collection::Sources::Git" ]
n12 -> n11 [ label = "RBS::Collection::Sources::Local" ]
n12 -> n8 [ label = "RBS::Collection::Sources::Stdlib" ]
n12 -> n7 [ label = "RBS::Collection::Sources::Rubygems" ]
n9 [ label = "lib/rbs/collection/sources/base.rb" ]
n10 [ label = "lib/rbs/collection/sources/git.rb" ]
n10 -> n9 [ label = "RBS::Collection::Sources::Base" ]
n10 -> n10 [ label = "RBS::Collection::Sources::Git::CommandError" ]
n11 [ label = "lib/rbs/collection/sources/local.rb" ]
n11 -> n9 [ label = "RBS::Collection::Sources::Base" ]
n7 [ label = "lib/rbs/collection/sources/rubygems.rb" ]
n7 -> n9 [ label = "RBS::Collection::Sources::Base" ]
n8 [ label = "lib/rbs/collection/sources/stdlib.rb" ]
n8 -> n9 [ label = "RBS::Collection::Sources::Base" ]
n13 [ label = "lib/rbs/environment_loader.rb" ]
n13 -> n7 [ label = "RBS::Collection::Sources::Rubygems" ]
n13 -> n8 [ label = "RBS::Collection::Sources::Stdlib" ]
}
この出力をGraphvizに食わせると、以下のような画像が生成された。
なんとなく雰囲気はつかめるのではないか。
file_dependency.rb
こちらは1つのファイルを対象に、そのファイルがどのファイルから依存されているかを出力するスクリプト。
実装は以下。
実装
# file_dependency.rb:
# Report all references to a given file.
#
# Usage:
# ruby file_dependency.rb FILE
require 'parser/current'
require 'open3'
def sh!(*cmd)
out, _err, _st = Open3.capture3(*cmd)
out
end
class TreeVisitor
def start(node)
__send__ :"on_#{node.type}", node, nil
end
def method_missing(name, *args)
if name.to_s.start_with?('on_')
node = args[0]
node.children.each do |child|
__send__(:"on_#{child.type}", child, node) if child.is_a?(Parser::AST::Node)
end
else
super
end
end
end
module Entry
def describe
<<~DESC
## #{full_name}
#{references.map { |r| "* #{r}" }.join("\n")}
DESC
end
def references
color = $stdout.tty? ? ['--color'] : []
out = sh! "git", "grep", "-nwF", *color, '--', name.to_s
out.split("\n")
end
end
MethodDef = Struct.new(:name, :namespace, :singleton_p, keyword_init: true) do
include Entry
def full_name
if namespace.empty?
"##{name}"
else
"#{namespace.join('::')}#{singleton_p ? '.' : '#'}#{name}"
end
end
end
ConstDef = Struct.new(:name, :namespace, keyword_init: true) do
include Entry
def full_name
[namespace, name].flatten.join('::')
end
end
class MethodDefCollector < TreeVisitor
def initialize
@namespace = []
@singleton_p = false
@method_defs = []
@const_defs = []
@leaf = nil
end
attr_reader :method_defs, :const_defs
def on_def(node, _)
return if node.children[0] == :initialize
d = MethodDef.new(name: node.children[0], namespace: @namespace, singleton_p: @singleton_p)
@method_defs << d
end
def on_defs(node, _)
d = MethodDef.new(name: node.children[1], namespace: @namespace, singleton_p: true)
@method_defs << d
end
def on_casgn(node, _)
@const_defs << ConstDef.new(name: node.children[1], namespace: @namespace)
end
def on_class(node, _)
@namespace = [*@namespace, node.children[0].children[1]]
@leaf = node
super
if @leaf == node
@const_defs << ConstDef.new(name: node.children[0].children[1], namespace: @namespace[0..-2])
end
ensure
@namespace = @namespace[0..-2]
end
alias on_module on_class
def on_sclass(node, _)
before = @singleton_p
@singleton_p = true
super
ensure
@singleton_p = before
end
end
def main
path = ARGV[0]
content = File.read(path)
tree = Parser::CurrentRuby.parse(content)
c = MethodDefCollector.new
c.start tree
puts [*c.const_defs.map(&:describe), *c.method_defs.map(&:describe)].join("\n\n")
end
main
Implementation
受け取ったファイルをparser gemでパースし、メソッド、定数、クラス、モジュール定義を抜き出し、それらの抜き出した定義を参照している箇所を一覧で表示する。
参照の取得は、git grep
を使っている。git grep
しているだけなので、ものによっては大量の偽陽性が出ることもあるのだけれど、それは運用でカバーする。なおinitialize
だけは明らかに偽陽性しか出ないので、無視している。
git grep
をちょっと便利にするやつ、ぐらいの気持ち。
Usage
調査対象のファイルを1つ指定して実行する。実行時の情報は使わないので、アプリケーションコードを読み込む必要はない。
これを実行するとファイル内で定義されているメソッドなどが使われていそうな場所がわかるので、それを見て対象ファイルを消していいかの判断の参考にする。
Example
こちらも RBS Collection を対象に実行してみた。
$ ruby file_dependency.rb lib/rbs/collection/config/lockfile.rb
結果
## RBS::Collection::Config::Lockfile
* benchmark/utils.rb:30: lock = RBS::Collection::Config::Lockfile.from_lockfile(lockfile_path: lock_path, data: YAML.load_file(lock_path.to_s))
* core/array.rbs:115:# * Gem::RequestSet::Lockfile::Tokenizer#to_a
* lib/rbs/cli.rb:40: lock = Collection::Config::Lockfile.from_lockfile(lockfile_path: lock_path, data: YAML.load_file(lock_path.to_s))
* lib/rbs/collection/config/lockfile.rb:6: class Lockfile
* lib/rbs/collection/config/lockfile.rb:48: lockfile = Lockfile.new(lockfile_path: lockfile_path, path: path, gemfile_lock_path: gemfile_lock_path)
* lib/rbs/collection/config/lockfile_generator.rb:42: @lockfile = Lockfile.new(
* lib/rbs/collection/config/lockfile_generator.rb:49: @existing_lockfile = Lockfile.from_lockfile(lockfile_path: lockfile_path, data: YAML.load_file(lockfile_path.to_s))
* lib/rbs/collection/config/lockfile_generator.rb:100: # @type var locked: Lockfile::library?
* lib/rbs/collection/installer.rb:10: @lockfile = Config::Lockfile.from_lockfile(lockfile_path: lockfile_path, data: YAML.load_file(lockfile_path))
* sig/collection/config.rbs:26: def self.generate_lockfile: (config_path: Pathname, definition: Bundler::Definition, ?with_lockfile: boolish) -> [Config, Lockfile]
* sig/collection/config/lockfile.rbs:4: # Lockfile represents the `rbs_collection.lock.yaml`, that contains configurations and *resolved* gems with their sources
* sig/collection/config/lockfile.rbs:6: class Lockfile
* sig/collection/config/lockfile.rbs:59: def self.from_lockfile: (lockfile_path: Pathname, data: lockfile_data) -> Lockfile
* sig/collection/config/lockfile_generator.rbs:17: attr_reader lockfile: Lockfile
* sig/collection/config/lockfile_generator.rbs:18: attr_reader existing_lockfile: Lockfile?
* sig/collection/config/lockfile_generator.rbs:28: def self.generate: (config: Config, definition: Bundler::Definition, ?with_lockfile: boolish) -> Lockfile
* sig/collection/config/lockfile_generator.rbs:38: def validate_gemfile_lock_path!: (lock: Lockfile?, gemfile_lock_path: Pathname) -> void
* sig/collection/installer.rbs:4: attr_reader lockfile: Config::Lockfile
* sig/environment_loader.rbs:88: def add_collection: (Collection::Config::Lockfile lockfile) -> void
* test/rbs/environment_loader_test.rb:227: lock = RBS::Collection::Config::Lockfile.from_lockfile(lockfile_path: lockfile_path, data: YAML.load_file(lockfile_path))
* test/rbs/environment_loader_test.rb:262: lock = RBS::Collection::Config::Lockfile.from_lockfile(lockfile_path: lockfile_path, data: YAML.load_file(lockfile_path))
* test/rbs/environment_loader_test.rb:315: lock = RBS::Collection::Config::Lockfile.from_lockfile(lockfile_path: lockfile_path, data: YAML.load_file(lockfile_path.to_s))
* test/rbs/environment_loader_test.rb:344: lock = RBS::Collection::Config::Lockfile.from_lockfile(lockfile_path: lockfile_path, data: YAML.load_file(lockfile_path.to_s))
## RBS::Collection::Config::Lockfile#fullpath
* lib/rbs/collection/config/lockfile.rb:18: def fullpath
* lib/rbs/collection/config/lockfile.rb:74: raise CollectionNotAvailable unless fullpath.exist?
* lib/rbs/collection/config/lockfile.rb:81: meta_path = fullpath.join(gem[:name], gem[:version], Sources::Git::METADATA_FILENAME)
* lib/rbs/collection/config/lockfile.rb:85: raise CollectionNotAvailable unless fullpath.join(gem[:name], gem[:version]).symlink?
* lib/rbs/collection/installer.rb:15: install_to = lockfile.fullpath
* lib/rbs/environment_loader.rb:83: repository.add(lockfile.fullpath)
* sig/collection/config/lockfile.rbs:51: def fullpath: () -> Pathname
* test/rbs/environment_loader_test.rb:239: assert repo.dirs.include? lock.fullpath
## RBS::Collection::Config::Lockfile#gemfile_lock_fullpath
* lib/rbs/collection/config/lockfile.rb:22: def gemfile_lock_fullpath
* lib/rbs/collection/config/lockfile_generator.rb:85: return unless lock.gemfile_lock_fullpath
* lib/rbs/collection/config/lockfile_generator.rb:86: unless File.realpath(lock.gemfile_lock_fullpath) == File.realpath(gemfile_lock_path)
* lib/rbs/collection/config/lockfile_generator.rb:87: raise GemfileLockMismatchError.new(expected: lock.gemfile_lock_fullpath, actual: gemfile_lock_path)
* sig/collection/config/lockfile.rbs:55: %a{pure} def gemfile_lock_fullpath: () -> Pathname?
## RBS::Collection::Config::Lockfile#to_lockfile
* lib/rbs/collection/config/lockfile.rb:28: def to_lockfile
* lib/rbs/collection/config/lockfile.rb:69: "source" => lib[:source].to_lockfile
* lib/rbs/collection/config/lockfile_generator.rb:80: lockfile.lockfile_path.write(YAML.dump(lockfile.to_lockfile))
* lib/rbs/collection/sources/git.rb:113: def to_lockfile
* lib/rbs/collection/sources/git.rb:211: "source" => to_lockfile
* lib/rbs/collection/sources/local.rb:72: def to_lockfile
* lib/rbs/collection/sources/rubygems.rb:36: def to_lockfile
* lib/rbs/collection/sources/stdlib.rb:38: def to_lockfile
* sig/collection/config/lockfile.rbs:57: def to_lockfile: () -> lockfile_data
* sig/collection/sources.rbs:11: def to_lockfile: () -> source_entry
* sig/collection/sources.rbs:62: def to_lockfile: () -> source_entry
* sig/collection/sources.rbs:149: def to_lockfile: () -> source_entry
* sig/collection/sources.rbs:178: def to_lockfile: () -> source_entry
* sig/collection/sources.rbs:204: def to_lockfile: () -> source_entry
* test/rbs/collection/config_test.rb:58: string = YAML.dump(lockfile.to_lockfile)
* test/rbs/collection/config_test.rb:109: string = YAML.dump(lockfile.to_lockfile)
* test/rbs/collection/config_test.rb:171: string = YAML.dump(lockfile.to_lockfile)
* test/rbs/collection/config_test.rb:194: string = YAML.dump(lockfile.to_lockfile)
* test/rbs/collection/config_test.rb:253: string = YAML.dump(lockfile.to_lockfile)
* test/rbs/collection/config_test.rb:329: string = YAML.dump(lockfile.to_lockfile)
* test/rbs/collection/config_test.rb:378: string = YAML.dump(lockfile.to_lockfile)
* test/rbs/collection/config_test.rb:419: string = YAML.dump(lockfile.to_lockfile)
* test/rbs/collection/config_test.rb:516: string = YAML.dump(lockfile.to_lockfile)
* test/rbs/collection/config_test.rb:574: string = YAML.dump(lockfile.to_lockfile)
* test/rbs/collection/config_test.rb:630: string = YAML.dump(lockfile.to_lockfile)
* test/rbs/collection/config_test.rb:674: string = YAML.dump(lockfile.to_lockfile)
* test/rbs/collection/config_test.rb:752: string = YAML.dump(lockfile.to_lockfile)
* test/rbs/collection/config_test.rb:846: string = YAML.dump(lockfile.to_lockfile)
## RBS::Collection::Config::Lockfile.from_lockfile
* benchmark/utils.rb:30: lock = RBS::Collection::Config::Lockfile.from_lockfile(lockfile_path: lock_path, data: YAML.load_file(lock_path.to_s))
* lib/rbs/cli.rb:40: lock = Collection::Config::Lockfile.from_lockfile(lockfile_path: lock_path, data: YAML.load_file(lock_path.to_s))
* lib/rbs/collection/config/lockfile.rb:42: def self.from_lockfile(lockfile_path:, data:)
* lib/rbs/collection/config/lockfile_generator.rb:49: @existing_lockfile = Lockfile.from_lockfile(lockfile_path: lockfile_path, data: YAML.load_file(lockfile_path.to_s))
* lib/rbs/collection/installer.rb:10: @lockfile = Config::Lockfile.from_lockfile(lockfile_path: lockfile_path, data: YAML.load_file(lockfile_path))
* sig/collection/config/lockfile.rbs:59: def self.from_lockfile: (lockfile_path: Pathname, data: lockfile_data) -> Lockfile
* test/rbs/environment_loader_test.rb:227: lock = RBS::Collection::Config::Lockfile.from_lockfile(lockfile_path: lockfile_path, data: YAML.load_file(lockfile_path))
* test/rbs/environment_loader_test.rb:262: lock = RBS::Collection::Config::Lockfile.from_lockfile(lockfile_path: lockfile_path, data: YAML.load_file(lockfile_path))
* test/rbs/environment_loader_test.rb:315: lock = RBS::Collection::Config::Lockfile.from_lockfile(lockfile_path: lockfile_path, data: YAML.load_file(lockfile_path.to_s))
* test/rbs/environment_loader_test.rb:344: lock = RBS::Collection::Config::Lockfile.from_lockfile(lockfile_path: lockfile_path, data: YAML.load_file(lockfile_path.to_s))
## RBS::Collection::Config::Lockfile#library_data
* lib/rbs/collection/config/lockfile.rb:33: "gems" => gems.each_value.sort_by {|g| g[:name] }.map {|hash| library_data(hash) },
* lib/rbs/collection/config/lockfile.rb:65: def library_data(lib)
* lib/rbs/collection/config/lockfile.rb:83: raise CollectionNotAvailable unless library_data(gem) == YAML.load(meta_path.read)
* sig/collection/config/lockfile.rbs:11: "gems" => Array[library_data]?, # null if empty
* sig/collection/config/lockfile.rbs:15: type library_data = {
* sig/collection/config/lockfile.rbs:70: def library_data: (library) -> library_data
## RBS::Collection::Config::Lockfile#check_rbs_availability!
* lib/rbs/collection/config/lockfile.rb:73: def check_rbs_availability!
* lib/rbs/environment_loader.rb:81: lockfile.check_rbs_availability!
* sig/collection/config/lockfile.rbs:66: def check_rbs_availability!: () -> void
最後に
コード削除のための補助ツールの紹介でした。
Discussion