🚮

Rubyの不要コード削除補助ツール

2024/09/12に公開

最近Railsプロジェクトの不要コード削除を行っていて、雑ツールを書いて作業効率化をしている。

2つのツールを作った。両方ともコードの依存関係を可視化するツールだけれども、それぞれ用途が違う。1つは各ファイル間の依存関係を大まかに可視化するもの、もう1つは特定のファイルがどのファイルから依存されているかを出力するもの。

とりあえず手元で動くことを重視しているので、対象とするコードによっては何かしらエラーで動かなかったりするかもしれない。

記事中のコードはCC-0とします。

dependency_graph.rb

これは各ファイル間の依存関係を大まかに可視化するためのスクリプト。ファイルの一覧を与えると、与えられたファイル内でどのような依存関係があるかがグラフとして出力される。

実装は以下。

実装
# dependency_graph.rb:
#   Generate a dependency graph of Ruby files.
#
# Usage
#   bin/rails r dependency_graph.rb [--pattern PAT] FILES...
#
# Example
#   bin/rails r dependency_graph.rb --pattern 'UnusedFeature' $(git grep UnusedFeature app/ | cut -d : -f 1 | sort -u)

require 'parser/current'
require 'optparse'

class TreeVisitor
  def start(node)
    __send__ :"on_#{node.type}", node, nil
  end

  def method_missing(name, *args)
    if name.to_s.start_with?('on_')
      node = args[0]
      node.children.each do |child|
        __send__(:"on_#{child.type}", child, node) if child.is_a?(Parser::AST::Node)
      end
    else
      super
    end
  end
end

class DepCollector < TreeVisitor
  attr_reader :constants

  class UndeterminedConstant < StandardError; end

  def initialize
    @ns = [Object]
    @constants = []
  end

  def on_class(node, _)
    @ns.push eval_const(node.children[0])
    super
  ensure
    @ns.pop
  end

  def on_const(node, parent)
    unless ignored_const?(node, parent)
      @constants << eval_const(node)
    end

    super
  rescue UndeterminedConstant
  end

  alias on_module on_class

  private

  def ignored_const?(node, parent)
    return false if parent.nil?
    case parent.type
    when :const, :module
      return true
    when :class
      return parent.children[0] == node
    end
  end

  def eval_const(node)
    name = const_name(node)
    @ns.reverse_each do |ns|
      return ns.const_get(name)
    rescue NameError
    end
    raise NameError, "#{name} not found"
  end

  def const_name(node)
    return if node.nil?
    return '' if node.type == :cbase
    raise UndeterminedConstant if node.type != :const

    [*const_name(node.children[0]), node.children[1]].join('::')
  end
end

puts <<~GRAPHVIZ
  digraph G {
    graph [
        layout = fdp;
    ]
GRAPHVIZ

nodes = Hash.new { |h, k| h[k] = "n#{h.size}" }

pat = nil

OptionParser.new do |opts|
  opts.on('--pattern PAT') { |p| pat = Regexp.new(p) }
end.parse!

ARGV.each do |file|
  node = Parser::CurrentRuby.parse(File.read(file))
  c = DepCollector.new
  c.start(node)

  deps = c.constants.select { |c| c.is_a?(Module) }.uniq.map { |c|
    next unless c.name
    next if pat && !pat.match?(c.name)
    loc = Object.const_source_location(c.name)
    next if loc.empty?

    [Pathname(loc.first).relative_path_from(Dir.pwd).to_s, c.name]
  }.compact

  puts "#{nodes[file]} [ label = \"#{file}\" ]"
  deps.each do |path, name|
    puts "#{nodes[path]} [ label = \"#{path}\" ]" unless nodes.key?(path)
    puts "#{nodes[file]} -> #{nodes[path]} [ label = \"#{name}\" ]"
  end
end

puts '}'

Implementation

定数の参照を依存関係として出力する。つまりa.rbA定数が定義されていて、b.rbA定数への参照があれば、b.rba.rbに依存しているとみなす。
定数の関係だけで、メソッドの依存などは見ていない。それでうまくいくのかと思うかもしれないけれど、厳密さを求めなければなかなか良い出力を得られる。Packwerkが同様のことをやっていて、アイディアを参考にした。

やっていることは、受け取ったファイルをparser gemでパースして、定数参照を抜き出し、参照されている定数の定義位置を集めている。

結果はGraphvizのコードとして出力している。

Usage

調査対象のアプリケーションのコードを読み込んでコードを実行する。Railsならばbin/rails rで実行すれば良い。

引数として、依存関係を手に入れたいファイルの一覧を受け取る。また--patternオプションに正規表現を渡すことで、そのパターンにマッチした定数のみを依存関係として見る。

依存関係を出してどうするかというと、コード削除をするときにどこから手を付けるかの目安にする。
例えば強く依存し合っているファイル群があり、そのファイル群以外との依存が少なければ、そのファイル群をまとめて削除することが検討できる。またどこからも依存されていないファイルがあれば、そこからの削除は手を付けやすいはず。

実行例

例として、rbs gemのRBS::Collection機能を対象として実行してみた。(この機能を消すことはないけれど、手近な例として。)

$ bundle exec ruby -Ilib -rfileutils -rrbs -rrbs/cli -rrbs/annotate depndency_graph.rb --pattern 'Collection' $(git grep Collection lib/ | cut -d : -f 1 | sort -u)
出力
digraph G {
  graph [
      layout = fdp;
  ]
n0 [ label = "lib/rbs/cli.rb" ]
n1 [ label = "lib/rbs/collection/config.rb" ]
n0 -> n1 [ label = "RBS::Collection::Config" ]
n2 [ label = "lib/rbs/collection/config/lockfile.rb" ]
n0 -> n2 [ label = "RBS::Collection::Config::Lockfile" ]
n3 [ label = "lib/rbs/collection/installer.rb" ]
n0 -> n3 [ label = "RBS::Collection::Installer" ]
n4 [ label = "lib/rbs/collection/cleaner.rb" ]
n0 -> n4 [ label = "RBS::Collection::Cleaner" ]
n5 [ label = "lib/rbs/collection.rb" ]
n4 [ label = "lib/rbs/collection/cleaner.rb" ]
n4 -> n1 [ label = "RBS::Collection::Config" ]
n1 [ label = "lib/rbs/collection/config.rb" ]
n6 [ label = "lib/rbs/collection/config/lockfile_generator.rb" ]
n1 -> n6 [ label = "RBS::Collection::Config::LockfileGenerator" ]
n7 [ label = "lib/rbs/collection/sources/rubygems.rb" ]
n1 -> n7 [ label = "RBS::Collection::Sources::Rubygems" ]
n8 [ label = "lib/rbs/collection/sources/stdlib.rb" ]
n1 -> n8 [ label = "RBS::Collection::Sources::Stdlib" ]
n9 [ label = "lib/rbs/collection/sources/base.rb" ]
n1 -> n9 [ label = "RBS::Collection::Sources" ]
n2 [ label = "lib/rbs/collection/config/lockfile.rb" ]
n2 -> n2 [ label = "RBS::Collection::Config::Lockfile" ]
n2 -> n9 [ label = "RBS::Collection::Sources" ]
n2 -> n1 [ label = "RBS::Collection::Config::CollectionNotAvailable" ]
n10 [ label = "lib/rbs/collection/sources/git.rb" ]
n2 -> n10 [ label = "RBS::Collection::Sources::Git" ]
n11 [ label = "lib/rbs/collection/sources/local.rb" ]
n2 -> n11 [ label = "RBS::Collection::Sources::Local" ]
n6 [ label = "lib/rbs/collection/config/lockfile_generator.rb" ]
n6 -> n1 [ label = "RBS::Collection::Config" ]
n6 -> n2 [ label = "RBS::Collection::Config::Lockfile" ]
n6 -> n8 [ label = "RBS::Collection::Sources::Stdlib" ]
n6 -> n6 [ label = "RBS::Collection::Config::LockfileGenerator::GemfileLockMismatchError" ]
n6 -> n9 [ label = "RBS::Collection::Sources" ]
n3 [ label = "lib/rbs/collection/installer.rb" ]
n3 -> n2 [ label = "RBS::Collection::Config::Lockfile" ]
n12 [ label = "lib/rbs/collection/sources.rb" ]
n12 -> n10 [ label = "RBS::Collection::Sources::Git" ]
n12 -> n11 [ label = "RBS::Collection::Sources::Local" ]
n12 -> n8 [ label = "RBS::Collection::Sources::Stdlib" ]
n12 -> n7 [ label = "RBS::Collection::Sources::Rubygems" ]
n9 [ label = "lib/rbs/collection/sources/base.rb" ]
n10 [ label = "lib/rbs/collection/sources/git.rb" ]
n10 -> n9 [ label = "RBS::Collection::Sources::Base" ]
n10 -> n10 [ label = "RBS::Collection::Sources::Git::CommandError" ]
n11 [ label = "lib/rbs/collection/sources/local.rb" ]
n11 -> n9 [ label = "RBS::Collection::Sources::Base" ]
n7 [ label = "lib/rbs/collection/sources/rubygems.rb" ]
n7 -> n9 [ label = "RBS::Collection::Sources::Base" ]
n8 [ label = "lib/rbs/collection/sources/stdlib.rb" ]
n8 -> n9 [ label = "RBS::Collection::Sources::Base" ]
n13 [ label = "lib/rbs/environment_loader.rb" ]
n13 -> n7 [ label = "RBS::Collection::Sources::Rubygems" ]
n13 -> n8 [ label = "RBS::Collection::Sources::Stdlib" ]
}

この出力をGraphvizに食わせると、以下のような画像が生成された。

Graph

なんとなく雰囲気はつかめるのではないか。

file_dependency.rb

こちらは1つのファイルを対象に、そのファイルがどのファイルから依存されているかを出力するスクリプト。

実装は以下。

実装
# file_dependency.rb:
#   Report all references to a given file.
#
# Usage:
#   ruby file_dependency.rb FILE

require 'parser/current'
require 'open3'

def sh!(*cmd)
  out, _err, _st = Open3.capture3(*cmd)
  out
end

class TreeVisitor
  def start(node)
    __send__ :"on_#{node.type}", node, nil
  end

  def method_missing(name, *args)
    if name.to_s.start_with?('on_')
      node = args[0]
      node.children.each do |child|
        __send__(:"on_#{child.type}", child, node) if child.is_a?(Parser::AST::Node)
      end
    else
      super
    end
  end
end

module Entry
  def describe
    <<~DESC
      ## #{full_name}

      #{references.map { |r| "* #{r}" }.join("\n")}
    DESC
  end

  def references
    color = $stdout.tty? ? ['--color'] : []
    out = sh! "git", "grep", "-nwF", *color, '--', name.to_s
    out.split("\n")
  end
end

MethodDef = Struct.new(:name, :namespace, :singleton_p, keyword_init: true) do
  include Entry

  def full_name
    if namespace.empty?
      "##{name}"
    else
      "#{namespace.join('::')}#{singleton_p ? '.' : '#'}#{name}"
    end
  end
end

ConstDef = Struct.new(:name, :namespace, keyword_init: true) do
  include Entry

  def full_name
    [namespace, name].flatten.join('::')
  end
end

class MethodDefCollector < TreeVisitor
  def initialize
    @namespace = []
    @singleton_p = false
    @method_defs = []
    @const_defs = []
    @leaf = nil
  end

  attr_reader :method_defs, :const_defs

  def on_def(node, _)
    return if node.children[0] == :initialize

    d = MethodDef.new(name: node.children[0], namespace: @namespace, singleton_p: @singleton_p)
    @method_defs << d
  end

  def on_defs(node, _)
    d = MethodDef.new(name: node.children[1], namespace: @namespace, singleton_p: true)
    @method_defs << d
  end

  def on_casgn(node, _)
    @const_defs << ConstDef.new(name: node.children[1], namespace: @namespace)
  end

  def on_class(node, _)
    @namespace = [*@namespace, node.children[0].children[1]]
    @leaf = node
    super
    if @leaf == node
      @const_defs << ConstDef.new(name: node.children[0].children[1], namespace: @namespace[0..-2])
    end
  ensure
    @namespace = @namespace[0..-2]
  end

  alias on_module on_class

  def on_sclass(node, _)
    before = @singleton_p
    @singleton_p = true
    super
  ensure
    @singleton_p = before
  end
end

def main
  path = ARGV[0]
  content = File.read(path)
  tree = Parser::CurrentRuby.parse(content)

  c = MethodDefCollector.new
  c.start tree

  puts [*c.const_defs.map(&:describe), *c.method_defs.map(&:describe)].join("\n\n")
end

main

Implementation

受け取ったファイルをparser gemでパースし、メソッド、定数、クラス、モジュール定義を抜き出し、それらの抜き出した定義を参照している箇所を一覧で表示する。

参照の取得は、git grepを使っている。git grepしているだけなので、ものによっては大量の偽陽性が出ることもあるのだけれど、それは運用でカバーする。なおinitializeだけは明らかに偽陽性しか出ないので、無視している。

git grepをちょっと便利にするやつ、ぐらいの気持ち。

Usage

調査対象のファイルを1つ指定して実行する。実行時の情報は使わないので、アプリケーションコードを読み込む必要はない。

これを実行するとファイル内で定義されているメソッドなどが使われていそうな場所がわかるので、それを見て対象ファイルを消していいかの判断の参考にする。

Example

こちらも RBS Collection を対象に実行してみた。

$ ruby file_dependency.rb lib/rbs/collection/config/lockfile.rb
結果
## RBS::Collection::Config::Lockfile

* benchmark/utils.rb:30:    lock = RBS::Collection::Config::Lockfile.from_lockfile(lockfile_path: lock_path, data: YAML.load_file(lock_path.to_s))
* core/array.rbs:115:# *   Gem::RequestSet::Lockfile::Tokenizer#to_a
* lib/rbs/cli.rb:40:            lock = Collection::Config::Lockfile.from_lockfile(lockfile_path: lock_path, data: YAML.load_file(lock_path.to_s))
* lib/rbs/collection/config/lockfile.rb:6:      class Lockfile
* lib/rbs/collection/config/lockfile.rb:48:          lockfile = Lockfile.new(lockfile_path: lockfile_path, path: path, gemfile_lock_path: gemfile_lock_path)
* lib/rbs/collection/config/lockfile_generator.rb:42:          @lockfile = Lockfile.new(
* lib/rbs/collection/config/lockfile_generator.rb:49:            @existing_lockfile = Lockfile.from_lockfile(lockfile_path: lockfile_path, data: YAML.load_file(lockfile_path.to_s))
* lib/rbs/collection/config/lockfile_generator.rb:100:            # @type var locked: Lockfile::library?
* lib/rbs/collection/installer.rb:10:        @lockfile = Config::Lockfile.from_lockfile(lockfile_path: lockfile_path, data: YAML.load_file(lockfile_path))
* sig/collection/config.rbs:26:      def self.generate_lockfile: (config_path: Pathname, definition: Bundler::Definition, ?with_lockfile: boolish) -> [Config, Lockfile]
* sig/collection/config/lockfile.rbs:4:      # Lockfile represents the `rbs_collection.lock.yaml`, that contains configurations and *resolved* gems with their sources
* sig/collection/config/lockfile.rbs:6:      class Lockfile
* sig/collection/config/lockfile.rbs:59:        def self.from_lockfile: (lockfile_path: Pathname, data: lockfile_data) -> Lockfile
* sig/collection/config/lockfile_generator.rbs:17:        attr_reader lockfile: Lockfile
* sig/collection/config/lockfile_generator.rbs:18:        attr_reader existing_lockfile: Lockfile?
* sig/collection/config/lockfile_generator.rbs:28:        def self.generate: (config: Config, definition: Bundler::Definition, ?with_lockfile: boolish) -> Lockfile
* sig/collection/config/lockfile_generator.rbs:38:        def validate_gemfile_lock_path!: (lock: Lockfile?, gemfile_lock_path: Pathname) -> void
* sig/collection/installer.rbs:4:      attr_reader lockfile: Config::Lockfile
* sig/environment_loader.rbs:88:    def add_collection: (Collection::Config::Lockfile lockfile) -> void
* test/rbs/environment_loader_test.rb:227:      lock = RBS::Collection::Config::Lockfile.from_lockfile(lockfile_path: lockfile_path, data: YAML.load_file(lockfile_path))
* test/rbs/environment_loader_test.rb:262:      lock = RBS::Collection::Config::Lockfile.from_lockfile(lockfile_path: lockfile_path, data: YAML.load_file(lockfile_path))
* test/rbs/environment_loader_test.rb:315:      lock = RBS::Collection::Config::Lockfile.from_lockfile(lockfile_path: lockfile_path, data: YAML.load_file(lockfile_path.to_s))
* test/rbs/environment_loader_test.rb:344:      lock = RBS::Collection::Config::Lockfile.from_lockfile(lockfile_path: lockfile_path, data: YAML.load_file(lockfile_path.to_s))


## RBS::Collection::Config::Lockfile#fullpath

* lib/rbs/collection/config/lockfile.rb:18:        def fullpath
* lib/rbs/collection/config/lockfile.rb:74:          raise CollectionNotAvailable unless fullpath.exist?
* lib/rbs/collection/config/lockfile.rb:81:              meta_path = fullpath.join(gem[:name], gem[:version], Sources::Git::METADATA_FILENAME)
* lib/rbs/collection/config/lockfile.rb:85:              raise CollectionNotAvailable unless fullpath.join(gem[:name], gem[:version]).symlink?
* lib/rbs/collection/installer.rb:15:        install_to = lockfile.fullpath
* lib/rbs/environment_loader.rb:83:      repository.add(lockfile.fullpath)
* sig/collection/config/lockfile.rbs:51:        def fullpath: () -> Pathname
* test/rbs/environment_loader_test.rb:239:      assert repo.dirs.include? lock.fullpath


## RBS::Collection::Config::Lockfile#gemfile_lock_fullpath

* lib/rbs/collection/config/lockfile.rb:22:        def gemfile_lock_fullpath
* lib/rbs/collection/config/lockfile_generator.rb:85:          return unless lock.gemfile_lock_fullpath
* lib/rbs/collection/config/lockfile_generator.rb:86:          unless File.realpath(lock.gemfile_lock_fullpath) == File.realpath(gemfile_lock_path)
* lib/rbs/collection/config/lockfile_generator.rb:87:            raise GemfileLockMismatchError.new(expected: lock.gemfile_lock_fullpath, actual: gemfile_lock_path)
* sig/collection/config/lockfile.rbs:55:        %a{pure} def gemfile_lock_fullpath: () -> Pathname?


## RBS::Collection::Config::Lockfile#to_lockfile

* lib/rbs/collection/config/lockfile.rb:28:        def to_lockfile
* lib/rbs/collection/config/lockfile.rb:69:            "source" => lib[:source].to_lockfile
* lib/rbs/collection/config/lockfile_generator.rb:80:          lockfile.lockfile_path.write(YAML.dump(lockfile.to_lockfile))
* lib/rbs/collection/sources/git.rb:113:        def to_lockfile
* lib/rbs/collection/sources/git.rb:211:            "source" => to_lockfile
* lib/rbs/collection/sources/local.rb:72:        def to_lockfile
* lib/rbs/collection/sources/rubygems.rb:36:        def to_lockfile
* lib/rbs/collection/sources/stdlib.rb:38:        def to_lockfile
* sig/collection/config/lockfile.rbs:57:        def to_lockfile: () -> lockfile_data
* sig/collection/sources.rbs:11:        def to_lockfile: () -> source_entry
* sig/collection/sources.rbs:62:        def to_lockfile: () -> source_entry
* sig/collection/sources.rbs:149:        def to_lockfile: () -> source_entry
* sig/collection/sources.rbs:178:        def to_lockfile: () -> source_entry
* sig/collection/sources.rbs:204:        def to_lockfile: () -> source_entry
* test/rbs/collection/config_test.rb:58:      string = YAML.dump(lockfile.to_lockfile)
* test/rbs/collection/config_test.rb:109:        string = YAML.dump(lockfile.to_lockfile)
* test/rbs/collection/config_test.rb:171:      string = YAML.dump(lockfile.to_lockfile)
* test/rbs/collection/config_test.rb:194:      string = YAML.dump(lockfile.to_lockfile)
* test/rbs/collection/config_test.rb:253:      string = YAML.dump(lockfile.to_lockfile)
* test/rbs/collection/config_test.rb:329:      string = YAML.dump(lockfile.to_lockfile)
* test/rbs/collection/config_test.rb:378:      string = YAML.dump(lockfile.to_lockfile)
* test/rbs/collection/config_test.rb:419:      string = YAML.dump(lockfile.to_lockfile)
* test/rbs/collection/config_test.rb:516:      string = YAML.dump(lockfile.to_lockfile)
* test/rbs/collection/config_test.rb:574:      string = YAML.dump(lockfile.to_lockfile)
* test/rbs/collection/config_test.rb:630:      string = YAML.dump(lockfile.to_lockfile)
* test/rbs/collection/config_test.rb:674:      string = YAML.dump(lockfile.to_lockfile)
* test/rbs/collection/config_test.rb:752:      string = YAML.dump(lockfile.to_lockfile)
* test/rbs/collection/config_test.rb:846:      string = YAML.dump(lockfile.to_lockfile)


## RBS::Collection::Config::Lockfile.from_lockfile

* benchmark/utils.rb:30:    lock = RBS::Collection::Config::Lockfile.from_lockfile(lockfile_path: lock_path, data: YAML.load_file(lock_path.to_s))
* lib/rbs/cli.rb:40:            lock = Collection::Config::Lockfile.from_lockfile(lockfile_path: lock_path, data: YAML.load_file(lock_path.to_s))
* lib/rbs/collection/config/lockfile.rb:42:        def self.from_lockfile(lockfile_path:, data:)
* lib/rbs/collection/config/lockfile_generator.rb:49:            @existing_lockfile = Lockfile.from_lockfile(lockfile_path: lockfile_path, data: YAML.load_file(lockfile_path.to_s))
* lib/rbs/collection/installer.rb:10:        @lockfile = Config::Lockfile.from_lockfile(lockfile_path: lockfile_path, data: YAML.load_file(lockfile_path))
* sig/collection/config/lockfile.rbs:59:        def self.from_lockfile: (lockfile_path: Pathname, data: lockfile_data) -> Lockfile
* test/rbs/environment_loader_test.rb:227:      lock = RBS::Collection::Config::Lockfile.from_lockfile(lockfile_path: lockfile_path, data: YAML.load_file(lockfile_path))
* test/rbs/environment_loader_test.rb:262:      lock = RBS::Collection::Config::Lockfile.from_lockfile(lockfile_path: lockfile_path, data: YAML.load_file(lockfile_path))
* test/rbs/environment_loader_test.rb:315:      lock = RBS::Collection::Config::Lockfile.from_lockfile(lockfile_path: lockfile_path, data: YAML.load_file(lockfile_path.to_s))
* test/rbs/environment_loader_test.rb:344:      lock = RBS::Collection::Config::Lockfile.from_lockfile(lockfile_path: lockfile_path, data: YAML.load_file(lockfile_path.to_s))


## RBS::Collection::Config::Lockfile#library_data

* lib/rbs/collection/config/lockfile.rb:33:            "gems" => gems.each_value.sort_by {|g| g[:name] }.map {|hash| library_data(hash) },
* lib/rbs/collection/config/lockfile.rb:65:        def library_data(lib)
* lib/rbs/collection/config/lockfile.rb:83:              raise CollectionNotAvailable unless library_data(gem) == YAML.load(meta_path.read)
* sig/collection/config/lockfile.rbs:11:          "gems" => Array[library_data]?,         # null if empty
* sig/collection/config/lockfile.rbs:15:        type library_data = {
* sig/collection/config/lockfile.rbs:70:        def library_data: (library) -> library_data


## RBS::Collection::Config::Lockfile#check_rbs_availability!

* lib/rbs/collection/config/lockfile.rb:73:        def check_rbs_availability!
* lib/rbs/environment_loader.rb:81:      lockfile.check_rbs_availability!
* sig/collection/config/lockfile.rbs:66:        def check_rbs_availability!: () -> void

最後に

コード削除のための補助ツールの紹介でした。

Money Forward Developers

Discussion