🍣

DYLDの仕様がしれっと変わっているじゃないかの巻

に公開

いわゆるBIGINTを実装しててbcの演算よりも遅い原因を探っているうちにmalloc回数の影響が大きそうというところに行き着き、そもそもmallocにちゃんと向き合ってなかったなと再実装することに。基本、Linuxで動かすつもりではいたのだけれど普通Macでも動くように作るでしょと2つの環境でテストしていたところはまったよ、という話。

DYLD_FORCE_FLAT_NAMESPACEがないぞ

macでは2段階名前空間によってビルドされるため、Linuxのように動的ライブラリをプリロードしようとすると若干ややこしい。名前が衝突しないならば問題はないのだけれど、今回のようにmallocを置き換える目的で動的ライブラリを作成する際にはDYLD_FORCE_FLAT_NAMESPACE環境変数をセットする必要がある、と言われてきた。LLMもそう答える。

しかし、最新OSにしてある手元のmacではうんともすんとも言わない。というか、普通に標準のmallocが呼ばれてしまう。リンカに渡すオプションかなにかが欠けてるのかとか自分の不備を疑いまくったがそういうわけではなかった。

最近のmacosではdyld系環境変数がしれっと変わっており、以前はあったDYLD_FORCE_FLAT_NAMESPACEがなくなっていることが原因。手元のマシンで見たところ、Montereyではすでにない。最新のSequoiaももちろんないのでたとえ設定してもなんのご利益もない。環境変数だからワーニングもなんも出ないのがつらいところ。

manを貼っておく。

Sequoia 15.5

DYLD(1)                                                                 General Commands Manual                                                                DYLD(1)

NAME
       dyld - the dynamic linker

SYNOPSIS
       DYLD_FRAMEWORK_PATH
       DYLD_FALLBACK_FRAMEWORK_PATH
       DYLD_VERSIONED_FRAMEWORK_PATH
       DYLD_LIBRARY_PATH
       DYLD_FALLBACK_LIBRARY_PATH
       DYLD_VERSIONED_LIBRARY_PATH
       DYLD_IMAGE_SUFFIX
       DYLD_INSERT_LIBRARIES
       DYLD_PRINT_TO_FILE
       DYLD_PRINT_LIBRARIES
       DYLD_PRINT_LOADERS
       DYLD_PRINT_SEARCHING
       DYLD_PRINT_APIS
       DYLD_PRINT_BINDINGS
       DYLD_PRINT_INITIALIZERS
       DYLD_PRINT_SEGMENTS
       DYLD_PRINT_ENV
       DYLD_PRINT_LINKS_WITH
       DYLD_SHARED_REGION
       DYLD_SHARED_CACHE_DIR

DESCRIPTION
       The dynamic linker (dyld) checks the following environment variables during the launch of each process.
       Note: If System Integrity Protection is enabled, these environment variables are ignored when executing binaries protected by System Integrity Protection.

       DYLD_FRAMEWORK_PATH
              This is a colon separated list of directories that contain frameworks.  The dynamic linker searches these directories before it searches for the
              framework by its install name.  It allows you to test new versions of existing frameworks. (A framework is a library install name that ends in the form
              XXX.framework/Versions/A/XXX or XXX.framework/XXX, where XXX and A are any name.)

              For each framework that a program uses, the dynamic linker looks for the framework in each directory in DYLD_FRAMEWORK_PATH in turn. If it looks in all
              those directories and can't find the framework, it uses whatever it would have loaded if DYLD_FRAMEWORK_PATH had not been set.

              Use the -L option to otool(1) to discover the frameworks and shared libraries that the executable is linked against.

       DYLD_FALLBACK_FRAMEWORK_PATH
              This is a colon separated list of directories that contain frameworks.  If a framework is not found at its install path, dyld uses this as a list of
              directories to search for the framework.

              For new binaries (Fall 2023 or later) there is no default fallback.  For older binaries, there is a default fallback search path of:
              /Library/Frameworks:/System/Library/Frameworks

       DYLD_VERSIONED_FRAMEWORK_PATH
              This is a colon separated list of directories that contain potential override frameworks.  The dynamic linker searches these directories for frameworks.
              For each framework found dyld looks at its LC_ID_DYLIB and gets the current_version and install name.  Dyld then looks for the framework at the install
              name path.  Whichever has the larger current_version value will be used in the process whenever a framework with that install name is required.  This is
              similar to DYLD_FRAMEWORK_PATH except instead of always overriding, it only overrides if the supplied framework is newer.  Note: dyld does not check the
              framework's Info.plist to find its version.  Dyld only checks the -current_version number supplied when the framework was created.

       DYLD_LIBRARY_PATH
              This is a colon separated list of directories that contain libraries. The dynamic linker searches these directories before it searches the default
              locations for libraries. It allows you to test new versions of existing libraries.

              For each dylib that a program uses, the dynamic linker looks for its leaf name in each directory in DYLD_LIBRARY_PATH.

              Use the -L option to otool(1) to discover the frameworks and shared libraries that the executable is linked against.

       DYLD_FALLBACK_LIBRARY_PATH
              This is a colon separated list of directories that contain libraries.  If a dylib is not found at its install  path, dyld uses this as a list of
              directories to search for the dylib.

              For new binaries (Fall 2023 or later) there is no default.  For older binaries, there is a default fallback search path of: /usr/local/lib:/usr/lib.

       DYLD_VERSIONED_LIBRARY_PATH
              This is a colon separated list of directories that contain potential override libraries.  The dynamic linker searches these directories for dynamic
              libraries.  For each library found dyld looks at its LC_ID_DYLIB and gets the current_version and install name.  Dyld then looks for the library at the
              install name path.  Whichever has the larger current_version value will be used in the process whenever a dylib with that install name is required.
              This is similar to DYLD_LIBRARY_PATH except instead of always overriding, it only overrides is the supplied library is newer.

       DYLD_IMAGE_SUFFIX
              This is set to a string of a suffix to try to be used for all shared libraries used by the program.  For libraries ending in ".dylib" the suffix is
              applied just before the ".dylib".  For all other libraries the suffix is appended to the library name.  This is useful for using conventional "_profile"
              and "_debug" libraries and frameworks.

       DYLD_INSERT_LIBRARIES
              This is a colon separated list of additional dynamic libraries to load before the ones specified in the program. If instead, your goal is to substitute
              a library that would normally be loaded, use DYLD_LIBRARY_PATH or DYLD_FRAMEWORK_PATH instead.

       DYLD_PRINT_TO_FILE
              This is a path to a (writable) file. Normally, the dynamic linker writes all logging output (triggered by DYLD_PRINT_* settings) to file descriptor 2
              (which is usually stderr).  But this setting causes the dynamic linker to write logging output to the specified file.

       DYLD_PRINT_ENV
              If set, causes dyld to print a line of key=value for each environment variable in the process.

       DYLD_PRINT_LIBRARIES
              If set, causes dyld to print a line for each mach-o image loaded into a process.  This is useful to make sure that the use of DYLD_LIBRARY_PATH is
              getting what you want.

       DYLD_PRINT_LOADERS
              If set, causes dyld to print a line whether each image is tracked by a JustInTimeLoader or a PrebuiltLoader.  Additionally, it prints if a
              PrebuiltLoaderSet was used to launch the process or if a PrebuiltLoader was written to make the next launch faster.

       DYLD_PRINT_SEARCHING
              If set, causes dyld to print a line about each file system path checked when searching for an image to load.

       DYLD_PRINT_INITIALIZERS
              If set, causes dyld to print out a line when running each initializer in every image.  Initializers run by dyld include constructors for C++ statically
              allocated objects, functions marked with __attribute__((constructor)), and -init functions.

       DYLD_PRINT_APIS
              If set, causes dyld to print a line whenever a dyld API is called (e.g. dlopen()).

       DYLD_PRINT_SEGMENTS
              If set, causes dyld to print out a line containing the name and address range of each mach-o segment that dyld maps.  In addition it prints information
              about if the image was from the dyld shared cache.

       DYLD_PRINT_BINDINGS
              If set, causes dyld to print a line each time a symbolic name is bound.

       DYLD_PRINT_LINKS_WITH
              If set to the leaf name of a mach-o image, dyld prints why that image was loaded, including the chain of links from the main executable or dlopen()ed
              image to the request image name. The leaf name needs to be the actual leaf file/install name (e.g. "libz.1.dylib" and not one of the aliases such as
              "libz.dylib").  When reporting the chain of links the --> may contain a letter (-w-> is a weak link, -r-> is a re-export, -u-> is an upward link, -d->
              is a delay-init link).

       DYLD_SHARED_REGION
              This can be "use" (the default) or "private".  Setting it to "private" tells dyld to remove the shared region from the process address space and mmap()
              back in a private copy of the dyld shared cache in the shared region address range. This is only useful if the shared cache on disk has been updated and
              is different than the shared cache in use.

       DYLD_SHARED_CACHE_DIR
              This is a directory containing dyld shared cache files.  This variable can be used in conjunction with DYLD_SHARED_REGION=private to run a process with
              an alternate shared cache.

DYNAMIC LIBRARY LOADING
       Unlike many other operating systems, Darwin does not locate dependent dynamic libraries via their leaf file name.  Instead the full path to each dylib is used
       (e.g. /usr/lib/libSystem.B.dylib).  But there are times when a full path is not appropriate; for instance, may want your binaries to be installable in anywhere
       on the disk.  To support that, there are three @xxx/ variables that can be used as a path prefix.  At runtime dyld substitutes a dynamically generated path for
       the @xxx/ prefix.

       @executable_path/
              This variable is replaced with the path to the directory containing the main executable for the process.  This is useful for loading dylibs/frameworks
              embedded in a .app directory.  If the main executable file is at /some/path/My.app/Contents/MacOS/My and a framework dylib file is at
              /some/path/My.app/Contents/Frameworks/Foo.framework/Versions/A/Foo, then the framework load path could be encoded as
              @executable_path/../Frameworks/Foo.framework/Versions/A/Foo and the .app directory could be moved around in the file system and dyld will still be able
              to load the embedded framework.

       @loader_path/
              This variable is replaced with the path to the directory containing the mach-o binary which contains the load command using @loader_path. Thus, in every
              binary, @loader_path resolves to a different path, whereas @executable_path always resolves to the same path. @loader_path is useful as the load path
              for a framework/dylib embedded in a plug-in, if the final file system location of the plugin-in unknown (so absolute paths cannot be used) or if the
              plug-in is used by multiple applications (so @executable_path cannot be used). If the plug-in mach-o file is at
              /some/path/Myfilter.plugin/Contents/MacOS/Myfilter and a framework dylib file is at
              /some/path/Myfilter.plugin/Contents/Frameworks/Foo.framework/Versions/A/Foo, then the framework load path could be encoded as
              @loader_path/../Frameworks/Foo.framework/Versions/A/Foo and the Myfilter.plugin directory could be moved around in the file system and dyld will still
              be able to load the embedded framework.

       @rpath/
              Dyld maintains a current stack of paths called the run path list.  When @rpath is encountered it is substituted with each path in the run path list
              until a loadable dylib if found.  The run path stack is built from the LC_RPATH load commands in the depencency chain that lead to the current dylib
              load.  You can add an LC_RPATH load command to an image with the -rpath option to ld(1).  You can even add a LC_RPATH load command path that starts with
              @loader_path/, and it will push a path on the run path stack that relative to the image containing the LC_RPATH.  The use of @rpath is most useful when
              you have a complex directory structure of programs and dylibs which can be installed anywhere, but keep their relative positions.  This scenario could
              be implemented using @loader_path, but every client of a dylib could need a different load path because its relative position in the file system is
              different. The use of @rpath introduces a level of indirection that simplifies things.  You pick a location in your directory structure as an anchor
              point.  Each dylib then gets an install path that starts with @rpath and is the path to the dylib relative to the anchor point. Each main executable is
              linked with -rpath @loader_path/zzz, where zzz is the path from the executable to the anchor point.  At runtime dyld sets it run path to be the anchor
              point, then each dylib is found relative to the anchor point.

SEE ALSO
       dyld_info(1), ld(1), otool(1)

その前のバージョンとなるとMojave、Catalina、Big Surになるが、それらがインストールされたマシンは手元にない。さらに古いHigh Sierra 10.13.6入のMacbook Airがあったのでこちらを見てると、DYLD_FORCE_FLAT_NAMESPACEがある。

High Sierra 10.13.6


DYLD(1)                                                                                                                                                  DYLD(1)

NAME
       dyld - the dynamic linker

SYNOPSIS
       DYLD_FRAMEWORK_PATH
       DYLD_FALLBACK_FRAMEWORK_PATH
       DYLD_VERSIONED_FRAMEWORK_PATH
       DYLD_LIBRARY_PATH
       DYLD_FALLBACK_LIBRARY_PATH
       DYLD_VERSIONED_LIBRARY_PATH
       DYLD_PRINT_TO_FILE
       DYLD_SHARED_REGION
       DYLD_INSERT_LIBRARIES
       DYLD_FORCE_FLAT_NAMESPACE
       DYLD_IMAGE_SUFFIX
       DYLD_PRINT_OPTS
       DYLD_PRINT_ENV
       DYLD_PRINT_LIBRARIES
       DYLD_BIND_AT_LAUNCH
       DYLD_DISABLE_DOFS
       DYLD_PRINT_APIS
       DYLD_PRINT_BINDINGS
       DYLD_PRINT_INITIALIZERS
       DYLD_PRINT_REBASINGS
       DYLD_PRINT_SEGMENTS
       DYLD_PRINT_STATISTICS
       DYLD_PRINT_DOFS
       DYLD_PRINT_RPATHS
       DYLD_SHARED_CACHE_DIR
       DYLD_SHARED_CACHE_DONT_VALIDATE

DESCRIPTION
       The dynamic linker checks the following environment variables during the launch of each process.
       Note:  If  System  Integrity Protection is enabled, these environment variables are ignored when executing binaries protected by System Integrity Protec-
       tion.

       DYLD_FRAMEWORK_PATH
              This is a colon separated list of directories that contain frameworks.  The dynamic linker searches these directories before it searches  for  the
              framework by its install name.  It allows you to test new versions of existing frameworks. (A framework is a library install name that ends in the
              form XXX.framework/Versions/YYY/XXX or XXX.framework/XXX, where XXX and YYY are any name.)

              For each framework that a program uses, the dynamic linker looks for the framework in each directory in DYLD_FRAMEWORK_PATH in turn. If  it  looks
              in  all  the directories and can't find the framework, it searches the directories in DYLD_LIBRARY_PATH in turn. If it still can't find the frame-
              work, it then searches DYLD_FALLBACK_FRAMEWORK_PATH and DYLD_FALLBACK_LIBRARY_PATH in turn.

              Use the -L option to otool(1).  to discover the frameworks and shared libraries that the executable is linked against.

       DYLD_FALLBACK_FRAMEWORK_PATH
              This is a colon separated list of directories that contain frameworks.  It is used as the default location  for  frameworks  not  found  in  their
              install path.

              By default, it is set to /Library/Frameworks:/Network/Library/Frameworks:/System/Library/Frameworks

       DYLD_VERSIONED_FRAMEWORK_PATH
              This  is  a  colon  separated  list  of directories that contain potential override frameworks.  The dynamic linker searches these directories for
              frameworks.  For each framework found dyld looks at its LC_ID_DYLIB and gets the current_version and install name.  Dyld then looks for the frame-
              work  at the install name path.  Whichever has the larger current_version value will be used in the process whenever a framework with that install
              name is required.  This is similar to DYLD_FRAMEWORK_PATH except instead of always overriding, it only overrides  is  the  supplied  framework  is
              newer.  Note: dyld does not check the framework's Info.plist to find its version.  Dyld only checks the -currrent_version number supplied when the
              framework was created.

       DYLD_LIBRARY_PATH
              This is a colon separated list of directories that contain libraries. The dynamic linker searches these directories before it searches the default
              locations for libraries. It allows you to test new versions of existing libraries.

              For  each  library that a program uses, the dynamic linker looks for it in each directory in DYLD_LIBRARY_PATH in turn. If it still can't find the
              library, it then searches DYLD_FALLBACK_FRAMEWORK_PATH and DYLD_FALLBACK_LIBRARY_PATH in turn.

              Use the -L option to otool(1).  to discover the frameworks and shared libraries that the executable is linked against.

       DYLD_FALLBACK_LIBRARY_PATH
              This is a colon separated list of directories that contain libraries.  It is used as the default location for libraries not found in their install
              path.  By default, it is set to $(HOME)/lib:/usr/local/lib:/lib:/usr/lib.

       DYLD_VERSIONED_LIBRARY_PATH
              This  is  a  colon  separated  list  of  directories that contain potential override libraries.  The dynamic linker searches these directories for
              dynamic libraries.  For each library found dyld looks at its LC_ID_DYLIB and gets the current_version and install name.  Dyld then looks  for  the
              library  at  the install name path.  Whichever has the larger current_version value will be used in the process whenever a dylib with that install
              name is required.  This is similar to DYLD_LIBRARY_PATH except instead of always overriding, it only overrides is the supplied library is newer.

       DYLD_PRINT_TO_FILE
              This is a path to a (writable) file. Normally, the dynamic linker writes all logging output (triggered by DYLD_PRINT_* settings) to file  descrip-
              tor 2 (which is usually stderr).  But this setting causes the dynamic linker to write logging output to the specified file.

       DYLD_SHARED_REGION
              This  can  be "use" (the default), "avoid", or "private".  Setting it to "avoid" tells dyld to not use the shared cache.  All OS dylibs are loaded
              dynamically just like every other dylib.  Setting it to "private" tells dyld to remove the shared region from the process address space and mmap()
              back  in  a  private  copy  of  the dyld shared cache in the shared region address range. This is only useful if the shared cache on disk has been
              updated and is different than the shared cache in use.

       DYLD_INSERT_LIBRARIES
              This is a colon separated list of dynamic libraries to load before the ones specified in the program.  This lets you test new modules of  existing
              dynamic  shared  libraries  that  are used in flat-namespace images by loading a temporary dynamic shared library with just the new modules.  Note
              that this has no effect on images built a two-level namespace images using a dynamic shared library unless DYLD_FORCE_FLAT_NAMESPACE is also used.

       DYLD_FORCE_FLAT_NAMESPACE
              Force  all  images  in  the program to be linked as flat-namespace images and ignore any two-level namespace bindings.  This may cause programs to
              fail to execute with a multiply defined symbol error if two-level namespace images are used to allow the images to have multiply defined  symbols.

       DYLD_IMAGE_SUFFIX
              This  is  set to a string of a suffix to try to be used for all shared libraries used by the program.  For libraries ending in ".dylib" the suffix
              is applied just before the ".dylib".  For all other libraries the suffix is appended to the library name.  This is useful for  using  conventional
              "_profile" and "_debug" libraries and frameworks.

       DYLD_PRINT_OPTS
              When this is set, the dynamic linker writes to file descriptor 2 (normally standard error) the command line options.

       DYLD_PRINT_ENV
              When this is set, the dynamic linker writes to file descriptor 2 (normally standard error) the environment variables.

       DYLD_PRINT_LIBRARIES
              When  this  is  set, the dynamic linker writes to file descriptor 2 (normally standard error) the filenames of the libraries the program is using.
              This is useful to make sure that the use of DYLD_LIBRARY_PATH is getting what you want.

       DYLD_BIND_AT_LAUNCH
              When this is set, the dynamic linker binds all undefined symbols the program needs at launch time. This includes function  symbols  that  can  are
              normally lazily bound at the time of their first call.

       DYLD_PRINT_STATISTICS
              Right  before the process's main() is called, dyld prints out information about how dyld spent its time.  Useful for analyzing launch performance.

       DYLD_PRINT_STATISTICS_DETAILS
              Right before the process's main() is called, dyld prints out detailed information about how dyld spent its time.  Useful for analyzing launch per-
              formance.

       DYLD_DISABLE_DOFS
              Causes dyld not register dtrace static probes with the kernel.

       DYLD_PRINT_INITIALIZERS
              Causes  dyld to print out a line when running each initializers in every image.  Initializers run by dyld included constructors for C++ statically
              allocated objects, functions marked with __attribute__((constructor)), and -init functions.

       DYLD_PRINT_APIS
              Causes dyld to print a line whenever a dyld API is called (e.g. NSAddImage()).

       DYLD_PRINT_SEGMENTS
              Causes dyld to print out a line containing the name and address range of each mach-o segment that dyld maps.  In addition  it  prints  information
              about if the image was from the dyld shared cache.

       DYLD_PRINT_BINDINGS
              Causes dyld to print a line each time a symbolic name is bound.

       DYLD_PRINT_DOFS
              Causes dyld to print out information about dtrace static probes registered with the kernel.

       DYLD_PRINT_RPATHS
              Cause dyld  to print a line each time it expands an @rpath variable and whether that expansion was successful or not.

       DYLD_SHARED_CACHE_DIR
              This  is  a  directory  containing  dyld  shared  cache  files.   This  variable  can  be  used in conjunction with DYLD_SHARED_REGION=private and
              DYLD_SHARED_CACHE_DONT_VALIDATE to run a process with an alternate shared cache.

       DYLD_SHARED_CACHE_DONT_VALIDATE
              Causes dyld to not check that the inode and mod-time of files in the shared cache match the requested dylib on disk. Thus a program can be made to
              run with the dylib in the shared cache even though the real dylib has been updated on disk.

       DYNAMIC LIBRARY LOADING
              Unlike  many  other operating systems, Darwin does not locate dependent dynamic libraries via their leaf file name.  Instead the full path to each
              dylib is used (e.g. /usr/lib/libSystem.B.dylib).  But there are times when a full path is not appropriate; for instance, may want your binaries to
              be installable in anywhere on the disk.  To support that, there are three @xxx/ variables that can be used as a path prefix.  At runtime dyld sub-
              stitutes a dynamically generated path for the @xxx/ prefix.

       @executable_path/
              This variable is replaced with the path to the directory containing the main executable for the process.  This is useful for loading dylibs/frame-
              works  embedded  in  a  .app  directory.   If  the main executable file is at /some/path/My.app/Contents/MacOS/My and a framework dylib file is at
              /some/path/My.app/Contents/Frameworks/Foo.framework/Versions/A/Foo, then the framework load path could be  encoded  as  @executable_path/../Frame-
              works/Foo.framework/Versions/A/Foo  and the .app directory could be moved around in the file system and dyld will still be able to load the embed-
              ded framework.

       @loader_path/
              This variable is replaced with the path to the directory containing the mach-o binary which contains the load command using @loader_path. Thus, in
              every  binary, @loader_path resolves to a different path, whereas @executable_path always resolves to the same path. @loader_path is useful as the
              load path for a framework/dylib embedded in a plug-in, if the final file system location of the plugin-in unknown (so  absolute  paths  cannot  be
              used)  or if the plug-in is used by multiple applications (so @executable_path cannot be used). If the plug-in mach-o file is at /some/path/Myfil-
              ter.plugin/Contents/MacOS/Myfilter and a framework dylib file is  at  /some/path/Myfilter.plugin/Contents/Frameworks/Foo.framework/Versions/A/Foo,
              then  the  framework load path could be encoded as @loader_path/../Frameworks/Foo.framework/Versions/A/Foo and the Myfilter.plugin directory could
              be moved around in the file system and dyld will still be able to load the embedded framework.

       @rpath/
              Dyld maintains a current stack of paths called the run path list.  When @rpath is encountered it is substituted with each path  in  the  run  path
              list  until  a loadable dylib if found.  The run path stack is built from the LC_RPATH load commands in the depencency chain that lead to the cur-
              rent dylib load.  You can add an LC_RPATH load command to an image with the -rpath option to ld(1).  You can even add a LC_RPATH load command path
              that  starts  with  @loader_path/,  and  it will push a path on the run path stack that relative to the image containing the LC_RPATH.  The use of
              @rpath is most useful when you have a complex directory structure of programs and dylibs which can be installed anywhere, but keep their  relative
              positions.   This scenario could be implemented using @loader_path, but every client of a dylib could need a different load path because its rela-
              tive position in the file system is different. The use of @rpath introduces a level of indirection that simplies things.  You pick a  location  in
              your  directory  structure as an anchor point.  Each dylib then gets an install path that starts with @rpath and is the path to the dylib relative
              to the anchor point. Each main executable is linked with -rpath @loader_path/zzz, where zzz is the path from the executable to the  anchor  point.
              At runtime dyld sets it run path to be the anchor point, then each dylib is found relative to the anchor point.

SEE ALSO
       dyldinfo(1), ld(1), otool(1)

Apple Inc.                                                                June 1, 2017                                                                   DYLD(1)

実行ファイルそのままで環境変数のセットだけで置き換えられるという仕組みは廃止されたのだろうか。悪意ある第三者にmallocなどを置き換えられてしまうとセキュリティ上極めてまずいことになるので(Library Validationが有効なバイナリならば問題ないはずではあるけれど)、抜け穴を塞ぎたいということなのだと理解している。それにしても、Apple Developerでもこの変更について触れている書類を見つけられない。あったら誰かこっそり教えてください。

最新mac os環境で置き換える

それでも置き換えたいときはどうすればよいのか。

妥協点にはなるが、実行ファイルのビルド時に-flat_namespaceオプションを渡して2段階名前空間を無視してシンボル名でのみ名前解決させることが可能だ。

$ cat just1.c
#include <stdlib.h>
#include <stdio.h>

int main() {
  char *s = malloc(4);
  s[0] = '4';
  s[1] = '2';
  s[2] = '\0';
  printf("%s\n", s);
  free(s);
}
$ cc just1.c
$ nm -m a.out
0000000100000000 (__TEXT,__text) [referenced dynamically] external __mh_execute_header
                 (undefined) external _free (from libSystem)
0000000100000460 (__TEXT,__text) external _main
                 (undefined) external _malloc (from libSystem)
                 (undefined) external _printf (from libSystem)
$ cc just1.c -flat_namespace
$ nm -m a.out
0000000100000000 (__TEXT,__text) [referenced dynamically] external __mh_execute_header
                 (undefined) external _free
0000000100000460 (__TEXT,__text) external _main
                 (undefined) external _malloc
                 (undefined) external _printf

オプションをつけずにビルドした方はfrom libSystemと出力されている。libSystemはDarwinの標準Cライブラリだ。つまり、2段階名前空間によりライブラリlibSystemのmallocやfree、そしてprintfが指定されているため、自作のmallocライブラリをDYLD_INSERT_LIBRARIESで指定することで優先的に名前解決しようとしても無視されてしまう。

一方、-flat_namespaceを付けた方にはライブラリ名の出力がなく、単に関数名のシンボルのみがあるだけなので、DYLD_INSERT_LIBRARIESなどの指定によって先にマッチしたmallocやfreeを呼ぶことになる。つまりmallocの置き換えが可能になる。

-flat_namespaceの効能

最新macでも2段階名前空間の壁を突破することができる-flat_namespaceだが、実はDYLD_FORCE_FLAT_NAMESPACE環境変数を使用する場合とでは少なからず違いがある。

このどちらもが有効なHigh Sierra環境での実験をしてみた。

$ cc just1.c
$ DYLD_LIBRARY_PATH=.. DYLD_INSERT_LIBRARIES=../libft_malloc.so DYLD_FORCE_FLAT_NAMESPACE=1 ./a.out

まず、上記のように-flat_namespaceオプションなしでビルドしたバイナリにDYLD_FORCE_FLAT_NAMESPACEをセットして実行すると、free() invalid pointerが3つ発生する。これは自作mallocでポインタのバリデをして出力されたエラーであり、その意味ではちゃんと期待通りに名前空間を無視して置き換わっていることになるが、とはいえ、ソースコード中には1つしかmallocがないわけだから期待にそぐわない挙動でもある。

この点については、以下の投稿が参考になる。
https://zenn.dev/mfunyu/articles/malloc-trace-caller

要するに、バイナリ実行のスタートアップ処理の中でmallocではなくmalloc_zone_mallocをコールして取得したポインタのfreeが発生したために、自作mallocではハンドルしていないポインタを引数として自作freeが呼ばれてしまうというわけだ。

この問題に対処するには自分がハンドルしていないアドレスの場合はlibSystemのfreeにバウンスしてやるようなイレギュラーな処理を付けることになる。

一方、-flat_namespaceの場合はもっと単純になる。

$ cc -flat_namespace just1.c
$ DYLD_INSERT_LIBRARIES=../libft_malloc.so ./a.out

これで前述のような問題に悩まされることなく動作するのである(しかもDYLD_LIBRARY_PATHが必須ではない)。

mallocのコーラーであるバイナリ自体が単一名前空間でビルド(リンク)されていることで、バイナリはDYLD_INSERT_LIBRARIESで指定したライブラリを優先的に使用するが、スタートアップはそれとは関係なくlibSystemのmallocとfreeを使うのである。あくまでバイナリのコードに関しては置き換えている、という状態だ。

ライブラリをリンクする場合との違い

もちろん一般的にライブラリを指定する方法でビルドすれば2段階名前空間を最もストレートに解決することはできる。

$ cc just1.c -L.. -lft_malloc
$ DYLD_LIBRARY_PATH=.. DYLD_INSERT_LIBRARIES=../libft_malloc.so ./a.out

しかし、この場合はDYLD_INSERT_LIBRARIESなどの指定なしではライブラリが見つからないためにシステムエラーになってしまう。
-flat_namespaceであれば、DYLD_INSERT_LIBRARIESによるライブラリ指定なしの場合は標準のmalloc / freeが、ライブラリを指定すればそちらのライブラリのmalloc / freeが呼ばれるという挙動が実現する。

2段階名前空間について
https://developer.apple.com/library/archive/documentation/Porting/Conceptual/PortingUnix/compiling/compiling.html

Discussion