Skip to content
This repository has been archived by the owner on Nov 10, 2023. It is now read-only.

make allocators and sanitizers work for processes created with multiprocessing's spawn method in dev mode #2660

Open
wants to merge 1 commit into
base: dev
Choose a base branch
from

Commits on Oct 12, 2021

  1. make allocators and sanitizers work for processes created with multip…

    …rocessing's spawn method in dev mode (facebook#2660)
    
    Summary:
    Pull Request resolved: facebook#2660
    
    **The first attempt (D30802446) overlooked that fact that the interpreter
    wrapper is not an executable (for `execv`) on Mac, and introduced some bugs due
    to the refactoring. The attempt 2 addressed the issues, and isolated the effect
    of the change to only processes created by multiprocess's spawn method on
    Linux.**
    
    #### Problem
    Currently, the entrypoint for in-place Python binaries (i.e. built with dev
    mode) executes the following steps to load system native dependencies (e.g.
    sanitizers and allocators):
    - Backup `LD_PRELOAD` set by the caller
    - Append system native dependencies to `LD_PRELOAD`
    - Inject a prologue in user code which restores `LD_PRELOAD` set by the caller
    - `execv` Python interpreter
    
    The steps work as intended for single process Python programs. However, when a
    Python program spawns child processes, the child processes will not load native
    dependencies, since they simply `execv`'s the vanilla Python interpreter. A few
    examples why this is problematic:
    - The ASAN runtime library is a system native dependency. Without loading it, a
      child process that loads user native dependencies compiled with ASAN will
      crash during static initialization because it can't find `_asan_init`.
    - `jemalloc` is also a system native dependency.
    
    Many if not most ML use cases "bans" dev mode because of these problems. It is
    very unfortunate considering the developer efficiency dev mode provides. In
    addition, a huge amount of unit tests have to run in a more expensive build
    mode because of these problems.
    
    #### Solution
    Move the system native dependencies loading logic out of the Python binary
    entrypoint into an interpreter wrapper, and set the interpreter as
    `sys.executable` in the injected prologue:
    - The Python binary entrypoint now uses the interpreter wrapper, which has the
      same command line interface as the Python interpreter, to run the main
      module.
    - `multiprocessing`'s `spawn` method now uses the interpreter wrapper to create
      child processes, ensuring system native dependencies get loaded correctly.
    
    #### Alternative Considered
    One alternative considered is to simply not removing system native dependencies
    from `LD_PRELOAD`, so they are present in the spawned processes. However, this
    can cause some linking issues, which were perhaps the reason `LD_PRELOAD` was
    restored in the first place.
    
    Reviewed By: fried, Reubend
    
    fbshipit-source-id: 9528c1856bf389ce033a8630cd718466754f3cef
    yifuwang authored and facebook-github-bot committed Oct 12, 2021
    Configuration menu
    Copy the full SHA
    7faa8a5 View commit details
    Browse the repository at this point in the history