Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run without sudo? #905

Open
hsane2001 opened this issue May 31, 2024 · 9 comments
Open

Run without sudo? #905

hsane2001 opened this issue May 31, 2024 · 9 comments

Comments

@hsane2001
Copy link

Can we allow gprofiler to run without sudo? There are many instances where sudo is not possible and it would be great to get app level stacks even if in a limited manner. Besides perf which would require sudo (although can be made to run otherwise), there are many cases where the code internally uses root access to the filesystem for namespaces and storing intermediate data.

@hsane2001 hsane2001 changed the title Run without sudo, even if with limited data? Run without sudo? May 31, 2024
@Jongy
Copy link
Contributor

Jongy commented Jun 3, 2024

We can definitely make gProfiler run without root. It will require some iterative work of encountering problems, fixing them (by making them handle the lack of permissions gracefully) and continuing.

Things I already have on my mind:

  1. Need to remove the actual check for "is root" in verify_preconditions.
  2. Many sites in gProfiler use run_in_ns & access /proc/pid/ns/ files, which might be inaccessible if you're not root. The use case of running gProfiler w/o root is to profile applications running in the same mount/pid namespace, so all run_in_ns interaction is optional and can be made so (for example, gprofiler can have a run_in_ns_wrapper that checks if we're root and skips the privileged operation if we're root).
  3. perf itself can run w/o root - modifying kernel.perf_event_paranoid is one thing, in addition I think that we'll need to run perf in a per-process mode (and not -a mode). I don't know if Linux allows you to run perf in -a mode while underprivileged, but it'd make little sense to me. However, running perf in -p mode, targeting PIDs of the same user, makes sense. If perf is desired and -a will be blocked, we can make gProfiler when runs underprivielged to run perf record -p (for example, based on PIDs passed via --pids to gProfiler).
  4. There might be additional issues - perhaps directories gProfiler tries to write to (we use /tmp by default but might fallback to /opt which is root-only).

If you're working on eliminating the root requirement, you can write thoughts here about how to handle particular parts being blocked due to underprivilege, and I'll help addressing. I'm also open to a Zoom discussion over it :)

@hsane2001
Copy link
Author

There has been more requirements to make this happen. Some answers inline:

  1. verify_precondition check should be easy to resolve
  2. The option of "run_in_ns_wrapper that checks if we're root and skips the privileged operation if we're root" sounds good.
  3. Yes, perf can itself run in user mode after applying the sys related changes (one time). "perf -a" also works and is essential in cases where a privileged container can monitor rest of the system or other containers.
  4. The code does have other places where it access root only filesystem and may need modification to a temporary user directory.

@pcarella1
Copy link

pcarella1 commented Oct 22, 2024

@Jongy, I'm working with @hsane2001 on this issue assuming the use case you mentioned where "running gProfiler w/o root is to profile applications running in the same mount/pid namespace". I've been bypassing the checks for run_in_ns when not in root, and redirecting some of the files to user writeable locations (such as with --log-file, --pid-file).

One issue I've encountered where sudo is currently required (where it probably doesn't have to be) is this function: mkdir_owned_root https://github.com/Granulate/gprofiler/blob/9da69dbcbed9a378e042c35e9b3af21060e2a77f/gprofiler/utils/fs.py#L79 which ensures that a directory is owned by root when it is created to ensure other users cannot alter these files (for example for temporary files in /tmp).

One potential approach, when running without sudo, is to redirect temporary files from /tmp to a per user temp directory, that can be deleted upon termination of gprofiler.

Does this seem to be the right approach to you? Is there a better way you'd recommend for handling this function for "sudoless" execution?

@Jongy
Copy link
Contributor

Jongy commented Oct 22, 2024

Hi @pcarella1 :)

As for run_in_ns, I highly suggest that we take the run_in_ns_wrapper approach which checks a global flag ("are we in no-root mode") and just runs the callback directly. This logic can go directly in run_in_ns impl in granulate-utils, actually.

As for mkdir_owned_root, it's goal is as you mentioned. Since gProfiler (currently) runs as root, it's writing all of its file as root to prevent any non-root user from messing with them. It required special care with the async-profiler DSO which is written in different mount namespaces (to match the profiled process), and that's where mkdir_owned_root helps.
When we're in non-root mode - we target only processes of our UID:GID, and only processes in our mount namespace (I think it's okay to assume that non-root profiling does not intend to profile containers). In that case, the async-profiler DSO can be used directly from the gProfiler resources directory, which is created under the PyInstaller temporary directory (_MEIxxxx), with gProfiler's UID:GID. gProfiler can assign self._ap_dir_base to a subdirectory under its resources directory, and that'd be good enough from security perspective.
I think that's the simplest approach, but if it requires too many changes to support the existing code path and the new one, then we can take your approach (change _find_rw_exec_dir to create a directory owned by UID:GID under one of POSSIBLE_AP_DIRS).

@pcarella1
Copy link

pcarella1 commented Oct 29, 2024

Hi @Jongy, I agree, run_in_ns_wrapper makes the most sense. I will take that approach.

I've implemented changing self._ap_dir_base to resource_path as you suggested.

I have found a new issue, in the pgrep_maps function, which runs this command ""grep -lE '{match}' /proc/*/maps"" while searching for the process memory maps of various libraries such as ruby, libjvm, etc.. This returns an error as some of the processes under /proc are not readable by non-root user. What do you suggest as the workaround for this?

https://github.com/Granulate/gprofiler/blob/9da69dbcbed9a378e042c35e9b3af21060e2a77f/gprofiler/utils/__init__.py#L354

Error is here:
[2024-10-25 20:47:34,592] DEBUG: gprofiler.utils: Running command (command=["grep -lE '^.+/libjvm\.so' /proc//maps"])
[2024-10-25 20:47:34,595] DEBUG: gprofiler.utils: Running command (command=["grep -lE '(^.+/ruby[^/]
$)' /proc//maps"])
[2024-10-25 20:47:34,639] DEBUG: gprofiler.utils: Command exited (command=["grep -lE '(^.+/ruby[^/]
$)' /proc//maps"], exit_code=2)
[2024-10-25 20:47:34,640] DEBUG: gprofiler.utils: Command exited (command=["grep -lE '(^.+/(lib)?python[^/]
$)|(^.+/site-packages/.+?$)|(^.+/dist-packages/.+?$)' /proc/*/maps"], exit_code=2)
[2024-10-25 20:47:34,640] ERROR: gprofiler.utils: Unexpected 'grep' error output (first 10 lines): [b'grep: /proc/178554/maps: Permission denied', b'grep: /proc/178557/maps: Permission denied', b'grep: /proc/180851/maps: Permission denied', b'grep: /proc/180854/maps: Permission denied', b'grep: /proc/180917/maps: Permission denied', b'grep: /proc/180920/maps: Permission denied', b'grep: /proc/180974/maps: Permission denied', b'grep: /proc/180977/maps: Permission denied', b'grep: /proc/1/maps: Permission denied', b'grep: /proc/203591/maps: Permission denied']
[2024-10-25 20:47:34,642] ERROR: gprofiler.utils: Unexpected 'grep' error output (first 10 lines): [b'grep: /proc/178554/maps: Permission denied', b'grep: /proc/178557/maps: Permission denied', b'grep: /proc/180851/maps: Permission denied', b'grep: /proc/180854/maps: Permission denied', b'grep: /proc/180917/maps: Permission denied', b'grep: /proc/180920/maps: Permission denied', b'grep: /proc/180974/maps: Permission denied', b'grep: /proc/180977/maps: Permission denied', b'grep: /proc/1/maps: Permission denied', b'grep: /proc/203591/maps: Permission denied']

@Jongy
Copy link
Contributor

Jongy commented Nov 7, 2024

Hi @pcarella1 , sorry for my late reply. I think that in this mode of operation, we can fix pgrep_maps to ignore the Permission denied errors, since it's by design. You can change pgrep_maps to accept another parameter (ignore_permission_errors: bool = False) that when passed, will ignore the lines with Permission denied, just like the lines with No such file or directory are ignored.

@pcarella1
Copy link

@Jongy It seems the main use for this function (pgrep_maps, as mentioned in my last message) is to find the process ID for python, ruby, java, and dotnet, so it get select those profilers rather than perf if they are used. Is this correct?

I've tried reading these function using the grep command located in pgrep_maps, and they all return permissions errors. This would make it impossible for gprofiler to detect if python, ruby, java, or dotnet are being used to select those profilers.

What is the correct solution for this? Should I try to use another command other than greping proc/*/maps (such as ps -aux | grep python) to get the PID for these? Do any of these profilers require sudo anyway, so we should just set a limitation that sudoless can't be used with them?

(Amusingly, I found an issue on the py-spy repo where you advise on ptrace_scope value to run py-spy without root :), so it looks like it should be possible for python at least)

Please advise.

@Jongy
Copy link
Contributor

Jongy commented Nov 21, 2024

It seems the main use for this function (pgrep_maps, as mentioned in my last message) is to find the process ID for python, ruby, java, and dotnet, so it get select those profilers rather than perf if they are used. Is this correct?

Correct.

Let me elaborate on my reply from 2 weeks ago. The access to /proc/pid/maps is governed by the same rules of ptrace (PTRACE_MODE_READ_FSCREDS; see Ptrace access mode checking here). Processes for which grep gets a Permission denied are processes you will not be able to profile in any case! That's because the various profilers used by gProfiler require ptrace access too (and with stronger permissions, actually):

  • py-spy - needs ptrace
  • rbspy - needs ptrace
  • phpspy - needs ptrace
  • async-profiler - doesn't use ptrace, but the JVM employs a similar access check that effectively is equivalent to ptrace in default settings
  • ...

To run any profiler (gProfiler or not) that needs ptrace access, you can either run as root, which gives you CAP_SYS_PTRACE, or have ptrace access without that capability, which in standard ptrace settings means that you need to run with the same uid & gid as the profiled process(es). For all other processes, you'll get Permission denied.
Also note that if /proc/sys/kernel/yama/ptrace_scope is set to 2 (a typical security hardening), then for ptrace access you'll need to have CAP_SYS_PTRACE.

@pcarella1
Copy link

pcarella1 commented Dec 5, 2024

@Jongy I submitted PR drafts for this, please review and respond if you get a chance. I mention some issues in the PR, but we can discuss here if that's a better place.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants