Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pex-cli: add [pex-cli].args option to pass arguments to the PEX process globally #21202

Merged

Conversation

AlexTereshenkov
Copy link
Member

@AlexTereshenkov AlexTereshenkov commented Jul 24, 2024

A new option [pex-cli].global_args has been added to be able to pass arbitrary arguments to the pex tool as part of any Pants goal invocation. This should make it a lot easier to modify behavior of pex tool without needing to make changes in the Pants codebase.

Not having this ability to pass arbitrary arguments to pex makes it really hard to start taking advantage of new features that come with newer versions of pex. For instance, the new --exclude flag added recently would require making lots of changes in the codebase to be able to pass those extra arguments. This is because the pex invocations in the Pants codebase are numerous and it's really hard to make sure a particular argument is respected (by keeping the chain of calls correct making sure it does reach the final subprocess spawn). And if it's relevant for multiple goals, this becomes even harder.

We would still need to make changes to pass arguments to individual targets, see #20737 or #20939 - this makes sense as those arguments apply only to those targets. However, some options would need to apply for any pex invocation (e.g. ignoring all 3rd party dependencies).

I've looked into having environment variables support for all flags that PEX has (pex-tool/pex#2242) first (so that no changes are needed in Pants, one would just export a bunch of env vars as needed), but this is not going to happen any time soon, so doing it in the Pants codebase instead is the only path forward, I reckon.

Usage in practice:

PANTS_SOURCE=../pants pants --pex-cli-global-args="--some-flag=value" package cheeseshop/cli:cheeseshop-query
Pantsd has been turned off via Env.
23:00:24.73 [INFO] Initializing Nailgun pool for 16 processes...
23:00:27.85 [INFO] Starting: Resolving plugins: packaging==21.3
23:00:28.61 [INFO] Completed: Resolving plugins: packaging==21.3
...
    raise ExecutionError(

Exception message: 1 Exception encountered:

ProcessExecutionFailure: Process 'Resolving plugins: packaging==21.3' failed with exit code 2.
stdout:

stderr:
usage: pex [-o OUTPUT.PEX] [options] [-- arg1 arg2 ...]

pex builds a PEX (Python Executable) file based on the given specifications: sources, requirements, their dependencies and other options.
Command-line options can be provided in one or more files by prefixing the filenames with an @ symbol. These files must contain one argument per line.

pex: error: unrecognized arguments: --some-flag=value

Use `--keep-sandboxes=on_failure` to preserve the process chroot for inspection.
NoneType: None

With practical usage (could be applied to any target, but showing something that actually can be easily verified to proof the flag value is being used), see below.

With --pex-cli-args="--no-compile":

$ PANTS_SOURCE=../pants pants --pex-cli-global-args="--no-compile" package cheeseshop/cli:cheeseshop-query
$ unzip -l dist/cheeseshop.cli/cheeseshop-query.pex | grep ".pyc"        
   268816  01-01-1980 00:00   .deps/charset_normalizer-3.3.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl/charset_normalizer/md__mypyc.cpython-39-x86_64-linux-gnu.so

With --pex-cli-args="--compile":

$ PANTS_SOURCE=../pants pants --pex-cli-global-args="--compile" package cheeseshop/cli:cheeseshop-query
$ unzip -l dist/cheeseshop.cli/cheeseshop-query.pex | grep ".pyc" | tail -n 5                   
      824  01-01-1980 00:00   cheeseshop/repository/parsing/exceptions.pyc
      839  01-01-1980 00:00   cheeseshop/repository/properties.pyc
     1901  01-01-1980 00:00   cheeseshop/repository/query.pyc
     1474  01-01-1980 00:00   cheeseshop/repository/repository.pyc
      236  01-01-1980 00:00   cheeseshop/version.pyc

@AlexTereshenkov AlexTereshenkov added category:new feature backend: Python Python backend-related issues labels Jul 24, 2024
@AlexTereshenkov AlexTereshenkov force-pushed the pants/add-pex-cli-args-option branch 2 times, most recently from 48e0e9d to 2bb9b3e Compare July 24, 2024 21:31
@AlexTereshenkov AlexTereshenkov marked this pull request as ready for review July 25, 2024 11:57
@AlexTereshenkov AlexTereshenkov requested review from kaos and huonw July 25, 2024 11:58
Copy link
Contributor

@huonw huonw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this.

Pants invokes pex in a lot of different ways, e.g. pex ... to build pex files, but also various pex3 subcommands like pex3 lock create ... for pants generate-lockfiles. Does this flag get passed to all of them? If so, I wonder if that isn't the best, e.g. some args might make sense for pex only. Thoughts?

I've looked into having environment variables support for all flags that PEX has (pex-tool/pex#2242) first (so that no changes are needed in Pants, one would just export a bunch of env vars as needed), but this is not going to happen any time soon, so doing it in the Pants codebase instead is the only path forward, I reckon.

Fully agreed, Pants should try to avoid getting in users way, by being as thin a wrapper around the underlying tool as possible, and this means exposing whatever args it needs.

@@ -281,3 +281,31 @@ def test_run_script_from_3rdparty_dist_issue_13747() -> None:
result = run_pants(args)
result.assert_success()
assert SAY in result.stdout.strip()


def test_pass_extra_pex_cli_subsystem_args() -> None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another option would be adding a mocked test to pex_cli_util.py, rather than only testing the pants run ... version of this. In particular, this ends up being a very heavy-weight test, because it's having to do a full run_pants call (level 4 in https://www.pantsbuild.org/2.23/docs/writing-plugins/the-rules-api/testing-plugins).

What do you think?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a unit test to make sure that Process object created does have the --pex-cli-args flags passed which exercises the modified rule (where those arguments are added). I think that having just unit test is probably enough since this is what verifies that arguments are going to be passed. How pex is going to handle them is probably outside of the scope being tested, but I think having this integration (in addition to the most recently added unit test) would be beneficial, but I am not sure about that. Happy to go either way.

@AlexTereshenkov AlexTereshenkov requested a review from huonw July 26, 2024 08:41
@AlexTereshenkov
Copy link
Member Author

Pants invokes pex in a lot of different ways, e.g. pex ... to build pex files, but also various pex3 subcommands like pex3 lock create ... for pants generate-lockfiles. Does this flag get passed to all of them? If so, I wonder if that isn't the best, e.g. some args might make sense for pex only. Thoughts?

That's right, it would be passed to whatever pex invocation. I think most often it would be specific to a Pants call being made, i.e. when running generating-lockfiles, one wouldn't expect to pass --no-compile. Some of the options I guess could be set in the pants.toml, but I would personally always pass them in Shell commands when invoking Pants as this is when they will make sense.

@sureshjoshi sureshjoshi self-requested a review July 29, 2024 22:24
@sureshjoshi
Copy link
Member

Pants invokes pex in a lot of different ways, e.g. pex ... to build pex files, but also various pex3 subcommands like pex3 lock create ... for pants generate-lockfiles. Does this flag get passed to all of them? If so, I wonder if that isn't the best, e.g. some args might make sense for pex only. Thoughts?

That's right, it would be passed to whatever pex invocation. I think most often it would be specific to a Pants call being made, i.e. when running generating-lockfiles, one wouldn't expect to pass --no-compile. Some of the options I guess could be set in the pants.toml, but I would personally always pass them in Shell commands when invoking Pants as this is when they will make sense.

I think this is fine as a step 1, given that all of the subsystem-level args are "global" in that sense - but maybe we should document/flag harder that users will likely want to add these at the command line, rather than in pants.toml?

Or maybe even naming them thusly ("pex-cli-global-args") - though I think this is kinda silly, now that I've written it out.

I think the alternative would be something closer to docker with "build_args", "run_args", etc? Eventually, I'd assume that each of the pex-related targets would have some sort of pex_cli_args

@sureshjoshi
Copy link
Member

I think the alternative would be something closer to docker with "build_args", "run_args", etc? Eventually, I'd assume that each of the pex-related targets would have some sort of pex_cli_args

Alright, with some time away, I'm now wondering if this might be a better overall approach.

e.g. from pex --help

  • pex options
  • pex resolver options
  • pex output options
  • pex target environment options
  • pex entry point options
  • pex global options

@AlexTereshenkov
Copy link
Member Author

AlexTereshenkov commented Jul 30, 2024

Thank you, @sureshjoshi!

Or maybe even naming them thusly ("pex-cli-global-args") - though I think this is kinda silly, now that I've written it out.

The help string produced

help=(
lambda subsystem_cls: softwrap(
f"""
Arguments to pass directly to {tool_name or subsystem_cls.name},
e.g. `--{subsystem_cls.options_scope}-args='{example}'`.{extra_help}
"""
)
),
default=default, # type: ignore[arg-type]
)

should be clear enough that these arguments are going to be passed to all pex invocations, I reckon? If that's not obvious, we could add the extra_help to tell this explicitly.

I think the alternative would be something closer to docker with "build_args", "run_args", etc? Eventually, I'd assume that each of the pex-related targets would have some sort of pex_cli_args

This is what has been done for some of the targets in the PRs I linked in the description. However, as soon as there's some PEX feature Pants users may want to start using (by providing a custom PEX version in their pants.toml version), there's really no way to pass the arguments to PEX without extending Pants with a plugin or even more likely making changes to the core.

So this PR makes it possible to pass the arguments globally when it makes sense (e.g. --compile), but I think running the commands with individual targets would be a more common practice until we have something more fine-grained:

pants --pex-cli-args="--flag=value" package //project:my-pex-binary 

So I think having a way to pass arguments "globally" would cover the needs of everyone (albeit sometimes forcing to have multiple pants invocations to pass different flags). Then, as additional feature, individual targets may have pex_args or something as the target property user can set if they want to pass something extra.

Alright, with some time away, I'm now wondering if this might be a better overall approach.

e.g. from pex --help

* pex options

* pex resolver options

* pex output options

* pex target environment options

* pex entry point options

* pex global options

I am not sure I follow what you mean with having those subsections, would you be able to elaborate just a bit?

@AlexTereshenkov
Copy link
Member Author

@huonw @sureshjoshi I am sorry I am not sure if you have any feedback for me since the last interaction we had in this PR? No rush, just checking if I have missed something.

@AlexTereshenkov AlexTereshenkov self-assigned this Aug 12, 2024
@AlexTereshenkov AlexTereshenkov force-pushed the pants/add-pex-cli-args-option branch from 2589e5c to ee2b123 Compare August 13, 2024 08:33
Copy link
Contributor

@huonw huonw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just some doc tweaks

Alexey and I talked on slack and came to the conclusion that if we renamed this it makes this good enough to go ahead: the args doesn't emphasise (to me) that this will apply across all the variety of invocations of pex that pants does. We brainstormed names like overall-args, universal-args, global-args and Alexey wisely chose the last one 😄 (my other two are bad)

@AlexTereshenkov AlexTereshenkov merged commit e7d9891 into pantsbuild:main Aug 13, 2024
25 checks passed
@AlexTereshenkov AlexTereshenkov deleted the pants/add-pex-cli-args-option branch August 13, 2024 12:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend: Python Python backend-related issues category:new feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants