Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple steampipe cli executions share environment information #4155

Open
jlm0x017 opened this issue Feb 28, 2024 · 4 comments
Open

Multiple steampipe cli executions share environment information #4155

jlm0x017 opened this issue Feb 28, 2024 · 4 comments
Labels
bug Something isn't working

Comments

@jlm0x017
Copy link

jlm0x017 commented Feb 28, 2024

Describe the bug
If I start a steampipe query session in one terminal window, it's environment is used by steampipe query executions in other terminals.

Steampipe version (steampipe -v)
Steampipe v0.21.8

To reproduce
open two terminal windows.
from the first, execute AWS_PROFILE=env1 steampipe query; leave this open
from the second terminal window execute AWS_PROFILE=env2 steampipe query "select count(*) from aws_ec2_instance;". The instance count will be from env1.

kill the client in window 1, and repeating the query in window 2 will now work.

Expected behavior
I would expect simultaneous steampipe runs to use their respective environment variables

Additional context
The second window will successfully execute steampipe query "select count(*) from env1.aws_ec2_instance;", providing the count for env1. However, this is a mask to the problem and an inconvenient workaround. If I'm writing multiple joins, I'd like not to have to specify the environment on each table name; I'd instead like to rely on the local value of AWS_PROFILE to be honored.

@e-gineer
Copy link
Contributor

This is not ideal, but is a side effect of a design decision we made for ease of use.

Internally there are two parts to the Steampipe CLI - a client and a server.

When you run steampipe service start you are starting the server alone, making it available for multiple steampipe clients and postgres tools to connect to. The configuration of permissions and connections is done at a server level.

When you run steampipe query we:

  1. Try to connect to an existing steampipe service that is running.
  2. If there is no service, start a "temporary service".
  3. Connect to the running service.
  4. Run our query.
  5. Disconnect our client.
  6. Stop the "temporary service" if we were the last client.

This approach makes steampipe query easy to use and allows you to run multiple clients in parallel. The negative effect is that the "temporary service" configuration is based on the first client to start it. This is what you are seeing.

If you are using multiple AWS profiles I highly recommend you set them up as different connections in steampipe - then you can reuse the same service and query both of them by just changing the search path - https://steampipe.io/docs/guides/search-path

But, as a workaround if you really want environment variable based control, you can use the --install-dir and --port arguments to start a second steampipe service in parallel. You should definitely checkout workspaces if you want to do this sort of thing - they make the configuration a lot easier and more flexible.

@jlm0x017
Copy link
Author

This is perfectly understandable once you know how the product works. But for the user who doesn't know, or is concentrating on the work (rather than the tool) this behaviour is surprising and frustrating. As part of the core design, I recognize this will be hard to change, but if it can not be changed easily, then consider ways to make the behaviour more visibile.

Potential ideas:

  • Call out current settings at the top of the console when using steampipe query
  • Call out the reuse of existing servers (and their process IDs) when querying.
  • Create an easy ".info" command that shows the current profile and connection settings to support troubleshooting/discoverability of reuse
  • have the steampipe session check that its environment matches the server instance environment regarding these behaviour-controlling values
  • enable time-out of backend servers

This last one especially resonates with my experience, as I troubleshot the inconsistent results for 2+ hours. The forgotten console was three hours old, and a 5m timeout on the backend server would have shut down the conflicting server before the behaviour even appeared.

@shaicoleman
Copy link

shaicoleman commented Apr 23, 2024

I think steampipe should listen to a random local socket on each execution by default for standalone queries.

This solves a few issues:

  • Allows concurrent execution
  • Each execution gets its own environment variables
  • Works even if port 9193 is in use
  • Does not require a listening port on localhost, which is more secure
  • Allows setting user permissions on the socket, which is more secure

@e-gineer
Copy link
Contributor

e-gineer commented May 1, 2024

Unfortunately @shaicoleman, running two instances of steampipe at the same time (e.g. on different ports) means running two postgres instances simultaneously. This is only possible with two separate installation directories to store all of the postgres data files, configuration, etc. Doing this on-demand will take time to setup and create a lot of noise / storage on the machine.

Providing more feedback through the UI about reuse as @jlm0x017 suggests seems like the best option at this point I think?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants