Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SOLR-17690 Make solr zk tools read ZK_HOST environment #3240

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

janhoy
Copy link
Contributor

@janhoy janhoy commented Mar 5, 2025

@janhoy
Copy link
Contributor Author

janhoy commented Mar 5, 2025

Refactored a bit, to be able to use the --zk-host foo --> ZK_HOST=foo fallback other places that attempt to read only the option.

Activated the fix since the Crave tests don't run anyway to show the failing test. I'll run all the tests locally.

@janhoy
Copy link
Contributor Author

janhoy commented Mar 5, 2025

I ran crave locally since it does not work in gh-actions:

../crave run
Downloading update https://github.com/accupara/crave/releases/download/0.2-7064/crave-0.2-7064-darwin-amd64.bin
Selecting project Solr (id:39)
Picking up local changes (if any)...
 3 files changed, 25 insertions(+), 10 deletions(-)
No build commands specified
Running preconfigured build commands
./gradlew --console=plain test
To run custom build commands use
crave [options] run [options] -- "customCommand1; customCommand2; ..."
Waiting for build id:185108 to start...

Services in this job can be accessed using the following:

Service        Remote  Access URL
-----------  --------  ----------------------------
shellinabox      5898  shellinabox://localhost:5898
ssh                22  ssh://localhost:41528
vscode           5899  vscode://localhost:5899
Setting up workspace (this could take some time)...
Pulling container image accupara/openjdk:21...
Finished pulling container image accupara/openjdk@sha256:8f36c8fd41df83a9773ac84785b5b16d29fd4eb5c94c1cc49327785bddca183e took 1.088967291s
Switched to a new branch 'feature/SOLR-17690-zkToolZKHostFromEnv'
setting commitID to e668dceb6bec0b838d5f1e045483432abd6b6612
From /home/admin/.craved/GUnip8pjhjmR-561c264fd117b4e8a8d25c057caef7c71e43099d.patch.gz.b64
 * branch                    HEAD       -> FETCH_HEAD
Updating e668dceb6be..1580a6c19c6
Fast-forward
 .../src/java/org/apache/solr/cli/CLIUtils.java     | 25 ++++++++++++++++++----
 .../java/org/apache/solr/cli/RunExampleTool.java   |  4 ++--
 .../org/apache/solr/cli/ZkSubcommandsTest.java     |  6 ++----
 3 files changed, 25 insertions(+), 10 deletions(-)

Starting a Gradle Daemon, 1 incompatible and 1 stopped Daemons could not be reused, use --status for details

[--snip--]
> Task :solr:core:wipeTaskTemp
The slowest tests (exceeding 500 ms) during this run:
  149.58s BasicAuthIntegrationTest.testBasicAuth (:solr:core)
  70.52s PeerSyncReplicationTest.test (:solr:core)
  66.55s TestDistributedSearch.test (:solr:core)
  65.63s ReplicationFactorTest.test (:solr:core)
  64.00s BasicDistributedZkTest.test (:solr:core)
  57.93s TestCpuAllowedLimit.testDistribLimit (:solr:core)
  54.98s BasicDistributedZk2Test.test (:solr:core)
  51.94s SyncSliceTest.test (:solr:core)
  50.02s UnloadDistributedZkTest.test (:solr:core)
  45.63s SolrAndKafkaReindexTest.testFullCloudToCloud (:solr:cross-dc-manager)
The slowest suites (exceeding 1s) during this run:
  149.65s BasicAuthIntegrationTest (:solr:core)
  113.65s SchemaTest (:solr:solrj)
  111.93s TestRecovery (:solr:core)
  107.73s TestCpuAllowedLimit (:solr:core)
  96.31s MultiThreadedOCPTest (:solr:core)
  89.91s TestCollectionAPI (:solr:core)
  85.43s TestPullReplica (:solr:core)
  82.79s TestCoordinatorRole (:solr:core)
  77.73s StreamDecoratorTest (:solr:solrj-streaming)
  75.33s PeerSyncReplicationTest (:solr:core)

BUILD SUCCESSFUL in 6m 32s
146 actionable tasks: 142 executed, 4 up-to-date

------------------------------------------------------------------------
Build Successful

Total time: 7m22.5s
------------------------------------------------------------------------

@janhoy
Copy link
Contributor Author

janhoy commented Mar 5, 2025

Hmm, there's no 9.8.1 section in CHANGES.txt on main, and not on branch_9x, only on branch_9_8.
Is the correct way for this bugfix to skip CHANGES here and on 9x, and then add one when cherry-picking to the release branch? Then the 9.8.1 RM will reconsile changes to other branches during release?

Here is suggested changes text:

* SOLR-17690: Make solr CLI tools read ZK_HOST environment as documented. There was a regression 
  in 9.8.0 where this did not work, causing issues also for solr-operator v0.9 (janhoy)

@HoustonPutman
Copy link
Contributor

HoustonPutman commented Mar 5, 2025

Hmm, there's no 9.8.1 section in CHANGES.txt on main, and not on branch_9x, only on branch_9_8. Is the correct way for this bugfix to skip CHANGES here and on 9x, and then add one when cherry-picking to the release branch? Then the 9.8.1 RM will reconsile changes to other branches during release?

Here is suggested changes text:

* SOLR-17690: Make solr CLI tools read ZK_HOST environment as documented. There was a regression 
  in 9.8.0 where this did not work, causing issues also for solr-operator v0.9 (janhoy)

So add it to the changelog for 9.9, and then in branch_9_8, add it to 9.8.1. As a part of the release, I remove the duplicates from 9.9 on branch_9x and main.

Copy link
Contributor

@HoustonPutman HoustonPutman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's lots of other usages of cli.getOptionValue(CommonCLIOptions.ZK_HOST_OPTION) so I might go through and see what else needs to be switched to this method.

Otherwise I like how it works.

* @param cli the command line
* @return the ZooKeeper connection string or null if not configured
*/
public static String getZkHostFromCliOrEnv(CommandLine cli) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we look more closely at how CLIUtils. getDefaultSolrUrl works? It feels like we need to make the looking up of ZK and Solr URL more aligned. Use similiar names for the methods and similar patterns for the look ups.

Copy link
Contributor Author

@janhoy janhoy Mar 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had another look at the CommonCLIOptions. You are correct that the SOLR_URL_OPTION documents that it defaults to an URL defined by a set of sysProps. But not all tools do fallback in the same way, e.g. CreateTool looks up a live node in zk instead.

The ZK_HOST_OPTION main help text says it defaults to localhost:9983 but many tools uses getZkHost() which queries solr to find zkHost, but that is really not documented in help docs.

This PR brings tools in line with ZK_HOST_OPTION help docs

"Zookeeper connection string; unnecessary if ZK_HOST is defined in solr.in.sh;

But the last part

"otherwise, defaults to localhost:9983"

is still largely wrong for many tools, since we actually try to look it up via Solr APIs. We could change that help text to align with reality but I have not surveyed every single use, have not seen what refGuide says etc, so let it be for a followup PR.

Copy link
Contributor

@epugh epugh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I posted an email to dev@ on why in general I think the use of environment variables in our scripts is a bad idea.

So this review isn't about if this is a good or bad idea, but only about how we are actually supporting the ZK_HOST variable.

Copy link
Contributor

@malliaridis malliaridis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with the changes. I feel that the users may be overwhelmed with the amount of options we give them for defining a property (CLI option, env vars and file), which makes our code more complex as well, as we have to cover each one individually and have a prioritization behind the scenes, covered in both Java classes and scripts. But that's another story.

Perhaps the alternative, "simplified" way to address this is to pass EnvUtils.getProperty("zkHost") in
cli.getOptionValue(CommonCLIOptions.ZK_HOST_OPTION, EnvUtils.getProperty("zkHost"), so no new methods are needed. But both ways should be fine if they adress the issue and the method is a bit more flexible in these scenarios. :)

I was also not aware that the operator would be affected by changes in the CLI, would be nice to have those covered as well during our refactoring. 😅

@HoustonPutman
Copy link
Contributor

So I think there is good discussion to be had here about the use of EnvVars in the SolrCLI logic (which currently we do use for default Solr URLs). But in the meantime, I do believe that there is a fix for apache/solr-operator#762 that will support 9.8.0 and 9.8.1 even if this PR isn't included.

I think that's probably the way to go, so we can spend the time and come up with the best strategy for CLI EnvVars.

@janhoy
Copy link
Contributor Author

janhoy commented Mar 5, 2025

I feel that the users may be overwhelmed with the amount of options we give them for defining a property (CLI option, env vars and file), which makes our code more complex as well

Adding ZK_HOST=foo to solr.in.sh has the same effect as adding export ZK_HOST=foo to your shell. It comes very handy when e.g. running Solr as a container and you can configure a lot with just env instead of having to patch the CMD part. The CLI tools like zk cp behaves the same as when starting solr itself.

So you can set SOLR_URL or ZK_HOST variable and then run much shorter commands to work with your cluster.

Perhaps we could do a better job in SolrCLI of wrapping such common behavior in more generic utils like

CLIUtils#getOptionOrPropValue(CommandLine cli, Option option, String sysprop, String default)

e.g.

CLIUtils.getOptionOrPropValue(cli, CommonCLIOptions.ZK_HOST_OPTION, "zkHost", null)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants