Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

REST API validate endpoint fails with unexpected error 500 #421

Closed
vdesabou opened this issue Dec 4, 2023 · 23 comments · Fixed by #426
Closed

REST API validate endpoint fails with unexpected error 500 #421

vdesabou opened this issue Dec 4, 2023 · 23 comments · Fixed by #426

Comments

@vdesabou
Copy link

vdesabou commented Dec 4, 2023

I noticed a difference of behaviour when REST API validate endpoint is used between 2.0.9 and 2.0.10+ versions

When missing mandatory is not set, validate in 2.0.9 returns expected results, i.e 200 OK with error_count set, etc..
With 2.0.10+ it now returns 500 error
The change of behaviour between 2.0.9 and 2.1.0 is caused by this change made in #365

My understanding is that validate is not supposed to return error 500 but rather 200 with error_count set.

@vdesabou vdesabou changed the title /validate fails with unexpected error 500 REST API validate endpoint fails with unexpected error 500 Dec 4, 2023
@rishabhbits038
Copy link

@VihasMakwana
Could you please prioritize this? This is a breaking change

@elekanne
Copy link

elekanne commented Mar 6, 2024

I do have a customer where they used this connector already, but due to this error they removed it from the available connectors and now application teams are waiting before they can start sending to Splunk. For this customer it is urgent as we have some Splunk use casus with this connector that has to be finished before 1st April.

@VihasMakwana
Copy link
Contributor

@elekanne @rishabhbits038 you can set splunk.validation.disable to true and the connector will behave like previous versions.

@vdesabou
Copy link
Author

vdesabou commented Mar 6, 2024

@VihasMakwana I've tested your suggestion but unfortunately it does not work for me

[2024-03-06 17:05:42,540] ERROR Uncaught exception in REST call to /connector-plugins/com.splunk.kafka.connect.SplunkSinkConnector/config/validate (org.apache.kafka.connect.runtime.rest.errors.ConnectExceptionMapper:64)
java.util.concurrent.ExecutionException: org.apache.kafka.common.config.ConfigException: Missing required configuration "splunk.hec.uri" which has no default value.
	at org.apache.kafka.connect.util.ConvertingFutureCallback.result(ConvertingFutureCallback.java:123)
	at org.apache.kafka.connect.util.ConvertingFutureCallback.get(ConvertingFutureCallback.java:115)
	at org.apache.kafka.connect.runtime.rest.resources.ConnectorPluginsResource.validateConfigs(ConnectorPluginsResource.java:109)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:566)
	at org.glassfish.jersey.server.model.internal.ResourceMethodInvocationHandlerFactory.lambda$static$0(ResourceMethodInvocationHandlerFactory.java:52)
	at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher$1.run(AbstractJavaResourceMethodDispatcher.java:134)
	at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.invoke(AbstractJavaResourceMethodDispatcher.java:177)
	at org.glassfish.jersey.server.model.internal.JavaResourceMethodDispatcherProvider$TypeOutInvoker.doDispatch(JavaResourceMethodDispatcherProvider.java:219)
	at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.dispatch(AbstractJavaResourceMethodDispatcher.java:81)
	at org.glassfish.jersey.server.model.ResourceMethodInvoker.invoke(ResourceMethodInvoker.java:475)
	at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:397)
	at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:81)
	at org.glassfish.jersey.server.ServerRuntime$1.run(ServerRuntime.java:255)
	at org.glassfish.jersey.internal.Errors$1.call(Errors.java:248)
	at org.glassfish.jersey.internal.Errors$1.call(Errors.java:244)
	at org.glassfish.jersey.internal.Errors.process(Errors.java:292)
	at org.glassfish.jersey.internal.Errors.process(Errors.java:274)
	at org.glassfish.jersey.internal.Errors.process(Errors.java:244)
	at org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:265)
	at org.glassfish.jersey.server.ServerRuntime.process(ServerRuntime.java:234)
	at org.glassfish.jersey.server.ApplicationHandler.handle(ApplicationHandler.java:684)
	at org.glassfish.jersey.servlet.WebComponent.serviceImpl(WebComponent.java:394)
	at org.glassfish.jersey.servlet.WebComponent.service(WebComponent.java:346)
	at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:358)
	at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:311)
	at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:205)
	at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:799)
	at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:554)
	at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233)
	at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1624)
	at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233)
	at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1440)
	at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:188)
	at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:505)
	at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1594)
	at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:186)
	at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1355)
	at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
	at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:234)
	at org.eclipse.jetty.server.handler.StatisticsHandler.handle(StatisticsHandler.java:181)
	at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
	at org.eclipse.jetty.server.Server.handle(Server.java:516)
	at org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:487)
	at org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:732)
	at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:479)
	at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:277)
	at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311)
	at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:105)
	at org.eclipse.jetty.io.ChannelEndPoint$1.run(ChannelEndPoint.java:104)
	at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:338)
	at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:315)
	at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:173)
	at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.produce(EatWhatYouKill.java:137)
	at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:883)
	at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1034)
	at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: org.apache.kafka.common.config.ConfigException: Missing required configuration "splunk.hec.uri" which has no default value.
	at org.apache.kafka.common.config.ConfigDef.parseValue(ConfigDef.java:518)
	at org.apache.kafka.common.config.ConfigDef.parse(ConfigDef.java:508)
	at org.apache.kafka.common.config.AbstractConfig.<init>(AbstractConfig.java:112)
	at org.apache.kafka.common.config.AbstractConfig.<init>(AbstractConfig.java:132)
	at com.splunk.kafka.connect.SplunkSinkConnectorConfig.<init>(SplunkSinkConnectorConfig.java:274)
	at com.splunk.kafka.connect.SplunkSinkConnector.validateSplunkConfigurations(SplunkSinkConnector.java:150)
	at com.splunk.kafka.connect.SplunkSinkConnector.validate(SplunkSinkConnector.java:112)
	at org.apache.kafka.connect.runtime.AbstractHerder.validateConnectorConfig(AbstractHerder.java:592)
	at org.apache.kafka.connect.runtime.AbstractHerder.lambda$validateConnectorConfig$6(AbstractHerder.java:470)
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
	... 1 more

As you can see from stack trace, it does not even reach the if (connectorConfig.disableValidation) check:

	at com.splunk.kafka.connect.SplunkSinkConnectorConfig.<init>(SplunkSinkConnectorConfig.java:274)
	at com.splunk.kafka.connect.SplunkSinkConnector.validateSplunkConfigurations(SplunkSinkConnector.java:150)

I guess the check should be done earlier? Before calling validateSplunkConfigurations

What do you think ?

@VihasMakwana
Copy link
Contributor

VihasMakwana commented Mar 7, 2024

@vdesabou @ludovic-boutros @rishabhbits038 The proposed fix bypasses mandatory field validation, which will likely cause issues later during runtime.
Let's take an example here.

  • This error arises due to missing required param, not due to splunk validation failure.
  • Let's say we bypass this check anyhow and splunk.hec.uri is set to ""
  • In normal circumstances, we expect that there should be at least one splunk.hec.uri available so that we can post data to Splunk.
  • However, we've set splunk.hec.uri to "" in this example and what happens now? An exception arises as follows:
    • new HecException("No channels are available / registered with LoadBalancer")
  • Data is not sent to Splunk. You'd need to restart this connector instance after providing valid URI.

Please take a look at the required parameters here.
If we don't set URI or token or index and bypass the check, this would cause an issue in runtime and events will not be sent to Splunk.

Let me know what you think.

@vdesabou
Copy link
Author

vdesabou commented Mar 7, 2024

The problem is that validate endpoint is not supposed to return anything apart 200 with error_count being set, this was the case with earlier versions (here 2.0.9) and users want to get same behaviour as before:

CleanShot 2024-03-07 at 15 06 07

@rishabhbits038
Copy link

rishabhbits038 commented Mar 7, 2024

@VihasMakwana
There are 2 issues here:

  1. Like @vdesabou said, in case a mandatory field, the return code should've been 200 instead of 500 with the error count being set. I think this is what the root cause is. Connector's validate shouldn't return a RuntimeException
    If 1 is fixed, we shouldn't need to use splunk.validation.disable.
  2. With splunk.validation.disable, validations are still kicking in, for which change by @ludovic-boutros should help. Its a different issue that the error would be thrown later during runtime

@rishabhbits038
Copy link

rishabhbits038 commented Mar 7, 2024

The issue is that a unhandledConfig exception is being thrown.
The error message should have been set in the ConfigDef corresponding to hec.uri like this

It wasn't happening with an older version because we were returning return new Config(validations); inside the public Config validate(final Map<String, String> connectorConfigs) { method.

But with the new change, we are ignoring the result of validations = config.configValues(); and proceeding ahead with connectivity check even when the field is not present.

I would suggest we only proceed with validity checks if there are no validation errors from Config config = super.validate(connectorConfigs);.
When there are no configuration errors, we should proceed with validateSplunkConfigurations and handle the connectivity exception by updating the configDef fields URI_CONF and TOKEN_CONF with the appropriate error message.

@elekanne
Copy link

I know there was weekend in between, but this is a blocking issue for the customer that Vincent @vdesabou is also (indirectly) working form. So any update on this would be appreciated.

@VihasMakwana
Copy link
Contributor

@elekanne @vdesabou I've merged the PR.
I will make a patch release soon.

@ludovic-boutros
Copy link
Contributor

@VihasMakwana, thank you for the merged PR.
BTW, this PR just addressed the second issue explained by @rishabhbits038:
With splunk.validation.disable, validations are still kicking in, for which change by @ludovic-boutros should help. Its a different issue that the error would be thrown later during runtime.
The first one still needs additional work, as explained here.

@vdesabou
Copy link
Author

I've tested merged PR with "splunk.validation.disable": "true" and get back to previous behaviour:

curl  -X PUT -H "Content-Type: application/json" --data @/var/folders/kk/sv72bs_s7xn3kz_ggxfmm4yh0000gp/T/pg-XXXXXXXXXX.5ROI4KivWE/connector_new.json http://localhost:8083/connector-plugins/com.splunk.kafka.connect.SplunkSinkConnector/config/validate
{
  "name": "com.splunk.kafka.connect.SplunkSinkConnector",
  "error_count": 1,
  "groups": [
    "Common",
    "Transforms",
    "Predicates",
    "Error Handling"
  ],
  "configs": [
    {

...

    {
      "definition": {
        "name": "splunk.hec.uri",
        "type": "STRING",
        "required": true,
        "default_value": null,
        "importance": "HIGH",
        "documentation": "Splunk HEC URIs. Either a list of FQDNs or IPs of all Splunk indexers, separated with a \",\", or a load balancer. The connector will load balance to indexers using round robin. Splunk Connector will round robin to this list of indexers. https://hec1.splunk.com:8088,https://hec2.splunk.com:8088,https://hec3.splunk.com:8088",
        "group": null,
        "width": "NONE",
        "display_name": "splunk.hec.uri",
        "dependents": [],
        "order": -1
      },
      "value": {
        "name": "splunk.hec.uri",
        "value": null,
        "recommended_values": [],
        "errors": [
          "Missing required configuration \"splunk.hec.uri\" which has no default value."
        ],
        "visible": true
      }
    },
}

@elekanne
Copy link

@vdesabou Do we now have a temporary setup that works when validation.disable is set or do we need to wait for the the real issue to be solved. If so @VihasMakwana what is needed and when do you think we can get that?

@VihasMakwana
Copy link
Contributor

@elekanne I think @vdesabou can work with a workaround for now. I can't commit to a specific date but I will raise a PR shortly.

@vdesabou
Copy link
Author

I've compiled and tested with "splunk.validation.disable": "true" but the end user would still need to have an official release.

@VihasMakwana are you planning to do a release that includes the current merged PR ?

@VihasMakwana
Copy link
Contributor

@rishabhbits038 @vdesabou I've raised a PR #426 to resolve the remaining issue.
Please take a look

@VihasMakwana
Copy link
Contributor

@vdesabou

@VihasMakwana are you planning to do a release that includes the current merged PR ?

Yes.

@vdesabou
Copy link
Author

@vdesabou

@VihasMakwana are you planning to do a release that includes the current merged PR ?

Yes.

Any idea when the release will be available ?

@VihasMakwana
Copy link
Contributor

@vdesabou maybe by today itself. I merged the other PR as well.

@VihasMakwana
Copy link
Contributor

@vdesabou
Copy link
Author

Thanks a lot. We are planning to upload it to confluent hub as well

@andyfinlay
Copy link

Hello @VihasMakwana I am following up to ask if there will be a permanent solution for this issue without the use of "splunk.validation.disable": "true"?
I understand that the property "splunk.validation.disable": "true" will prevent the validation error being thrown and a 500 being wrongfully returned, but wondered if a future version (and sorry if I have missed one) will be released where that will be addressed.
Is it fixed in 2.2.2?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants