Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] In Triple protocol , parameter retries does not take effect #14139

Closed
4 tasks done
guipengfei opened this issue Apr 28, 2024 · 1 comment
Closed
4 tasks done

[Bug] In Triple protocol , parameter retries does not take effect #14139

guipengfei opened this issue Apr 28, 2024 · 1 comment
Labels
component/need-triage Need maintainers to triage type/need-triage Need maintainers to triage

Comments

@guipengfei
Copy link

Pre-check

  • I am sure that all the content I provide is in English.

Search before asking

  • I had searched in the issues and found no similar issues.

Apache Dubbo Component

Java SDK (apache/dubbo)

Dubbo Version

dubbo java 3.1.11,openjdk17

Steps to reproduce this issue

1.消费者端代码,重试次数设置为4,且为tri协议
( For the consumer code, the number of retries is set to 4 and the tri protocol is used )

@DubboReference(retries = 4, url = "tri://127.0.0.1:21021")
 private IDemoService demoService;

2. 当接口调用超时时,会报以下错误,发现并未重试
( When an interface call times out, the following error is reported and no retry is found )

org.apache.dubbo.rpc.StatusRpcException: DEADLINE_EXCEEDED : Waiting server-side response timeout by scan timer. start time: 2024-04-28 18:06:52.836, end time: 2024-04-28 18:06:54.908, timeout: 2000 ms, service: com.xx.xx.xx.IDemoService, method: queryUser
	at org.apache.dubbo.rpc.TriRpcStatus.asException(TriRpcStatus.java:214)
	at org.apache.dubbo.rpc.protocol.tri.DeadlineFuture$TimeoutCheckTask.notifyTimeout(DeadlineFuture.java:183)
at org.apache.dubbo.rpc.protocol.tri.DeadlineFuture$TimeoutCheckTask.lambda$run$0(DeadlineFuture.java:169)
	at org.apache.dubbo.common.threadpool.ThreadlessExecutor$RunnableWrapper.run(ThreadlessExecutor.java:184)
	at org.apache.dubbo.common.threadpool.ThreadlessExecutor.waitAndDrain(ThreadlessExecutor.java:103)
	at org.apache.dubbo.rpc.AsyncRpcResult.get(AsyncRpcResult.java:194)
	at org.apache.dubbo.rpc.protocol.AbstractInvoker.waitForResultIfSync(AbstractInvoker.java:266)
	at org.apache.dubbo.rpc.protocol.AbstractInvoker.invoke(AbstractInvoker.java:186)
	at org.apache.dubbo.rpc.listener.ListenerInvokerWrapper.invoke(ListenerInvokerWrapper.java:71)
	at com.cxmt.cnap.common.dubbo.core.filter.DubboTraceFilter.invoke(DubboTraceFilter.java:42)
	at org.apache.dubbo.rpc.cluster.filter.FilterChainBuilder$CopyOfFilterChainNode.invoke(FilterChainBuilder.java:327)
	at org.apache.dubbo.rpc.cluster.filter.FilterChainBuilder$CallbackRegistrationInvoker.invoke(FilterChainBuilder.java:194)
	at org.apache.dubbo.rpc.protocol.ReferenceCountInvokerWrapper.invoke(ReferenceCountInvokerWrapper.java:78)
	at org.apache.dubbo.rpc.cluster.support.AbstractClusterInvoker.invokeWithContext(AbstractClusterInvoker.java:379)
	at org.apache.dubbo.rpc.cluster.support.FailoverClusterInvoker.doInvoke(FailoverClusterInvoker.java:81)
	at org.apache.dubbo.rpc.cluster.support.AbstractClusterInvoker.invoke(AbstractClusterInvoker.java:341)
	at org.apache.dubbo.rpc.cluster.router.RouterSnapshotFilter.invoke(RouterSnapshotFilter.java:46)
	at org.apache.dubbo.rpc.cluster.filter.FilterChainBuilder$CopyOfFilterChainNode.invoke(FilterChainBuilder.java:327)
	at org.apache.dubbo.monitor.support.MonitorFilter.invoke(MonitorFilter.java:100)
	at org.apache.dubbo.rpc.cluster.filter.FilterChainBuilder$CopyOfFilterChainNode.invoke(FilterChainBuilder.java:327)
	at org.apache.dubbo.rpc.protocol.dubbo.filter.FutureFilter.invoke(FutureFilter.java:52)
	at org.apache.dubbo.rpc.cluster.filter.FilterChainBuilder$CopyOfFilterChainNode.invoke(FilterChainBuilder.java:327)
	at org.apache.dubbo.rpc.cluster.filter.support.ConsumerClassLoaderFilter.invoke(ConsumerClassLoaderFilter.java:40)
	at org.apache.dubbo.rpc.cluster.filter.FilterChainBuilder$CopyOfFilterChainNode.invoke(FilterChainBuilder.java:327)
	at org.apache.dubbo.rpc.cluster.filter.support.ConsumerContextFilter.invoke(ConsumerContextFilter.java:120)
	at org.apache.dubbo.rpc.cluster.filter.FilterChainBuilder$CopyOfFilterChainNode.invoke(FilterChainBuilder.java:327)
	at org.apache.dubbo.rpc.cluster.filter.FilterChainBuilder$CallbackRegistrationInvoker.invoke(FilterChainBuilder.java:194)
	at org.apache.dubbo.rpc.cluster.support.wrapper.AbstractCluster$ClusterFilterInvoker.invoke(AbstractCluster.java:92)
	at org.apache.dubbo.rpc.cluster.support.wrapper.MockClusterInvoker.invoke(MockClusterInvoker.java:103)
	at org.apache.dubbo.rpc.proxy.InvocationUtil.invoke(InvocationUtil.java:57)
	at org.apache.dubbo.rpc.proxy.InvokerInvocationHandler.invoke(InvokerInvocationHandler.java:75)
	at com.cxmt.cnap.kernel.permission.api.IUserServiceDubboProxy8.queryUser(IUserServiceDubboProxy8.java)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:568)
	at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:344)
	at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:208)
  1. org.apache.dubbo.rpc.cluster.support.FailoverClusterInvoker
public Result doInvoke(Invocation invocation, final List<Invoker<T>> invokers, LoadBalance loadbalance) throws RpcException {
        List<Invoker<T>> copyInvokers = invokers;
        checkInvokers(copyInvokers, invocation);
        String methodName = RpcUtils.getMethodName(invocation);
        // 获取重试次数成功,len = 5
        int len = calculateInvokeTimes(methodName);
        // retry loop.
        RpcException le = null; // last exception.
        List<Invoker<T>> invoked = new ArrayList<Invoker<T>>(copyInvokers.size()); // invoked invokers.
        Set<String> providers = new HashSet<String>(len);
        for (int i = 0; i < len; i++) {
            //Reselect before retry to avoid a change of candidate `invokers`.
            //NOTE: if `invokers` changed, then `invoked` also lose accuracy.
            if (i > 0) {
                checkWhetherDestroyed();
                copyInvokers = list(invocation);
                // check again
                checkInvokers(copyInvokers, invocation);
            }
            Invoker<T> invoker = select(loadbalance, invocation, copyInvokers, invoked);
            invoked.add(invoker);
            RpcContext.getServiceContext().setInvokers((List) invoked);
            boolean success = false;
            try {
                // 1. 接口调用超时时,dubbo协议会抛出异常,被捕获后,进入下一次循环;
                //     但tri协议,返回的是正常的AsyncRpcResult对象,会在后面直接return出去,结束循环
                Result result = invokeWithContext(invoker, invocation);
                if (le != null && logger.isWarnEnabled()) {
                    logger.warn(CLUSTER_FAILED_MULTIPLE_RETRIES,"failed to retry do invoke","","Although retry the method " + methodName
                        + " in the service " + getInterface().getName()
                        + " was successful by the provider " + invoker.getUrl().getAddress()
                        + ", but there have been failed providers " + providers
                        + " (" + providers.size() + "/" + copyInvokers.size()
                        + ") from the registry " + directory.getUrl().getAddress()
                        + " on the consumer " + NetUtils.getLocalHost()
                        + " using the dubbo version " + Version.getVersion() + ". Last error is: "
                        + le.getMessage(),le);
                }
                success = true;
               // tri协议,这里直接return了,从而导致重试次数不生效
                return result;
            } catch (RpcException e) {
                if (e.isBiz()) { // biz exception.
                    throw e;
                }
                le = e;
            } catch (Throwable e) {
                le = new RpcException(e.getMessage(), e);
            } finally {
                if (!success) {
                    providers.add(invoker.getUrl().getAddress());
                }
            }
        }
        throw new RpcException(le.getCode(), "Failed to invoke the method "
                + methodName + " in the service " + getInterface().getName()
                + ". Tried " + len + " times of the providers " + providers
                + " (" + providers.size() + "/" + copyInvokers.size()
                + ") from the registry " + directory.getUrl().getAddress()
                + " on the consumer " + NetUtils.getLocalHost() + " using the dubbo version "
                + Version.getVersion() + ". Last error is: "
                + le.getMessage(), le.getCause() != null ? le.getCause() : le);
    }
  1. 如上,接口调用超时时,FailoverClusterInvoker类的处理,在tri协议下,invokeWithContext(invoker, invocation)的返回结果是个正常的AsyncRpcResult对象,导致后面直接return从而导致重试次数没有生效?
    ( As above, the FailoverClusterInvoker class handles the invocation timeout. Under the tri protocol, the return result from the invokeWithContext(invoker, invocation) is a normal AsyncRpcResult object. Does the retry count fail to take effect as a result of a direct return? )

5.问题:tri协议下,重试次数不生效是bug还是特性,如果是特性,并未看到官方文档的说明
( Problem: Whether the number of retries does not take effect under the tri protocol is a bug or a feature, if it is a feature, it is not described in the official documentation )

What you expected to happen

和dubbo协议一样,消费者端设置额重试次数生效
( As with the dubbo protocol, the number of retries set on the consumer side takes effect )

Anything else

No response

Are you willing to submit a pull request to fix on your own?

  • Yes I am willing to submit a pull request on my own!

Code of Conduct

@guipengfei guipengfei added component/need-triage Need maintainers to triage type/need-triage Need maintainers to triage labels Apr 28, 2024
@walklown
Copy link

walklown commented May 7, 2024

  1. The returns of DubboInvoker and TripleInvoker are all AsyncRpcResult. Asynchronous processing will determine whether to wait synchronously at org.apache.dubbo.rpc.protocol.AbstractInvoker#waitForResultIfSync, so this is not the key to the problem.
  2. The real question:
    2.1. DubboInvoker uses a blocking model. 'asyncResult.get(timeout, TimeUnit.MILLISECONDS)' in waitForResultIfSync will throw an exception when it times out, so it can run normally.
    2.2 TripleInvoker will actively stop the request when the request times out (see DeadlineFuture). 'asyncResult.get(timeout, TimeUnit.MILLISECONDS)' in waitForResultIfSync will never time out because the request has actively stopped before it times out (the status is failed), so he never throws an exception and tries again.

This issue has been fixed in 3.2.x, see Revision Number 0f7a62a. It is recommended to upgrade to 3.2.x to fix the problem. Hope it helps you.

image

@AlbumenJ AlbumenJ closed this as completed May 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/need-triage Need maintainers to triage type/need-triage Need maintainers to triage
Projects
Archived in project
Development

No branches or pull requests

3 participants