-
Notifications
You must be signed in to change notification settings - Fork 245
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[client] Add retry when get one available tablet serverNode fails #426
base: main
Are you sure you want to change the base?
Conversation
f116150
to
c2287b3
Compare
AdminReadOnlyGateway gateway = | ||
GatewayClientProxy.createGatewayProxy( | ||
() -> | ||
getOneAvailableTabletServerNode( | ||
metadataUpdater.getCluster()), | ||
rpcClient, | ||
AdminReadOnlyGateway.class); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unnecessary change. Pls revert this line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I will revert this line.
@@ -52,6 +55,9 @@ | |||
|
|||
/** Utils for metadata for client. */ | |||
public class MetadataUtils { | |||
private static final Logger LOG = LoggerFactory.getLogger(MetadataUtils.class); | |||
|
|||
private static final int MAX_RETRY_TIMES = 5; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this retry time be configurable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I will make it as system configuration
@gkatzioura also added a retry mechanism here UPDATE: There is a retry method, but its only available in the test utils |
@polyzos Good idea, I will try to make a common util method of retry logic. WDYT? @SteNicholas please left some comments, thank you |
List<ServerNode> aliveTabletServers = null; | ||
for (int retryTimes = 0; retryTimes <= MAX_RETRY_TIMES; retryTimes++) { | ||
aliveTabletServers = cluster.getAliveTabletServerList(); | ||
if (aliveTabletServers.isEmpty()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When aliveTabletServers
is empty in cluster
, the retry doesn't work since the cluster wil stil be empty for aliveTabletServers, we need send metadata request to update
cluster`.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK,I will adjust it,thank you.
@zhaozijun109, @polyzos, IMO, the retry mechanism maybe different from different behavior, therefore it may not need to introduce a common util method of retry logic. |
@SteNicholas Thank for your advice, I will re-adjust based on all the comments above |
@SteNicholas Indeed, but there might be re-use of some implementations.. my thoughts were that it might be best to have them in a centralized place, to avoid introducing multiple retry implementations across the codebase, because I already saw 3 different cases popping up in different places. Whatever you think works best |
c2287b3
to
fc5a64f
Compare
@luoyuxia @SteNicholas Could you please help me review again when you have free time? Thank you. |
@@ -876,6 +876,13 @@ public class ConfigOptions { | |||
"Enable metrics for client. When metrics is enabled, the client " | |||
+ "will collect metrics and report by the JMX metrics reporter."); | |||
|
|||
public static final ConfigOption<Integer> CLIENT_GET_TABLET_SERVER_NODE_MAX_RETRY_TIMES = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add this option in configuration.md
.
@zhaozijun109, could you create new issue for this pull request? BTW, could you also add some description for this pull request? |
Purpose
Linked issue: close #xxx
Tests
API and Format
Documentation