Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validations support 2 #66

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

Validations support 2 #66

wants to merge 2 commits into from

Conversation

chatman
Copy link
Collaborator

@chatman chatman commented Apr 3, 2023

Had to abandon the previous PR (#44) due to various merge conflicts, and re-open it here. Please follow the discussions there.

@chatman chatman requested a review from patsonluk April 3, 2023 17:07
@chatman chatman mentioned this pull request Apr 3, 2023
@chatman
Copy link
Collaborator Author

chatman commented Apr 3, 2023

In this PR, validations work only when detailed-stats is turned off. It can be worked on in a separate PR.

@patsonluk can you please review and also verify that it doesn't break detailed-stats functionality?

@@ -75,10 +119,11 @@ public static void runQueryBenchmarks(List<QueryBenchmark> queryBenchmarks, Stri
String baseUrl = queryNodes.get(benchmark.queryNode-1).getBaseUrl();
log.info("Query base URL " + baseUrl);
for (int threads = benchmark.minThreads; threads <= benchmark.maxThreads; threads++) {
ControlledExecutor.ExecutionListener<String, NamedList<Object>> listener = benchmark.detailedStats ? new DetailedQueryStatsListener() : new ErrorListener();
ControlledExecutor.ExecutionListener<String, SolrBenchQueryResponse> listener = benchmark.detailedStats ? new DetailedQueryStatsListener() : new ValidationListener();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we should allow both validation and detailed stats later on :)

try {
validationResults = runQueryValidations(benchmark.validations, ((ValidationListener) listener).queryResponses, benchmark, collection, baseUrl);
} catch (Exception e) {
log.error("Problem during validations", e);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we just let the exception throws as is? The method signature can actually be throws Exception now. There was a change that wraps call these tasks into a Callable, which it can throws exception (instead of Runnable that does not allow checked exception).

If exception is thrown, then the executor should properly handle the correct task dependency (do not proceed with dependant tasks) and error reporting (print to console with the root cause)

this.queryRequest = queryRequest;
this.client = client;
this.collection = collection;
this.queryString = queryString;
Copy link
Contributor

@patsonluk patsonluk Apr 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this different from the queryRequest.toString() ? getType() and getQueryString() might return very similar result. getType() might not be the best naming that i used 😓 but it was actually used to identify stats for different "types" which each type is just the query.

If query string is more accurate, perhaps getType() should just return the query string?

I named it type probably wanted to make it flexible (so the implementation can determine what is the "key"/"type") , but it might be a bit too generic that's confusing to others

import org.slf4j.LoggerFactory;

// nocommit: javadocs
public class SolrBenchQueryResponse {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice to have this class to avoid double reading the stream that triggers errors 💪🏼


InputStream responseStream = (InputStream) response.get("stream");
try {
responseString = getResponseStreamAsString(responseStream); // should only call this once, as this reads the stream!
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to make reading the stream lazy? My concern is that this always read the response stream, which seems unnecessary if the caller does not care about detailed stats nor needing any validations (which i assume is a common case for general benchmarking?)

} catch (IOException e) {
logger.warn("Failed to read the response stream for " + typeKey);
}
if (sbq.responseString != null) {
Copy link
Contributor

@patsonluk patsonluk Apr 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can probably remove this check or only check if sbq is null. As sbq.reponseString could be null (if the response is erroneous) and in such case, we do want to increment the errorCount instead of skipping all logic :)

@Override
public void onExecutionComplete(String typeKey, NamedList<Object> result, long duration) {
public void onExecutionComplete(String typeKey, SolrBenchQueryResponse result, long duration) {
queryResponses.add(result);
Copy link
Contributor

@patsonluk patsonluk Apr 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some concerns on memory usage as this appears to store all responses until the current query benchmark task run finishes.

I don't think it's a blocker right now, as it should be okay if the response size is small or if the total amount queries executed are low.

Perhaps in the future we can consider doing the validation at onExecutionComplete itself? I assume the overall computation is the same - validate each result here vs iterate all responses and validate at the end. It might increase the total duration per task, but per query duration (which is what matters more?) should be the same as far as we disregard the time spent in validation :) ?

Another minor benefit of doing it as here is we can fail fast if we want.

Similarly, for the run that generates validation (instead of performing validation), we can as well just keep the doc count here instead of the whole response?

Anyway, not a blocker I think 😊

@patsonluk
Copy link
Contributor

In this PR, validations work only when detailed-stats is turned off. It can be worked on in a separate PR.

@patsonluk can you please review and also verify that it doesn't break detailed-stats functionality?

Sorry about the conflict 😓 ! I have briefly reviewed and it's mostly LGTM except a few comments. I will test tomorrow to make sure the detailed stats are still there w/o issues

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants