-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Making the content syncs more resilient #214
base: master
Are you sure you want to change the base?
Conversation
mzgubin
commented
Aug 29, 2018
•
edited
Loading
edited
- Changes that improve resiliency like catching some of the errors encountered to keep the jobs running. It would be great if that was perhaps put into a configuration whether to fail-fast or just log those errors as warnings and go on. I don't know enough about the best approach for something like that would be.
- Converted the servlet on the source being pulled from to handle post requests instead of get. That way the requests won't be limited by the GET URL length. This can be useful if there are a ton of exclusions as we've encountered.
- Also I made a change to how the exclusion filters are matching the paths. Currently the paths will be searched to find the exclusion string in any part of the path, instead I've made it so that the string is matched against just the startsWith instead. Not sure if that's how it's intended to be used, but perhaps that can also be configured somehow in the exclusion filter section itself.
- Added more debug logging
* Continuing on error when writing nodes on the client (catching and logging the ConstraintViolationException and the InvalidItemStateException)
…Logging the errors. Going to see if this helps with giving more information when something goes wrong on the server side like reading too large of a file.
…s in org.apache.jackrabbit.oak.plugins.blob.datastore.DataStoreBlobStore on line 676 of oak-blob-plugins-1.8.2.jar
… post method so that we are not limited by URL length in a GET call. This way we can pass a large exclusion list.
…re-logging Fixes/continue on error and more logging
…rt of the path to see what it starts with. We see if it matches or if it starts with that path (plus a forward slash to denote that it's a parent path)
Fixing the ExclusionPathNodeIterator to not just look at the first pa…
# Conflicts: # gradle.properties # src/main/groovy/com/twcable/grabbit/server/batch/steps/jcrnodes/JcrNodesWriter.groovy # src/main/groovy/com/twcable/grabbit/server/services/ExcludePathNodeIterator.groovy
gradle.properties
Outdated
|
||
# Please keep alphabetical | ||
apache_httpclient_version = 4.5.4 | ||
apache_httpcore_version = 4.4.8 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we are using the OkHttpClient already, it would be better to continue to use that than having another way to create connections using the Apache HttpClient.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok will switch to that one
@@ -41,6 +43,7 @@ servlet_api_version = 2.5 | |||
slf4j_version = 1.7.6 | |||
sling_api_version = 2.9.0 | |||
sling_base_version = 2.2.2 | |||
sling_commons_json_version = 2.0.6 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be better to continue using Groovy's built-in JSONBuilder for consistency as used elsewhere in the codebase rather than adding this new dependency!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I'll switch it
"path={}\nafter={}\nexcludePathList={}\n\n\n", path, | ||
afterDateString, StringUtils.join(excludePathsList, ",") | ||
|
||
serverService.getContentForRootPath(serverUsername, path, excludePathsList ?: null, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree the need to have a POST here instead of GET due to GET's limitations. However, since this is an internal call (from Grabbit client to Grabbit server), I don't think we need to keep both GET and POST.
I would thus suggest removing doGet()
and also updating the GrabbitContentPullServletSpec
accordingly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok will do
@@ -0,0 +1,344 @@ | |||
package com.twcable.grabbit.server.services.impl; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @mzgubin I did not see info about this in your PR description. Could you please let us know the need to re-implement this instead of using the TreeTraverser from Jackrabbit commons as it was before?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well the reason was that it allowed me to log what the error was when it was traversing. By default it was set to ErrorHandler.IGNORE so the errors were just being swallowed up.
log.debug "Sending NodeProto : ${node.name}" | ||
node.writeDelimitedTo(servletOutputStream) | ||
try { | ||
//log.info "\n\n### NodeProtos ###\n\n${StringUtils.join(nodeProtos.toString(), "\n")}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can get rid of all the commented out code (here and elsewhere in the PR). Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No problem will do
Sorry have been bogged down but will definitely get to this in the next week |
- Removing the commons sling json dependencies and using the the google groovy builder for json - Removing commented out code - Removing the GET method from the GrabbitContentPullServlet