New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add experimental event ruler #10615
Add experimental event ruler #10615
Conversation
cd43cbd
to
a67c3c6
Compare
58a4bd4
to
017bae5
Compare
LocalStack Community integration with Pro 2 files ± 0 2 suites ±0 1h 30m 58s ⏱️ -12s Results for commit dc265aa. ± Comparison against base commit 4b72d6e. This pull request removes 35 and adds 67 tests. Note that renamed tests count towards both.
This pull request removes 1 skipped test and adds 1 skipped test. Note that renamed tests count towards both.
♻️ This comment has been updated with latest results. |
# Determine which implementation to use for the event rule / event filtering engine used by multiple services: | ||
# EventBridge, EventBridge Pipes, Lambda Event Source Mapping, SNS | ||
# Options: provider (default) | java | ||
EVENT_RULE_ENGINE = os.environ.get("EVENT_RULE_ENGINE", "").strip() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✅ CI run with JPype enabled using java
here in 017bae5
# Workaround to unblock LocalStack shutdown. By default, JPype wait until all daemon threads are terminated, | ||
# which blocks the LocalStack shutdown upon any leaked thread (know LS issue). | ||
# See https://github.com/MPh-py/MPh/issues/15#issuecomment-778486669 | ||
jpype_config.destroy_jvm = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's unfortunately needed. Could this cause any side-effects?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The container shutdown will probably not work, as we have an orphan process which did not get terminated by its parent on shutdown. The supervisor will probably wait forever, until docker kills the container.
pyproject.toml
Outdated
@@ -67,6 +67,8 @@ base-runtime = [ | |||
"Werkzeug>=3.0.0", | |||
"xmltodict>=0.13.0", | |||
"rolo>=0.4", | |||
# allow Python programs full access to Java class libraries. Used for opt-in event ruler. | |||
"JPype1>=1.5.0", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
431KB without any extra dependencies (see their pyproject.toml), just packaging
and typing_extensions
, which we have as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could we move it to runtime
instead? base-runtime
is the base one used by the S3 image, which has no dependency basically. Or was there a specific reason to put into base-runtime
?
For a bit of context, basically, base-runtime
is the minimal set of dependencies needed to run a service in LocalStack which would be entirely written in Python with no external dependencies (like S3).
I believe this change, which is for a provider with external dependencies and only opt-in, would fit better in runtime
.
This would also have the side-effect of not having to change the S3 image Dockerfile, which does not need this and g++.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes a lot of sense @bentsku 💯 Thank you for this valuable input 👀 👏
I fixed everything (including the follow-up S3 tests because the test deps include the runtime deps 😬 ; that's a bit of an ugly hack though)
I didn't understand the full implications of base-runtime
vs runtime
dependencies.
With the fixed solution, the S3 image does not get larger 👌
# {"id": {"S": "id_value_1"}}, | ||
# {"id": {"S": "id_value_2"}, "presentKey": {"S": "presentValue"}}, | ||
# {"dynamodb": {"NewImage": {"presentKey": [{"exists": False}]}}}, | ||
# 2, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This false negative took me forever to spot. A combination of bad test design and inappropriate configuration led to wrong assumptions!
It was configured to 1 but both events match against AWS. However, the second event never gets waited. Negative testing for the last event is unreliable!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this test be reintroduced at a later point?
function_name = f"lambda_func-{short_uid()}" | ||
table_name = f"test-table-{short_uid()}" | ||
max_retries = 50 | ||
|
||
create_lambda_function( | ||
handler_file=TEST_LAMBDA_PYTHON_ECHO, | ||
func_name=function_name, | ||
runtime=Runtime.python3_9, | ||
runtime=Runtime.python3_12, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/cc @Morijarti I'm bumping the runtime here together with significant test changes, so can can simply take over my changes upon rebasing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome, thank you
@@ -136,7 +136,7 @@ def test_no_match_partial(self): | |||
filters = [ | |||
{ | |||
"Filters": [ | |||
{"Pattern": json.dumps({"partitionKey": "2", "data": {"City": ["Seattle"]}})} | |||
{"Pattern": json.dumps({"partitionKey": ["2"], "data": {"City": ["Seattle"]}})} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
another example why non-validated unit-tests are tricky to get right 😢
Dockerfile.s3
Outdated
apt-get install -y gcc | ||
apt-get install -y gcc && \ | ||
# Workaround to fix JPype1 compile error on ARM Linux "gcc: fatal error: cannot execute ‘cc1plus’" | ||
apt-get install -y g++ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wished this was not necessary, especially not needed for S3 😿
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this would be a bit faster if it would be in a single apt-get install
call? 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And maybe this should be added somewhere in the contribution guide? Because it seems Mac users will struggle to run make install
without g++
(and Java when they use this feature) on their machines?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for a single apt-get call, updated ✅
The g++
dependency might be specific to ARM Linux because it worked out of the box for me on an ARM MacBook.
I agree that the Java installation and setting JAVA_HOME
is a bit cumbersome in host mode 😿
@alexrashed Where is the right place to document service-specific requirements for host mode? (we have Notion, the dev guide on the website, getting started guide)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be part of the contribution guidelines (because it affects all open source contributors), which is currently located in our docs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need it for S3? Because we put it into base-runtime
? Could we just move the dependency into runtime
instead? I'm not sure I entirely follow here.
(I probably should add the S3 docker file into the code owners list for S3 as well, I totally missed this PR 😅)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need it for S3. Thanks for suggesting the switch to runtime
deps 👍
Fixed as commented in #10615 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, this a pretty sophisticated change which drastically improve parity with the event pattern matching! 💯
I hope the new dependency doesn't cause too many issues with #10235 (but as you stated, they seem to work on compatibility above 3.10).
Besides that I only had some thoughts concerning the internal developer experience (Mac users need to be aware that they have to install g++
) and some smaller nitpicks ;)
Besides that, I'll leave the final signoff to the service owners :)
0aca7a1
to
ec960ea
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome! 🚀 Thanks a lot for addressing all the comments! From my side all these changes are looking good! As mentioned in the initial comment, I'll leave the final sign-off to the code owners 🥳
# Workaround to unblock LocalStack shutdown. By default, JPype wait until all daemon threads are terminated, | ||
# which blocks the LocalStack shutdown upon any leaked thread (know LS issue). | ||
# See https://github.com/MPh-py/MPh/issues/15#issuecomment-778486669 | ||
jpype_config.destroy_jvm = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The container shutdown will probably not work, as we have an orphan process which did not get terminated by its parent on shutdown. The supervisor will probably wait forever, until docker kills the container.
if config.EVENT_RULE_ENGINE == "java": | ||
import jpype | ||
import jpype.imports | ||
from jpype import config as jpype_config | ||
from jpype import java | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it necessary to do this on module level on import? including the startup? Could it be done on first match as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we can go full dynamic import upon first event matching:
- it could hide potential issues
- JVM startup delay is probably not an issue though.
Do you think a lazy import is better here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Resolved using a dynamic import now (as discussed in sync).
# {"id": {"S": "id_value_1"}}, | ||
# {"id": {"S": "id_value_2"}, "presentKey": {"S": "presentValue"}}, | ||
# {"dynamodb": {"NewImage": {"presentKey": [{"exists": False}]}}}, | ||
# 2, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this test be reintroduced at a later point?
localstack/services/lambda_/event_source_listeners/dynamodb_event_source_listener.py
Show resolved
Hide resolved
Trying to reply to @dfangl comment: #10615 (comment)
This a problem How should we test this? Any suggestion on how to fix this? |
efaf60c
to
9a54628
Compare
This should be resolved after successfully testing the shutdown behavior together with @dfangl (in sync today) The CI is looking good here 🟢 . In ext, we have some Kinesis analytics test still failing (e.g., |
Using The failing Kinesisanalytics tests are likely caused by an interaction with our Kinesisanalytics implementation using https://github.com/localstack/PySiddhi, an unmaintained project that uses some kind of custom Python=>Java bridge. This C-based proxy uses JNI (like JPype) and could potentially cause an unintended interaction. Failing ext test run:
|
Without using the new opt-in Java-based event-ruler, all tests are green 🟢 , both in community and ext 🚀 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good now, thank you for addressing my comments and avoiding start of the jvm on module import.
Looking forward for more unification of all the event matching implementations :)
Could this also be supported when running localstack in a docker container? And I assume this will resolve the current issues with what appears to be unsupported syntax such as: There are current matching rules that don't appear to be working for me, I'm currently running localstack 3.3 in docker. |
Thanks for the comment @andy-g
Yes, it will work in Docker.
All builds are 🟢 , so I expect this to be merged anytime soon 🚀 |
Brilliant, thanks @joe4dev, I'll keep an eye on this PR then, and thanks so much for the quick response. |
Very nice PR, I expect to use this heavily in our new native event bridge implementation. Although tests with this java pattern matching take a bit longer to execute for me compared to our own implementation - it is much more complete. I also expect it to perform much better once a certain event number is put through the system at the same time. This will come in handy if we can get the rest of the provider's performance also up that level of speed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added 2 comments: #10615 (comment) and #10615 (comment), which I believe should be addressed before merging if possible, as this PR would then have no impact on the S3 image or base runtime. There might be something I missed, feel free to correct me if it's really needed to go into base-runtime
. Thanks!
This reverts commit 4e7dbc5.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks a lot for removing the changes for the S3 image, sorry it has been a bit complicated with the tests dependencies and had to go further down the path of quick hacks in the make target 😬 but very grateful you've addressed it. Thank you so much!
And sorry again for missing this and coming in late requesting changes. 😞
Great work, can't wait for us to have a unified event ruler! 🚀
@joe4dev I see this was merged, and I see that new docker images have been built, am I right in assuming these changes should be in latest? I'm running:
Using this rule my lambda works (as it has previously): {
"detail-type": ["PaymentReceived","PaymentCompleted"],
"source": ["producer-1"]
} ...but changing my rule to either of these result in the lambda not being invoked (I was hoping this PR would change that): {
"detail-type": ["PaymentReceived"],
"detail.payload.paymentDetails": {
"url": [{"exists": true}],
"type": ["Multi"]
},
"source": ["producer-1"]
}
These are working fine on AWS itself. Am I missing something here? |
Sorry, after a closer look I realised I needed to set the |
Yes, the changes are released for @andy-g Awesome to hear everything works as expected using |
@andy-g I can confirm the |
Motivation
We have several parity gaps and bugs in multiple different implementations for event pattern matching (see PR #10599). We can offer high AWS-parity quickly using the Amazon EventBridge Event Bus event matching now runs on Open Source Event Ruler v1.5.0 released
event-ruler
Java library from AWS described in this blog post.Changes
EVENT_RULE_ENGINE=java
flag using the Java event-ruler library from AWS through JPype.MavenPackageInstaller
utility(could probably move tolocalstack/packages/core.py
DONE)Testing
EVENT_RULE_ENGINE=java
JAVA_HOME
points to a valid JVM installation if you are testing in host modeSKIP_LABELS
for the test test_test_event_patterntest_test_event_pattern
LocalStack Shutdown
@dfangl and @joe4dev tested the LocalStack shutdown as follows:
make docker-build
EVENT_RULE_ENGINE=java DEBUG=1 localstack start
docker stop $CONTAINER_ID
should stop the container immediately and not after a 10s delay=> worked without delay
Limitations
ERROR: Could not build wheels for jpype1
due togcc: fatal error: cannot execute ‘cc1plus’: execvp: No such file or directory
. Supposedly, JPype only supports up to Python 3.10 🤔, which would be a blocker 🔴 (they recently merged a fix for Python 3.12 though). We are using Python 3.11 in the image and jpype1 builds fine on my ARM MacBook using Python 3.11.7.apt-get install -y g++
.docker-run-tests-s3-only
to keep it out of the S3 image.Kinesisanalytics
Kinesisanalytics seems incompatible with
EVENT_RULE_ENGINE=java
indicated by the following failing tests:TestKinesisAnalyticsProvider.test_tag_list_tag_untag_resource
TestKinesisAnalyticsProvider.test_input_processing_configuration
TestKinesisAnalyticsProvider.test_application_output
TestKinesisAnalytics.test_run_siddhi_query
⇒ all these tests keep failing in the AMD64 pipeline. They are not ARM compatible.
⇒ They fail with the assertion:
AssertionError: assert 'STARTING' in ['READY', 'RUNNING']
⇒ The siddhi query test also fails with a name error
NameError: name 't_siddhi_api_core' is not defined
in the Siddhi loaderPySiddhi/SiddhiLoader.py
LS uses some badly outdated fork of the unmaintained (5y; LS 3y last commit) PySiddhi project: https://github.com/localstack/PySiddhi. They are using some kind of custom Python⇒Java bridge based on JNI, which could cause interference.
ℹ️ We decided to ignore this limitation for now. If JPype proves successful, it might be a better alternative than PySiddhi ...
TODO
EVENT_RULE_ENGINE
back from"java"
(used for CI testing) to""
Follow Up
tests.aws.services.lambda_.test_lambda_integration_dynamodbstreams.TestDynamoDBEventSourceMapping.test_dynamodb_event_filter
numeric_filter
fails due to missinganything-but
supportexists_false_filter
fails due to broken implementation