Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question : Controllers standby test #5416

Open
p-jonghyun opened this issue May 22, 2023 · 6 comments
Open

Question : Controllers standby test #5416

p-jonghyun opened this issue May 22, 2023 · 6 comments

Comments

@p-jonghyun
Copy link

p-jonghyun commented May 22, 2023

behavior of "Controllers hot standby"
it should "use controller1 if controller0 goes down" in withAssetCleaner(wskprops) { (wp, assetHelper) =>
if (amountOfControllers >= 2) {

'use controller1 if controller0 goes down' test in ShootComponentsTests.scala does following procedures.

  1. restart controller0 container
  2. /ping controller0 until it’s down
  3. /ping controller1 to check it’s still up
  4. ( Invoke Action(POST) + Get Action(GET) ) * 96 to nginx

Isn't there a case when nginx will forward the first POST request to controller0 that is not ready to take requests? ( in state container up, but the backend in the container not)

Such behaivor will result in connection reset by peer which is not in Swagger Spec, failing the test.

@style95
Copy link
Member

style95 commented May 22, 2023

I believe the test case does not guarantee no request is sent to the failed controller.
It allows some level of unsuccessful requests.

unsuccessfulInvokes should be <= 5

@p-jonghyun
Copy link
Author

@style95
Thanks for the response!

As far as I understood this test case allows unsuccessful invokes only if their response matched the swagger spec.

val validationErrors = validateRequestAndResponse(request, response)
if (validationErrors.nonEmpty) {
fail(
s"HTTP request or response did not match the Swagger spec.\nRequest: $request\n" +
s"Response: $response\nValidation Error: $validationErrors")
}
response
}

And Connection reset by peer is not one of them which will make the test fail.

@style95
Copy link
Member

style95 commented May 22, 2023

Do you see any test failures?
In my recent build, it was successful.
https://github.com/apache/openwhisk/actions/runs/5029161559/jobs/9020540100#step:4:8830

I am curious about in which condition it failed.

@p-jonghyun
Copy link
Author

p-jonghyun commented May 22, 2023

It looks like OW system test build does not use multiple controllers

[controllers]
controller0 ansible_host=172.17.0.1 ansible_connection=local
;{% if mode is defined and 'HA' in mode %}
;controller1 ansible_host=172.17.0.1 ansible_connection=local
;{% endif %}

Therefore, the test case should be ignored.

it should "use controller1 if controller0 goes down" in withAssetCleaner(wskprops) { (wp, assetHelper) =>
if (amountOfControllers >= 2) {
val actionName = "shootcontroller"
assetHelper.withCleaner(wsk.action, actionName) { (action, _) =>

@style95
Copy link
Member

style95 commented May 22, 2023

Indeed.
We are setting up the environment with the HA mode.

$ANSIBLE_CMD setup.yml -e mode=HA

But it seems controller1 is always disabled.
According to the PR, it looks this is not intended.

I think we need to enable controller1 in the CI environment and fix the system test if required.
@p-jonghyun Thank you for reporting this.

@style95
Copy link
Member

style95 commented May 31, 2023

It seems the test passed after enabling the second controller.
https://github.com/apache/openwhisk/actions/runs/5132045151/jobs/9233975717#step:4:8941

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants