Skip to content
This repository has been archived by the owner on Sep 21, 2022. It is now read-only.

Switchboard automatic process failure-restart #227

Open
1kaushik1 opened this issue Jun 24, 2019 · 1 comment
Open

Switchboard automatic process failure-restart #227

1kaushik1 opened this issue Jun 24, 2019 · 1 comment

Comments

@1kaushik1
Copy link

Thank you for submitting an issue.

Problem you are trying to solve

Our uaa service fluctuated for a minute due to JDBC connection error
We could link the downtime to mysql proxy's switchboard restart.
The downtime happened at the same time the process had restarted.
Based on the logs can you tell us as to what could be the reason for the failure-restart as it happened twice in 10 days

What is the current broken behavior?
Automatic switchboard process failure and restart

Expected:

The process should not fail

Deployment Context:

Please provide relevant details about your deployment. That might include:

  • Are you using cf-mysql-release with a service broker?
    yes
  • How many nodes and proxies are deployed?
    3 mysql nodes and 2 proxy nodes
  • Other relevant deployment topology info
  • Release version
    CF/287, mysql version - cf-mysql/32
  • Other bosh releases deployed on the same vms
    No

Reference:

Attach screenshot(s) or logs if relevant

Post restart we could see the following set of logs

{"timestamp":"1561118814.904110193","source":"/var/vcap/packages/switchboard/bin/switchboard","message":"/var/vcap/packages/switchboard/bin/switchboard.lock.lost-lock","log_level":2,"data":{"error":"Unexpected response code: 500","key":"v1/locks/mysql_lock","session":"1","value":""}}
{"timestamp":"1561118814.904178619","source":"/var/vcap/packages/switchboard/bin/switchboard","message":"/var/vcap/packages/switchboard/bin/switchboard.lock.done","log_level":1,"data":{"key":"v1/locks/mysql_lock","session":"1","value":""}}
{"timestamp":"1561118814.904283285","source":"/var/vcap/packages/switchboard/bin/switchboard","message":"/var/vcap/packages/switchboard/bin/switchboard.registration-runner.poll-until-signaled.deregistering-service","log_level":1,"data":{"service":"mysql","session":"2.1","update-interval":"1.5s"}}
{"timestamp":"1561118814.920963049","source":"/var/vcap/packages/switchboard/bin/switchboard","message":"/var/vcap/packages/switchboard/bin/switchboard.registration-runner.poll-until-signaled.finished","log_level":1,"data":{"service":"mysql","session":"2.1","update-interval":"1.5s"}}
{"timestamp":"1561118814.921007633","source":"/var/vcap/packages/switchboard/bin/switchboard","message":"/var/vcap/packages/switchboard/bin/switchboard.registration-runner.finished","log_level":1,"data":{"service":"mysql","session":"2"}}
{"timestamp":"1561118814.921078205","source":"/var/vcap/packages/switchboard/bin/switchboard","message":"/var/vcap/packages/switchboard/bin/switchboard.Received signal","log_level":1,"data":{"signal":2}}
{"timestamp":"1561118814.921156883","source":"/var/vcap/packages/switchboard/bin/switchboard","message":"/var/vcap/packages/switchboard/bin/switchboard.Received signal","log_level":1,"data":{"signal":2}}
{"timestamp":"1561118814.921186209","source":"/var/vcap/packages/switchboard/bin/switchboard","message":"/var/vcap/packages/switchboard/bin/switchboard.Proxy runner has exited","log_level":1,"data":{}}
{"timestamp":"1561118814.921303272","source":"/var/vcap/packages/switchboard/bin/switchboard","message":"/var/vcap/packages/switchboard/bin/switchboard.Switchboard exited unexpectedly","log_level":3,"data":{"error":"Exit trace for group:\nlock exited with error: lock lost\nregistration exited with nil\nhealth exited with nil\nmonitor exited with nil\napi exited with nil\nbridge exited with nil\n","proxyConfig":{"Port":3306,"Backends":[{"Host":"10.3.5.19","Port":3306,"StatusPort":9200,"StatusEndpoint":"galera_status","Name":"backend-0"},{"Host":"10.3.6.19","Port":3306,"StatusPort":9200,"StatusEndpoint":"galera_status","Name":"backend-1"},{"Host":"10.3.7.19","Port":3306,"StatusPort":9200,"StatusEndpoint":"galera_status","Name":"backend-2"}],"HealthcheckTimeoutMillis":5000},"trace":"goroutine 1 [running]:\ngithub.com/cloudfoundry-incubator/switchboard/vendor/code.cloudfoundry.org/lager.(*logger).Fatal(0xc420115da0, 0x8bf522, 0x1f, 0xac1ce0, 0xc4203c7860, 0xc42047c300, 0x1, 0x1)\n\t/var/vcap/packages/switchboard/src/github.com/cloudfoundry-incubator/switchboard/vendor/code.cloudfoundry.org/lager/logger.go:131 +0xc7\nmain.main()\n\t/var/vcap/packages/switchboard/src/github.com/cloudfoundry-incubator/switchboard/main.go:151 +0xd13\n"}}
panic: Exit trace for group:
lock exited with error: lock lost
registration exited with nil
health exited with nil
monitor exited with nil
api exited with nil
bridge exited with nil

@cf-gitbot
Copy link
Collaborator

We have created an issue in Pivotal Tracker to manage this:

https://www.pivotaltracker.com/story/show/166877449

The labels on this github issue will be updated when the story is started.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
Development

No branches or pull requests

2 participants