Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a health check on the process that allows to test reachability #384

Open
Ongy opened this issue Dec 10, 2024 · 4 comments
Open

Add a health check on the process that allows to test reachability #384

Ongy opened this issue Dec 10, 2024 · 4 comments

Comments

@Ongy
Copy link
Contributor

Ongy commented Dec 10, 2024

Describe the feature

There should be an additional endpoint provided by the process (e.g. /healthz) that can be probed to detect general functioning.

Describe the reason for such feature

This allows us to employ better monitoring, for alerting when the server side goes offline for some reason. Additionally it allows to plug into load balancing services that are aware of the health of their backends to provide dynamic failover.

Describe alternatives you've considered

None.

@erebe
Copy link
Owner

erebe commented Dec 11, 2024

Hello,
It can be possible, but for now there is no such metric to tell if the server is correctly running, beside that the listening port is open, which can be tested without an http healthcheck endpoint.

There is no limit of nb of connection or whatsoever for now.
Have you seen a case where the server was not responding ?

@Ongy
Copy link
Contributor Author

Ongy commented Dec 11, 2024

The main reason I'd like a dedicated endpoint is to avoid side effects of the probing.

I think both the client side and the server side return some non-200 HTTP Status on a GET (instead of upgrade/connect) and log error messages.
When we tried t use the kubernetes TCP-probe on the client, it sent the 0.0.0.0 connection requests to the server that I mentioned in a PR previously.

We've not seen it not react at all, but the cloud native tooling is better at probing HTTP endpoints than just probing ports, and we've run into a situation where it failed due to our own mistakes.

@erebe
Copy link
Owner

erebe commented Dec 12, 2024

I understand, at some point I thought of adding a Prometheus /metrics endpoint to get some metrics.
I would rather put this /healthz and /metrics on a separate port, because otherwise it would force people to use a reverse proxy. Starting the server without a reverse proxy would expose those 2 path to the wild.

That something I can plan to add in the future, but for the moment I don't have much free time on hand I can't commit on anything

@Ongy
Copy link
Contributor Author

Ongy commented Dec 13, 2024

separate port is fine, though I'd like to have it in the same main execution logic, so in case anything's stuck it's stuck as well.

WRT. time: This is half a request, half coordination if I want to start something. so there's a decent chance I'll provide at least /healthz, maybe /metrics if I can think of good ones.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants