-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
storageusers pod in CrashLoopBackOff mode after upgrade #855
Comments
How did you upgrade? Do you use a chart that has |
Yes! I pulled the latest from this repository. So the Chart.yaml shows 7.0.0. And |
Are you using the builtin NATS or an external one? |
I use NATS like one of the examples here in this repository. My helmfile.yaml may answer your question. It's deployed to it's own namespace, so kind of external. But used exclusively by OCIS. |
The configuration actually looks fine. Could you please execute following command to ensure that the relevant pods are on
Also it would be interesting if the output is similar (I have replicas set to 2, so we see the same output twice.):
|
Yes. It looks like all of them are updated.
And the second output is similar to yours:
I also checked the connection to > kubectl run curlpod --image=curlimages/curl -ti -- sh
If you don't see a command prompt, try pressing enter.
~ $ curl -v nats.ocis-nats.svc.cluster.local:4222
* Host nats.ocis-nats.svc.cluster.local:4222 was resolved.
* IPv6: (none)
* IPv4: 10.96.123.122
* Trying 10.96.123.122:4222...
* Connected to nats.ocis-nats.svc.cluster.local (10.96.123.122) port 4222
* using HTTP/1.x
> GET / HTTP/1.1
> Host: nats.ocis-nats.svc.cluster.local:4222
> User-Agent: curl/8.11.1
> Accept: */*
>
* Received HTTP/0.9 when not allowed
* closing connection #0
curl: (1) Received HTTP/0.9 when not allowed
~ $ I suppose NATS is a kv-store. But is it dependent on persistence? Can I delete the NATS PVCs and recreate NATS from scratch? Maybe the kv-store can become corrupted in some way after updating to 7.0.0. Although the logs look fine to me: > kubectl -n ocis-nats logs -l app.kubernetes.io/component=nats
Defaulted container "nats" out of: nats, reloader
Defaulted container "nats" out of: nats, reloader
Defaulted container "nats" out of: nats, reloader
[7] 2025/02/03 11:27:26.073835 [WRN] Catchup for stream '$OCIS > KV_service-registry' resetting first sequence: 386508 on catchup request
[7] 2025/02/03 11:27:26.132553 [INF] JetStream cluster new stream leader for '$OCIS > KV_eventhistory'
[7] 2025/02/03 11:27:26.355371 [INF] JetStream cluster new stream leader for '$OCIS > KV_ids-storage-users'
[7] 2025/02/03 11:27:26.550106 [INF] JetStream cluster new consumer leader for '$OCIS > main-queue > userlog'
[7] 2025/02/03 11:27:27.066558 [INF] JetStream cluster new metadata leader: nats-2/nats
[7] 2025/02/03 11:27:27.641533 [INF] JetStream cluster new consumer leader for '$OCIS > main-queue > search'
[7] 2025/02/03 11:27:27.689822 [INF] JetStream cluster new stream leader for '$OCIS > KV_postprocessing'
[7] 2025/02/03 11:27:28.489216 [INF] JetStream cluster new consumer leader for '$OCIS > main-queue > frontend'
[7] 2025/02/03 11:27:29.294879 [INF] JetStream cluster new consumer leader for '$OCIS > main-queue > activitylog'
[7] 2025/02/03 11:27:35.960745 [INF] JetStream cluster new consumer leader for '$OCIS > KV_service-registry > atTLH0de'
[7] 2025/02/03 11:27:19.999821 [WRN] RAFT [yrzKKRBu - C-R3F-VcxU0MuI] Detected another leader with higher term, will stepdown
[7] 2025/02/03 11:27:20.005924 [WRN] RAFT [yrzKKRBu - S-R3F-GQ0lBcwu] Detected another leader with higher term, will stepdown
[7] 2025/02/03 11:27:20.009586 [WRN] RAFT [yrzKKRBu - S-R3M-tPuEdTd1] Detected another leader with higher term, will stepdown
[7] 2025/02/03 11:27:20.029709 [WRN] RAFT [yrzKKRBu - S-R3M-eSXnkVG4] Detected another leader with higher term, will stepdown
[7] 2025/02/03 11:27:20.153287 [INF] 10.233.205.70:45004 - rid:300 - Route connection created
[7] 2025/02/03 11:27:20.154363 [INF] 10.233.205.70:45004 - rid:300 - Router connection closed: Duplicate Route
[7] 2025/02/03 11:27:24.855415 [INF] JetStream cluster new stream leader for '$OCIS > KV_activitylog'
[7] 2025/02/03 11:27:24.858442 [INF] JetStream cluster new stream leader for '$OCIS > KV_settings-cache'
[7] 2025/02/03 11:27:26.100417 [INF] JetStream cluster new stream leader for '$OCIS > KV_ocis-pkg'
[7] 2025/02/03 11:27:36.483472 [INF] JetStream cluster new consumer leader for '$OCIS > KV_service-registry > beIVHnry'
[7] 2025/02/03 11:27:26.074897 [INF] Catchup for stream '$OCIS > KV_service-registry' complete
[7] 2025/02/03 11:27:26.119206 [INF] JetStream cluster new stream leader for '$OCIS > KV_userlog'
[7] 2025/02/03 11:27:26.165808 [INF] JetStream cluster new consumer leader for '$OCIS > main-queue > postprocessing'
[7] 2025/02/03 11:27:26.188788 [INF] JetStream cluster new consumer leader for '$OCIS > main-queue > graph'
[7] 2025/02/03 11:27:26.958469 [INF] JetStream cluster new stream leader for '$OCIS > KV_cache-roles'
[7] 2025/02/03 11:27:27.061159 [INF] Self is new JetStream cluster metadata leader
[7] 2025/02/03 11:27:28.034532 [INF] JetStream cluster new stream leader for '$OCIS > main-queue'
[7] 2025/02/03 11:27:28.238499 [INF] JetStream cluster new consumer leader for '$OCIS > main-queue > jsoncs3sharemanager'
[7] 2025/02/03 11:27:29.882287 [INF] JetStream cluster new stream leader for '$OCIS > KV_storage-system'
[7] 2025/02/03 11:27:35.968347 [INF] JetStream cluster new consumer leader for '$OCIS > KV_service-registry > 0x9QNsBr' |
After upgrading from
5.0.3
to7.0.0
the storageusers pod remains in CrashLoopBackOff mode. The logs complain about needingnats
, butnats
is running and healthy. Anyone any idea about that?The text was updated successfully, but these errors were encountered: