Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nexus Server Instance not works with OC 4.11 #230

Open
marcozanghi opened this issue Jan 5, 2023 · 6 comments
Open

Nexus Server Instance not works with OC 4.11 #230

marcozanghi opened this issue Jan 5, 2023 · 6 comments
Labels
bug 🐛 Something isn't working

Comments

@marcozanghi
Copy link

Describe the bug
Nexus Operator instantiates Nexus server without up and running nexus process.
Events console: Liveness and Readiness probe failed: connection refused

To Reproduce
Steps to reproduce the behavior:

  1. Install the operator
  2. Configure SCC resource as described in readme file

Expected behavior
Up and running pod.

Environment
Openshift v. 4.11.20

Additional context
Also, the manual deployment provides the same result.

@marcozanghi marcozanghi added the bug 🐛 Something isn't working label Jan 5, 2023
@ricardozanini
Copy link
Member

Hi! Thanks for opening this issue. Are you happen to have the logs from the nexus server deployed by the operator or the server hasn't deployed?

@marcozanghi
Copy link
Author

No logs are written. The container is up and running but not ready. Accessing by terminal I can see all the folders correctly configured (nexus is the owner). When I tried to start manually the server inside the container I could check the java process running, but no logs were produced.

image

@ricardozanini
Copy link
Member

Thanks! Unfortunately, I have little time to go through it, thou. I'll see if I can take a look next week.

@marcozanghi
Copy link
Author

marcozanghi commented Jan 25, 2023

Hi @ricardozanini I'm trying to go through it for a fix. I have tried to deploy nexus image rewriting all the yaml files for OC platform and also Dockerfile from scratch. In that case, It seems that the container is correctly running in the pod but nexus server is not started and no logs are written when I mount the volume of /nexus-data by using a persistent volume claim. (As it happens with this operator) Otherwise, when I set emptyDir{} volume all is fine.

If you have any ideas or more tests that I can do by myself it would be great. Thanks.

@titou10titou10
Copy link

titou10titou10 commented Jan 25, 2023

FYI, I'm running the nexus operator with OKD v4.12 (upgraded since v4.9...) and it works without problem
I don't think the permission you show before are correct. OCP/OKD use a random UID and the group should be "root"
In the hope it will help you, here some data about my running Nexus (Currently v3.43.0-01 OSS)

sh-4.4$ id
uid=1000700000 gid=0(root) groups=0(root),1000700000

sh-4.4$ ls -la /nexus-data/
total 74
drwxrwxrwx.  18 root       root  21 Jan 22 20:15 .
dr-xr-xr-x.   1 root       root  39 Jan 22 20:15 ..
drwxr-xr-x.   2 1000700000 root 167 Jan 25 05:00 backups  --> not from Nexus
drwxr-xr-x.   5 1000700000 root   5 Jun 29  2022 blobs
drwxr-xr-x. 320 1000700000 root 321 Jan 22 20:16 cache
drwxr-xr-x.   6 1000700000 root   8 Aug 20 16:22 db
drwxr-xr-x.   3 1000700000 root   4 Jun 29  2022 elasticsearch
drwxr-xr-x.   3 1000700000 root   4 Jun 29  2022 etc
drwxr-xr-x.   2 1000700000 root   2 Jun 29  2022 generated-bundles
drwxr-xr-x.   2 1000700000 root   3 Jun 29  2022 instances
drwxr-xr-x.   3 1000700000 root   3 Jun 29  2022 javaprefs
drwxr-xr-x.   2 1000700000 root   2 Jun 29  2022 kar
-rw-r--r--.   1 1000700000 root   1 Jan 22 20:15 karaf.pid
drwxr-xr-x.   3 1000700000 root   3 Jun 29  2022 keystores
-rw-r--r--.   1 1000700000 root  25 Jan 22 20:15 lock
drwxr-xr-x.   4 1000700000 root  57 Jan 25 00:00 log
drwxr-xr-x.   2 1000700000 root   2 Jun 29  2022 orient
-rw-r--r--.   1 1000700000 root   5 Jan 22 20:15 port
drwxr-xr-x.   2 1000700000 root   2 Jun 29  2022 restore-from-backup
drwxr-xr-x.  24 1000700000 root  56 Jan 22 20:16 tmp
drwxr-xr-x.   2 1000700000 root   2 Aug 20 16:22 upgrades

Part of my"Nexus" instance

apiVersion: apps.m88i.io/v1alpha1
kind: Nexus
metadata:
  name: nexus3
  namespace: nexus
spec:
  image: registry.connect.redhat.com/sonatype/nexus-repository-manager
  replicas: 1
  serviceAccountName: nexus3
  useRedHatImage: true
{...}

As you can see, I'm using the RedHat image, so no need for SCC...

There is currently one potential problem with Nexus on OKD/OCP. The Nexus image uses Java 8 that is not "officially" compatible withcgroups v2used by FCOS 36/37 OS by OKD v4.11/4.12 but it doesn't seem to affect Nexus itself in my case.
I'm not sure if your OCP v4.11 witth CoreOS (I guess)usescgroups v1orcgroups v2, but I'm quite sure it has no effect

@marcozanghi
Copy link
Author

marcozanghi commented Jan 25, 2023

Thanks for your feedback. I have already tried without any modification of permissions or by using the original RedHat image. I have also tried with registry.connect.redhat.com/sonatype/nexus-repository-manager and the result is the following error :
id: cannot find name for user ID 1000670000

I have checked the nexus user as the following:

image

I can't upgrade the cluster at the moment due the new version of k8s and the many API deprecation correlated to new OC 4.12.
Are you using PV?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug 🐛 Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants