Skip to content

pkidestroy on a failed install does not properly remove instance unit and subsequent pkispawn fails #5134

@taherrin

Description

@taherrin

Summary:

pkispawn fails due to a missing wants directory - /etc/systemd/system/pki-tomcatd.target.wants - if the pkidestroy that ran before it did not properly clean up the instance unit in that directory (i.e. /etc/systemd/system/pki-tomcatd.target.wants/[email protected])

This will happen if pkispawn fails during the middle of the script for some reason, such as "Invalid/obsolete admin certificate", which could easily happen since the admin directory is not cleaned up by pkidestroy. The subsequent pkidestroy
would normally remove the instance unit (i.e. /etc/systemd/system/pki-tomcatd.target.wants/[email protected]) before disabling the systemd service at the end of pkidestroy (systemctl disable [email protected]), which prevents the wants directory from being removed recursively.

Instead, when pkidestroy cleans up the half-installed instance it attempts to remove a non-existing instance unit - /etc/systemd/system/pki-tomcatd.target.wants/[email protected] - and the actual instance unit remains. So when the service is disabled, the directory /etc/systemd/system/pki-tomcatd.target.wants also gets removed, which was needed for pkispawn.

pkidestroy --debug on a successful fully-installed CA: uninstall_full.log
pkidestroy --debug on a failed half-installed CA: uninstall_half.log

Build:

OS: Fedora release 42 (Adams)
dogtag-pki-11.7.0-0.1.alpha1.20250623132230UTC.4e475eff.fc42.x86_64
COPR: @pki/master

Steps to reproduce:

  1. Install DS & CA instances

  2. Destroy CA instance:

# pkidestroy -s CA -i topology-02 --remove-logs --remove-conf --force
  1. Re-install CA instance (it will fail because the admin directory containing p12 was not removed in previous pkidestroy):
# pkispawn -s CA -f ca.cfg --debug

. . . output omitted . . .

INFO: Verifying admin cert in /opt/topology-02-CA/ca_admin_cert.p12
DEBUG: Command: pki -d /var/lib/pki/topology-02-CA/conf/alias -f /var/lib/pki/topology-02-CA/conf/password.conf nss-cert-verify --debug
FINE: Initializing NSS
FINE: Logging into internal token
FINE: Using internal token
FINE: PKITrustManager: getAcceptedIssuers():
FINE: PKITrustManager:  - CN=CA Signing Certificate,OU=topology-02-CA,O=topology-02_Foobarmaster.org
FINE: PKITrustManager: checkCert(CN=PKI Administrator,[email protected],OU=topology-02-CA,O=topology-02_Foobarmaster.org):
FINE: PKITrustManager: cert AKI: null
FINE: PKITrustManager: SKI of CN=CA Signing Certificate,OU=topology-02-CA,O=topology-02_Foobarmaster.org: null
FINE: PKITrustManager: cert not issued by CN=CA Signing Certificate,OU=topology-02-CA,O=topology-02_Foobarmaster.org: Signature does not match
ERROR: Invalid certificate: Unable to validate certificate signature: CN=PKI Administrator,[email protected],OU=topology-02-CA,O=topology-02_Foobarmaster.org
 
Installation failed: Invalid/obsolete admin certificate in /opt/topology-02-CA/ca_admin_cert.p12
  1. Observe pki instance still displays according to instance-find command:
# pki-server instance-find
-----------------
1 entries matched
-----------------
  Instance ID: topology-02-CA
  Active: False
  1. Try to remove failed instance:
# pkidestroy -s CA -i topology-02-CA --force --remove-logs --remove-conf

Uninstalling CA from /var/lib/pki/topology-02-CA.
SSLSocketException: Unable to connect: (-5961) TCP connection reset by peer.
ERROR: Unable to remove CA from security domain
ERROR: To remove manually:
ERROR: $ pki -U https://pki1.example.com:20443 -n  securitydomain-host-del "CA pki1.example.com 20443"
ERROR: Command '['pki', '-d', '/var/lib/pki/topology-02-CA/conf/alias', '-f', '/var/lib/pki/topology-02-CA/conf/password.conf', '-n', 'subsystemCert cert-topology-02-CA', '-U', 'https://pki1.example.com:20443', '--ignore-banner', 'securitydomain-leave', '--type', 'CA', '--hostname', 'pki1.example.com', '--secure-port', '20443', 'CA pki1.example.com 20443']' returned non-zero exit status 255.
WARNING: Link not found: /etc/systemd/system/pki-tomcatd.target.wants/[email protected]
WARNING: Directory not found: /var/lib/pki/topology-02-CA/webapps
Removed /etc/systemd/system/pki-tomcatd.target.wants/[email protected].
 
Uninstallation complete.

# pki-server instance-find
-----------------
0 entries matched
-----------------
  1. Install CA again and it will fail. The target wants directory /etc/systemd/system/pki-tomcatd.target.wants is not there even though the symlink - /lib/systemd/system/[email protected] exists:
# pkispawn -s CA -f /tmp/test_dir/ca.cfg --debug
Loading deployment configuration from /tmp/test_dir/ca.cfg.

. . . output omitted . . .

Installation failed: [Errno 2] No such file or directory: '/lib/systemd/system/[email protected]' -> '/etc/systemd/system/pki-tomcatd.target.wants/[email protected]'
 
# pki-server instance-find
-----------------
1 entries matched
-----------------
  Instance ID: topology-02-CA
  Active: False
  1. It looks like the instance is there according to pki-server instance find, but cannot be removed:
# pkidestroy -s CA -i topology-02-CA --force --remove-logs --remove-conf
ERROR: No CA subsystem in topology-02-CA instance

# pkidestroy -s CA -i topology-02-CA --force --remove-logs --remove-conf
ERROR: No CA subsystem in topology-02-CA instance

Expected Result:

If a PKI instance fails to install, it should be cleaned up properly by pkidestroy. A valid instance unit should be removed during pkidestroy script so that the wants directory does not get removed.

Actual Result:

pkidestroy does not remove the actual instance unit resulting in systemctl disable <instance systemd service> removing the wants directory. The next pkispawn attempt will fail with the error below:

. . . output omitted . . .

INFO: Creating /etc/sysconfig/pki/tomcat/topology-02-CA/topology-02-CA
DEBUG: Command: cp /usr/share/pki/setup/pkidaemon_registry /etc/sysconfig/pki/tomcat/topology-02-CA/topology-02-CA
DEBUG: Command: systemctl daemon-reload
INFO: Linking /etc/systemd/system/pki-tomcatd.target.wants/[email protected] to /lib/systemd/system/[email protected]
DEBUG: Command: ln -s /lib/systemd/system/[email protected] /etc/systemd/system/pki-tomcatd.target.wants/[email protected]
ERROR: FileNotFoundError: [Errno 2] No such file or directory: '/lib/systemd/system/[email protected]' -> '/etc/systemd/system/pki-tomcatd.target.wants/[email protected]'
  File "/usr/lib/python3.6/site-packages/pki/server/pkispawn.py", line 594, in main
    deployer.spawn()
  File "/usr/lib/python3.6/site-packages/pki/server/deployment/__init__.py", line 5952, in spawn
    scriptlet.spawn(self)
  File "/usr/lib/python3.6/site-packages/pki/server/deployment/scriptlets/instance_layout.py", line 237, in spawn
    exist_ok=True)
  File "/usr/lib/python3.6/site-packages/pki/server/__init__.py", line 707, in symlink
    exist_ok=exist_ok)
  File "/usr/lib/python3.6/site-packages/pki/util.py", line 134, in symlink
    os.symlink(source, dest)
 
 
Installation failed: [Errno 2] No such file or directory: '/lib/systemd/system/[email protected]' -> '/etc/systemd/system/pki-tomcatd.target.wants/[email protected]'

Additional Info:

The best and quickest way (instead of re-installing the packages) to workaround this is to re-enable the service:

# systemctl enable [email protected]

And pkispawn should succeed. Additionally the instance unit could also be removed and then the service disabled again to replicate what should happen from a proper pkidestroy:

# rm -rf /etc/systemd/system/pki-tomcatd.target.wants/[email protected]
# systemctl disable [email protected]

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions