Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update mount points #2092

Merged
merged 22 commits into from
Sep 20, 2024
Merged

Update mount points #2092

merged 22 commits into from
Sep 20, 2024

Conversation

JimMadge
Copy link
Member

@JimMadge JimMadge commented Aug 6, 2024

✅ Checklist

  • You have given your pull request a meaningful title (e.g. Enable foobar integration rather than 515 foobar).
  • You are targeting the appropriate branch. If you're not certain which one this is, it should be develop.
  • Your branch is up-to-date with the target branch (it probably was when you started, but it may have changed since then).

🚦 Depends on

⤴️ Summary

🌂 Related issues

Closes #2027

🔬 Tests

  • Tested in a clean deployment.

@JimMadge JimMadge requested a review from a team as a code owner August 6, 2024 08:57
Copy link

github-actions bot commented Aug 6, 2024

Coverage report

This PR does not seem to contain any modification to coverable code.

@craddm
Copy link
Contributor

craddm commented Aug 6, 2024

This is what I'm seeing in the serial console after trying to deploy a new vm in an existing SRE

[  165.116160] cloud-init[1938]: >=== Mounting all external volumes... ===<
[  165.121676] cloud-init[1938]:   /etc/fstab  # CLOUD_IMG: This file was created/modified by the Cloud Image build process
[  165.127771] cloud-init[1938]:   /etc/fstab  UUID=03496b49-bced-4cc6-8502-efd2ae48dbb9        /        ext4   discard,errors=remount-ro       0 1
[  165.133341] cloud-init[1938]:   /etc/fstab  UUID=F229-A1AA   /boot/efi       vfat    umask=0077      0 1
[  165.139420] cloud-init[1938]:   /etc/fstab  shgresremoodesiredstatec.blob.core.windows.net:/shgresremoodesiredstatec/desiredstate    /var/local/ansible      nfs     ro,_netdev,sec=sys,vers=3,nolock,proto=tcp,comment=cloudconfig  0       2
[  165.145367] cloud-init[1938]:   /etc/fstab  shgresremoosensitivedata.blob.core.windows.net:/shgresremoosensitivedata/ingress /mnt/input      nfs     ro,_netdev,sec=sys,vers=3,nolock,proto=tcp,comment=cloudconfig  0       2
[  165.151276] cloud-init[1938]:   /etc/fstab  shgresremoosensitivedata.blob.core.windows.net:/shgresremoosensitivedata/egress  /mnt/output     nfs     rw,_netdev,sec=sys,vers=3,nolock,proto=tcp,comment=cloudconfig  0       2
[  165.156950] cloud-init[1938]:   /etc/fstab  shmgreensremoocouserdata.file.core.windows.net:/shmgreensremoocouserdata/shared  /mnt/shared     nfs     _netdev,sec=sys,nconnect=4,comment=cloudconfig  0       2
[  165.163467] cloud-init[1938]:   /etc/fstab  shmgreensremoocouserdata.file.core.windows.net:/shmgreensremoocouserdata/home    /home   nfs     _netdev,sec=sys,nconnect=4,comment=cloudconfig  0       2
[  165.169982] cloud-init[1938]:   /etc/fstab  /dev/disk/cloud/azure_resource-part1     /mnt    auto    defaults,nofail,x-systemd.requires=cloud-init.service,_netdev,comment=cloudconfig       0       2
[  165.216682] cloud-init[1938]: mount.nfs: trying 10.0.1.28 prog 100003 vers 3 prot TCP port 2048
[  165.255024] cloud-init[1938]: mount.nfs: trying 10.0.1.28 prog 100005 vers 3 prot TCP port 2048
[  165.269955] cloud-init[1938]: mount.nfs: timeout set for Tue Aug  6 10:15:40 2024
[  165.278388] cloud-init[1938]: mount.nfs: trying text-based options 'sec=sys,vers=3,nolock,proto=tcp,addr=10.0.1.28'
[  165.283072] cloud-init[1938]: mount.nfs: prog 100003, trying vers=3, prot=6
[  165.289101] cloud-init[1938]: mount.nfs: prog 100005, trying vers=3, prot=6
[  165.298038] cloud-init[1938]: mount.nfs: mount point /mnt/input does not exist
[  165.306742] cloud-init[1938]: mount.nfs: mount point /mnt/output does not exist
[  165.318891] cloud-init[1938]: mount.nfs: mount point /mnt/shared does not exist
[  165.352792] cloud-init[1938]: mount.nfs: timeout set for Tue Aug  6 10:15:41 2024
[  165.362954] cloud-init[1938]: mount.nfs: trying text-based options 'sec=sys,nconnect=4,vers=4.2,addr=10.0.1.36,clientaddr=10.0.2.4'

@jemrobinson
Copy link
Member

jemrobinson commented Aug 6, 2024

The symlinks aren't being uploaded as they're actually local symlinks on the deployment system:

LOCAL

$ ls -alh /Users/jrobinson/Developer/data-safe-haven/code/dsh-upstream/data_safe_haven/resources/workspace/ansible/files/etc/skel/input
lrwxr-xr-x  1 jrobinson  staff    10B  6 Aug 13:40 /Users/jrobinson/Developer/data-safe-haven/code/dsh-upstream/data_safe_haven/resources/workspace/ansible/files/etc/skel/input -> /mnt/input

WORKSPACE

$ ls -alh /var/local/ansible/files/etc/skel
total 2.0K
dr-xr-xr-x 2 root root    0 Aug  6 10:50 .
dr-xr-xr-x 2 root root    0 Aug  6 10:51 ..
-r-xr-xr-x 1 root root 1.3K Aug  6 10:50 bashrc
-r-xr-xr-x 1 root root   14 Aug  6 10:50 xsession

@JimMadge
Copy link
Member Author

JimMadge commented Aug 6, 2024

OK, we can create them another way.

@JimMadge
Copy link
Member Author

JimMadge commented Aug 6, 2024

@jemrobinson Changed in db95c0f

@jemrobinson
Copy link
Member

jemrobinson commented Aug 6, 2024

This also fails although I'm not sure why:
EDIT: I think we need /etc instead of etc in the symlinks.

TASK [Create skeleton symlinks] ************************************************
failed: [localhost] (item={'path': 'etc/skel/input', 'src': '/mnt/input'}) => {"ansible_loop_var": "item", "changed": false, "item": {"path": "etc/skel/input", "src": "/mnt/input"}, "msg": "Error while linking: [Errno 2] No such file or directory: b'/mnt/input' -> b'etc/skel/input'", "path": "etc/skel/input"}
failed: [localhost] (item={'path': 'etc/skel/output', 'src': '/mnt/output'}) => {"ansible_loop_var": "item", "changed": false, "item": {"path": "etc/skel/output", "src": "/mnt/output"}, "msg": "Error while linking: [Errno 2] No such file or directory: b'/mnt/output' -> b'etc/skel/output'", "path": "etc/skel/output"}
failed: [localhost] (item={'path': 'etc/skel/shared', 'src': '/mnt/shared'}) => {"ansible_loop_var": "item", "changed": false, "item": {"path": "etc/skel/shared", "src": "/mnt/shared"}, "msg": "Error while linking: [Errno 2] No such file or directory: b'/mnt/shared' -> b'etc/skel/shared'", "path": "etc/skel/shared"}

since the targets do exist

$ ls -alh /mnt/
total 8.5K
drwxr-xr-x  5 root root 4.0K Aug  6 12:45 .
drwxr-xr-x 19 root root 4.0K Aug  6 12:56 ..
drwxr-xr-x  2 root root    0 Aug  6 10:58 input
drwxrwxrwx  2 root root    0 Aug  6 10:58 output
drwxrwxrwx  2 root root   64 Aug  6 10:58 shared

@JimMadge
Copy link
Member Author

JimMadge commented Aug 6, 2024

@jemrobinson Paths should be correct in bb47298

Copy link
Member

@jemrobinson jemrobinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LDAP users are not working with these changes:

$ getent passwd
...
saned:x:122:133::/var/lib/saned:/usr/sbin/nologin
colord:x:123:134:colord colour management daemon,,,:/var/lib/colord:/usr/sbin/nologin
pulse:x:124:135:PulseAudio daemon,,,:/run/pulse:/usr/sbin/nologin
$ tail -f /var/log/auth.log 
Aug  6 13:24:35 shm-pink-sre-fuschia-vm-workspace-02 xrdp-sesman[23082]: pam_unix(xrdp-sesman:account): could not identify user (from getpwnam(james.robinson))

@JimMadge
Copy link
Member Author

JimMadge commented Aug 6, 2024

@jemrobinson Interesting, that looks very similar to the problem I had earlier?

I'm not sure why that would be, do you have any ideas?

@JimMadge
Copy link
Member Author

JimMadge commented Aug 6, 2024

@jemrobinson @craddm Let's bump this one until after the pen test?

@jemrobinson
Copy link
Member

It's a missing setting in /etc/nsswitch.conf. Any idea why this might have stopped being set?

WORKING

$ cat /etc/nsswitch.conf
passwd:         files systemd ldap
group:          files systemd ldap
shadow:         files ldap
gshadow:        files

hosts:          files mdns4_minimal [NOTFOUND=return] dns
networks:       files

protocols:      db files
services:       db files
ethers:         db files
rpc:            db files

netgroup:       nis

NON-WORKING

$ cat /etc/nsswitch.conf
passwd:         files systemd
group:          files systemd
shadow:         files
gshadow:        files

hosts:          files mdns4_minimal [NOTFOUND=return] dns
networks:       files

protocols:      db files
services:       db files
ethers:         db files
rpc:            db files

netgroup:       nis

i.e. it's the missing ldap setting on passwd and group

@JimMadge
Copy link
Member Author

JimMadge commented Aug 6, 2024

That is the next task

- name: Add ldap to /etc/nsswitch.conf
ansible.builtin.replace:
path: /etc/nsswitch.conf
regexp: '^(passwd|group|shadow)(:.*)(?<!ldap)$'
replace: '\1\2 ldap'

Did that task run?

@jemrobinson
Copy link
Member

Did that task run?

No, because the symlink failed with exception occurred during task execution.

@JimMadge
Copy link
Member Author

JimMadge commented Aug 6, 2024

In that case, it might just require running the playbook again (assuming the fatal error is fixed).

@jemrobinson
Copy link
Member

jemrobinson commented Aug 6, 2024

/mnt/shared still failing:

ok: [localhost] => (item={'path': '/etc/skel/input', 'src': '/mnt/input'})
ok: [localhost] => (item={'path': '/etc/skel/output', 'src': '/mnt/output'})
An exception occurred during task execution. To see the full traceback, use -vvv. The error was: PermissionError: [Errno 1] Operation not permitted: b'/mnt/shared'
failed: [localhost] (item={'path': '/etc/skel/shared', 'src': '/mnt/shared'}) => {"ansible_loop_var": "item", "changed": false, "item": {"path": "/etc/skel/shared", "src": "/mnt/shared"}, "module_stderr": "Traceback (most recent call last):\n  File \"/root/.ansible/tmp/ansible-tmp-1722952769.3476353-37900-21432236129960/AnsiballZ_file.py\", line 102, in <module>\n    _ansiballz_main()\n  File \"/root/.ansible/tmp/ansible-tmp-1722952769.3476353-37900-21432236129960/AnsiballZ_file.py\", line 94, in _ansiballz_main\n    invoke_module(zipped_mod, temp_path, ANSIBALLZ_PARAMS)\n  File \"/root/.ansible/tmp/ansible-tmp-1722952769.3476353-37900-21432236129960/AnsiballZ_file.py\", line 40, in invoke_module\n    runpy.run_module(mod_name='ansible.modules.file', init_globals=None, run_name='__main__', alter_sys=True)\n  File \"/usr/lib/python3.10/runpy.py\", line 224, in run_module\n    return _run_module_code(code, init_globals, run_name, mod_spec)\n  File \"/usr/lib/python3.10/runpy.py\", line 96, in _run_module_code\n    _run_code(code, mod_globals, init_globals,\n  File \"/usr/lib/python3.10/runpy.py\", line 86, in _run_code\n    exec(code, run_globals)\n  File \"/tmp/ansible_ansible.builtin.file_payload_apoyxk31/ansible_ansible.builtin.file_payload.zip/ansible/modules/file.py\", line 928, in <module>\n  File \"/tmp/ansible_ansible.builtin.file_payload_apoyxk31/ansible_ansible.builtin.file_payload.zip/ansible/modules/file.py\", line 916, in main\n  File \"/tmp/ansible_ansible.builtin.file_payload_apoyxk31/ansible_ansible.builtin.file_payload.zip/ansible/modules/file.py\", line 771, in ensure_symlink\n  File \"/tmp/ansible_ansible.builtin.file_payload_apoyxk31/ansible_ansible.builtin.file_payload.zip/ansible/module_utils/basic.py\", line 1422, in set_fs_attributes_if_different\n  File \"/tmp/ansible_ansible.builtin.file_payload_apoyxk31/ansible_ansible.builtin.file_payload.zip/ansible/module_utils/basic.py\", line 1186, in set_mode_if_different\nPermissionError: [Errno 1] Operation not permitted: b'/mnt/shared'\n", "module_stdout": "", "msg": "MODULE FAILURE\nSee stdout/stderr for the exact error", "rc": 1}

Can we do this with a script in /etc/profile instead?

@jemrobinson
Copy link
Member

BUG:

Output folder is not writeable

$ touch /mnt/output/test.txt
touch: cannot touch 'test.txt': Permission denied

Copy link
Member

@jemrobinson jemrobinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Output folder is not writeable for some reason. I can't see why.

@jemrobinson
Copy link
Member

jemrobinson commented Aug 6, 2024

If we merge #2103 then we probably don't need the symlinks. If we can drop the symlinks entirely and fix the output folder issue, then we could consider merging this.

@jemrobinson jemrobinson changed the title Update mount points [WIP] Update mount points Aug 12, 2024
@jemrobinson
Copy link
Member

@JimMadge worth coming back to this one after RSECon?

@JimMadge JimMadge requested a review from jemrobinson September 2, 2024 10:00
@JimMadge JimMadge changed the title [WIP] Update mount points Update mount points Sep 2, 2024
Copy link
Member

@jemrobinson jemrobinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this work? Given the previous issues, I'd like to see confirmation that the directories are correctly set up in a from-scratch deploy.

@JimMadge
Copy link
Member Author

All mount point as in #2027 (comment)

Deployment runs without error and the system is functional.

This was referenced Sep 17, 2024
Copy link
Member

@jemrobinson jemrobinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JimMadge this looks fine, but can you confirm that it works (and all folders mount correctly) in a from-scratch deployment? I'm not sure why it wasn't working before, so I'd like to be sure.

@JimMadge
Copy link
Member Author

@JimMadge this looks fine, but can you confirm that it works (and all folders mount correctly) in a from-scratch deployment? I'm not sure why it wasn't working before, so I'd like to be sure.

Yes all was working from a fresh deployment.

Copy link
Member

@jemrobinson jemrobinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM then :)

@JimMadge JimMadge merged commit 9117fa5 into develop Sep 20, 2024
10 checks passed
@JimMadge JimMadge deleted the mount_points branch September 20, 2024 09:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Decide where volumes should be mounted
3 participants