Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NooBaa nsfs service showing Error No disk candidates found at generate entropy due to unsupported /dev/nvme0 devices #8598

Open
ramya-c3 opened this issue Dec 17, 2024 · 18 comments · Fixed by #8600 · May be fixed by #8777
Open
Assignees
Labels

Comments

@ramya-c3
Copy link

Environment info

  • NooBaa Version: 5.17.0
  • Platform: Standalone

Actual behavior

[nsfs/465779] [LOG] CONSOLE:: generate_entropy: error
Error: No disk candidates found at generate_entropy (/usr/local/noobaa-core/src/util/nb_native.js:138:27) at async init_rand_seed (/usr/local/noobaa-core/src/util/nb_native.js:63:5)
Dec 12 19:00:24 stor107.ete14.res.ibm.com node[465779]: Dec-12 19:00:24.636 [nsfs/465779] [LOG] CONSOLE:: generate_entropy: error Error: No disk candidates found

Expected behavior

1.generate_entropy should be able to handle nvme devices, so that NooBaa nsfs service starts without error.

Steps to reproduce

The problem can be mitigated by adding /dev/dasda to
line 122 in /usr/local/noobaa-core/src/util/nb_native.js

" for (disk of ['/dev/sda', '/dev/vda', '/dev/xvda', '/dev/dasda','/dev/nvme0']) {"

and restart NooBaa nsfs service

More information - Screenshots / Logs / Other output

@ramya-c3 ramya-c3 added the NS-FS label Dec 17, 2024
@rkomandu
Copy link
Collaborator

@romayalon , this is similar to the #7982

@rkomandu
Copy link
Collaborator

@ramya-c3 , could you work with Romy and if needed check the fix, as we have tried manually patching in the 5.15.6 on ECE cluster yday w/r/t nvme disks.

@romayalon , please let Ramya know about the upstream if needs to be tested. However as mentioned over slack there is no ECE cluster we have at hand with nvme drives.

@ramya-c3
Copy link
Author

@romayalon This issue is not resolved please add /dev/nvme0n1 should be added instead of nvme0 alone. Is there a way that we can have wildcard character so we can have any number of characters post nvme?

@rkomandu
Copy link
Collaborator

IMO, we need to understand what is the role of this, can @romayalon explain the same.. @ramya-c3 if possible can you post the lsblk command of ECE cluster for her reference

@romayalon
Copy link
Contributor

@rkomandu @ramya-c3 this code was written 8 years ago and its purpose is to make sure we generate highly random encryption keys. Attaching here a link to a detailed github issue that Guy opened for explaining the purpose. @guymguym might be able to share more details.
@ramya-c3 Can you run the new code on a machine with '/dev/nvme0' for checking my code?
Also, please check what are the optional disks in ECE cluster instead of doing iterations.

@shirady
Copy link
Contributor

shirady commented Dec 22, 2024

@guymguym Do you know if this part of the code must be hard-coded?
I thought that for future cases, maybe we can add it as a property in the config.json?

@romayalon romayalon reopened this Dec 23, 2024
@romayalon
Copy link
Contributor

@rkomandu @ramya-c3 I'm reverting /dev/nvme0 for now, please check what are the optional disks in ECE cluster so we can do this change properly in the next PR

@romayalon
Copy link
Contributor

@rkomandu @ramya-c3 any news about it?

@ramya-c3
Copy link
Author

@romayalon After discussion with the internal team we got to know that the nvme drives path will have namespace attached and it will be different each time so we need to have the regex supported along with nvme* suffix.

@shirady
Copy link
Contributor

shirady commented Jan 21, 2025

@romayalon After discussion with the internal team we got to know that the nvme drives path will have namespace attached and it will be different each time so we need to have the regex supported along with nvme* suffix.

@ramya-c3 would you attach a couple of examples?
I didn't understand the "namespace attached" to the nvme drives.

Anyway as it is written today we go over an array (as you can see currently with 4 items) and iterate over them.

for (disk of ['/dev/sda', '/dev/vda', '/dev/xvda', '/dev/dasda']) {

A regex expression would not fit in this context, could you pass it as env (environment variable)?
My thought is to get the env, and I can check that it matches the regex you wished and insert it in the array.

By the way in the flow, it comes from the function generate_entropy which is called from init_rand_seed and this function is called only under this condition:

if (process.env.DISABLE_INIT_RANDOM_SEED !== 'true') {
init_rand_seed();
}

@shirady shirady self-assigned this Jan 21, 2025
@ramya-c3
Copy link
Author

No it is not possible it will be automatically picked up by the system.

@shirady
Copy link
Contributor

shirady commented Jan 22, 2025

Hi @ramya-c3 ,
As I tried to share the code limitation in the comment - when you said:

No it is not possible it will be automatically picked up by the system.

Could you describe the way it picked and where is it saved?
Is this disk picked before noobaa service is started?
Does it save somewhere so the admin can read it and pass it as env (or config) to noobaa before it starts?

@ramya-c3
Copy link
Author

lsblk command will provide the output you can take alook at it

lsblk

NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
sda 8:0 0 25G 0 disk
├─sda1 8:1 0 1023M 0 part /boot
└─sda2 8:2 0 19G 0 part /
sdb 8:16 0 10G 0 disk
└─sdb1 8:17 0 10G 0 part
sdc 8:32 0 10G 0 disk
└─sdc1 8:33 0 10G 0 part
sdd 8:48 0 10G 0 disk
└─sdd1 8:49 0 10G 0 part
sde 8:64 0 10G 0 disk
└─sde1 8:65 0 10G 0 part
sdf 8:80 0 10G 0 disk
└─sdf1 8:81 0 10G 0 part
sdg 8:96 0 10G 0 disk
└─sdg1 8:97 0 10G 0 part
sdh 8:112 0 10G 0 disk
└─sdh1 8:113 0 10G 0 part
sdi 8:128 0 10G 0 disk
└─sdi1 8:129 0 10G 0 part
sr0 11:0 1 1024M 0 rom

@shirady
Copy link
Contributor

shirady commented Jan 27, 2025

HI @ramya-c3,
Thank you for referencing the command, I didn't understand from the example output you attached - what should be added exactly to the array?

I tried running the command in Linux terminal (I don't have a Linux machine, so I used this link):

$ lsblk
NAME        MAJ:MIN RM  SIZE RO TYPE MOUNTPOINTS
nvme0n1     259:0    0  3.5T  0 disk 
nvme1n1     259:1    0  3.5T  0 disk 
|-nvme1n1p1 259:2    0    4G  0 part [SWAP]
|-nvme1n1p2 259:3    0    1G  0 part 
|-nvme1n1p3 259:4    0    2T  0 part /etc/hosts
|                                    /etc/hostname
|                                    /etc/resolv.conf
|                                    /usr/sbin/docker-init
|-nvme1n1p4 259:5    0  1.5T  0 part /home/cg/root
`-nvme1n1p5 259:6    0    1M  0 part

So if I understand correctly from what I run I should add the following disks:

$ lsblk | grep disk
nvme0n1     259:0    0  3.5T  0 disk 
nvme1n1     259:1    0  3.5T  0 disk 

@ramya-c3
Copy link
Author

/dev/nvme1n1,/dev/nvme0n1,/dev/nvme0n1p1,/dev/nvme0n1p2,/dev/nvme0,/dev/nvme1

@shirady
Copy link
Contributor

shirady commented Jan 27, 2025

Hi,
I’m looking at the list you attached @ramya-c3

  • /dev/nvme1n1
  • /dev/nvme0n1
  • dev/nvme0n1p1
  • /dev/nvme0n1p2
  • /dev/nvme0
  • /dev/nvme1

Do you want to add the whole list?
If I understand correctly the disk name is /dev/nvme0 and /dev/nvme1, but the “n1” represents the namespace inside it and the “p” is for the partition (reference).
I’m trying to understand if adding the disk name (/dev/nvme0 and /dev/nvme1) will be enough or if you need the full list?

(copied from Slack thread).

@shirady
Copy link
Contributor

shirady commented Feb 2, 2025

Hi @ramya-c3 ,
According to the Slack thread the plan is:

  1. To add to the list 2 items - was added in PR NC | Related to Issue 8598 | generate_entropy() - Add Disks (Temporary and Partial Fix) #8734.
  2. To find a long-term solution, that would be valid also for more than 1 NVMe namespace.

I don't have a Linux machine (nor a GPFS cluster), but I created and tested this script on a docker container. I wanted to see if we could collaborate on testing it.

Attached is the script that would use the command lsblk and add the disk names to the array that we've already had:

const util = require('util');
const child_process = require('child_process');
const async_exec = util.promisify(child_process.exec);

async function main() {
    const array_of_disk_names = ['/dev/sda', '/dev/vda', '/dev/xvda', '/dev/dasda', '/dev/nvme0n1', '/dev/nvme1n1'];
    console.log('SDSD original array_of_disk_names', array_of_disk_names);
    const additional_array_of_disk_names = await get_disk_names();
    console.log('SDSD additional_array_of_disk_names', additional_array_of_disk_names);
    if (additional_array_of_disk_names.length > 0) {
        const set_disk_names = new Set(array_of_disk_names); // to go over the original disk names only once
        for (index in additional_array_of_disk_names) {
            // add only the names of disks that we didn't have in the original_array_of_disk_names
            if (!set_disk_names.has(additional_array_of_disk_names[index])) {
                array_of_disk_names.push(additional_array_of_disk_names[index])
            }
        }
    }
    console.log('SDSD combined array_of_disk_names with additional_array_of_disk_names', array_of_disk_names);
}

/**
 * get_disk_names will return the disk names using lsblk command
 * @returns {Array} array_of_disk_names
 */
async function get_disk_names() {
    try {
        const res = await async_exec(`lsblk --json`);
        const res_json =  JSON.parse(res.stdout)
        const disks = res_json.blockdevices.filter(block_device => block_device.type === 'disk');
        const array_of_disk_names = disks.map(disk => '/dev/' + disk.name); // will take the disk name add add '/dev/ prefix to it (to match the original array)
        return array_of_disk_names;
    } catch (err) {
        console.log('get_disk_names, got an error:', err);
        return [];
    }
}

exports.main = main;
if (require.main === module) main();

The output:

SDSD original array_of_disk_names [
  '/dev/sda',
  '/dev/vda',
  '/dev/xvda',
  '/dev/dasda',
  '/dev/nvme0n1',
  '/dev/nvme1n1'
]
SDSD additional_array_of_disk_names [ '/dev/vda', '/dev/vdb' ]
SDSD combined array_of_disk_names with additional_array_of_disk_names [
  '/dev/sda',
  '/dev/vda',
  '/dev/xvda',
  '/dev/dasda',
  '/dev/nvme0n1',
  '/dev/nvme1n1',
  '/dev/vdb'
]

You can see in the output that we saw 2 disks '/dev/vda' (was already in the list) and '/dev/vdb' (was added to the array).

If it is fine I would add the function get_disk_names() in the code, but we need to see it on GPFS machine.

@shirady
Copy link
Contributor

shirady commented Feb 6, 2025

@ramya-c3,
I run the script above on a GPFS cluster (I created a file script_test.js in the /usr/local/noobaa-core) and the output is:

node ./script_test.js

SDSD original array_of_disk_names [
  '/dev/sda',
  '/dev/vda',
  '/dev/xvda',
  '/dev/dasda',
  '/dev/nvme0n1',
  '/dev/nvme1n1'
]
SDSD additional_array_of_disk_names [
  '/dev/sda', '/dev/sdb',
  '/dev/sdc', '/dev/sdd',
  '/dev/sde', '/dev/sdf',
  '/dev/sdg', '/dev/sdh',
  '/dev/sdi'
]
SDSD combined array_of_disk_names with additional_array_of_disk_names [
  '/dev/sda',     '/dev/vda',
  '/dev/xvda',    '/dev/dasda',
  '/dev/nvme0n1', '/dev/nvme1n1',
  '/dev/sdb',     '/dev/sdc',
  '/dev/sdd',     '/dev/sde',
  '/dev/sdf',     '/dev/sdg',
  '/dev/sdh',     '/dev/sdi'
]

Is that what you wanted?
I'm asking because in your example you wanted a kind of regex of NVMe disks on the namespaces, and now looking at the list I see additional disks, and none of them is NVMe with namespaces.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment