Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

iDRAC: Freezes upon boot or doesn't recognize inputs? #294

Open
lee-costa opened this issue Oct 14, 2024 · 30 comments
Open

iDRAC: Freezes upon boot or doesn't recognize inputs? #294

lee-costa opened this issue Oct 14, 2024 · 30 comments

Comments

@lee-costa
Copy link

I am booting ShredOS via the virtual console by mapping the ISO on my Dell R230 server. Upon booting into the blue screen, It won't respond to any input such as SELECT (spacebar). I am not sure if it's freezing or it is not recognizing any inputs.

Any ideas what else to try?

@PartialVolume
Copy link
Owner

I don't understand what you mean by 'mapping the ISO'?

@lee-costa
Copy link
Author

I am running a headless server and have mounted the ISO using a virtual console. I can boot just fine but no inputs are recognized.

@PartialVolume
Copy link
Owner

This sounds like the same problem, have you tried the various settings mentioned in the comments? https://www.dell.com/community/en/conversations/systems-management-general/idrac-keyboard-not-working/647f877bf4ccf8a8de6b5f10

@PartialVolume PartialVolume changed the title Freezes upon boot or doesn't recognize inputs? iDRAC: Freezes upon boot or doesn't recognize inputs? Oct 15, 2024
@PartialVolume
Copy link
Owner

How are you booting, USB or PXE?

@Firminator
Copy link

Neither. The iDRAC (KVM/VirtualConsole) has a feature where it allows you to mount/map an ISO and it then changes the BIOS boot order to boot the mounted ISO. I never mounted ShredOS through the iDRAC, but with other ISOs I have not encountered that keyboard input wasn't working. Might have to update BIOS and iDRAC and then try again. Might also have to set the server from UEFI to BIOS and disable SecureBoot. These are just wild guesses though. Depending on the RAID controller in that server you also have to set the RAID controller to HBA mode else each of the physical drives can't be detected by any OS.
@ggruber has a bunch of Dell servers if I remember correct and he helped testing ShredOS. Maybe he can chime in and reproduce.

@fthobe
Copy link
Contributor

fthobe commented Jan 9, 2025

Hey @lee-costa ,

I can't confirm any issues on iDrac 8. Can you send me a download of your iso? I would like to give it a try.

I have tried plenty of Rx30 models (pretty much all but R230), but I would be curios to give it a try as you experience an issue I never had.

@Firminator
Copy link

@leecosta Are you on a Mac? You might want to try changing the iDRAC webinterface from the default Java to HTML5 to see if this alleviate the issue not having a keyboard. I recall having a collegue with a similar issue.

@fthobe
Copy link
Contributor

fthobe commented Jan 10, 2025

@leecosta Are you on a Mac? You might want to try changing the iDRAC webinterface from the default Java to HTML5 to see if this alleviate the issue not having a keyboard. I recall having a collegue with a similar issue.

Actually what he said plus go always html 5, Java aplets / active sync are deprecated.

@lee-costa
Copy link
Author

I am on a Mac and use HTLM5 instead of Java. I actually gave up after trying multiple times and no longer have the ISO.

@fthobe
Copy link
Contributor

fthobe commented Jan 11, 2025

It's weird though, it always worked for me. Did you have any crude hardware in the system or could you send an inventory export from iDrac?

@Firminator
Copy link

Yeah it's the other way around then. HTML5 wasn't working on my co-workers Mac and he had no keyboard input. It's been a while.
But whatever... it seems lee-costa moved on :)

@fthobe
Copy link
Contributor

fthobe commented Jan 20, 2025

I can confirm that iDrac 8 worked for me on R230 flawlessly this weekend.
@PartialVolume can be closed till it comes up again I guess.

@wimb0
Copy link

wimb0 commented Jan 30, 2025

I am having kind of the issue on an iDrac 9 on Dell R640.
Am booting via a mapped ISO in the HTML 5 remote console.
It boots, comes to this screen and then screen blanks and I cannot do anything.

Image

Happens on 4 different R640's. R630 with iDrac 8 works fine with the same iso.

@fthobe
Copy link
Contributor

fthobe commented Jan 30, 2025

Can you reset the bios to factory default, check that this reset didn't reactivate secure boot and try again?
Also try with USB.

@wimb0
Copy link

wimb0 commented Jan 30, 2025

BIOS resetted, changed boot mode from UEFI to BIOS. (else the ISO will not boot.):

Image

Secure boot is off:
Image

Same issue, get booting the kernel screen and then this:

Image

Server is still powered on, or else it would display "System is powered off"

edit:
I'll try the nomodeset ISO, see if that makes a difference.

@wimb0
Copy link

wimb0 commented Jan 30, 2025

Seems like the nomodeset ISO works:

Image

Image

It was waiting for a long time at "Waiting for all USB devices to be initialised". but keyboard works, and it detects the USB drives.

@Firminator
Copy link

That's good news. Mental note: nomodeset for Dell servers :)

Would you mind testing to see if two SD cards (or two IDSDMs) show up in nwipe once you disable the SD card mirror in the BIOS so that both can be wiped seperately without being caught in Dell's proprietary SD card RAID magic?

@PartialVolume
Copy link
Owner

It was waiting for a long time at "Waiting for all USB devices to be initialised". but keyboard works, and it detects the USB drives.

Can you post dmesg.txt and transfer.log They may provide a clue as to what was causing that delay.

So it looks like there is a problem with the DRM driver for your graphics hardware or possibly the iDrac software.
Can you also post the output of lspci -k which will tell us what graphics hardware you have.

To determine whether it's IDrac that's the problem or the video card, does vanilla work with a normal monitor and not iDrac?

@PartialVolume
Copy link
Owner

PartialVolume commented Jan 31, 2025

I'm just wondering if this server is using the Intel Xe series embedded graphics. There is a DRM Xe driver that classed as experimental in buildroot so not normally installed in ShredOS, but maybe that's what is needed here.. I can build a .iso vanilla with this Xe driver so you can test it if you would like?

However, i'd like to see the output of lspci -k to confirm what video hardware you have before building the test .iso

@wimb0
Copy link

wimb0 commented Jan 31, 2025

That's good news. Mental note: nomodeset for Dell servers :)

Would you mind testing to see if two SD cards (or two IDSDMs) show up in nwipe once you disable the SD card mirror in the BIOS so that both can be wiped seperately without being caught in Dell's proprietary SD card RAID magic?

For now I only see the one IDSDM "raid" disk.
If I disable Internal SD redundancy I only see the primary SD card, not both.

Image

It was waiting for a long time at "Waiting for all USB devices to be initialised". but keyboard works, and it detects the USB drives.

Can you post dmesg.txt and transfer.log They may provide a clue as to what was causing that delay.

So it looks like there is a problem with the DRM driver for your graphics hardware or possibly the iDrac software. Can you also post the output of lspci -k which will tell us what graphics hardware you have.

To determine whether it's IDrac that's the problem or the video card, does vanilla work with a normal monitor and not iDrac?

transfer.log:
Image

Graphics card according to lspci:

Image

I have no way to test with a monitor, as the server is in a datacenter which I don't have access to.

dmesg does not list long wait times on usb devices.

@PartialVolume
Copy link
Owner

PartialVolume commented Jan 31, 2025

I need to see the non piped output of lspci -k, grepping -i vga removes the driver details if they are there. The matrox G200 driver is supplied so should say something like mgag200

@PartialVolume
Copy link
Owner

Of course, this lspci -k won't be useful if you are using the nomodeset .iso as the driver won't get loaded anyway.

You really need the vanilla version and telnet into it to see the output of lspci -k. We would then know if the matrix driver is being loaded.

I really need to buy one of these Dell servers on eBay to figure out where the problem is.

@wimb0
Copy link

wimb0 commented Jan 31, 2025

Unfortunately I wiped the server already and it is decommissioned/removed from the datacenter.
I will try again when I get to wipe another R640 or R740, without nomodeset.
Need to find a way to get telnet access, as those servers do not have network anymore and telnet is not allowed in our environment.

@PartialVolume
Copy link
Owner

Ok, thanks, at some point I'll add ssh, although I'm not sure how that will work with a .iso as you need somewhere to store your keys.

It would work with a USB boot as you could store them on the USB stick in /etc/ssh/

Anyway, glad the nomodeset version worked.

@Firminator
Copy link

"The iDRAC virtual console leverages the onboard Matrox G200 graphics controller" is what Dell's documentation states... well at least for determining what the maximum resolution that is. The display should be over KVM, so maybe we are missing some kind of KVM driver in buildroot?

@PartialVolume
Copy link
Owner

PartialVolume commented Feb 1, 2025

"The iDRAC virtual console leverages the onboard Matrox G200 graphics controller" is what Dell's documentation states... well at least for determining what the maximum resolution that is. The display should be over KVM, so maybe we are missing some kind of KVM driver in buildroot?

I just found the Dell article linked to below about IDRAC and graphics resolution and servers without monitors attached, @wimb0 said there was no monitor attached to this server so the resolution that the OS sets seems to be important. But if there is no monitor what is it setting? When using ShredOS in nomodeset mode it is operating in a low resolution of maybe 800x600 or 1024x768 but with the DRM drivers they would set the highest supported resolution available for the monitor and card but as there is no monitor I wonder what resolution is being set. IDRAC seems to have some resolution limitations so maybe this could be something to do with it.

If access to the server was available I would plug a 4:3 monitor with a max resolution of 1280x1024 and see if IDRAC then worked with the vanilla version remotely.

I wonder if it's possible to force a specific resolution with the DRM drivers when no monitor is attached?

https://www.dell.com/support/kbdoc/en-uk/000134669/virtual-console-video-resolution-1920x1200-is-not-available-using-idrac-9

Although having re-read that article it doesn't explain the blank screen unless the DRM driver is setting a resolution above iDracs max resolution of 1920x1200 at 60Hz

@PartialVolume
Copy link
Owner

https://github.com/dvdhrm/docs/blob/master/drm-howto/modeset.c

Just posting this link as a reminder for myself to read it fully as it contains a lot of interesting comments about the DRM process of detecting the video graphics, connectors, setting the modes etc

@Firminator
Copy link

If access to the server was available I would plug a 4:3 monitor with a max resolution of 1280x1024 and see if IDRAC then worked with the vanilla version remotely.

Or use a "VGA Display Emulator dongle" like they mentioned in the Dell KB.

When a monitor is not connected to either VGA port on the server, the operating system [ShredOS] installed dictates the available resolutions for the virtual console.

Maximum virtual console resolutions based on host OS without physical monitor:

Windows: 1600x1200 (1600x1200, 1280x1024, 1152x864, 1024x768, 800x600)
Linux: 1024x768 (1024x768, 800x600, 848x480, 640x480)

Dell Technologies iDRAC Engineering continues to look into methods to improve upon this Operating System/ driver limitation when a physical VGA monitor is absent. If a higher resolution through the virtual console is required when a physical KVM or monitor is not present, a VGA Display Emulator dongle can be leveraged to mimic an external monitor connected with a resolution up to 1920x1080.

That sounds to me like a missing driver if there is no output at all.

I wonder if this issue is limited to iDRACs or if iLOs or even OpenBMC are affected as well. The iDRAC uses port 5900 which is traditionally related to VNC. Is there maybe a VNC driver??

When I ls /dev it shows that there is a /kvm device here on my non-server device. I wonder if we have /dev/kvm in ShredOS.

Wait I'm now finding https://unix.stackexchange.com/questions/249727/dev-kvm-is-missing-from-system-supporting-kvm-virtualization with some hints how to enable KVM in the kernel although that article is from 2015:

Added support in kernel via setting CONFIG_KVM and CONFIG_KVM_INTEL modules as built-in modules in genkernel --menuconfig all;

@PartialVolume
Copy link
Owner

PartialVolume commented Feb 1, 2025

CONFIG_KVM and CONFIG_KVM_INTEL

Both currently disabled in ShredOS but I could add them if required. I thought IDRAC was operating at a hardware/bios level and doesn't require any specific software on the O.S. to work.

However, what's this about IDRAC Enterprise vs Express card. It sounds like you need the Enterprise card for full video display?

A Dell iDRAC Enterprise card offers more advanced server management features compared to an iDRAC Express card, including a dedicated network interface for iDRAC access, a full virtual console for remote server control,

Virtual Console:
Enterprise cards provide a full virtual console allowing administrators to remotely control the server as if
sitting directly in front of it, while Express cards may have limited virtual console functionality or none at all.

A Dell iDRAC Enterprise card offers more advanced server management features compared to an iDRAC
Express card, including a dedicated network interface for iDRAC access, a full virtual console for remote
server control, and often more robust monitoring capabilities, while the Express card provides basic remote
management functionality with limitations like sharing a network port with the server itself; essentially, the
Enterprise card is designed for complex IT environments requiring extensive remote management, while
the Express card is suitable for basic server monitoring and management. 

Key Differences:

Dedicated NIC:
An iDRAC Enterprise card typically has a dedicated network port for iDRAC access, whereas an Express card
usually shares a network port with the server. 

>> Virtual Console: <<
Enterprise cards provide a full virtual console allowing administrators to remotely control the server as if
sitting directly in front of it, while Express cards may have limited virtual console functionality or none at all. 

Feature Set:
Enterprise cards generally offer a wider range of advanced management features like lifecycle management,
power capping, and detailed monitoring options, while Express cards focus on basic remote access and monitoring. 

@PartialVolume
Copy link
Owner

https://www.dell.com/support/kbdoc/en-uk/000178016/support-for-integrated-dell-remote-access-controller-9-idrac9?lang=en#iDRAC-Licenses

Looks like you need a Enterprise or Data Center License for full HTML5 or VNC virtual console. The Express license doesn't appear to support virtual console. The nomodeset version of ShredOS using a simple text only framebuffer I guess works with the express license but for the DRM version of ShredOS you need a full virtual console only provided by the Enterprise and Datacenter licenses.

So I'm wondering which license is in use? It would be nice if somebody could prove this theory, so I can update the README.md for DRAC and iDRAC servers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants