Looks like my Ampere Altra Dev Kit stopped working - stuck at boot


  • I have an Ampere Altra Dev Kit that I had been using as a Debian GNU/Linux server for nearly three years. Yesterday, it apparently crashed (couldn't connect to it anymore, but still connect to the BMC). So I thought if I need to reboot it anyway, I might as well install OpenBMC now, which I did. I guess that worked.

    But I can't get it to boot. So I've now connected a VGA screen. At start, it is stuck at the "Tanocore/EDK2 firmware version 2.04.100.10 Build 20230418 ATF 2.06 Press ESCAPE for boot option." line. I've tried with a USB keyboard on all USB ports by now, but pressing Esc has apparently no effect (but I can switch NumLock on and off, so I guess they keyboard is connected somewhow). Tried resetting the CMOS via JP30, but that had no visible effect.

    I've tried a dozen times, with the USB keyboard at different USB ports, or no keyboard at all. Rebooting via front-panel button or via OpenBMC. Same result.

    Any ideas? Should I try the serial port (I guess that needs a nullmoden cable?)?

     

    P.S.: To rule out a hardware issue not directly in the Ampere Altra Dev Kit, I've tried: different power supply, remove any pair of RAM sticks, remove all cards in PCIe slots. Nothing made a difference.

    P.P.S.: The keyboard is apparently working somehow. While "Esc" apparently doesn't do anything, I can reboot using "Alt+Ctrl+Del".

    P.P.P.S.: The hang apparently only happens after a reboot; or when powering on via the BMC. If I cut of power (via the switch onthe power supply), leave it off for an hour, then switch it on, I do not get the hang, and can get into the BIOS menu via "Esc". Time to try to update the firmware, to see if that helps.



  • Having upgraded EDKII formware to 2.09, I can now reliably access the BIOS. I see there are further firmw are updates, but for 2.10 there are multiple versions, v2.10.100.02 and v2.10.100.02_BMC_disabled, and for now I don'T know which one I'd need.

    GRUB is now gone from the boot menu, but I can still start it manually from the UEFI shell. Linux still hangs on boot, though (but that problem was there since the crash, before the firmware upgrade, so I need to debug that either way).


  • Apparently this was a known issue when installing OpenBMC on older EDKII firmware: someone else ran into the same problem, as documented in the thread at https://community.amperecomputing.com/t/openbmc-now-available-for-adlink-ampere-altra-systems/628

    Also, the Linux boot issue was just differing Event counts after the crash resulting in no RAID arrays coming up, I've now fixed the RAID arrays, and my server is working fine again.


Please login to reply this topic!