CPU frequency capped to 2.3 GHz instead of 2.6 GHz


  • Hello,

    we've ordered a 80-core 2.6 GHz dev kit. While testing the frequency, I noticed 2.3 GHz. I looked at the various cpufreq settings for the CPPC driver, tried to enable "boost" (not supported here), etc, still no way to go beyond 2.3 GHz. I noticed that the cpufreq would accept any frequency between 1.0 and 2.6 GHz so I tried all of them in 2 MHz increments and measured the effective one. I observed 50 MHz steps. The day after I found the machine running at 2.6 GHz without knowing why! I ran the same tests and the frequency was growing in 81.25 MHz steps this time, and without being capped anymore at 2.3. In the first case I noticed 27 frequency bins from 1000 to 2300 MHz and in the second case I noticed 20 frequency bins from 1056 to 2600 MHz. That was a bit puzzling, so I took this opportunity to take a few extra measurements and to reboot. The machine rebooted at 2.3 again and I only managed to make it reach 2.6 the day after, again!

    I managed to make it boot at 2.6 by disabling both CPPC and LPI in the BIOS ACPI settings (just blacklisting cppc_cpufreq has no effect). However this time, after I use it for a while, it goes back to 2.3 and stays there.

    So there's really something wrong here. Also I noticed that when it's running at 2.3 GHz, it's not perfectly smooth, there is a little bit of noise in the measurement, as if the machine was throttling a little bit, or was enabling a very wide spread spectrum.

    For me it's particularly annoying because we've ordered this machine to optimize software for various multi-core scenarios, which involves regularly rebooting to change the NUMA config (1, 2 or 4 nodes), and seeing it not use its full performance at boot, or suddenly drop in capacity after some time it quite problematic for performance testing. It's also annoying as I'm building on the machine, and seeing build times increase by 13% without any reason is annoying.

    BTW I installed Ubuntu 22.04 with kernel 5.15.0-76. But given that it's observable even without cppc_cpufreq, I'm pretty sure now that the kernel is irrelevant.

    I'd like to know if this matches anything others have observed, if there's a known workaround etc. I verified that the CPU wasn't particularly hot, and stuff like this. In case that helps, I noted the following info from the BIOS:

       Board                      AVA Developer Platform     AHB Clock              
       SCP FW Version             2.06                                              
       SCP FW Build               20220308                                          
       MMC FW Version             02.04                                             
       MCU FW Version             NA                                                
       CPU                        Ampere(R) Altra(R)                                
                                  Processor                                         
       CPU Clock                  2600MHz                                           
       PCP Clock                  1450MHz                                           
       L1I CACHE                  64KB                                              
       L1D CACHE                  64KB                                              
       L2 CACHE                   1MB                                               
       SOC Clock                  2000MHz                                           
       Sys Clock                  400MHz                                            
       AHB Clock                  200MHz                                            

    It really feels odd that when it's slow the frequency steps are not that sharp anymore. I'm attaching a graph I've made below.Frequency measurements after boot and after stabilization

    The script to produce this is ultra-simple:

    f=1000000
    while [ $f -lt 2700000 ]; do
    echo ${f} | sudo tee /sys/devices/system/cpu/cpufreq/policy0/scaling_max_freq >/dev/null
    echo  ${f%???} $(taskset -c 0 ~/mhz/mhz -c -i 1 0 $((f/2)))
    f=$((f+2000))
    done

    Oh, and the utility used to measure the frequency is  "mhz" :

    git clone https://github.com/wtarreau/mhz
    cd mhz
    make
    ./mhz -c 5 
    2299.361
    2299.333
    2299.448
    2299.361
    2299.333

    Thanks for any idea!

    Willy



  • I'm working on updating the firmware and hoping I can tweak the ACPI tables to allow more performance. For example I think it should be possible to enable the "Max Performance" setting, while at the moment the highest in GNOME is "Balanced".


  • Hello Rebecca,

     

    any news about your firmware update ? Today I found my machine at 1.0 GHz, voiding all my testings, that's particularly annoying :-(

     

    Thanks!


  • I finally "fixed" it by upgrading to firmware 2.09.100.00 but it brought its new bag of problems (system now takes 7 minutes of silence before the EFI prompt and a few such anomalies). But the frequency seems to be OK now.


  • @Willy Tarreau Sorry, I didn't see your reply until now.

    I've been busy on other work unfortunately (including trying to get OpenBMC running), but I've made some improvements. However since I'm not set up to validate the firmware I'm a bit reluctant to provide downloads since without a DB40-HPC debug card from ADLINK it would be easy to brick the system with a bad upgrade.


  • @Willy Tarreau You might want to upgrade to the firmware that ADLINK just released, since at least on my system it boots _much_ faster since SSIF is now working.


  • @Rebecca Cran Hi Rebecca!  Much appreciated, thank you. I'm willing to give it a try.

    Unfortunately the machine is refusing to boot again for an unknown reason ("synchronous exception at xxx"). I can't express how much I hate UEFI for the instability that it has brought to the computer world :-(  I'll see if I manage to get control on it again and flash it. It's currently a huge pain to wait 7 minutes before seeing a prompt that last a few seconds before hanging.


  • @Rebecca Cran I'm confused, the image from November I found in the download section for AADK is a 2.04.100 while I'm on 2.09.100. Maybe there's a different download link that I didn't find ? Thanks!


  • @Willy Tarreau responding to myself, it's indicated in this thread that the firmware images have been removed (!):  https://www.ipi.wiki/community/forum/topic/119680/edk2-github-repo-disappeared

    Not cool :-(


  • @Willy Tarreau Yeah, it's very unfortunate.

    I didn't think 2.09.100 was released yet. I've been running 2.04.100.10 and 2.04.100.11. Maybe a build of 2.09.100 was mistakenly created?


  • Well, after reverting to EDK2 2.04.100.11, finally the system is back to 2.3 GHz... Grrrr... I think I will flash 2.09 on the carrier board so that I can boot on it using JP27.


Please login to reply this topic!