Skip to content

Configure benchmark machine for maximal stability #338

Closed
@lrytz

Description

@lrytz

Disable hyper-threading

NUMA

The machine only has a single NUMA node, so we don't need to worry about it.

http://stackoverflow.com/questions/11126093/how-do-i-know-if-my-server-has-numa

scala@scalabench:~$ sudo dmesg | grep -i numa
[    0.000000] No NUMA configuration found
scala@scalabench:~$ numactl --hardware
available: 1 nodes (0)
node 0 cpus: 0 1 2 3

Use cpu sets

Install cset: sudo apt-get install cpuset. (On NUMA machines, cset also handles sets of memory nodes, but we only have one.)

  • cset set to create, manipulate CPU sets
  • cset proc to mange processes into sets
  • cset shield is convenience, simpler to use, allows isolating a process

Shielding

  • cset shield shows the current status
  • cset shield -c 1-3
    • creates 3 sets: "root" with all CPUs, "user" with CPUs 1-3 (the "shield"), and "system" with the other CPUs.
    • userspace processes in root are moved to system
  • cset shield -k on moves kernel threads (those that can be moved) from root to system (some kernel threads are specific to a CPU and not moved)
  • cset shield -v -s / -u show shielded / unshielded processes
  • cset shield -e cmd -- -cmdArg execute cmd -cmdArg in the shield
  • cset shield -r reset the shield

References

Use isolated CPUs

NOTE: Using isolated CPUs for running the JVM is not a good idea. The kernel doesn't do any load balancing across isolated CPUs. https://groups.google.com/forum/#!topic/mechanical-sympathy/Tkcd2I6kG-s, https://www.novell.com/support/kb/doc.php?id=7009596. Use cset instead of isolcpus and taskset.

lscpu --all --extended lists CPUs, also logical cores (if hyper-threading is enabled). The CORE column shows the physical core.

Kernel parameter isolcpus=2,3 removes CPUs 2 and 3 from the kernel's scheduler.

  • In /etc/default/grub, for example GRUB_CMDLINE_LINUX_DEFAULT="quiet isolcpus=2,3"
  • sudo update-grub

Verify

  • cat /proc/cmdline
  • cat /sys/devices/system/cpu/isolated
  • taskset -cp 1-- affinity list of process 1
  • ps -eww --forest -o pid,ppid,psr,user,stime,args -- there should be nothing on isolated cores.

Use taskset -c 2,3 <cmd> to run cmd (and child processes) only on CPUs 2 and 3.

Questions

  • Running on fewer cores probably impacts performance as the JVM runs compilation and GC concurrently.
  • When using taskset -c 2,3, does the JVM still think the system has 4 cores? Would that be a problem?
$ taskset -c 0,1 ~/scala/scala-2.11.8/bin/scala -e 'println(Runtime.getRuntime().availableProcessors())'
2
$ taskset -c 1 ~/scala/scala-2.11.8/bin/scala -e 'println(Runtime.getRuntime().availableProcessors())'
2

References

Tickless / NOHZ

Disable scheduling clock interrupts on the CPUs used for benchmarking, add the nohz_full=2,3 kernel parameter if there's a single task (thread) on the CPU.

Verify

  • cat /sys/devices/system/cpu/nohz_full
  • dmesg|grep dyntick should show the CPUs
  • sudo perf stat -C 1 -e irq_vectors:local_timer_entry taskset -c 1 stress -t 1 -c 1 should show 1 tick (see redhat reference)
    • On my test system (after building a kernel with CONFIG_NO_HZ_FULL), i got numbers between 20 and 90 ticks on the otherwise idle CPU 1. Running on CPU 0, I get ~390 ticks.
    • watch -n 1 -d grep LOC /proc/interrupts shows 1 tick per second on CPU 1 when idle
    • Running anything stress -t 1 -c 1 on CPU 1 causes more ticks
    • Running the scala REPL on CPU 1 causes more ticks whenever the RPEL is not idle

NOTE: disabling interrupts has some effect on CPU frequency, see https://fosdem.org/2017/schedule/event/python_stable_benchmark/ (24:45). Make sure to use a fixed CPU frequency. I don't have the full picture yet, but its something like that: the intel_pstate driver is no longer notified and does not update the CPU frequency.

(Some more advanced stuff in http://www.breakage.org/2013/11, pin some regular tasks to specific CPUs, writeback/cpumask, writeback/numa).

References

rcu_nocbs

RCU is a thread synchronization mechanism. RCU callbacks may prevent a cpu from entering adaptive-tick mode (tickless with 0/1 tasks). https://www.kernel.org/doc/Documentation/timers/NO_HZ.txt

The rcu_nocbs=2,3 kernel param prevents CPUs 2 and 3 from queuing RCU callbacks.

References

Interrupt handlers

Avoid running interrupt handlers on certain CPUs

  • /proc/irq/default_smp_affinity is the default bit mask of CPUs permitted for an interrupt handle
  • /proc/irq/N/ contains smp_affinity (bit mask of allowed CPUs) and smp_affinity_list (list of CPUs able to execute the interrupt handler)

Verify

  • cat /proc/interrupts

There's an irqbalance service (systemctl status irqbalance)

References

CPU Frequency

Disable Turbo Boost

  • In BIOS
  • Or write 1 to /sys/devices/system/cpu/intel_pstate/no_turbo -- if using pstate
    • with intel_pstate=disable, find out how to disable turbo boost it in the system

There seem to be two linux tools

Intel can run in different P-States, voltage-frequency pairs when running a process. C-States are idle / power saving states. The intel_pstate driver handles this.

The intel_pstate=disable kernel argument disables the intel_pstate driver and uses acpi-cpufreq instead (see redhad reference).

  • sudo apt-get install linux-cpupower (in jessie backports only!)
  • cpupower frequency-info and cpupower idle-info to show the active drivers.

CPU Info

  • lscpu
  • cat /proc/cpuinfo (| grep MHz)
  • cpupower frequency-info
  • watch -n 1 grep \"cpu MHz\" /proc/cpuinfo

CPUfreq Governors

  • List available governor: cpupower frequency-info --governors (Examples: performance, powersave, ...). Should use performance, which keeps the maximal frequency. NOTE: the intel_pstate driver still does dynamic scaling in this mode.
  • Check active governors: cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
  • Set governor: cpupower -c 1-3 frequency-set --governor [governor] (on CPUs 2, 3)

Set a specific frequency:

The intel_pstate driver has /sys/devices/system/cpu/intel_pstate/min_perf_pct and max_perf_pct, maybe these can be used if we stick with that driver?

References

Disable git gc

https://stackoverflow.com/questions/28092485/how-to-prevent-garbage-collection-in-git

  • $ git config --global gc.auto 0

Disable hpet

Suggested by Dmitry, I haven't found any other references.

hpet is a hardware timer with a frequency of at least 10 MHz (higher than older timer circuits).

  • Current source: cat /sys/devices/system/clocksource/clocksource0/current_clocksource
  • Available sources: cat /sys/devices/system/clocksource/clocksource0/available_clocksource

Change using a kernel parameter clocksource=acpi_pm

Explanation of clock sources: https://access.redhat.com/solutions/18627

References

Ramdisk

tmpfs vs ramfs

Added to /etc/fstab

  • tmpfs /mnt/ramdisk tmpfs defaults,size=16g 0 0

Disable "transparent hugepages"

There are some recommendations out there to disable "transparent hugepages", mostly for database servers

Disable khungtaskd

Probably not useful, runs every 120 seconds only. Detects hung tasks.

Cron jobs

https://help.ubuntu.com/community/CronHowto

  • User crontabs: crontab -e to edit, crontab -l to show
  • Show all user crontabs: for user in $(cut -f1 -d: /etc/passwd); do sudo crontab -u $user -l; done. Or make sure that the /var/spool/cron/crontabs directory is empty.
  • System crontab: /etc/crontab - should not edit by hand
  • /etc/cron.d contains files with system crontab entries
  • /etc/cron.hourly / .daily / .monthly / .weekly contain scripts executed from /etc/crontab (or by anacron, if installed)

Disable / enable cron

  • systemctl stop cron
  • systemctl start cron

Disable / enable at

  • systemctl stop atd
  • systemctl start atd

Run under perf stat

Suggestion by Dmitry, discard benchmarks with too many cpu-migrations, context-switches. Would need to keep track of expected values.

  • sudo perf stat -x, scalac Test.scala (machine-readable output)
  • -prof perfnorm in jmh

References

Build custom kernel

Ah well, probably have to figure out some more details how to do this correctly.

apt-get install linux-source-4.9
tar xaf /usr/src/linux-source-4.9.tar.xz

apt-get install build-essential fakeroot libncurses5-dev

cd linux-source-4.9
cp /boot/config-4.9.0-0.bpo.2-amd64 .config
make menuconfig
  - General setup->Timers subsystem->Timer tick handling -> Full dynticks system (tickless)
  - Up one level -> Full dynticks system on all CPUs by default (except CPU 0)
  - General setup->Local Version, enter a simple string
nano .config
  - comment out CONFIG_SYSTEM_TRUSTED_KEYS
    https://unix.stackexchange.com/questions/293642/attempting-to-compile-any-kernel-yields-a-certification-error

make deb-pkg

cd ..
sudo dpkg -i linux-image-4.9.18_4.9.18-1_amd64.deb

Scripting all of that

It seems that python3's "perf" package will do most configurations:

pip3 install perf
python3 -m perf system show
python3 -m perf system tune
python3 -m perf system reset

Important: check all settings before starting a benchmark.

Check load

Find a way to ensure that the benchmark machine is idle before starting a job.

Machine Specs

NX236-S2HD (http://www.nixsys.com/nx236-s2hd.html)

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions