About zram, /tmp, swap & thread-counts

If you have any suggestions or ideas about improving Salix, here's the place to post them.
Post Reply
colebaas
Posts: 22
Joined: 18. Dec 2016, 10:55

About zram, /tmp, swap & thread-counts

Post by colebaas »

At the end of this post is a slightly refined bash script - adopted from: https://github.com/otzy007/enable-zRam-in-Slackware -
which creates (#ofCpuThreads / 2 * /dev/zramX) + 1 <=5 devices, while
1) creates /tmp on /dev/zram0
2) Allocates swap space equally on the remaining [1-4] zram devices.
- as always, YMMV so feel free to play with the amount of space allocated.

The reasoning for such an allocation follows;

We tested for any potential benefit of a /dev/zram mounted /tmp in lieu of today's fast IO devices (SSD).
A number of popular applications, mainly those which are server+client(desktop/notebook) models, did significantly benefit from a RAM mounted /tmp especially with socket based communication.
Examples: i3, zim, iPython, jupyter notebooks, etc.
Loading and initialization (socket creation) times have improved anywhere from 5x-20x while stability has also increased. (For obvious reasons.)

Why use swap? - I have plenty of RAM.
Why only 4 (max.) devices?? - my CPU has >4 threads.

Upon testing a number of zram configurations (all Slackware, vanilla kernels) and analyzing the memory page tables, significant differences emerged with respect to optimum page allocation.

Take a look at a standard memory allocation table
  • $ cat /proc/buddyinfo
    Node 0, zone DMA 0 2 2 0 3 2 0 0 1 1 3
    Node 0, zone DMA32 1398 1409 1329 1161 1406 968 665 426 255 2 240
    Node 0, zone Normal 4522 8592 9149 9858 12029 11806 7777 4691 3285 5 3672
    [It reads right to left in multiples of 1024 compared to each previous column. - E.g. far right=smallest -> far left=largest.]
On a freshly booted system, for example, one would notice the table has mostly 0s in the right columns - i.e. very few, if any, small, fragmented memory chunks.
As usage progresses applications' needs vary with respect to the memory chunks the kernel must provide and these are not always unit size.
Example; If an application requests a 14K memory chunk (far right) then the kernel will grab a 16K page and return the remainder back into the memory pool at the 12K allocation. That is, as long as it has unused chunks closest to the requested amount of memory. When it has no more such chunks left then it has to resort to grabbing the next chunk from a larger allocation unit and release the remainder back into the memory pool - the right side of the page table.
Over time, a lot of smaller chunks get released back into the pool and thus accumulate to the right and memory becomes fragmented.
Due to the lack of swap space, the kernel has no other choice but to fulfill the concurrent memory requests from whatever is available, mostly from the remaining, smaller memory fragments.

In a number of common usage scenarios the page table becomes rapidly fragmented and most of the remaining memory will reside on the far right of the table.
Providing even a small swap space allows the kernel to swap out the less used, fragmented memory segments and to reassemble them into larger memory chunks. - Essentially defragmenting the memory pool on the fly. (How efficient is this process depends on a number of factors, not in the least on speed differences between the CPU and the front-side bus, etc.)
During one of the practical tests we have found that, in the case of a fresh install on a USB stick (both USB 2.0 & 3.0) Slackware | Salix (both, 32 & 64, vanilla kernel), the overall performance has increased rapidly.
Furthermore, without any swap allocation, in heavy memory use scenarios (NOT heavy disk IO!) we had little difficulty to cause the system to "freeze" and/or a kernel panic even on machines with 32,64 or 128 GB installed RAM. (RAM speeds varied 1600-3200 and ECC vs non-ECC)
Under identical scenarios but with a proportionately small swap->zram allocation - amounts tested were: 128M, 256M, 512M, 1GB, 2GB, 4GB of swap space - in all cases the system had no problem throttling past even the farthest point where, before the addition of any swap space it would freeze.
The above example does not imply that running a system on a USB stick is the only use-case scenario for zram swap but rather it highlights one of the most notable differences among those we have tested.

Testing various zram device configurations with respect to the number of CPU threads, from 2-72 threads, on commonly deployed Intel i7 and Xeon processor configurations, we have found that while 0,1,2,4 one-to-one thread->zram configurations were optimal, increasing the number of zram devices past 4 (corresponding to higher CPU thread count) incurred tangible performance penalties. (Without claiming to have found the exact cause for this, zram instability in the case of zram devices >4 seemed to play a role.)
Consequently the script is aimed to cap the number of zram devices <=4, even if the CPU thread-count exceeds that.

Code: Select all

#!/bin/bash
# Script to start zRam (Virtual Swap Compressed in RAM) + /tmp in RAM
# adopted from https://github.com/otzy007/enable-zRam-in-Slackware
#
#  place this script in /etc/rc.d/rc.zram
#  start it from rc.S -> right before /tmp gets setup
#
# Size of swap & /tmp space, each in MB
# default=1024MB, each
###

tempSize=1024	# /tmp size; in MB
swapSize=1024	# Total swap space; in MB

## do not alter below ##
thrdCnt=$(cat /proc/cpuinfo | grep -c processor)
NumDev=`[ $thrdCnt -gt 4 ] && echo 4 || echo $thrdCnt`
diskSize=$(( $swapSize / $NumDev ))

start() { ## setup zram0 for /tmp
	modprobe zram num_devices=$(( $NumDev + 1 ))
	echo lzo > /sys/block/zram0/comp_algorithm
	echo "$(( $tempSize * 1024 * 1024 ))" > /sys/block/zram0/disksize
	mkfs.ext4 /dev/zram0
	mount /dev/zram0 /tmp
	   ## now set up the swap devices
  for i in $(seq $NumDev); do
  	echo lzo > /sys/block/zram$i/comp_algorithm
	echo $diskSize*1024*1024 | bc > /sys/block/zram$i/disksize
	mkswap --label zram$i /dev/zram$i
	swapon --priority 100 /dev/zram$i
   done
}

stop() {
	for i in $(seq $NumDev); do
	swapoff /dev/zram$i
	done
##    rmmod zram	## do Not unload;  /tmp is still mounted!!
}

case "$1" in
  start)
	start
  ;;

  stop)
	stop
  ;;

  restart)
	  for i in $(seq $NumDev); do
		echo 1 > /sys/block/zram$i/reset
	done
  ;;

  *)
  echo "Usage: $0 (start|stop|restart)"
esac
Note: Do NOT run this script inside a Virtual Machine or a container!
Post Reply