Hacking Ubuntu Touch, Part 8: System and process monitoring tools (part 2)
NOTE: This is a continuation of the series and relies on having Developer mode enabled.
NOTE: Thanks to Colin King from Canonical for all the help, and to Thomas Voss from Canonical for sheding some light on the GPS issue mentioned in the
In the first part of this article we had a look at the “usual” suspects for system and process monitoring:
vmstat. In this second part we will look at some special tools developed to find runaway processes and other power hogs on Ubuntu phones.
dpkg -xto extract the packages to a temporary directory.
cpustat displays the cpu utilization of all currently running tasks. It takes two optional arguments: the delay between iterations (1 second by default), and the number of iterations to run (infinite by default).
The man page contains quite a list of command line options which modify output. I found the following very useful:
cpustatitself in the output. This should probably be on by default?
-Dshows the distribution of CPU utilisation by task and by CPU once the command terminates.
-lshows full command information.
-rwrites to a CSV file for later analysis.
-Sadds timestamps to the output.
-tignores processes which used less CPU than a given threshold in the last interval.
-xshows the average CPU load over the last 1, 5 and 10 minutes, the average CPU frequency over all CPUs and the number of online CPUs.
This tool periodically reads the content of the kernel event statistics file,
/proc/timer_stats, and filters/beautifies it. The output gives a good indication about which processes and threads use kernel timers to be woken up at specific points in time, and how often that happens. An excessive number of timer events will most likely keep the system unnecessarily busy and keep it from going to sleep, thus increasing power consumption.
The syntax is mostly the same as
count as optional arguments, the default is to update the output every second until
Strg+c is pressed.
Now for the output:
The first column is the number of events per task per second, the second is the thread ID (
LWP column in ps), the third is the task name, column four is the kernel function that was used to initialize the timer and column five the kernel function called on expiry.
In the example you can see that
2976) is called ten times per second, even though I had the Location detection completely disabled on my bq Aquaris E4.5. I talked about this with Thomas Voss at Canonical, and apparently the Android Hardware Abstraction Layer for the GPS in the Aquaris E4.5 “ticks” at a constant rate of 10 Hertz, even if the GPS is deactivated completely. Luckily this doesn’t prevent the CPU from going into deep sleep, so the increase in power consumption is negligible.
eventstat again has a lot of command line options, the following ones seem the most interesting to me:
-ragain writes the output to a CSV file for later analysis.
-lswitches the process name field to “long” format.
-kshows only kernel threads,
-uonly user-space processes.
-wadds a timestamp to the output.
Most file system activity is invisible to the user, but it can have a serious impact on system performance and power consumption. This is where
fnotifystat comes into play: it uses fnotify to dump all filesystem activity in a given period of time.
fnotifystat again takes two optional arguments, the delay and the count. The defaults are a delay of one second between iterations and an infinite number of iterations, but unless you specify the
-f option output will be suppressed if there was no activity in the last iteration (e.g. when the device is sleeping).
The output displays how many open/close/read/write operations a process requested on a specific file on average over the sampling period. Keep this in mind if your delay is not set to one second!
Out of the long list of command line options I ended up using the following more often:
-fforces output after every iteration even if there was no activity.
-Dswitches to “device mode” and displays the minor/major device number instead of the filename.
-D, but includes file system inodes.
-xcan be used to whitelist/blacklist paths, instead of monitoring all file systems.
-mmerges multiple events on the same file by the same process into a single line, instead of writing to the output every time the kernel sends a notification.
-plimits monitoring to the specified process names/IDs.
There doesn’t currently seem to be an option to write the output to a CSV file.
Two of the most important system calls of any UNIX-like system are
exit, used to create child processes. On Linux
fork has been mostly replaced by
clone. While the implementation of these system calls is usually highly optimized, a high number of spawned and killed child processes and threads can have negative effects on system performance and power consumption.
forkstat monitors the system and displays every call to
exit. It also tells you how long a process/thread was alive until it exited:
Because of an ambiguity in the interface to the kernel, earlier versions of
forkstat erroneously displayed every call to
clone as a call to
fork instead. Version 0.01.11 fixed this.
The most interesting command line options are:
-ecan be used to trace only a given list of events.
-Sshows cumulative event statistics when the command terminates.
Back when I explained the
SZ fields shown by
ps, we realised that it’s not that easy to find out how much memory a process actually consumes in the end. Every process has its own address space, and in that address space there are several different regions: machine code, stack, private data, shared data, libraries, memory mapped files and devices, etc.
Stack and private data regions are easy, they can be accounted to the process in full because they differ between processes. But what about code, shared libraries and shared data? The kernel is clever, it realizes that it only has to load these into memory once if they are used by multiple processes at the same time, and then just has to pretend to every process that it has got its own copy. Memory mapped files and devices are not even real memory, just make-shift regions emulated by the kernel.
So if you sum up the size of all memory regions in all process address spaces, the number is usually much larger than the real memory consumption. If only there was a tool that handled all those shared regions correctly.
smemstat to the rescue! Instead of just displaying mostly useless values like
RSS, it tries to paint a more realistic picture by dividing the output into
USS is the unique set size of a process. It counts every bit of memory in the process space that only exists once for this process in the system.
smemstat is quite good at calculating this, for example if there is only one instace of the
bash process running, it counts the whole code segment towards this single process. If the same binary is running muĺtiple times,
smemstat sees that the kernel keeps only a single copy of the code segment and every shared libraries in memory, so it doesn’t count these regions against the unique set size.
PSS is the proportional set size of a process.
smemstat starts with the unique set size of the process and then adds a proportional piece of every shared area. Let’s say there are three instances of
bash running, and the code segment of the binary is exactly 900 kilobytes in size. Also
bash needs the shared library
libc.so.6, which is let’s say exactly 1000 kilobytes in size, but is also used by two other processes in the system.
smemstat will then account (900 / 3) = 300 kilobytes of the code segment and (1000 / 5) = 200 kilobytes for the shared library towards every running instance of
RSS is, as already described, the total size of all areas in a process’ address space which are currently loaded in, which means they have been accessed at least once and are not swapped out.
Let’s look at the difference between the following two outputs to get a feeling for how this all works. I made sure that nothing is swapped out to make things a bit easier, especially on the bq Aquaris E4.5 swapping is the norm though, so you will often have to combine
RSS in your mind.
bash instance running:
bash instances running:
If just one
bash instance is running, most of it counts towards
PSS isn’t much higher than
bash only loads a small number of libraries which are used by nearly every other process in the system as well. If you divide a ~100 kilobyte library among 100 processes (this actually happens e.g. with
ld-2.21.so), there’s not much to account to a single process.
When we run three
bash instances it gets more interesting: Note that
RSS is always about the same value, because a freshly started
bash in this example has loaded and accessed areas worth ~1780 kilobytes. But now just about everything that is not unique to each
bash process is accounted towards
PSS, not against
USS! What remains for
USS are mostly heap and stack.
PSS is much lower than before, because the code segment can be divided by three. Shared libraries are negligible in this example.
With those three values a developer can now actually see what’s really going on. When optimizing for low memory consumption,
USS should obviously be kept as low as possible for example. A high
Swap might be okay if
PSS is much lower, because in that case there might be a lot of shared regions which are shared with many other processes, lowering the “impact” of every single process.
You will obviously only get to see your own processes if you don’t run
smemstat as root. The most important command line switches are:
-gto report memory in kilobytes, megabytes or gigabytes, respectively.
-oto dump the data to a JSON formatted file.
-pto monitor a list of processes.
If you know better and/or something has changed, please do find me on the Freenode IRC or on Launchpad.net and get in contact!