Memory observability tools

ToolDescription
vmstatVirtual and physical memory statistics
PSIMemory pressure stall information
swaponSwap device usage
sarHistorical statistics
slabtopKernel slab allocator statistics
numastatNUMA statistics
psProcess status
topMonitor per-process memory usage
pmapProcess address space statistics
perfMemory PMC and tracepoint analysis
drsnoopDirect reclaim tracing
wssWorking set size estimation
bpftraceTracing programs for memory analysis
pmcarchCPU cycle usage including LLC misses
tlbstatSummarizes TLB cycles
freeCache capacity statistics
cachestatPage cache statistics
oomkillShows extra info on OOM kill events
memleakShows possible memory leak code paths
mmapsnoopTraces mmap(2) calls system-wide
brkstackShows brk() calls with user stack traces
shmsnoopTraces shared memory calls with details
faultsShows page faults, by user stack trace
ffaultsShows page faults, by filename
vmscanMeasures VM scanner shrink and reclaim times
swapinShows swap-ins by process
hfaultsShows huge page faults, by process

CPU concepts

  1. Clock Rate
    The clock is a digital signal that drives all processor logic. Each CPU instruction may take one or more cycles of the clock (called CPU cycles) to execute. CPUs execute at a particular clock rate; for example, a 4 GHz CPU performs 4 billion clock cycles per second.
  2. Instructions
    CPUs execute instructions chosen from their instruction set. An instruction includes the following steps, each processed by a component of the CPU called a functional unit:
    Instruction fetch
    Instruction decode
    Execute
    Memory access
    Register write-back

    Memory access is the slowest
  3. Instruction Pipeline
    The instruction pipeline is a CPU architecture that can execute multiple instructions in parallel by executing different components of different instructions at the same time.
  4. Branch Prediction
    Modern processors can perform out-of-order execution of the pipeline, where later instructions can be completed while earlier instructions are stalled, improving instruction throughput. 
  5. Instruction Width
    Multiple functional units of the same type can be included, so that even more instructions can make forward progress with each clock cycle. This CPU architecture is called superscalar and is typically used with pipelining to achieve a high instruction throughput.
  6. Instruction Size
    x86, which is classified as a complex instruction set computer (CISC), allows up to 15-byte instructions. ARM, which is a reduced instruction set computer (RISC), has 4 byte instructions with 4-byte alignment
  7. SMT
    Simultaneous multithreading makes use of a superscalar architecture and hardware multithreading support (by the processor) to improve parallelism. It allows a CPU core to run more than one thread, effectively scheduling between them during instructions.
  8. IPC, CPI
    Instructions per cycle (IPC) is an important high-level metric for describing how a CPU is spending its clock cycles and for understanding the nature of CPU utilization. This metric may also be expressed as cycles per instruction (CPI), the inverse of IPC.
  9. Utilization
    CPU utilization is measured by the time a CPU instance is busy performing work during an interval, expressed as a percentage. It can be measured as the time a CPU is not running the kernel idle thread but is instead running user-level application threads or other kernel threads, or processing interrupts.
  10. User Time/Kernel Time
    The CPU time spent executing user-level software is called user time, and kernel-level software is kernel time. Kernel time includes time during system calls, kernel threads, and interrupts. When measured across the entire system, the user time/kernel time ratio indicates the type of workload performed.
  11. Saturation
    A CPU at 100% utilization is saturated, and threads will encounter scheduler latency as they wait to run on-CPU, decreasing overall performance. This latency is the time spent waiting on the CPU run queue or other structure used to manage threads.
  12. Preemption
    Allows a higher-priority thread to preempt the currently running thread and begin its own execution instead. This eliminates the run-queue latency for higher-priority work, improving its performance.
  13. Priority Inversion
    Priority inversion occurs when a lower-priority thread holds a resource and blocks a higher-priority thread from running. This reduces the performance of the higher-priority work, as it is blocked waiting.
  14. Multiprocess, Multithreading
    Most processors provide multiple CPUs of some form. For an application to make use of them, it needs separate threads of execution so that it can run in parallel.
  15. Word Size
    Processors are designed around a maximum word size—32-bit or 64-bit—which is the integer size and register size. 
  16. Compiler Optimization
    Compilers are also frequently updated to take advantage of the latest CPU instruction sets and to implement other optimizations. Sometimes application performance can be significantly improved simply by using a newer compiler.

Enable sse on virtual machine(kvm)

Kernel-based Virtual Machine (KVM) has become the defacto hypervisor on GNU/Linux systems it works with a great performance as it utilizes the CPU virtualization extensions Intel VT-x or AMD-V). KVM doesn’t emulate hardware but uses QEMU for this.

Nested Virtual guest

It’s possible to use nested virtualization this makes it possible to run a hypervisor inside a KVM virtual machine.

Verify

To verify if nested virtualization is enabled on your system can check /sys/module/kvm_intel/parameters/nested on Intal systems or /sys/module/kvm_amd/parameters/nested

[staf@ak ~]$ cat /sys/module/kvm_intel/parameters/nested
N
[staf@ak ~]$ 

Enable

Shutdown all virtual machines

Make sure that there no virtual machines running.

[root@ak ~]# virsh 
Welcome to virsh, the virtualization interactive terminal.

Type:  'help' for help with commands
       'quit' to quit

virsh # list
 Id    Name                           State
----------------------------------------------------

virsh # 

Unload KVM

Unload the KVM kernel module.

[root@ak ~]# modprobe -r kvm_intel
[root@ak ~]# 

Load KVM and activate nested

Reload the KVM with the nested feature enabled.

[root@ak ~]# modprobe kvm_intel nested=1
[root@ak ~]# 

Verify

[root@ak ~]# cat /sys/module/kvm_intel/parameters/nested
Y
[root@ak ~]# 

To enable the nested feature permanently create /etc/modprobe.d/kvm_intel.conf

[root@ak ~]# vi /etc/modprobe.d/kvm_intel.conf

and enable the nested option.

options kvm_intel nested=1

Enabling nested virtualization in the virtual machine

When you logon to a virtual machine and verify the virtualization extensions on the cpu the flags aren’t available.

[staf@centos7 ~]$ cat /proc/cpuinfo | grep  -i -E "vmx|svm"
[staf@centos7 ~]$ 

To enable nested virtualization in a vritual machine you can

  • start virsh and and edit the the virtual machine and change the CPU line to <cpu mode='host-model' check='partial'/>
  • Open virt-manager and select Copy host CPU configuration on the CPU configuration
root@ak ~]# virsh 
Welcome to virsh, the virtualization interactive terminal.

Type:  'help' for help with commands
       'quit' to quit

virsh # list
 Id    Name                           State
----------------------------------------------------
 1     centos7.0                      running

virsh # edit centos7.0 

Change the cpu settings

  <features>
    <acpi/>
    <apic/>
    <vmport state='off'/>
  </features>
  <cpu mode='host-model' check='partial'>
    <model fallback='allow'/>
  </cpu>

Shutdown the virtual machine

virsh # reboot centos7.0 
Domain centos7.0 is being rebooted

virsh # 

Start the virtual machine

virsh # start centos7.0  
Domain centos7.0 started

While saving the virsh domain xml you might get an error as:

Extra element cpu in interleave

Press i for ignore and start the domain.

Logon to the virtual machine and verify the cpu flags;

[staf@centos7 ~]$ cat /proc/cpuinfo | grep -i vmx
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology eagerfpu pni pclmulqdq vmx ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid xsaveopt ibpb ibrs arat spec_ctrl
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology eagerfpu pni pclmulqdq vmx ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid xsaveopt ibpb ibrs arat spec_ctrl
[staf@centos7 ~]$ cat /proc/cpuinfo | grep  -i "vmx|svm"
[staf@centos7 ~]$ cat /proc/cpuinfo | grep  -i -E "vmx|svm"
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology eagerfpu pni pclmulqdq vmx ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid xsaveopt ibpb ibrs arat spec_ctrl
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology eagerfpu pni pclmulqdq vmx ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid xsaveopt ibpb ibrs arat spec_ctrl

dpkg: error processing package linux-image-generic (–configure): dependency problems – leaving unconfigured

I had this issue just now. What I did was purge the errant package using dpkg in my case then update and force the reinstall:

sudo dpkg --purge linux-image-3.13.0-35-generic
sudo apt-get update
sudo apt-get -f install

Writing a Simple Linux Kernel Module

Prerequisites

Before we get started, we need to make sure we have the correct tools for the job. Most importantly, you’ll need a Linux machine. I know that comes as a complete surprise! While any Linux distribution will do, I am using Ubuntu 16.04 LTS in this example, so if you’re using a different distribution you may need to slightly adjust your installation commands.

Secondly, you’ll need either a separate physical machine or a virtual machine. I prefer to do my work in a virtual machine, but this is entirely up to you. I don’t suggest using your primary machine because data loss can occur when you make a mistake. I say when, not if, because you undoubtedly will lock up your machine at least a few times during the process. Your latest code changes may still be in the write buffer when the kernel panics, so it’s possible that your source files can become corrupted. Testing in a virtual machine eliminates this risk.

And finally, you’ll need to know at least some C. The C++ runtime is far too large for the kernel, so writing bare metal C is essential. For interaction with hardware, knowing some assembly might be helpful.

Installing the Development Environment

On Ubuntu, we need to run:

apt-get install build-essential linux-headers-`uname -r`

This will install the essential development tools and the kernel headers necessary for this example.

The examples below assume you are running as a regular user and not root, but that you have sudo privileges. Sudo is mandatory for loading kernel modules, but we want to work outside of root whenever possible.

Getting Started

Let’s start writing some code. Let’s prepare our environment:

mkdir ~/src/lkm_example
cd ~/src/lkm_example

Fire up your favorite editor (in my case, this is vim) and create the file lkm_example.c with the following contents:

#include <linux/init.h>
#include <linux/module.h>
#include <linux/kernel.h>
MODULE_LICENSE(“GPL”);
MODULE_AUTHOR(“Robert W. Oliver II”);
MODULE_DESCRIPTION(“A simple example Linux module.”);
MODULE_VERSION(“0.01”);
static int __init lkm_example_init(void) {
 printk(KERN_INFO “Hello, World!\n”);
 return 0;
}
static void __exit lkm_example_exit(void) {
 printk(KERN_INFO “Goodbye, World!\n”);
}
module_init(lkm_example_init);
module_exit(lkm_example_exit);

Now that we’ve constructed the simplest possible module, let’s example the important parts in detail:

· The “includes” cover the required header files necessary for Linux kernel development.

· MODULE_LICENSE can be set to a variety of values depending on the license of the module. To see a full list, run:
grep “MODULE_LICENSE” -B 27 /usr/src/linux-headers-`uname -r`/include/linux/module.h

· We define both the init (loading) and exit (unloading) functions as static and returning an int.

· Note the use of printk instead of printf. Also, printk doesn’t share the same parameters as printf. For example, the KERN_INFO, which is a flag to declare what priority of logging should be set for this line, is defined without a comma. The kernel sorts this out inside the printk function to save stack memory.

· At the end of the file, we call module_init and module_exit to tell the kernel which functions are or loading and unloading functions. This gives us the freedom to name the functions whatever we like.

We can’t compile this file yet, though. We need a Makefile. This basic example will work for now. Note that make is very picky about spaces and tabs, so ensure you use tab instead of space where appropriate.

obj-m += lkm_example.o
all:
 make -C /lib/modules/$(shell uname -r)/build M=$(PWD) modules
clean:
 make -C /lib/modules/$(shell uname -r)/build M=$(PWD) clean

If we run “make”, it should compile your module successfully. The resulting file is “lkm_example.ko”. If you receive any errors, check that your quotation marks in the example source file are correct and not pasted accidentally as UTF-8 characters.

Now we can insert the module to test it. To do this, run:

sudo insmod lkm_example.ko

If all goes well, you won’t see a thing. The printk function doesn’t output to the console but rather the kernel log. To see that, we’ll need to run:

sudo dmesg

You should see the “Hello, World!” line prefixed by a timestamp. This means our kernel module loaded and successfully printed to the kernel log. We can also check to see if the module is still loaded:

lsmod | grep “lkm_example”

To remove the module, run:

sudo rmmod lkm_example

If you run dmesg again, you’ll see “Goodbye, World!” in the logs. You can also use lsmod again to confirm it was unloaded.

As you can see, this testing workflow is a bit tedious, so to automate this we can add:

test:
 sudo dmesg -C
 sudo insmod lkm_example.ko
 sudo rmmod lkm_example.ko
 dmesg

at the end of our Makefile and now run:

make test

to test our module and see the output of the kernel log without having to run separate commands.

Now we have a fully functional, yet completely trivial, kernel module!

Missing separator in Makefile?

The following Makefile is not working and I am not sure what’s going on.

CC = gcc
CFLAGS = -Wall -g

demo:
    ${CC} ${CFLAGS} demo.c -o demo
lib:
    ${CC} ${CFLAGS} lib.c -o lib
clean:
    rm -f lib demo

Demo has the main function and lib has a set of methods used in demo.

I added the -c flag to lib. However when I run make, I get:

Makefile:5: *** missing separator.  Stop.

Solution:

Given your update with the error, check what you have on the line before those ${CC} commands. Many make programs require a real tab character before the commands and editors that put in eight spaces (for example) will break them. That’s more often than not the cause of the “Missing separator” errors.

You can see that with the following transcript. In the file, there are four spaces before the $(xyzzy):

xyzzy=echo
all:
    $(xyzzy) hello

So, when I make it, I get the same error as you:

pax> make
makefile:3: *** missing separator.  Stop.

But, when I edit it and turn those four spaces into a tab, it works fine:

pax> make
echo hello
hello

You also have a problem with the way you’re trying to combine the source files together.

Without a -c flag to gcc, it will try to create a separate executable from each of those commands, almost certainly leading to linker errors. You’re going to need something like (simple):

CC = gcc
CFLAGS = -Wall -g

# Just compile/link all files in one hit.
demo: demo.c lib.c
   ${CC} ${CFLAGS} -o demo demo.c lib.c

clean:
    rm -f demo

or (slightly more complex):

CC = gcc
CFLAGS1 = -Wall -g -c
CFLAGS2 = -g

# Link the two object files together.

demo: demo.o lib.o
   ${CC} ${CFLAGS2} -o demo demo.o lib.o

# Compile each source file to an object.

demo.o: demo.c
   ${CC} ${CFLAGS1} -o demo.o demo.c

lib.o: lib.c
   ${CC} ${CFLAGS1} -o lib.o lib.c

clean:
    rm -f demo

The problem with the first solution is that it unnecessarily compiles both programs even when only one is out of date. The second solution is a little more intelligent.

kernel:NMI watchdog: BUG: soft lockup – CPU#0 stuck for 21s!

If you think that overcommitment is the reason, you can use the following:

CODE:

echo [time] >  /proc/sys/kernel/watchdog_thresh

where time cannot be more than 60.

How to boot with old kernel version in RHEL7 ?

  • By default, the key for the GRUB_DEFAULT directive in the /etc/default/grub file is the word saved. This instructs GRUB 2 to load the kernel specified by the saved_entry directive in the GRUB 2 environment file, located at /boot/grub2/grubenv. One can set another GRUB record to be the default, using the grub2-set-default command, which will update the GRUB 2 environment file.
  • By default, the saved_entry value is set to the name of latest installed kernel of package type kernel. This is defined in /etc/sysconfig/kernel by the UPDATEDEFAULT and DEFAULTKERNEL directives. The file can be viewed by the root user as follows:
    $ cat /etc/sysconfig/kernel
    # UPDATEDEFAULT specifies if new-kernel-pkg should make
    # new kernels the default
    UPDATEDEFAULT=yes
    
    # DEFAULTKERNEL specifies the default kernel package type
    DEFAULTKERNEL=kernel
    
  • To force a system to always use a particular menu entry, use the menu entry name as the key to the GRUB_DEFAULT directive in the /etc/default/grub file. To list the available menu entries, run the following command as root:
    ~]# awk -F\' '$1=="menuentry " {print $2}' /etc/grub2.cfg
    
    Eg: 
    ~]#  awk -F\' '$1=="menuentry " {print $2}' /etc/grub2-efi.cfg 
    Red Hat Enterprise Linux Server (3.10.0-693.el7.x86_64) 7.3 (Maipo)           <<==== Entry 0
    Red Hat Enterprise Linux Server (3.10.0-514.el7.x86_64) 7.3 (Maipo)           <<==== Entry  1
    Red Hat Enterprise Linux Server (0-rescue-d3c598b9d2204138bd2e1001316a5cc6) 7.3 (Maipo)
    
  • GRUB 2 supports using a numeric value as the key for the saved_entry directive to change the default order in which the kernel or operating systems are loaded. To specify which kernel or operating system should be loaded first, pass its number to the grub2-set-defaultcommand. For example:
    ~]# grub2-set-default 1
    
  • Check the below file to see the kernel which will be loaded at next boot, crosscheck the numeric value with the menuentry in the /etc/default/grub file.
    ~]# cat /boot/grub2/grubenv |grep saved
    
    Eg:
    ~]# cat /boot/grub2/grubenv |grep saved
    saved_entry=1
    
  • Changes to /etc/default/grub require rebuilding the grub.cfg file as follows:
  • Rebuild the /boot/grub2/grub.cfg file by running the grub2-mkconfig -o command as follows:
    • On BIOS-based machines: ~]# grub2-mkconfig -o /boot/grub2/grub.cfg
    • On UEFI-based machines: ~]# grub2-mkconfig -o /boot/efi/EFI/redhat/grub.cfg

kernel:NMI watchdog: BUG: soft lockup – CPU#0 stuck for 21s!

  • Consulting the files /etc/grub.conf and /boot/grub/grub.conf, in RHEL 6 and below, or /etc/sysconfig/grub in RHEL 7, it should be verified if the console output is redirected to a console, i.e. using console=ttyS1 or console=ttyS1,9600. In both of these cases the output is restricted to 9600 baud, limiting the output and possibly causing issues.
  • A fix might be to not log to the serial console, or explicitly configure a higher baudrate, i.e. using console=ttyS1,115200. Please note, in some situations also 115200 baud might be a limiting factor.

Otherwise, investigate further root cause conditions

  • Determine if the system was under extremely high load at the time the soft lockups were seen in the logs. If the sysstat package was already installed, it will have recorded load average every 10 minutes using a cron job.
  • Then Load average can be found by searching for ldavg in /var/log/sa/sar<day> where day is the number date of the day when soft lockups were seen. If load average is significantly higher than the amount of logical CPU cores on the system it indicates the soft lockups probably occured because of extremely high workloads.
    In this case it would be best to determine what processes caused the load to go so high and make changes so that the processes don’t cause the issue again.
  • Since it is also possible that defects in the kernel could have caused the soft lockups, full logs needs to be investigated around the time of the soft lockups to see if the issue is a bug or is fixed by errata. It can help to look in the changelog of the latest kernel available on Red Hat Network and see if any soft lockup issues were fixed since the version of the installed kernel.
  • Another way is to eliminate the possibility of a known issue which has already been fixed by testing the system by running it with the latest kernel and see if the soft lockups happen again. Red Hat support may be required to conclusively determine if the issue is a bug.
  • Also verify with a hardware vendor that the issue is not hardware related. One way to verify that the issue is not a known and solved hardware problem is to update the firmware or BIOS to the latest available from the hardware vendor.
  • On virtual systems, soft lockups can indicate that the underlying hypervisor is overcommitted. Please see this article addressing this issue: VMware virtual machine guest suffers multiple soft lockups at the same time
  • If all of the above have been verified to not be the cause it could be a case where soft lockups do not indicate a problem; for example on systems with very large numbers of CPU cores.

If this is encountered in RHEL 5, then increase the threshold at which the messages appear using the following procedures:

  • Run following command and check whether “soft lockup” errors are still encountered on the system:
    # sysctl -w kernel.softlockup_thresh=30
  • To make this parameter persistent across reboots by adding following line in /etc/sysctl.conf file:
     kernel.softlockup_thresh=30

In RHEL 6 and above, the threshold is now named “watchdog_thresh” and can be set to no higher than 60:
– To make this change in RHEL 6 and above, set the tuneable kernel.watchdog_thresh in sysctl.conf

Additional Notes:

  • The softlockup_thresh kernel parameter was introduced in Red Hat Enterprise Linux 5.2 in kernel-2.6.18-92.el5 thus it is not possible to modify this on older versions.

Root Cause

  • Soft lockups are situations in which the kernel’s scheduler subsystem has not been given a chance to perform its job for more than the limit set by the watchdog threshold, in seconds; they can be caused by defects in the kernel, by hardware issues or by extremely high workloads.
  • If lockups are encountered on a virtual system, it is important to ensure that the hypervisor is not overcommitted.
  • Hardware issues related to newly installed memory might cause soft lockups.
  • Also misconfigurations might cause the issue, like redirecting console output to a serial device and limiting it to i.e. 9600 baud.
  • On systems with a very large numbers of CPU cores soft lockups might not indicate a problem.

Trying to modify a kernel

I’m a noob in linux/android, yet I have to modify a kernel.

For one specific reason I’m using this guide (it’s somewhat understandable when translated to english using google).

The problem is that I’m stuck at part where you have to “enter the following command to view the address of these two functions”. The only addresses I get when entering those commands are 00000000, which doesn’t seem quite right.

I don’t really understand why is that happening. It may be because the guy who created a guide is using adb for getting addresses, while I’m trying to get them using terminal in android. I can’t quite use adb, because I’m running MEmu emulator and that’s where I need addresses from.

Solution:

The address is not being shown because you are not running the command under the root user.
This issue has been explained in this answer.

In your case, you need to obtain super-admin rights using either the sudo -s or su command. Once admin, your shell prompt should end with a #. On my one plus, the prompt looks like this when I am admin: A0001:/ #

If it does not work, be sure that the file /proc/sys/kernel/kptr_restrict contains a 0. You can do so by executing the command cat /proc/sys/kernel/kptr_restrict.

To set its value to 0, you should execute the command echo 0 > /proc/sys/kernel/kptr_restrict with administrative rights.

Hope it helps!