The Linux Kernel

An API to Hardware

Processes

We know that the Linux kernel allocates resources to processes, otherwise processes would never be able to run. So what exactly is a process?

Basically, it is a collection of things needed to run a program, and the program itself.

% cat /proc/1620/status 
Name:	firefox
State:	S (sleeping)
Tgid:	1620
Ngid:	0
Pid:	1620
PPid:	1
TracerPid:	0
Uid:	1024	1024	1024	1024
Gid:	100	100	100	100
FDSize:	256
Groups:	10 52 100 142 421 
NStgid:	1620
NSpid:	1620
NSpgid:	1053
NSsid:	1053
VmPeak:	 2629516 kB
VmSize:	 2618700 kB
VmLck:	       0 kB
VmPin:	       0 kB
VmHWM:	 1090712 kB
VmRSS:	  916448 kB
VmData:	 1691772 kB
VmStk:	     328 kB
VmExe:	     140 kB
VmLib:	  196464 kB
VmPTE:	    3616 kB
VmPMD:	      24 kB
VmSwap:	       0 kB
Threads:	94
SigQ:	0/63912
SigPnd:	0000000000000000
ShdPnd:	0000000000000000
SigBlk:	0000000000000000
SigIgn:	0000000001001000
SigCgt:	0000002f820044af
CapInh:	0000000000000000
CapPrm:	0000000000000000
CapEff:	0000000000000000
CapBnd:	0000003fffffffff
Seccomp:	0
Cpus_allowed:	ff
Cpus_allowed_list:	0-7
Mems_allowed:	00000000,00000001
Mems_allowed_list:	0
voluntary_ctxt_switches:	1646967
nonvoluntary_ctxt_switches:	18380

"A process is an active program and related resources"

  • memory addresses
  • registers
  • file descriptors

What are these related resources?

What about CPU time? How do processes actually get time to run?

Linux scheduler determines priority based on main two factors:

  1. I/O bound get higher priority than CPU bound

  2. Nice value set by the user

How are new processes created?

  1. An existing process calls fork and duplicates itself.
  2. The new process calls exec in order to an executable program into memory.

Existing process forks

Ready to run

Kernel scheduler dispatches process

Preempted by scheduler

Process exits

Running

Uninterruptable sleep

Waiting for interrupt

In most other operating system, processes also have another important related resource: threads. Linux does not differentiate a process from a thread, instead the kernel will share resources betweent threads. Multi-threading in linux is simply having several processes share resources.

A quick aside on Linux threads

System Calls

When a process needs to talk to the hardware it does this via system calls to the kernel. System calls are often made through the system libraries, but programs can send system calls directly to the kernel.

Applications

System Libraries

System Calls

Kernel

Hardware

User
Level

Kernel

Level

Kernel

Level

User
Level

What are system calls for?

What do system calls look like?

18053 open("/etc/nginx/vhosts.d/localhost.conf", O_RDONLY) = 5
18053 fstat(5, {st_mode=S_IFREG|0644, st_size=253, ...}) = 0
18053 pread(5, "server {\n    server_name 127.0.0"..., 253, 0) = 253
18053 getuid()                          = 0
18053 open("/etc/nginx/includes/restrictions.conf", O_RDONLY) = 6
18053 fstat(6, {st_mode=S_IFREG|0644, st_size=810, ...}) = 0
18053 pread(6, "# Global restrictions configurat"..., 810, 0) = 810
18053 close(6)                          = 0
18053 getuid()                          = 0
18053 close(5)                          = 0
18053 open("/etc/nginx/vhosts.d/www.hsdnd.com.conf", O_RDONLY) = 5
18053 fstat(5, {st_mode=S_IFREG|0644, st_size=340, ...}) = 0
18053 pread(5, "server {\n    server_name hsdnd.c"..., 340, 0) = 340
18053 getuid()                          = 0
18053 open("/etc/nginx/includes/wordpress.conf", O_RDONLY) = 6
18053 fstat(6, {st_mode=S_IFREG|0644, st_size=1204, ...}) = 0
18053 pread(6, "# WordPress single blog rules.\n#"..., 1204, 0) = 1204
18053 open("/etc/nginx/includes/fastcgi.conf", O_RDONLY) = 7
18053 fstat(7, {st_mode=S_IFREG|0644, st_size=958, ...}) = 0
18053 pread(7, "fastcgi_param  SCRIPT_FILENAME  "..., 958, 0) = 958
18053 close(7)                          = 0
18053 close(6)                          = 0
18053 open("/etc/nginx/includes/restrictions.conf", O_RDONLY) = 6
18053 fstat(6, {st_mode=S_IFREG|0644, st_size=810, ...}) = 0
18053 pread(6, "# Global restrictions configurat"..., 810, 0) = 810
18053 brk(0)                            = 0x936000
18053 brk(0x959000)                     = 0x959000
18053 close(6)                          = 0
18053 getuid()                          = 0
18053 close(5)                          = 0

Example

A customer requests a new cloud server because they accidentally borked their existing server.

  • We act as the application.
  • They give us specific data about the data center, flavor, OS, and etc.
  • We take their barely coherent thoughts and translate them into button clicks in the control panel.

The Application

The control panel is the system library; it receives our button clicks and turns those into api calls to openstack. It does not matter whether we, the applications, go through Reach or use curl to send api commands directly. We use Reach and Encore because those are much easier for us to interface with for the menial and repetitive tasks, such as creating a new server.

The System Library

The System Calls

The api is analogous to the system calls. These are what give the final instructions to the the linux kernel, or in our example, openstack itself. The linux kernel contains the device drivers, so it knows how to talk to the physical hardware. Ultimately, the linux kernel is what actually does the heavy lifting.

Now we have our resources...

Now that our OS has the instructions, it can allocates those resources back up the stack. Processes either get resources the wanted, or get a kind error message. Similarly, openstack gives Reach the server ip addres, root pw, etc. which we then see and kindly tell the customer.

Resources

Memory

We know that servers have at least two kinds of memory:

  • Physical RAM (Main Memory)
  • Swap space

Types of Memory

  • When a program asks for memory, it does not care what kind of memory it gets back, so long as it gets a usable memory address.
  • Similary, when a program wants to write to memory, it does not want to be bothered with where the memory is stored, page size, etc.
  • The kernel handles all of this through a memory abstraction layer called virtual memory.

What kind of memory do applications get?

Virtual Memory

  • The kernel gives processes a private slice of the available virtual memory.
  • Multiple processes can share memory slices, but completely independent processes do not have to worry about overrunning each other's memory slice.
  • This prevents processes from accessing memory that belongs to privileged users, etc.
  • Virtual memory also allows the kernel to seamlessly move memory pages from main memory to disk without the processes being any wiser.

Filesystems

VFS

Similar to virtual memory, the kernel provides a virtual filesystem to abstract away the underlying complexities such as:

  • different filesystem types
  • where filesystems are mounted
  • the hardware of the filesystem

System Calls

VFS

EXT4

XFS

BTRFS

SYSFS

PROC

DEV

NFS

GLUSTERFS

SAMBA

Physical Device

Virtual Memory

Network

Caching