{"id":10943,"date":"2020-09-11T05:00:00","date_gmt":"2020-09-11T03:00:00","guid":{"rendered":"https:\/\/blog.mi.hdm-stuttgart.de\/?p=10943"},"modified":"2020-09-13T11:55:21","modified_gmt":"2020-09-13T09:55:21","slug":"behind-the-scenes-of-modern-operating-systems-security-through-isolation-part-1","status":"publish","type":"post","link":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/2020\/09\/11\/behind-the-scenes-of-modern-operating-systems-security-through-isolation-part-1\/","title":{"rendered":"Behind the scenes of modern operating systems \u2014 Security through isolation (Part 1)"},"content":{"rendered":"<p>In recent years, since the Internet has become available to almost anyone, application and runtime security is important more than ever. Be it an (unknown) application you download and run from the Internet or some server application you expose <em>to<\/em> the Internet, it&#8217;s almost certainly a bad idea to run apps without any security restrictions applied:<br \/>\n<!--more--><br \/>\nUnknown (untrusted) applications from the Internet could well include some malware trying to steal data from you. Server applications can even be attacked remotely by triggering security vulnerabilities such as buffer overflows, file inclusion bugs and what not.<br \/>\nWhile most solutions such as using multiple devices (or using a dedicated device for each process you would normally run on your PC) are impractical and cumbersome to use, other techniques such as application sandboxing \u2014 which we will explore later on \u2014 exist.<\/p>\n<p>The problem here is that applications (by default) can access data and use operating system functionality they should better not be able to use; The trojan horse you just downloaded can easily access your photos and other kinds of sensitive data stored on your PC or inject itself to your browser and steal money while you&#8217;re performing some bank transactions over there. The server application you expose to the Internet can run <em>arbitrary code<\/em> after a buffer overflow was triggered in it and pretty much cause the same harm as any other kind of malware.<\/p>\n<p style=\"font-size: 0.7em\"><img decoding=\"async\" src=\"\/wp-content\/uploads\/2020\/09\/xkcd-1957.png\" alt=\"xkcd's 2018 CVE List\"><br \/>\n(Source: <a href=\"https:\/\/xkcd.com\/1957\/\">https:\/\/xkcd.com\/1957\/<\/a>)<\/p>\n<p>This is the first of two posts on the security of modern operating systems. This part lays and explains the groundwork of security in the Linux Kernel.<\/p>\n<p><a href=\"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/2020\/09\/11\/behind-the-scenes-of-modern-operating-systems-security-through-isolation-part-2\/\">Part two<\/a> shows how the security mechanisms introduced in this post can be combined to create containerization platforms such as Docker and OS-level application isolation techniques such as LXC; It as well introduces other kinds of isolation techniques such as virtualization.<\/p>\n<h2>Sandboxing<\/h2>\n<p>The idea behind application sandboxing is simple: A sandboxed application (which in fact is a process) is <em>isolated<\/em> from all other processes running on a PC. It can only cause harm to anything inside its sandbox and this sandbox should only include the bare-minimum data and functionality required to run the application.<\/p>\n<p>Probably <em>the<\/em> sandbox in most common use (and even active on your PC right now) is inside your browser! Modern browsers such as Firefox isolate all tabs from one another and only allow communication between them through specially crafted communication interfaces.<\/p>\n<p><img decoding=\"async\" src=\"\/wp-content\/uploads\/2020\/09\/sandboxing.png\" alt=\"Sandboxing\"><\/p>\n<p>Sandboxing \u2014 if implemented correctly \u2014 is a great solution to the problem from our introduction in that it allows to use a single PC for running multiple processes without any security implications.<!-- \u2014 compared to separating the processes to each run on a different machine.--><br \/>\nBut how is sandboxing actually achieved in modern operating systems? Let&#8217;s find out!<\/p>\n<h2>Security in the Linux Kernel<\/h2>\n<p>Most servers on the Internet use Linux as their operating system kernel and modern containerization platforms such as Docker currently only run on Linux, Linux is <a href=\"https:\/\/github.com\/torvalds\/linux\">open source<\/a> and has the most active development community and companies behind it (<a href=\"https:\/\/en.wikipedia.org\/wiki\/Usage_share_of_operating_systems#Public_servers_on_the_Internet\">source 1<\/a>, <a href=\"https:\/\/w3techs.com\/technologies\/details\/os-unix\">source 2<\/a>). That&#8217;s why we want to take a deeper look into its inner workings and explore the security mechanisms provided by Linux.<\/p>\n<h3>chroot \u2014 change root<\/h3>\n<p>One of the features to sandbox applications (which by the way is supported by almost all UNIX-descendants, including BSD and System V) has been in Linux since its inception \u2014 <em>chroot<\/em>.<br \/>\n<em>chroot<\/em> is a system call allowing the kernel to set the apparent root directory of a process; <em>chroot<\/em>-jailed processes see a different file system view than other processes.<\/p>\n<p>Let&#8217;s take a look at the following file system view to better illustrate how a <em>chroot<\/em> actually works. Suppose you have the following files on your drive:<\/p>\n<pre style=\"background: #f9f9f9;padding: 13px;font-size: 0.7em\"><code class=\"\" data-line=\"\">\/\n\u251c\u2500\u2500 bin\n\u2502   \u251c\u2500\u2500 sh\n\u2502   \u2514\u2500\u2500 \u2026\n\u251c\u2500\u2500 dev\n\u2502   \u251c\u2500\u2500 null\n\u2502   \u2514\u2500\u2500 random\n\u251c\u2500\u2500 home\n\u2502   \u251c\u2500\u2500 artur\n\u2502   \u2502   \u2514\u2500\u2500 arturs_file\n\u2502   \u2514\u2500\u2500 leon\n\u2502       \u2514\u2500\u2500 leons_file\n\u2514\u2500\u2500 root\n<\/code><\/pre>\n<p>In order to create a <em>chroot<\/em> jail for a process with a different file system view, such a &#8220;view&#8221; must exist. This simply means all necessary files required to launch an operating system such as Debian must exist in some sub-folder on the file system. This might look as follows (see the <code class=\"\" data-line=\"\">chroot<\/code> folder):<\/p>\n<pre style=\"background: #f9f9f9;padding: 13px;font-size: 0.7em\"><code class=\"\" data-line=\"\">\/\n\u251c\u2500\u2500 bin\n\u2502   \u251c\u2500\u2500 sh\n\u2502   \u2514\u2500\u2500 \u2026\n\n\u251c\u2500\u2500 chroot # &lt;-- This is where our chroot-ed process will &quot;live&quot;\n\u2502   \u251c\u2500\u2500 bin\n\u2502   \u2502   \u251c\u2500\u2500 sh\n\u2502   \u2502   \u2514\u2500\u2500 \u2026\n\u2502   \u251c\u2500\u2500 dev\n\u2502   \u251c\u2500\u2500 home\n\u2502   \u2502   \u2514\u2500\u2500 peter\n\u2502   \u2502     \u2514\u2500\u2500 peters_file\n\u2502   \u2514\u2500\u2500 root\n\n\u251c\u2500\u2500 home\n\u2502   \u251c\u2500\u2500 artur\n\u2502   \u2502   \u2514\u2500\u2500 arturs_file\n\u2502   \u2514\u2500\u2500 leon\n\u2502       \u2514\u2500\u2500 leons_file\n\u2514\u2500\u2500 root\n<\/code><\/pre>\n<p>Let&#8217;s spawn a new <code class=\"\" data-line=\"\">\/bin\/sh<\/code> shell and <em>chroot<\/em> it to <code class=\"\" data-line=\"\">\/chroot<\/code>:<\/p>\n<pre style=\"background: #f9f9f9;padding: 13px;font-size: 0.7em\"><code class=\"\" data-line=\"\"># These three commands are required to mount all the\n# necessary virtual file systems to the chroot jail.\n$ sudo mount -o bind \/dev \/chroot\/dev\n$ sudo mount -o bind \/proc \/chroot\/proc\n$ sudo mount -o bind \/sys \/chroot\/sys\n\n# This command spawns a new \/bin\/sh shell and chroot&#039;s\n# it to \/chroot\n$ sudo chroot \/chroot \/bin\/sh\n\n# From now on we\u2019re using the new shell inside the chroot jail\n\n# Quick verification\u2026\n$ tree \/home\n.\n\u2514\u2500\u2500 peter\n\u2514\u2500\u2500 peters_file\n\n# Success! The shell we&#039;re using right now can&#039;t see\n# &quot;artur&quot;s and &quot;leon&quot;s home!\n<\/code><\/pre>\n<p>A <em>chroot<\/em>-jailed process can create new files only in his jail:<\/p>\n<pre style=\"background: #f9f9f9;padding: 13px;font-size: 0.7em\"><code class=\"\" data-line=\"\"># From the chroot-jailed process\u2026\n$ touch \/testfile\n<\/code><\/pre>\n<p>Our file system now looks as follows:<\/p>\n<pre style=\"background: #f9f9f9;padding: 13px;font-size: 0.7em\"><code class=\"\" data-line=\"\">\/\n\u251c\u2500\u2500 bin\n\u2502   \u251c\u2500\u2500 sh\n\u2502   \u2514\u2500\u2500 \u2026\n\n\u251c\u2500\u2500 chroot\n\u2502   \u251c\u2500\u2500 bin\n\u2502   \u2502   \u251c\u2500\u2500 sh\n\u2502   \u2502   \u2514\u2500\u2500 \u2026\n\u2502   \u251c\u2500\u2500 dev\n\u2502   \u251c\u2500\u2500 home\n\u2502   \u2502   \u2514\u2500\u2500 peter\n\u2502   \u2502     \u2514\u2500\u2500 peters_file\n\u2502   \u251c\u2500\u2500 root\n\u2502   \u2514\u2500\u2500 testfile # &lt;----- The new file\n\n\u251c\u2500\u2500 home\n\u2502   \u251c\u2500\u2500 artur\n\u2502   \u2502   \u2514\u2500\u2500 arturs_file\n\u2502   \u2514\u2500\u2500 leon\n\u2502       \u2514\u2500\u2500 leons_file\n\u2514\u2500\u2500 root\n<\/code><\/pre>\n<p><em>chroot<\/em>-capable directories can either created by hand or automatically by tools like <code class=\"\" data-line=\"\">debootstrap<\/code>. <code class=\"\" data-line=\"\">debootstrap<\/code> can be used to easily set up a Debian system:<\/p>\n<pre style=\"background: #f9f9f9;padding: 13px;font-size: 0.7em\"><code class=\"\" data-line=\"\">$ sudo debootstrap stable \/chroot https:\/\/deb.debian.org\/debian\/\nI: Target architecture can be executed\nI: Retrieving InRelease\nI: Checking Release signature\nI: Valid Release signature\nI: Retrieving Packages\nI: Validating Packages\nI: Resolving dependencies of required packages...\nI: Resolving dependencies of base packages...\nI: Checking component main on https:\/\/deb.debian.org\/debian...\nI: Retrieving libacl1 2.2.53-4\nI: Validating libacl1 2.2.53-4\nI: Retrieving adduser 3.118\nI: Validating adduser 3.118\nI: Retrieving libapparmor1 2.13.2-10\nI: Validating libapparmor1 2.13.2-10\nI: Retrieving apt 1.8.2\nI: Validating apt 1.8.2\n# \u2026\nI: Configuring systemd...\nI: Configuring ca-certificates...\nI: Base system installed successfully.\n\n$ ls -la \/chroot\/\ndrwxr-xr-x 17 root root 4096 Jun 21 08:31 .\ndrwxr-xr-x  1 root root 4096 Jun 21 08:30 ..\nlrwxrwxrwx  1 root root    7 Jun 21 08:30 bin -&gt; usr\/bin\ndrwxr-xr-x  2 root root 4096 May  2 16:39 boot\ndrwxr-xr-x  4 root root 4096 Jun 21 08:30 dev\ndrwxr-xr-x 48 root root 4096 Jun 21 08:31 etc\ndrwxr-xr-x  2 root root 4096 May  2 16:39 home\nlrwxrwxrwx  1 root root    7 Jun 21 08:30 lib -&gt; usr\/lib\nlrwxrwxrwx  1 root root    9 Jun 21 08:30 lib32 -&gt; usr\/lib32\nlrwxrwxrwx  1 root root    9 Jun 21 08:30 lib64 -&gt; usr\/lib64\nlrwxrwxrwx  1 root root   10 Jun 21 08:30 libx32 -&gt; usr\/libx32\ndrwxr-xr-x  2 root root 4096 Jun 21 08:30 media\ndrwxr-xr-x  2 root root 4096 Jun 21 08:30 mnt\ndrwxr-xr-x  2 root root 4096 Jun 21 08:30 opt\ndrwxr-xr-x  2 root root 4096 May  2 16:39 proc\ndrwx------  2 root root 4096 Jun 21 08:30 root\ndrwxr-xr-x  3 root root 4096 Jun 21 08:30 run\nlrwxrwxrwx  1 root root    8 Jun 21 08:30 sbin -&gt; usr\/sbin\ndrwxr-xr-x  2 root root 4096 Jun 21 08:30 srv\ndrwxr-xr-x  2 root root 4096 May  2 16:39 sys\ndrwxrwxrwt  2 root root 4096 Jun 21 08:31 tmp\ndrwxr-xr-x 13 root root 4096 Jun 21 08:30 usr\ndrwxr-xr-x 11 root root 4096 Jun 21 08:30 var\n<\/code><\/pre>\n<p><em>chroot<\/em> forms the basis of every application sandbox in that it enabled different processes to see different files. However, <em>chroot<\/em>s by themselves do <strong>not<\/strong> provide any security against malicious attacks. This is due to the fact that the <code class=\"\" data-line=\"\">root<\/code> user <em>inside<\/em> the jail has the same user id as the <code class=\"\" data-line=\"\">root<\/code> user <em>outside<\/em> of it. <a href=\"https:\/\/web.archive.org\/web\/20160127150916\/http:\/\/www.bpfh.net\/simes\/computing\/chroot-break.html\">Escaping<\/a> a <em>chroot<\/em> jail without any additional restrictions is quite easy!<\/p>\n<p>If implemented correctly, a <em>chroot<\/em>ed process can not see files outside of its <em>chroot<\/em>, yet a malicious or misbehaving application can still harm the system by e.g. exhausting the system&#8217;s hardware resources or accessing network interfaces when they shouldn&#8217;t \u2014 additional security mechanisms are required to prevent that!<\/p>\n<h3>namespaces<\/h3>\n<p>Support for <em>namespaces<\/em> was added to the Linux Kernel back in 2002. <em>namespaces<\/em> affects which <strong>system resources<\/strong> a process can <em>see<\/em> and <em>interact<\/em> with. As of the most recent Kernel v5.8, this includes the following 8 system resource types:<\/p>\n<ul>\n<li>Interprocess Communication (<code class=\"\" data-line=\"\">IPC<\/code>)\n<ul>\n<li>Controls which processes can <em>IPC<\/em> with each other (using shared memory \u2014 SHM)<\/li>\n<\/ul>\n<\/li>\n<li>Network (<code class=\"\" data-line=\"\">net<\/code>)\n<ul>\n<li>Controls which <em>network interfaces<\/em> a namespace uses<\/li>\n<\/ul>\n<\/li>\n<li>Mount (<code class=\"\" data-line=\"\">mnt<\/code>)\n<ul>\n<li>Controls which <em>mounts<\/em> are available to a namespace<\/li>\n<\/ul>\n<\/li>\n<li>Process ID (<code class=\"\" data-line=\"\">PID<\/code>)\n<ul>\n<li>Controls which <em>processes<\/em> a namespace can interact with<\/li>\n<\/ul>\n<\/li>\n<li>UNIX Time-Sharing (<code class=\"\" data-line=\"\">UTS<\/code>)\n<ul>\n<li>Controls which <em>system hostname<\/em> a namespace uses<\/li>\n<\/ul>\n<\/li>\n<li>User ID (<code class=\"\" data-line=\"\">user<\/code>)\n<ul>\n<li>Controls which <em>users<\/em> a namespace can interact with<\/li>\n<\/ul>\n<\/li>\n<li>Time (<code class=\"\" data-line=\"\">time<\/code>)\n<ul>\n<li>Controls which <em>system time<\/em> a namespace uses<\/li>\n<\/ul>\n<\/li>\n<li>Control group (<code class=\"\" data-line=\"\">cgroup<\/code>)\n<ul>\n<li>Controls which <em>hardware resources<\/em> a namespace can use<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<p>Every process on a Linux system must be part of one <em>namespace<\/em> and it can use all system resources which are available to that <em>namespace<\/em>.<br \/>\nConsider the following graphical illustration where we have two <code class=\"\" data-line=\"\">net<\/code> namespaces. One of them, aptly named <code class=\"\" data-line=\"\">No_Network<\/code> having access to no network interface and another namespace, named <code class=\"\" data-line=\"\">With_Network<\/code>, which has access to the <code class=\"\" data-line=\"\">lo0<\/code> and <code class=\"\" data-line=\"\">eth0<\/code> network interface:<\/p>\n<p><img decoding=\"async\" src=\"\/wp-content\/uploads\/2020\/09\/namespaces-net.png\" alt=\"Network namespaces\"><\/p>\n<p>Process <code class=\"\" data-line=\"\">A<\/code> and <code class=\"\" data-line=\"\">B<\/code> use the <code class=\"\" data-line=\"\">No_Network<\/code> namespace so they are not allowed to do any network-related activities.<br \/>\nProcess <code class=\"\" data-line=\"\">C<\/code> and <code class=\"\" data-line=\"\">D<\/code> on the other hand are in the <code class=\"\" data-line=\"\">With_Network<\/code> namespace, so they <em>do<\/em> have access to the <code class=\"\" data-line=\"\">lo0<\/code> and <code class=\"\" data-line=\"\">eth0<\/code> network interfaces.<\/p>\n<p>Internally, in the Kernel, each <em>namespace<\/em> is identified by a namespace ID. This ID can be shown for each process by viewing the <code class=\"\" data-line=\"\">proc<\/code> file system&#8217;s <code class=\"\" data-line=\"\">ns<\/code> file:<\/p>\n<pre style=\"background: #f9f9f9;padding: 13px;font-size: 0.7em\"><code class=\"\" data-line=\"\">$ ls -l \/proc\/$$\/ns\nlrwxrwxrwx. 1 root root 0 Jun 21 11:43 cgroup           -&gt; &#039;cgroup:[1111111111]&#039;\nlrwxrwxrwx. 1 root root 0 Jun 21 11:43 ipc              -&gt; &#039;ipc:   [2222222222]&#039;\nlrwxrwxrwx. 1 root root 0 Jun 21 11:43 mnt              -&gt; &#039;mnt:   [3333333333]&#039;\nlrwxrwxrwx. 1 root root 0 Jun 21 11:43 net              -&gt; &#039;net:   [4444444444]&#039;\nlrwxrwxrwx. 1 root root 0 Jun 21 11:43 pid              -&gt; &#039;pid:   [5555555555]&#039;\nlrwxrwxrwx. 1 root root 0 Jun 21 11:43 pid_for_children -&gt; &#039;pid:   [5555555555]&#039;\nlrwxrwxrwx. 1 root root 0 Jun 21 11:43 user             -&gt; &#039;user:  [6666666666]&#039;\nlrwxrwxrwx. 1 root root 0 Jun 21 11:43 uts              -&gt; &#039;uts:   [7777777777]&#039;\n\n#                                                                       ^\n#                                                                  Namespace ID\n<\/code><\/pre>\n<p>Let&#8217;s create our own <code class=\"\" data-line=\"\">UTS<\/code> namespace to use a different <em>system hostname<\/em> for a new process \u2014 and only for this process!<\/p>\n<pre style=\"background: #f9f9f9;padding: 13px;font-size: 0.7em\"><code class=\"\" data-line=\"\"># The hostname of our system is &quot;debian&quot;\n$ hostname\ndebian\n\n# View the &quot;uts&quot; namespace ID of our current shell\n# ($$ refers to the current shell&#039;s process ID)\n$ ls -l \/proc\/$$\/ns | grep uts\nlrwxrwxrwx. 1 root root 0 Jun 21 11:43 uts              -&gt; &#039;uts:   [7777777777]&#039;\n\n# Using the &quot;unshare&quot; command, spawn a new &quot;bash&quot; shell\n# in its own and newly created &quot;uts&quot; namespace:\n$ unshare --fork --uts chroot \/ bash\n\n# From now on we\u2019re inside the new namespace\n\n# View the &quot;uts&quot; namespace ID of our freshly spawned bash.\n# Note: it is a different ID!\n$ ls -l \/proc\/$$\/ns | grep uts\nlrwxrwxrwx. 1 root root 0 Jun 21 11:44 uts              -&gt; &#039;uts:   [8888888888]&#039;\n\n# Set the hostname inside the new bash:\n$ hostname HELLO-UTS\n\n# Verify that it was set correctly\n$ hostname\nHELLO-UTS\n\n# Exit the bash we just spawned\n$ exit\n\n# From now on we\u2019re back again to the old namespace.\n# Verify that we got back our old hostname:\n$ hostname\ndebian\n\n# \u2026 nice!\n<\/code><\/pre>\n<h4>cgroup \u2014 Control groups<\/h4>\n<p>Control groups (abbreviated <em>cgroup<\/em>s) are another kind of system resource namespace but they are the most powerful one, so they deserve their own section! Support for <em>cgroup<\/em>s was added to the Linux Kernel in 2008.<br \/>\n<em>cgroup<\/em>s allow to specify which <em>hardware resources<\/em> a process can use, this includes the following hardware resource types:<br \/>\n&#8211; CPU<br \/>\n&#8211; RAM<br \/>\n&#8211; Storage I\/O<br \/>\n&#8211; Network I\/O<br \/>\n&#8211; etc.<\/p>\n<p>Additionally, certain <em>cgroup<\/em>s can get a higher priority than others (this is how the <code class=\"\" data-line=\"\">nice<\/code> CLI app works). <em>cgroup<\/em>s also allow to measure their resource usage which can be used for billing of shared computation resources, e.g. as done by VPS providers.<\/p>\n<p style=\"font-size: 0.7em\"><img decoding=\"async\" src=\"\/wp-content\/uploads\/2020\/09\/cgroup.png\" alt=\"cgroup\">(Source: <a href=\"https:\/\/mairin.wordpress.com\/2011\/05\/13\/ideas-for-a-cgroups-ui\/\">https:\/\/mairin.wordpress.com\/2011\/05\/13\/ideas-for-a-cgroups-ui\/<\/a>)<\/p>\n<p>Let&#8217;s see them in action and create our own <em>cgroup<\/em>!<\/p>\n<pre style=\"background: #f9f9f9;padding: 13px;font-size: 0.7em\"><code class=\"\" data-line=\"\"># Each cgroup needs a name, we call ours &quot;DEMO&quot;\n$ CGROUPNAME=DEMO\n\n# To create a new cgroup, simply issue the following command\n$ sudo mkdir \/sys\/fs\/cgroup\/memory\/$CGROUPNAME\n\n# We limit the RAM usage of our cgroup to 1 KB,\n# so we write 1000 (bytes) to the following file\n$ echo 1000 | sudo tee \/sys\/fs\/cgroup\/memory\/$CGROUPNAME\/memory.limit_in_bytes\n\n# Add our own process ID to the new cgroup\n# ($$ refers to the current shell&#039;s process ID)\n$ echo $$ | sudo tee \/sys\/fs\/cgroup\/memory\/$CGROUPNAME\/cgroup.procs\n\n# Try to launch &quot;ls&quot;\n$ ls -a\nKilled\n\n# Whoops, our shell was killed due to the memory constraints\n# we just defined! Let\u2019s grant some more memory!\n\n# Increase RAM usage of our cgroup to 1 MB.\n$ echo 10000000 | sudo tee \/sys\/fs\/cgroup\/memory\/$CGROUPNAME\/memory.limit_in_bytes\n\n# We need to add our shell&#039;s process ID to the cgroup\n# once again as our previous shell was terminated!\n$ echo $$ | sudo tee \/sys\/fs\/cgroup\/memory\/$CGROUPNAME\/cgroup.procs\n\n# Try to launch &quot;ls&quot; again, this time with a 1 MB RAM limit.\n$ ls -a\n. .. .bash_history .bashrc .profile\n\n# \u2026 nice!\n<\/code><\/pre>\n<h3>seccomp \u2014 Secure computing mode<\/h3>\n<p>The <em>secure computing mode<\/em> (introduced to Linux in 2005) allows processes to make a one-way transition into a &#8220;secure&#8221; mode. During this transition, the process relinquishes the right to use certain system calls and from then on is no longer able to use them (the process is killed by the Kernel if it still tries to). This allows to follow the <a href=\"https:\/\/en.wikipedia.org\/wiki\/Principle_of_least_privilege\">principle of least privileges<\/a> where an application can only do whatever it must be allowed to do to complete the task it is intended to complete.<\/p>\n<p>Applications written in memory-unsafe programming languages such as C are often vulnerable to buffer overflows where an attacker can execute arbitrary code in the context of the application and can therefore run system calls which are not even used by the application under normal conditions.<br \/>\nseccomp is the only security feature we will explore which the application <em>itself<\/em> has to implement; it&#8217;s a feature which protects the application from <em>itself<\/em>.<\/p>\n<p>In its strictest form, the <em>strict mode<\/em>, <em>seccomp<\/em> only allows the following 4 system calls to be made:<\/p>\n<ul>\n<li><code class=\"\" data-line=\"\">exit<\/code><\/li>\n<li><code class=\"\" data-line=\"\">sigreturn<\/code><\/li>\n<li><code class=\"\" data-line=\"\">read<\/code><\/li>\n<li><code class=\"\" data-line=\"\">write<\/code><\/li>\n<\/ul>\n<p>Processes using the strict seccomp mode need to make all required file descriptors available <em>before<\/em> performing the one-way transition to their locked-down version.<br \/>\nIf the strict mode is too strict, for example if a call to the <code class=\"\" data-line=\"\">open<\/code> system call is still required after the transition, an <em>allow list<\/em> of valid system calls can be specified.<\/p>\n<p>In C lang, seccomp can be used as follows:<\/p>\n<pre style=\"background: #f9f9f9;padding: 13px;font-size: 0.7em\"><code class=\"language-c\" data-line=\"\">#include &lt;stdio.h&gt;\n#include &lt;fcntl.h&gt;\n#include &lt;unistd.h&gt;\n#include &lt;sys\/prctl.h&gt;\n#include &lt;linux\/seccomp.h&gt;\n\nint main(int argc, char* argv[]) {\n    char fn1[] = &quot;file1&quot;, fn2[] = &quot;file2&quot;;\n    int fd1, fd2;\n\n    \/\/ Try to open file\n    printf(&quot;Opening &#039;%s&#039;\u2026\\n&quot;, fn1);\n    fd1 = open(fn1, O_RDONLY);\n    printf(&quot;\u2026 done!\\n&quot;);\n\n    \/\/ Enter strict seccomp mode\n    prctl(PR_SET_SECCOMP, SECCOMP_MODE_STRICT);\n\n    \/\/ Try to open another file _after_ enabling strict seccomp mode\n    \/\/ -&gt; open() system call is prohibited, binary killed by Kernel\n    printf(&quot;Opening &#039;%s&#039;\u2026\\n&quot;, fn2);\n    fd2 = open(fn2, O_RDONLY);\n    printf(&quot;\u2026 done!\\n&quot;);\n\n    close(fd1);\n    close(fd2);\n    return 0;\n}\n<\/code><\/pre>\n<pre style=\"background: #f9f9f9;padding: 13px;font-size: 0.7em\"><code class=\"\" data-line=\"\">$ .\/seccomp\nOpening &#039;file1&#039;\u2026\n\u2026 done!\nOpening &#039;file2&#039;\u2026\nKilled\n<\/code><\/pre>\n<h3>Capabilities<\/h3>\n<p>Capabilities affects in more or less fine-grained detail which <em>capabilities<\/em> (i.e. which access-control settings) a process is allowed to use. As of Kernel v5.8, the list of capabilities encompasses:<\/p>\n<ul>\n<li><code class=\"\" data-line=\"\">CAP_NET_BIND_SERVICE<\/code> \u2014 Allow binding to TCP\/UDP sockets below 1024<\/li>\n<li><code class=\"\" data-line=\"\">CAP_CHOWN<\/code> \u2014 Allow the use of the <code class=\"\" data-line=\"\">chown()<\/code> system call to change file and group ownership<\/li>\n<li><code class=\"\" data-line=\"\">CAP_SYS_CHROOT<\/code> \u2014 Allow the use of the <code class=\"\" data-line=\"\">chroot()<\/code> system call<\/li>\n<li><code class=\"\" data-line=\"\">CAP_SYS_PTRACE<\/code> \u2014 Allow to <code class=\"\" data-line=\"\">ptrace()<\/code> any process<\/li>\n<li><code class=\"\" data-line=\"\">CAP_NET_BROADCAST<\/code> \u2014 Allow broadcasting and listen to multicast<\/li>\n<li><code class=\"\" data-line=\"\">CAP_NET_RAW<\/code> \u2014 Allow the use of RAW sockets<\/li>\n<li><code class=\"\" data-line=\"\">CAP_SYS_BOOT<\/code> \u2014 Allow the use of the <code class=\"\" data-line=\"\">reboot()<\/code> system call to reboot the system<\/li>\n<li>\u2026<\/li>\n<\/ul>\n<p>Capabilities in action:<\/p>\n<pre style=\"background: #f9f9f9;padding: 13px;font-size: 0.7em\"><code class=\"\" data-line=\"\"># By default, binding to a port below 1024 requires root permissions\n# (or, strictly speaking, the &quot;CAP_NET_BIND_SERVICE&quot; capability which\n# the &quot;root&quot; user has)\n$ netcat -l -p 81\nCan&#039;t grab 0.0.0.0:81 with bind : Permission denied\n\n# Use &quot;getcap&quot; to list all capabilities available to a binary\n$ sudo getcap $(readlink -f $(which netcat))\n(empty) # i.e. &quot;no additional capabilities&quot;\n\n# Use &quot;setcap&quot; to add the &quot;CAP_NET_BIND_SERVICE&quot; capability to &quot;netcat&quot;\n$ sudo setcap cap_net_bind_service=+ep $(readlink -f $(which netcat))\n\n# The &quot;netcat&quot; binary now has the &quot;CAP_NET_BIND_SERVICE&quot; capability\n$ sudo getcap $(readlink -f $(which netcat))\n\/usr\/bin\/nc.traditional = cap_net_bind_service+ep\n\n# \u2026 this means it is allowed to bind to ports below 1024 even when\n# launched without root permissions!\n$ netcat -l -p 81 &amp;\n\n$ sudo netstat -tulpen | grep 81\ntcp    0    0 0.0.0.0:81    0.0.0.0:*    LISTEN    1002    39008    2284\/netcat\n\n# \u2026 yay! &quot;netcat&quot; was able to bind to port 81 without the need to\n# run it with root privileges by a root user or the use of setuid!\n<\/code><\/pre>\n<h3>LSMs \u2014 Linux Security Modules<\/h3>\n<p>In order to better understand the concept of Linux Security Modules (<em>LSM<\/em>s), we first need to talk about Linux Kernel Modules.<\/p>\n<p>Linux basically is a large application written in C which, after being compiled, can no longer be extended with new functionality. That&#8217;s where Linux Kernel Modules (<em>LKM<\/em>s) come into play. Through special interfaces provided by Linux, LKMs allow to modify the Linux Kernel at <em>runtime<\/em>.<\/p>\n<p>A simple Linux Kernel Module might look as follows:<\/p>\n<pre style=\"background: #f9f9f9;padding: 13px;font-size: 0.7em\"><code class=\"language-c\" data-line=\"\">#include &lt;linux\/init.h&gt;\n#include &lt;linux\/kernel.h&gt;\n#include &lt;linux\/module.h&gt;\n\n\/\/ Provide information about our Kernel module\nMODULE_LICENSE(&quot;AGPL&quot;);\nMODULE_AUTHOR(&quot;HdM Stuttgart&quot;);\nMODULE_DESCRIPTION(&quot;Hello World Linux module.&quot;);\nMODULE_VERSION(&quot;1.0.0&quot;);\n\n\/\/ lkm_init is called when the module is _loaded_\nstatic int __init lkm_init(void) {\n    printk(KERN_INFO &quot;Hello, World!\\n&quot;);\n    return 0;\n}\n\n\/\/ lkm_init is called when the module is _unloaded_\nstatic void __exit lkm_exit(void) {\n    printk(KERN_INFO &quot;Goodbye, World!\\n&quot;);\n}\n\n\/\/ Linux provides hooks for all kinds of internal\n\/\/ events. Amongst so many others, this includes:\n\/\/ - Call function when a Kernel module is loaded\n\/\/ - Call function when a Kernel module is unloaded\n\/\/ - Call function before a file is opened\n\/\/ - Call function before a file is written to\n\/\/ - etc.\n\nmodule_init(lkm_init);\nmodule_exit(lkm_exit);\n<\/code><\/pre>\n<pre style=\"background: #f9f9f9;padding: 13px;font-size: 0.7em\"><code class=\"\" data-line=\"\">MODULE_NAME := lkm_helloworld\n\nobj-m += $(MODULE_NAME).o\n\n.PHONY: all\nall: build\n\n.PHONY: build\nbuild:\n    make -C \/lib\/modules\/$(shell uname -r)\/build M=&quot;$(PWD)&quot; modules\n\n.PHONY: load\nload:\n    sudo insmod $(MODULE_NAME).ko\n\n.PHONY: unload\nunload:\n    sudo rmmod $(MODULE_NAME).ko\n<\/code><\/pre>\n<p>Running <code class=\"\" data-line=\"\">make load<\/code> and <code class=\"\" data-line=\"\">make unload<\/code> adds the following output to the Kernel ring buffer (visible by the <code class=\"\" data-line=\"\">dmesg<\/code> command):<\/p>\n<p><img decoding=\"async\" src=\"\/wp-content\/uploads\/2020\/09\/lkm.png\" alt=\"Linux Kernel Module\"><\/p>\n<p>That&#8217;s it for our small excursus to Linux Kernel Modules \u2014 LKMs.<\/p>\n<p>Linux Security Modules \u2014 LSMs \u2014 are very similar to LKMs, although they can&#8217;t be loaded and unloaded at runtime (what&#8217;s the point of a <em>Security<\/em> module after all if it can simply be unloaded by a malicious program?).<\/p>\n<p>LSMs extend the Linux Kernel with additional security features whose use should not be mandatory. Instead, LSMs can be enabled and disabled through the bootloader configuration (for GRUB this is as easy as adding <code class=\"\" data-line=\"\">apparmor=1 security=apparmor<\/code> to the <code class=\"\" data-line=\"\">GRUB_CMDLINE_LINUX_DEFAULT<\/code> config option to enable the <em>AppArmor<\/em> LSM which we will get to know later on).<\/p>\n<p>Several LSMs are already included in the Linux Kernel source tree, some of them even enabled by default:<\/p>\n<ul>\n<li>Yama\n<ul>\n<li><a href=\"https:\/\/www.kernel.org\/doc\/html\/latest\/admin-guide\/LSM\/Yama.html\">Yama<\/a> restricts the usage on <code class=\"\" data-line=\"\">ptrace()<\/code><\/li>\n<\/ul>\n<\/li>\n<li>LoadPin\n<ul>\n<li><a href=\"https:\/\/www.kernel.org\/doc\/html\/latest\/admin-guide\/LSM\/LoadPin.html\">LoadPin<\/a> ensures all kernel-loaded files (modules, firmware, etc.) originate from the same file system and not some external one<\/li>\n<\/ul>\n<\/li>\n<li>SafeSetID\n<ul>\n<li><a href=\"https:\/\/www.kernel.org\/doc\/html\/latest\/admin-guide\/LSM\/SafeSetID.html\">SafeSetID<\/a> restricts UID\/GID process transitions by a system-wide whitelist.<\/li>\n<li>Example: Using SafeSetID, one can specify the following:\n<ul>\n<li>&#8220;User1 may start process as User2&#8221;<\/li>\n<li>&#8220;User1 may NOT start process as User3&#8221;<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/li>\n<li>SELinux\n<ul>\n<li><a href=\"https:\/\/www.kernel.org\/doc\/html\/latest\/admin-guide\/LSM\/SELinux.html\">SELinux<\/a> implements fine-grained Mandatory Access Control (MAC). To achieve a MAC, it labels objects (<em>who<\/em> is allowed to do <em>what<\/em>, e.g. <em>User1<\/em> is allowed to modify the <em>system timezone<\/em>)<\/li>\n<li>Initially developed by the NSA in 2000 in the form of Kernel patches. Only later became a LSM and even part of the Linux source tree<\/li>\n<li>Enabled by default on Android since v4.3<\/li>\n<li>Comes with a GUI with is more or less user-friendly<\/li>\n<\/ul>\n<\/li>\n<li>Smack\n<ul>\n<li><a href=\"https:\/\/www.kernel.org\/doc\/html\/latest\/admin-guide\/LSM\/Smack.html\">Smack<\/a> is similar to SELinux but much easier to use<\/li>\n<\/ul>\n<\/li>\n<li>AppArmor\n<ul>\n<li><a href=\"https:\/\/www.kernel.org\/doc\/html\/latest\/admin-guide\/LSM\/apparmor.html\">AppArmor<\/a> implements a MAC for confining applications. Compared to SELinux it uses no object-labeling; instead, the security policy is applied to pathnames<\/li>\n<li>Enabled by default in Debian 10 (Buster)<\/li>\n<li>Explained in more detail <a href=\"#apparmor\">further below<\/a><\/li>\n<\/ul>\n<\/li>\n<li>TOMOYO\n<ul>\n<li><a href=\"https:\/\/www.kernel.org\/doc\/html\/latest\/admin-guide\/LSM\/tomoyo.html\">TOMOYO<\/a> is similar to AppArmor but \u201cdomains\u201d (trees of process invocation) are targeted instead of pathnames<\/li>\n<li>Example: Using TOMOYO, one can specify the following:\n<ul>\n<li>The call chain <code class=\"\" data-line=\"\">boot -&gt; init -&gt; sh -&gt; ping<\/code> is allowed<\/li>\n<li>The call chain <code class=\"\" data-line=\"\">boot -&gt; init -&gt; sh -&gt; bash -&gt; ping<\/code> is <strong>not<\/strong> allowed<\/li>\n<li>Therefore, <code class=\"\" data-line=\"\">bash<\/code> is not allowed to launch the <code class=\"\" data-line=\"\">ping<\/code> binary<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<h4>AppArmor<\/h4>\n<p>AppArmor is one of the better-known Linux Security Modules which is part of Linux since 2009. It is similar to SELinux in that it implements a Mandatory Access Control, but it identifies subjects (files) based on their <em>path<\/em> instead of their <em>inode<\/em> allowing to make configuration profiles easier to create.<\/p>\n<p>AppArmor comes with three modes of behaving:<\/p>\n<ul>\n<li><em>audit<\/em> mode \u2014 Verification mode\n<ul>\n<li>log all actions<\/li>\n<\/ul>\n<\/li>\n<li><em>complain<\/em> mode \u2014 Learning mode\n<ul>\n<li>log but do not block restricted actions<\/li>\n<\/ul>\n<\/li>\n<li><em>enforce<\/em> mode \u2014 Enforcement mode\n<ul>\n<li>log and block restricted actions<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<p>AppArmor relies on configuration profiles to limit the actions certain binaries are allowed to perform. Such a configuration profile might look as follows:<\/p>\n<pre style=\"background: #f9f9f9;padding: 13px;font-size: 0.7em\"><code class=\"\" data-line=\"\">$ cat \/etc\/apparmor.d\/bin.ping\n\n# Include some global settings\n#include &lt;tunables\/global&gt;\n\n# Our profile named &quot;ping&quot; is valid for the following binaries:\n# - \/bin\/ping\n# - \/bin\/iputils-ping\n# - \/usr\/bin\/ping\n# - \/usr\/bin\/iputils-ping\nprofile ping \/{usr\/,}bin\/{,iputils-}ping flags=(complain) {\n  #include &lt;abstractions\/base&gt;\n  #include &lt;abstractions\/consoles&gt;\n  #include &lt;abstractions\/nameservice&gt;\n\n  # Allow &quot;ping binaries&quot; to use RAW IPv4 and IPv6 network sockets\n  capability net_raw,\n  capability setuid,\n  network inet raw,\n  network inet6 raw,\n\n  # Allow &quot;ping&quot; binaries to read the following paths\n  # (its their own path, binaries must be able to read themselves)\n  \/{,usr\/}bin\/{,iputils-}ping mixr,\n  # The following file must apparently be read by &quot;ping&quot; as well\n  \/etc\/modules.conf r,\n}\n<\/code><\/pre>\n<p>AppArmor by default comes with a variety of profiles for all kinds of applications and makes it easy to create profiles for other applications as well. Let&#8217;s walk through the process.<\/p>\n<p>Say we have a small app (<code class=\"\" data-line=\"\">apparmor-demo<\/code>) which tries to read the file <code class=\"\" data-line=\"\">\/bin\/ping<\/code> and either print <code class=\"\" data-line=\"\">success<\/code> or <code class=\"\" data-line=\"\">failure<\/code> based on whether it succeeded:<\/p>\n<pre style=\"background: #f9f9f9;padding: 13px;font-size: 0.7em\"><code class=\"language-c\" data-line=\"\">#include &lt;stdio.h&gt;\n#include &lt;fcntl.h&gt;\n#include &lt;unistd.h&gt;\n\nint main(int argc, char* argv[]) {\n    char fn1[] = &quot;\/bin\/ping&quot;;\n    int fd1;\n\n    printf(&quot;Opening &#039;%s&#039;\u2026\\n&quot;, fn1);\n    if ((fd1 = open(fn1, O_RDONLY)) &gt; 0) {\n        printf(&quot;\u2026 success!\\n&quot;);\n    } else {\n        printf(&quot;\u2026 fail!\\n&quot;);\n    }\n\n    close(fd1);\n    return 0;\n}\n<\/code><\/pre>\n<p>Let&#8217;s try to launch the app:<\/p>\n<pre style=\"background: #f9f9f9;padding: 13px;font-size: 0.7em\"><code class=\"\" data-line=\"\">$ .\/apparmor-demo\nOpening &#039;\/bin\/ping&#039;\u2026\n\u2026 success!\n<\/code><\/pre>\n<p>Obviously this succeeds.<\/p>\n<p>Now we want to create an AppArmor profile for our binary to explicitly grant read-access to that file.<\/p>\n<p>First, create a new empty profile:<\/p>\n<pre style=\"background: #f9f9f9;padding: 13px;font-size: 0.7em\"><code class=\"\" data-line=\"\">$ aa-easyprof $PWD\/apparmor-demo &gt; \/etc\/apparmor.d\/usr.bin.apparmor-demo\n\n# This empty profile was created by the above command\n$ cat \/etc\/apparmor.d\/usr.bin.apparmor-demo\n#include &lt;tunables\/global&gt;\n&quot;\/root\/apparmor-demo&quot; {\n  #include &lt;abstractions\/base&gt;\n}\n<\/code><\/pre>\n<p>Enable the new profile using <code class=\"\" data-line=\"\">apparmor_parser<\/code>:<\/p>\n<pre style=\"background: #f9f9f9;padding: 13px;font-size: 0.7em\"><code class=\"\" data-line=\"\">$ apparmor_parser -r \/etc\/apparmor.d\/usr.bin.apparmor-demo\n\n$ aa-status\napparmor module is loaded.\n22 profiles are loaded.\n6 profiles are in enforce mode.\n   \/root\/apparmor-demo\n   \u2026\n\n# Our new profile is in enforce mode!\n<\/code><\/pre>\n<p>And try to launch our binary again:<\/p>\n<pre style=\"background: #f9f9f9;padding: 13px;font-size: 0.7em\"><code class=\"\" data-line=\"\">$ .\/apparmor-demo\nOpening &#039;\/bin\/ping&#039;\u2026\n\u2026 fail!\n<\/code><\/pre>\n<p>This time it failed to read <code class=\"\" data-line=\"\">\/bin\/ping<\/code> due to the fact that AppArmor works with an <em>allow list<\/em> instead of a <em>disallow list<\/em> which means all allowed actions must be explicitly specified instead of specifying which actions are <em>not<\/em> allowed.<\/p>\n<p>As we&#8217;re pretty lazy in editing the profile ourself, let&#8217;s make use of some of AppArmor&#8217;s handy utilities which allow us to update profiles in an interactive way.<\/p>\n<p>We first need to set our profile to <code class=\"\" data-line=\"\">complain<\/code> mode so all actions are allowed and logged:<\/p>\n<pre style=\"background: #f9f9f9;padding: 13px;font-size: 0.7em\"><code class=\"\" data-line=\"\">$ aa-complain apparmor-demo\n<\/code><\/pre>\n<p>Running our app succeeds again and produces the expected log entries:<\/p>\n<pre style=\"background: #f9f9f9;padding: 13px;font-size: 0.7em\"><code class=\"\" data-line=\"\">$ .\/apparmor-demo\nOpening &#039;\/bin\/ping&#039;\u2026\n\u2026 success!\n\n$ tail -n1 \/var\/log\/syslog\nJun 21 22:06:06 buster kernel: [24034.481637] audit: type=1400 audit(1592777166.717:47): apparmor=&quot;ALLOWED&quot; operation=&quot;open&quot; profile=&quot;\/root\/apparmor-demo&quot; name=&quot;\/usr\/bin\/ping&quot; pid=2985 comm=&quot;apparmor-demo&quot; requested_mask=&quot;r&quot; denied_mask=&quot;r&quot; fsuid=0 ouid=0\n<\/code><\/pre>\n<p><code class=\"\" data-line=\"\">aa-logprof<\/code> can be used to read those log messages and create an entry in the profile to either allow or deny the action:<\/p>\n<pre style=\"background: #f9f9f9;padding: 13px;font-size: 0.7em\"><code class=\"\" data-line=\"\">$ aa-logprof\nReading log entries from \/var\/log\/syslog.\nUpdating AppArmor profiles in \/etc\/apparmor.d.\nComplain-mode changes:\n\nProfile: \/root\/apparmor-demo\nPath: \/usr\/bin\/ping\nNew Mode: owner r\nSeverity: unknown\n\n[1 - owner \/usr\/bin\/ping r,]\n(A)llow \/ [(D)eny] \/ (I)gnore \/ (G)lob \/ Glob with (E)xtension \/ \u2026\n&gt; A # We allow the action\n\nAdding owner \/usr\/bin\/ping r, to profile.\nEnforce-mode changes:\n= Changed Local Profiles =\n\nThe following local profiles were changed. Would you like to save them?\n\n[1 - \/root\/apparmor-demo]\n(S)ave Changes \/ Save Selec(t)ed Profile \/ [(V)iew Changes] \/ View Changes b\/w (C)lean profiles \/ Abo(r)t\n&gt; S # We save our changes to the profile\n\nWriting updated profile for \/root\/apparmor-demo.\n<\/code><\/pre>\n<p>And indeed, our profile was updated!<\/p>\n<pre style=\"background: #f9f9f9;padding: 13px;font-size: 0.7em\"><code class=\"\" data-line=\"\">$ cat \/etc\/apparmor.d\/usr.bin.apparmor-demo\n#include &lt;tunables\/global&gt;\n&quot;\/root\/apparmor-demo&quot; {\n  #include &lt;abstractions\/base&gt;\n\n  owner \/{usr\/,}bin\/ping r,\n}\n<\/code><\/pre>\n<p>Now that we explicitly allow read access to <code class=\"\" data-line=\"\">\/bin\/ping<\/code>, let&#8217;s put our application in enforce mode again:<\/p>\n<pre style=\"background: #f9f9f9;padding: 13px;font-size: 0.7em\"><code class=\"\" data-line=\"\">$ aa-enforce apparmor-demo\n<\/code><\/pre>\n<p>Our app now works even with an active AppArmor profile!<\/p>\n<pre style=\"background: #f9f9f9;padding: 13px;font-size: 0.7em\"><code class=\"\" data-line=\"\">$ .\/apparmor-demo\nOpening &#039;\/bin\/ping&#039;\u2026\n\u2026 success!\n\n# Nice!\n<\/code><\/pre>\n<p>LSMs such as AppArmor might seem incredibly complicated to implement, but they are actually quite simple:<\/p>\n<p>In it&#8217;s <code class=\"\" data-line=\"\">init<\/code> function, <a href=\"https:\/\/github.com\/torvalds\/linux\/blob\/1b5044021070efa3259f3e9548dc35d1eb6aa844\/security\/apparmor\/lsm.c#L1172-L1260\">AppArmor registers<\/a> to all kinds of Kernel hooks such as <a href=\"https:\/\/github.com\/torvalds\/linux\/blob\/1b5044021070efa3259f3e9548dc35d1eb6aa844\/security\/apparmor\/lsm.c#L1194\"><code class=\"\" data-line=\"\">file_open<\/code><\/a> which is fired before a file is opened by the Linux Kernel.<\/p>\n<p style=\"font-size: 0.7em\">\n<pre style=\"background: #f9f9f9;padding: 13px;font-size: 0.7em\"><code class=\"language-c\" data-line=\"\">\/\/ \u2026\nLSM_HOOK_INIT(file_open, apparmor_file_open),\n\/\/ \u2026\n<\/code><\/pre>\n<p><span style=\"font-size: 0.7em\">(Source: <a href=\"https:\/\/github.com\/torvalds\/linux\/blob\/1b5044021070efa3259f3e9548dc35d1eb6aa844\/security\/apparmor\/lsm.c#L1194\">linux-1b50440210\/security\/apparmor\/lsm.c#L1194<\/a>)<\/span><\/p>\n<p>AppArmor&#8217;s <a href=\"https:\/\/github.com\/torvalds\/linux\/blob\/1b5044021070efa3259f3e9548dc35d1eb6aa844\/security\/apparmor\/lsm.c#L402-L434\"><code class=\"\" data-line=\"\">apparmor_file_open<\/code><\/a> function then, based on the relevant AppArmor profile, decides whether the call should be aborted (if the action is not allowed) or if the Kernel shall continue to open the file.<\/p>\n<pre style=\"background: #f9f9f9;padding: 13px;font-size: 0.7em\"><code class=\"language-c\" data-line=\"\">\/\/ \u2026\nerror = aa_path_perm(OP_OPEN, label, &amp;file-&gt;f_path, 0,\n                     aa_map_file_to_perms(file), &amp;cond);\n\/\/ \u2026\n<\/code><\/pre>\n<h2>Where to go from here?<\/h2>\n<p>This was our introduction to some of Linux&#8217; security features. And let us tell you what: these are pretty much the only security features to make containers (as used by Docker et al.) possible and <em>secure<\/em> to even run untrusted code. You should now be able to implement the <a href=\"https:\/\/github.com\/p8952\/bocker\/blob\/000633061c6cfcb99c8d9eef5aa483a44318f3e6\/bocker#L61-L90\">core of Docker<\/a> yourself!<\/p>\n<p>All of these security features are useful, yet they require <em>manual work<\/em> of either the application developer or an (experienced) computer user to enable. For server applications, when set up by a system administrator, this knowledge probably exists, but for the larger part of the computer user base, the home user, it doesn\u2019t. Security features which are available but not enabled <em>by default<\/em> are pretty useless.<br \/>\nThis brings us to <a href=\"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/2020\/09\/11\/behind-the-scenes-of-modern-operating-systems-security-through-isolation-part-2\/\">part two<\/a> of this blog post which explains more about containers, virtualization and security solutions for the end-user.<\/p>\n<h2>Sources<\/h2>\n<p>All links were last accessed on 2020-09-08.<\/p>\n<ul>\n<li><a href=\"https:\/\/www.kernel.org\/doc\/html\/latest\/userspace-api\/seccomp_filter.html\">https:\/\/www.kernel.org\/doc\/html\/latest\/userspace-api\/seccomp_filter.html<\/a><\/li>\n<li><a href=\"https:\/\/github.com\/torvalds\/linux\/blob\/1b5044021070efa3259f3e9548dc35d1eb6aa844\/Documentation\/admin-guide\/cgroup-v2.rst\">https:\/\/github.com\/torvalds\/linux\/blob\/1b5044021070efa3259f3e9548dc35d1eb6aa844\/Documentation\/admin-guide\/cgroup-v2.rst<\/a><\/li>\n<li><a href=\"https:\/\/github.com\/torvalds\/linux\/blob\/1b5044021070efa3259f3e9548dc35d1eb6aa844\/include\/uapi\/linux\/capability.h\">https:\/\/github.com\/torvalds\/linux\/blob\/1b5044021070efa3259f3e9548dc35d1eb6aa844\/include\/uapi\/linux\/capability.h<\/a><\/li>\n<li><a href=\"https:\/\/github.com\/torvalds\/linux\/tree\/1b5044021070efa3259f3e9548dc35d1eb6aa844\/Documentation\/admin-guide\/LSM\">https:\/\/github.com\/torvalds\/linux\/tree\/1b5044021070efa3259f3e9548dc35d1eb6aa844\/Documentation\/admin-guide\/LSM<\/a><\/li>\n<li><a href=\"https:\/\/man7.org\/linux\/man-pages\/man7\/capabilities.7.html\">https:\/\/man7.org\/linux\/man-pages\/man7\/capabilities.7.html<\/a><\/li>\n<li><a href=\"https:\/\/man7.org\/linux\/man-pages\/man2\/seccomp.2.html\">https:\/\/man7.org\/linux\/man-pages\/man2\/seccomp.2.html<\/a><\/li>\n<li><a href=\"https:\/\/man7.org\/linux\/man-pages\/man2\/syscalls.2.html\">https:\/\/man7.org\/linux\/man-pages\/man2\/syscalls.2.html<\/a><\/li>\n<li><a href=\"https:\/\/www.kernel.org\/doc\/html\/latest\/admin-guide\/LSM\/index.html\">https:\/\/www.kernel.org\/doc\/html\/latest\/admin-guide\/LSM\/index.html<\/a><\/li>\n<li><a href=\"https:\/\/www.linux.com\/training-tutorials\/overview-linux-kernel-security-features\/\">https:\/\/www.linux.com\/training-tutorials\/overview-linux-kernel-security-features\/<\/a><\/li>\n<li><a href=\"https:\/\/ajxchapman.github.io\/linux\/2016\/08\/31\/seccomp-and-seccomp-bpf.html\">https:\/\/ajxchapman.github.io\/linux\/2016\/08\/31\/seccomp-and-seccomp-bpf.html<\/a><\/li>\n<li><a href=\"https:\/\/en.wikipedia.org\/wiki\/Cgroups\">https:\/\/en.wikipedia.org\/wiki\/Cgroups<\/a><\/li>\n<li><a href=\"https:\/\/en.wikipedia.org\/wiki\/Security-Enhanced_Linux\">https:\/\/en.wikipedia.org\/wiki\/Security-Enhanced_Linux<\/a><\/li>\n<li><a href=\"https:\/\/gitlab.com\/apparmor\/apparmor\/-\/wikis\/AppArmor_Failures\">https:\/\/gitlab.com\/apparmor\/apparmor\/-\/wikis\/AppArmor_Failures<\/a><\/li>\n<li><a href=\"https:\/\/debian-handbook.info\/browse\/en-US\/stable\/sect.apparmor.html\">https:\/\/debian-handbook.info\/browse\/en-US\/stable\/sect.apparmor.html<\/a><\/li>\n<li><a href=\"https:\/\/wiki.ubuntuusers.de\/AppArmor\/\">https:\/\/wiki.ubuntuusers.de\/AppArmor\/<\/a><\/li>\n<li><a href=\"https:\/\/wiki.archlinux.org\/index.php\/Chroot\">https:\/\/wiki.archlinux.org\/index.php\/Chroot<\/a><\/li>\n<li><a href=\"https:\/\/github.com\/moby\/moby\/blob\/master\/profiles\/seccomp\/default.json\">https:\/\/github.com\/moby\/moby\/blob\/master\/profiles\/seccomp\/default.json<\/a><\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>In recent years, since the Internet has become available to almost anyone, application and runtime security is important more than ever. Be it an (unknown) application you download and run from the Internet or some server application you expose to the Internet, it&#8217;s almost certainly a bad idea to run apps without any security restrictions [&hellip;]<\/p>\n","protected":false},"author":961,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[1,26],"tags":[61,3,4,374,27],"ppma_author":[808],"class_list":["post-10943","post","type-post","status-publish","format-standard","hentry","category-allgemein","category-secure-systems","tag-containers","tag-docker","tag-linux","tag-malware","tag-security"],"aioseo_notices":[],"jetpack_featured_media_url":"","jetpack-related-posts":[{"id":6859,"url":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/2019\/08\/02\/mobile-security-how-secure-are-our-daily-used-devices\/","url_meta":{"origin":10943,"position":0},"title":"Mobile Security &#8211; How secure are our daily used devices?","author":"Johannes Mauthe","date":"2. August 2019","format":false,"excerpt":"Nowadays, the usage of mobile devices has become a part of our everyday life. A lot of sensitive and personal data is stored on these devices, which makes them more attractive targets for attackers. Also, many companies offer the possibility to work remotely, which results in storing confidential business information\u2026","rel":"","context":"In &quot;Allgemein&quot;","block_context":{"text":"Allgemein","link":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/category\/allgemein\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2019\/08\/Bildschirmfoto-2019-08-01-um-13.46.01-1.png?resize=350%2C200&ssl=1","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2019\/08\/Bildschirmfoto-2019-08-01-um-13.46.01-1.png?resize=350%2C200&ssl=1 1x, https:\/\/i0.wp.com\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2019\/08\/Bildschirmfoto-2019-08-01-um-13.46.01-1.png?resize=525%2C300&ssl=1 1.5x"},"classes":[]},{"id":3084,"url":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/2017\/09\/05\/cloud-security-part-2-the-vulnerabilities-and-threats-of-the-cloud-current-scientific-work-on-cloud-security-conclusion-and-outlook\/","url_meta":{"origin":10943,"position":1},"title":"Cloud Security \u2013 Part 2: The vulnerabilities and threats of the cloud, current scientific work on cloud security, conclusion and outlook","author":"Andreas Fliehr","date":"5. September 2017","format":false,"excerpt":"The second of two blog posts about cloud security. This post covers the vulnerabilities and threats of the cloud, the current scientific work on cloud security and a conclusion and an outlook.","rel":"","context":"In &quot;Cloud Technologies&quot;","block_context":{"text":"Cloud Technologies","link":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/category\/scalable-systems\/cloud-technologies\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2017\/09\/Structure-of-Nexen.png?resize=350%2C200&ssl=1","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2017\/09\/Structure-of-Nexen.png?resize=350%2C200&ssl=1 1x, https:\/\/i0.wp.com\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2017\/09\/Structure-of-Nexen.png?resize=525%2C300&ssl=1 1.5x"},"classes":[]},{"id":28714,"url":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/2026\/02\/26\/taking-control-of-dns-over-https\/","url_meta":{"origin":10943,"position":2},"title":"Taking Control of DNS over HTTPS","author":"rh080","date":"26. February 2026","format":false,"excerpt":"For decades, enterprise security relied on a simple truth: if you control Port 53, you can see where your users are going. Every DNS query left the network in plaintext, straightforward to log, filter, and block. DNS over HTTPS (DoH), standardized in RFC 8484 [2], broke that model by wrapping\u2026","rel":"","context":"In &quot;Allgemein&quot;","block_context":{"text":"Allgemein","link":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/category\/allgemein\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2026\/02\/dns-traditional.png?resize=350%2C200&ssl=1","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2026\/02\/dns-traditional.png?resize=350%2C200&ssl=1 1x, https:\/\/i0.wp.com\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2026\/02\/dns-traditional.png?resize=525%2C300&ssl=1 1.5x, https:\/\/i0.wp.com\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2026\/02\/dns-traditional.png?resize=700%2C400&ssl=1 2x, https:\/\/i0.wp.com\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2026\/02\/dns-traditional.png?resize=1050%2C600&ssl=1 3x, https:\/\/i0.wp.com\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2026\/02\/dns-traditional.png?resize=1400%2C800&ssl=1 4x"},"classes":[]},{"id":23067,"url":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/2022\/03\/15\/security-strategies-and-best-practices-for-microservices-architecture\/","url_meta":{"origin":10943,"position":3},"title":"Security Strategies and Best Practices for Microservices Architecture","author":"Larissa Schmauss","date":"15. March 2022","format":false,"excerpt":"Microservices architectures seem to be the new trend in the approach to application development. However, one should always keep in mind that microservices architectures are always closely associated with a specific environment:\u00a0Companies want to develop faster and faster, but resources are also becoming more limited, so they now want to\u2026","rel":"","context":"In &quot;Scalable Systems&quot;","block_context":{"text":"Scalable Systems","link":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/category\/scalable-systems\/"},"img":{"alt_text":"","src":"https:\/\/lh6.googleusercontent.com\/LbFspPRY1BxRBdAVjQwWXeJ6UOoxl6JWsRYrxboF5ObXlNNgy3uZikcGkc3cgzI0mr_ZlbWPxvdp0FoJC1k-odh7mRc2lCPXaMSq8TudjfoZ7e5HKstaMHmLpH319jCym6vQRo1a","width":350,"height":200,"srcset":"https:\/\/lh6.googleusercontent.com\/LbFspPRY1BxRBdAVjQwWXeJ6UOoxl6JWsRYrxboF5ObXlNNgy3uZikcGkc3cgzI0mr_ZlbWPxvdp0FoJC1k-odh7mRc2lCPXaMSq8TudjfoZ7e5HKstaMHmLpH319jCym6vQRo1a 1x, https:\/\/lh6.googleusercontent.com\/LbFspPRY1BxRBdAVjQwWXeJ6UOoxl6JWsRYrxboF5ObXlNNgy3uZikcGkc3cgzI0mr_ZlbWPxvdp0FoJC1k-odh7mRc2lCPXaMSq8TudjfoZ7e5HKstaMHmLpH319jCym6vQRo1a 1.5x"},"classes":[]},{"id":27440,"url":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/2025\/02\/27\/how-i-improved-the-network-security-of-my-live-chat-application-architecture-on-aws\/","url_meta":{"origin":10943,"position":4},"title":"How I Improved the Network Security of My Live Chat Application Architecture on AWS","author":"Jannik Scheider","date":"27. February 2025","format":false,"excerpt":"In an increasingly connected world, the need for robust security measures for cloud infrastructures is constantly growing. Applications that are accessible over the internet must be secured in a way that prevents unnecessary exposure of sensitive backend components. A fully public Virtual Private Cloud (VPC) architecture may be sufficient for\u2026","rel":"","context":"In &quot;Cloud Technologies&quot;","block_context":{"text":"Cloud Technologies","link":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/category\/scalable-systems\/cloud-technologies\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":3864,"url":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/2018\/08\/07\/server-less-computing-vs-security\/","url_meta":{"origin":10943,"position":5},"title":"Server \u201cless\u201d Computing vs. Security","author":"Merve Uzun","date":"7. August 2018","format":false,"excerpt":"Summary about Serverless Computing with Security aspects.","rel":"","context":"In &quot;Allgemein&quot;","block_context":{"text":"Allgemein","link":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/category\/allgemein\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/blog.mi.hdm-stuttgart.de\/wp-content\/uploads\/2018\/08\/Funktionsweise-300x98.png?resize=350%2C200&ssl=1","width":350,"height":200},"classes":[]}],"jetpack_sharing_enabled":true,"authors":[{"term_id":808,"user_id":961,"is_guest":0,"slug":"lk163","display_name":"Leon Klingele","avatar_url":"https:\/\/secure.gravatar.com\/avatar\/1f0b9e6e47bd4b8d164510c4e7cdcdd346a8dc16f447bac78cbc44ce876d4d72?s=96&d=mm&r=g","0":null,"1":"","2":"","3":"","4":"","5":"","6":"","7":"","8":""}],"_links":{"self":[{"href":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/wp-json\/wp\/v2\/posts\/10943","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/wp-json\/wp\/v2\/users\/961"}],"replies":[{"embeddable":true,"href":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/wp-json\/wp\/v2\/comments?post=10943"}],"version-history":[{"count":94,"href":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/wp-json\/wp\/v2\/posts\/10943\/revisions"}],"predecessor-version":[{"id":11176,"href":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/wp-json\/wp\/v2\/posts\/10943\/revisions\/11176"}],"wp:attachment":[{"href":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/wp-json\/wp\/v2\/media?parent=10943"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/wp-json\/wp\/v2\/categories?post=10943"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/wp-json\/wp\/v2\/tags?post=10943"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/blog.mi.hdm-stuttgart.de\/index.php\/wp-json\/wp\/v2\/ppma_author?post=10943"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}