WIP: describe stack handling with pictures

2017-09-19 10:38:00 +02:00 · 2017-09-19 10:38:00 +02:00 · b4f23fdd2f
commit b4f23fdd2f
parent 28dd1fe2c2
8 changed files with 815 additions and 366 deletions
--- a/.gitignore
+++ b/.gitignore
@ -25,3 +25,4 @@

 /src/docs/thesis.pdf
 /docs/
+_minted*/
--- a/src/docs/gfx/call-procedure-memory-content.png
+++ b/src/docs/gfx/call-procedure-memory-content.png
--- a/src/docs/glossary.tex
+++ b/src/docs/glossary.tex
@ -4,7 +4,7 @@
    name = {Rust},
    long = {the Rust programming language},
    description = {
-        Statically typed programming language that uses a new concept of variable ownership and reference tracking. Largely explain in \autoref{context::rust}.
+        Statically typed programming language that uses a new concept of variable ownership and reference tracking. Largely explain in \cref{context::rust}.
    },
    first = {\glsentrylong{Rust}}
 }
@ -309,9 +309,9 @@

 \newglossaryentry{CWE-119}{
    name = CWE-119,
+    long = {CWE-119: \glsentrydesc{CWE-119}},
    description = {Improper Restriction of Operations within the Bounds of a Memory Buffer},
-    short = {buffer error},
-    first = {CWE-119: \glsentrydesc{CWE-119}\cite{MITRE-CWE-119}}
+    first = {\glsentrylong{CWE-119}\cite{MITRE-CWE-119}}
 }

 \newglossaryentry{C}{
--- a/src/docs/parts/context/context.tex
+++ b/src/docs/parts/context/context.tex
@ -5,9 +5,9 @@

 This thesis studies the feasibility of using compile-time code analysis, as found in \gls{Rust}'s \gls{compiler}, for ensuring memory-safety within an \gls{OS} kernel.
 This study could be applied to all \glspl{app}, but the focus is on the implementation of \glspl{OS} which is the \gls{app} that is responsible for managing the system's resources and provide abstractions for all other \glspl{app}.
-For this the \gls{OS} is the only \gls{app} that required unrestricted access to these resources, with the job of managing them safely according to the rules that are either hard-coded or set up by the \gls{sysadmin}.
+For this the \gls{OS} is the only \gls{app} that required unrestricted access to these resources, with the responsibility of managing them safely according to the rules that are either hard-coded or set up by the \gls{sysadmin}.

-The increasing number of vulnerabilities based on memory-safety issues in \glspl{app}, as presented in \autoref{context::common-mem-safety-mistakes::cwe::statistics}, is a major motivator for working on this topic.
+The increasing number of vulnerabilities based on memory-safety issues in \glspl{app}, as presented in \cref{context::common-mem-safety-mistakes::cwe::statistics}, is a major motivator for working on this topic.

 \section{Motivational Hypothesis}
 % Primary Research Questions
@ -19,12 +19,12 @@ The increasing number of vulnerabilities based on memory-safety issues in \glspl
 %TODO: mention electrolyte, formal verification for Rust

 According to my best-effort literature research in Q1/2017, the hypothesis that \textit{Rust's static code analysis can guarantee memory safety in the \gls{OS}} has not been studied explicitly.
-This is to my surprise, because as explained in \autoref{context::introduction::memory-safety}, memory-safety in \gls{OS} development is critical, and \gls{Rust} offers attractive features that might bring improvements, which is covered in \autoref{context::rust}.
+This is to my surprise, because as explained in \cref{context::introduction::memory-safety}, memory-safety in \gls{OS} development is critical, and \gls{Rust} offers attractive features that might bring improvements, which is covered in \cref{context::rust}.
 The hypothesis cannot be trivially approved or denied, which drives the research efforts for my final thesis project.

 Besides this specific hypothesis, many implementations of \glspl{OS} with \gls{Rust} have appeared in public.
 Their purposes range from proof-of-concept and educational work like \gls{imezzos} and \gls{blogos}, to implementations that aim to be production grade software like \gls{redoxos} and \gls{tockos} \cite{Levy2015a}.
-These implementations are subject to evaluation in \autoref{rnd::existing-os-in-rust}
+These implementations are subject to evaluation in \cref{rnd::existing-os-in-rust}

 The final results will be of qualitative nature, captured by analyzing the existing and a self-developed \gls{Rust}-implementations of popular memory management techniques.
 In addition to the sole analysis of \gls{Rust}-implementations, comparisons will be made, discerning the level of memory safety guarantees gained over similarly intending implementations in \gls{C}.
@ -103,22 +103,32 @@ This is especially likely using low-abstraction languages like \gls{C} and \gls{
 The goal of this thesis is to find out if the \gls{Rust} \gls{compiler} is able to mitigate this specific problem.

 \chapter{OS Development Concepts}
-This chapter explains concepts used in \gls{OS} development today, and is a direct preparation for the upcoming \autoref{context::common-mem-safety-mistakes}, which explains specific weaknesses that result from made memory-management mistakes in the attempt to implement these concepts.
+This chapter explains concepts used in \gls{OS} development today, and is a direct preparation for the upcoming \cref{context::common-mem-safety-mistakes}, which explains specific weaknesses that result from made memory-management mistakes in the attempt to implement these concepts.
 Since the \gls{OS} manages the system's hardware directly, some of the implementation and design choices depend on the underlying hardware architecture.
 For a full understanding the hardware implications are also outlined in this document.
-To bound the extent of this and the following chapters, the explanations are limited to one contemporary architecture, \gls{amd64}, and further narrowed down by focusing on the operation in 64-Bit long mode\cite[p.~18]{AMD64Vol2} it provides.
+To bound the extent of this and the following chapters, the explanations are limited to one contemporary architecture, \gls{amd64}, and further narrowed down by focusing on the operation in 64-Bit Long Mode\cite[p.~18]{AMD64Vol2}.

 \section{Resource Management by Virtualization}
 Resource management in \gls{OS} development is different than in generic \glspl{app} development.
-The \gls{OS}, typically the lowest software layer, must know the very details of the system's hardware and perform raw access to it.
+The \gls{OS} - typically the lowest software layer - must know the very details of the system's hardware and perform raw access to it.

 \subsection{Layers}
-The \gls{OS} creates a virtualization\footnote{The term \textit{virtualization} the \gls{OS} jargon can be understood as abstraction} layer on top of architecture specific code and abstracts it in form of an internal \gls{api}.
+The \gls{OS} creates a virtualization\footnote{The term \textit{virtualization} within the \gls{OS} the jargon can be understood as abstraction} layer on top of architecture specific code and abstracts it in form of an internal \gls{api}.
 This layer abstracts at least the \gls{CPU} and memory\cite{Arpaci-Dusseau2015}.
 Higher-level, complex management algorithms can then implement hardware-independent on top of this \gls{api}, making it reusable across different architectures.
 The \gls{OS} then provides an \gls{api} through which \glspl{app} can request access to these virtualized resources.
 This allows \gls{app} developers to develop and run different programs easily and presumably safely on the \gls{OS}, agnostic of the architecture.

+\subsection{Task Models}
+TODO shortly describe these and give a reference model
+\begin{itemize}
+    \item Task
+    \item Program
+    \item Procedure
+    \item Process
+    \item Thread
+\end{itemize}
+
 \subsection{Resource Specifics}
 Virtualization has different technical implications for different resources types, depending on their nature and available count.
 To give an example, the \gls{CPU} is not explicitly requested, because any instruction by the program implicitly requires the \gls{CPU} to execute it.
@ -129,20 +139,20 @@ In contrary, a program could ask the \gls{OS} for a specific amount of memory or
 Activating the 64-Bit long mode on \gls{amd64} makes the system rely primarily on paging memory management, thus the technique of memory segmentation can be neglected in this context.
 This section provides information about hardware-supported memory paging and protection techniques.

-To improve the efficiency and safety of memory-management, developers of hardware and software have been collaborating to offload the page look-up from the \gls{OS} software to the hardware, namely the \gls{CPU}'s \gls{MMU}.
+To improve the efficiency and safety of memory-management, developers of hardware and software have been collaborating to offload virtual memory address lookup and caching from the \gls{OS} software to the hardware, namely the \gls{CPU}'s \gls{MMU}.
 A hardware-implementation of the lookup algorithm is fast, and allows rudimentary memory permission runtime-checks to protect pages by leveraging \gls{CPU}'s security rings\cite[p.~117,~p.~145]{AMD64Vol2}.

 \subsection{Virtual Address Translation and Paging}
-Paging with virtual addresses is one method of virtualizing and in this way transparently share the system's memory among running programs and the \gls{OS} itself, presumably in a safe way. 
-On \gls{amd64}, the software's instructions use virtual memory addresses, which are translated to physical memory addresses by the \gls{MMU} of the \gls{CPU} at the time the instructions are executed.
+Paging with virtual addresses is one method of virtualizing memory and in this way transparently share the system's memory among running tasks and the \gls{OS} itself, presumably in a safe way. 
 Even when using a language that supports direct memory addressing, \gls{app} developers don't have to consider paging and address translation in the logic of their programs, because all addresses in their program are virtual and are translated at runtime by the \gls{MMU}.
 The translation itself is performed by the \gls{MMU} according to a map that is called page table, which is a structure maintained by the \gls{OS} in the main memory.
 This memory structure can be stored anywhere in memory, and the address is handed to the \gls{MMU} via a specific \gls{CPU} register, \textit{CR3} on \gls{amd64}.
 The \gls{OS} can maintain multiple page table structures, and can create different virtual address spaces by changing \gls{MMU}'s page-table pointer - the \textit{CR3} register.
+As mentioned above the hardware caches provides caches for repeated lookups of the same virtual addresses.
+Controlling the validity of these cache entries is in the \gls{OS} responsibility.

 To avoid the need for storing a translation mapping for every possible address, mappings are grouped into fixed-size pieces, the \textit{page}s.
-This works by encoding the offset within the page in the virtual address, together with the index into the page table.
-
+This works by encoding the offset within the page in the virtual address, together with page's index in page table.
 The offset size depends on the chosen page-size, and can be calculated with the following formula, given page-size in bytes as $p$:
 \begin{equation}
    \textrm{offset\_bits(p)} = log_2(p), \{ p \in N, p: n^2 \}
@ -153,126 +163,142 @@ For example, the \gls{amd64} default page-size of 4 KiB has a 12-bit offset, whi
 If an instruction uses a virtual address that indexes a page which is not present in memory, the \gls{CPU} will generate page-fault exception to give control back to the \gls{OS}.
 The \gls{OS} must then react accordingly by e.g. finding free physical memory and map it to the page my modifying the page's page-table entry.

-\paragraph{Hypothetical 1-level-page-table example.}
-If only one page-table per context was used that consists of $2^{52}$ page-table entries which mustat minimum store the physical address it maps to, it would require $\frac{52 * 2^{52} [Bit]}{8*1024^4 [Bit/Byte]} = 26624$ TiB of memory for each context.
-Even if only a handful of additional pages were allocated and mapped, the \gls{OS} would still have to allocate this huge page-table.
-This vast consumption of main memory is impractical and impossible for average systems, which rarely surpass 100 GiB of main memory.
-
 \subsubsection{Swapping}
 The finite primary memory can only hold a finite number of virtual pages, and the \gls{OS} is responsible for having the required pages present.
 Besides the pages that contain the page-table itself, the pages that aren't required by the current instruction might be moved to secondary memory.
 Swapping pages in and out of primary memory is risky as it requires to transfer large amounts of raw memory content, but these safety analyzes exceed the scope of this thesis.

-\subsubsection{Multi-Level Paging}
+
+\section{Multi-Level Paging}
 \label{context::introduction::hw-supported-mm::multilevel-paging}
-On \gls{amd64} "a four-level page-translation data structure is provided to allow long-mode operating systems to translate a 64-Bit virtual-address space into a 52-Bit physical-address space."\cite[p.~18]{AMD64Vol2}.
-Using a hierarchical translation structure allows to save significant amounts of memory, as not every page-table of every level and address space has to be allocated and present in memory.
-Only the \textit{PML4} which is currently referenced by the \textit{Page Map Base Register (CR3)} is required to be present.
+If only one page-table per virtual address space was used that consists of $2^{52}$ page-table entries, which must at minimum store the physical address, it would require $\frac{52 * 2^{52} [Bit]}{8*1024^4 [Bit/Byte]} = 26624$ TiB of memory for each virtual address space.
+Even if only a handful of additional pages were allocated and mapped, the \gls{OS} would still have to allocate this huge page-table.
+This vast consumption of main memory is impractical and impossible for average systems, which rarely surpass 100 GiB of main memory.

-\autoref{fig:virtual-addr-transl} shows the 64-Bit virtual address composition on \gls{amd64}, which uses four-levels of page tables.
-Counterintuitively the page-tables are not called level-\textit{n}-page-table, but the levels received distinct names in \citetitle{AMD64Vol2}.
-The most-significant Bits labelled as \textit{Sign Extend} are not used for addressing purposes, but must adhere the canonical address form and simply repeat the value of the most-significant implemented Bit \cite[p.~130]{AMD64Vol2}.
-The least significant Bits represent the offset within the physical page.
-The four groups in between are used to index the page-table at their respective level.
+Therefore most systems use a hierarchy of page tables.
+Using a hierarchical translation structure allows to save significant amounts of memory, as not every page-table of every level in the address space has to be allocated and present in main memory.

+% TODO picture this
 \begin{figure}
 \centering
-\includegraphics[width=\textwidth]{gfx/Virtual-to-Physical-Address-Translation-Long-Mode.png}
-\caption{Virtual to Physical Address in Long Mode\cite{AMD64Vol2}}
-\label{fig:virtual-addr-transl}
+\includegraphics[width=\textwidth]{gfx/TODO-nlevel-paging}
+\begin{tikzpicture}
+    \def\x{9ex}
+    % memory cells
+    \path[draw,font=\small]
+     % cells
+    (0,0)
+        rectangle ++(\x, 1) 
+        rectangle ++(\x,-1)
+        rectangle ++(\x, 1) 
+        rectangle ++(\x,-1)
+        rectangle ++(\x, 1) 
+    % cell text
+    (0.5*\x,0.5)
+        node(text-a){idx$_n$} ++(\x,0)
+        node(text-b){idx$_{n-1}$} ++(\x,0)
+        node(text-c){...} ++(\x,0)
+        node(text-d){idx$_1$} ++(\x,0)
+        node(text-e){offset$_{page}$}++(\x,0)
+    % bit numbers
+    (0,1) 
+        node[anchor=south]{63}
+    (5*\x,1) 
+        node[anchor=south]{0}
+        ;
+    % braces
+    \foreach \y in {1,...,5} {
+        \pgfmathparse{\y-1}
+        \draw[decorate,decoration={brace,mirror}] 
+            ($(\pgfmathresult*\x,-1ex)!0.1!(\y*\x,-1ex)$) -- node[shape=coordinate](brace-\y){}
+            ($(\pgfmathresult*\x,-1ex)!0.9!(\y*\x,-1ex)$);
+    }
+
+    \draw
+     % cells
+    (0*\x,-1*\x)                  
+        rectangle ++( \x, 0.5*-1) 
+        rectangle ++(-\x,-1)     
+        rectangle ++( \x, 0.5*-1) 
+    (1*\x,-3*\x)
+        rectangle ++( \x, 0.5*-1)
+        rectangle ++(-\x,-1)
+        rectangle ++( \x, 0.5*-1) 
+    (2*\x,-5*\x)
+        rectangle ++( \x, 0.5*-1)
+        rectangle ++(-\x,-1)
+        rectangle ++( \x, 0.5*-1) 
+        ;
+
+    % cell text
+%    (0.5*\x,0.5)
+%        node(text-a){idx$_n$} ++(\x,0)
+%        node(text-b){idx$_{n-1}$} ++(\x,0)
+%        node(text-c){...} ++(\x,0)
+%        node(text-d){idx$_1$} ++(\x,0)
+%        node(text-e){offset$_{page}$}++(\x,0)
+%    % bit numbers
+%    (0,1) 
+%        node[anchor=south]{63}
+%    (5*\x,1) 
+%        node[anchor=south]{0}
+%        ;
+
+%        \def\y{1}
+%        \pgfmathparse{\y-1}
+%        \draw[decorate,decoration={brace,mirror}] 
+%            ($(\pgfmathresult,-1ex)!0.1!(\y*\x,-1ex)$) -- node[shape=coordinate](brace-\y){}
+%            ($(\pgfmathresult,-1ex)!0.9!(\y*\x,-1ex)$);
+%        \def\y{2}
+%        \pgfmathparse{\y-1}
+%        \draw[decorate,decoration={brace,mirror}] 
+%            ($(\pgfmathresult*\x,-1ex)!0.1!(\y*\x,-1ex)$) -- node[shape=coordinate](brace-\y){}
+%            ($(\pgfmathresult*\x,-1ex)!0.9!(\y*\x,-1ex)$);
+%        \draw[decorate,decoration={brace,mirror}] 
+%            ($(1*\x,-1ex)!0.1!(\x*2,-1ex)$) -- node[shape=coordinate](brace-2){} 
+%            ($(1*\x,-1ex)!0.9!(\x*2,-1ex)$);
+
+\end{tikzpicture}
+\caption{Hierarchical Virtual Paging}
+\label{fig:paging-hierarchy-abstract}
 \end{figure}

-\subsubsection{Translation Scheme 4 KiB and 2 MiB Pages}
-The \gls{amd64} architecture allows configuring the page-size, two of which will be introduced in this section.
-\autoref{tab:page-transl-vaddr-composition} displays the virtual address composition for the 4KiB and 2MiB page-size modes on \gls{amd64}.
-The direction from top to bottom in the table corresponds to most significant to least significant - left to right - in the virtual address.
-The \textit{sign extension} Bits cannot be used for actual information but act as a reservation for future architectural changes.
+The details of how this is implemented on \gls{amd64} can be found in \cnameref{rnd::sysprog-conventions::paging-amd64}.

-\begin{table}
-    \begin{tabular}{l | c | c}
-        Description & Bits in 4 KiB Pages & Bits in 2 MiB Pages \\
-        \hline
-        Sign Extend & 12 & 12 \\
-        Page-Map-Level-4 Offeset & 9 & 9 \\
-        Page-Directory-Pointer Offeset & 9 & 9 \\
-        Page-Directory Offeset & 9 & 9 \\
-        Page-Table Offeset & 9 & - \\
-        Physical Page Offset & 9 & 21 \\
-    \end{tabular}
-    \caption{Paging on \gls{amd64}: Virtual Address Composition 4KiB/2MiB pagesizes}
-    \label{tab:page-transl-vaddr-composition}
-\end{table}
+\subsection{The Concepts of Stack And Heap}
+\label{context::introduction::hw-supported-mm::stackheap}
+In \gls{proglang} and \gls{OS} design and literature, the terms \gls{stack} and \gls{heap} are ubiquitous. A research for their definition wasn't conclusive, indicating that they are rather concepts than absolutely defined terms, and might be implemented and used differently on various architectures, \glspl{proglang} and \glspl{OS}.

-\begin{figure}
-\centering
-\includegraphics[width=\textwidth]{gfx/amd64-4kb-page-translation-long-mode}
-\caption{4-Kbyte Page Translation—Long Mode\cite{AMD64Vol2}}
-\label{fig:4kb-page-transl}
-\end{figure}
+This part focuses on the basic concepts, already limiting the scope to the \gls{amd64} architecture, the \glspl{proglang} \gls{C} and \gls{Rust} and their usage on either bare-metal or \gls{LX}.
+A detailed continuation is found in \cnameref{rnd::mm-conventions}.

-\autoref{fig:4kb-page-transl} shows the detailed virtual address composition for 4 KiB pages, using four levels of page-tables.
-It uses four sets of 9-Bit indices in the virtual address, one per hierarchy level, followed by the 9 Bit page-internal offset.
-
-An alternative approach is displayed in \autoref{fig:2mb-page-transl}, using 2 MiB sized pages.
-It uses three sets of 9-Bit indices for the page-tables, and a 21-Bit page-internal offset.
-Increasing the page-size improves speed and memory-usage and decreases the granularity.
-In this specific example the hierarchy is reduced by one level of page-tables.
-This reduces the amount of storage required for the page-tables in overall and causes the lookup algorithm to finish faster.
-
-\begin{figure}
-\centering
-\includegraphics[width=\textwidth]{gfx/amd64-2mb-page-translation-long-mode}
-\caption{2-Mbyte Page Translation—Long Mode\cite{AMD64Vol2}}
-\label{fig:2mb-page-transl}
-\end{figure}
-
-The other supported page sizes, 4 MiB and 1 GiB, as well as intermixing page sizes through the different levels don't add new insight into the mechanism and don't need to be detailed here.
-
-% \subsubsection{Top-Level Page Table Self-Reference}
-% \subsubsection{Caching Lookups}
-% \subsubsection{Full Example}
-% * http://taptipalit.blogspot.de/2013/10/theory-recursive-mapping-page.html
-% * https://www.coresecurity.com/blog/getting-physical-extreme-abuse-of-intel-based-paging-systems-part-2-windows
-
-\subsection{Premised Trust In Hardware}
-The algorithms that are implemented in hardware can't be verified and need to be trusted to work exactly like the manual describes them.
-% TODO: remove this chapter of write something interesting
-
-\subsection{The \textit{Stack} And \textit{Heap} Concept}
-In \gls{proglang} and \gls{OS} design and literature, the terms \gls{stack} and \gls{heap} are ubiquitous and assumed to be known.
-To avoid ambiguities in the first place, this document refers to \gls{heap} as the memory zone, not the data structure.  
-
-From a perspective of developing \glspl{app} and studying \gls{OS} course content, there is still a certain vagueness in the understanding of these concepts.
-After the research for their origin it is clear that they are mere concepts, that might be implemented and used differently in the various \glspl{OS} and \glspl{proglang}.
-The hardware manuals \citetitle{AMD64Vol1} and \citetitle{AMD64Vol2} refer to \gls{stack} but have no mention of \gls{heap}.

 \subsubsection{Stack: Hardware-Backed Abstract Type}
-The \gls{amd64} manuals conjunctionally describe how the \gls{stack} is used and influenced by various instructions. 
-In summary, it is a memory model for a structured contiguous memory region which grows by storing new data entries on top of each other.
-It grows from numerically higher to numerically lower addresses, whereas the numerically highest address is called the stack bottom, and the current numerically lowest address is the stack top.
-Hi
+\label{context::introduction::hw-supported-mm::stackheap::stack}
+In summary, the \gls{stack} is a memory model for structuring contiguous memory.
+It grows by adding new data entries on top of each other.
+According to the \gls{stack} analogy, only the topmost element can be accessed and removed, thus it behaves like a Last-In-First-Out data structure.

-The usage of the \gls{stack} is coupled with control flow instructions, in conjunction with two registers, the Stack-Frame Base Pointer (RBP) and the Stack Pointer (RSP).
-The instructions that reference with the stack\cite[p.~83]{AMD64Vol1} can be grouped into the following three categories.
+The hardware manuals \citetitle{AMD64Vol1} and \citetitle{AMD64Vol2} have no mention of the word \textit{heap}, but use \textit{stack} hundreds of times, indicating that \gls{stack} is implemented in hardware to some extend.
+The \gls{amd64} manuals conjunctionally describe how the \gls{stack} is used and influenced by various instructions on this architecture.
+Here it grows from numerically higher to numerically lower addresses, whereas the numerically highest address is called the stack bottom, and the current numerically lowest address is the stack top.
+In 64-Bit long mode \gls{amd64} doesn't consider the stack to be sized.

-\paragraph{Data Storage}
-The address in RSP moves towards numerically lower addresses with every PUSH instruction, which stores a new data entry on top.
-POP instructions works in the opposite direction, consuming the top-most data entry, moving the RSP address towards the numerically higher RBP address.
-When RBP and RSP have the same value, the stack is empty.
+The \gls{stack} is allocated per procedure and typically stores only procedure-local data, which is simply forgotten once the procedure has completed.
+To achieve memory-safety with regards to \gls{stack} management inside \gls{OS}, each procedure must only access its own particular \gls{stack}.
+Additionally, \glspl {stack} must be prevented from growing into memory regions that might belong to other procedures.
+This needs to be considered by \gls{OS} developers when implementing memory-management for multitasking \gls{OS}, as further investigated in \cref{rnd::existing-os-dev-wity-rust,rnd::imezzos-preemptive-multitasking}.

-In 64-Bit long mode \gls{amd64} doesn't consider the stack to be sized, so it is up to the \gls{OS} developer to ensure that it doesn't grow into other foreign memory regions.
+\subsubsection{Heap: Organized Chaos}
+\label{context::introduction::hw-supported-mm::stackheap::heap}
+\Gls{heap} is an ambiguous term that names a data structure in more theoretical computer science and a memory zone in system programming.
+In this document \gls{heap} refers to the latter.

-\paragraph{Procedure Calls}
-% TODO CALL
-% TODO RET
-\Glspl{stack} used for procedure calls\footnote{a different word for function call}, specifically for passing data from the calling to the called procedure.
+\subsection{Combining Stack And Heap}
+% TODO: figure that shows stack and  heap?

-\paragraph{Procedure Setup}
-% TODO ENTER
-% TODO LEAVE
- 
 \section{Preemptive Multitasking}
-Virtualization as previously explained is the foundation for the \gls{OS} to perform preemptive multitasking inconspicuously towards the \glspl{app}.
+The previously explained virtualization is the foundation for the \gls{OS} to perform preemptive multitasking inconspicuously towards the \glspl{app}.
 This means that when a task is preempted and continued later, it observes no side-effects other than an elapse of time.
 Preemptive multitasking needs not be considered during development of single-threaded \gls{app}.
 Multi-threading and 
@ -283,13 +309,13 @@ For example, a single \gls{CPU} system can not be utilized by more than one prog
 In contrast, main memory resources are only limited by their capacity and can otherwise be shared by several programs simultaneously, so that tasks that are not executed by \gls{CPU} can still have data stored in memory.

 The \gls{OS} must ensure that switching tasks is done properly for all resources to prevent interference and unintended behavior.
-To ensure memory safety in this scenario, all data in the memory must be protected from unintended access, according to the definition of memory safety in \autoref{context::introduction::memory-safety::def}.
+To ensure memory safety in this scenario, all data in the memory must be protected from unintended access, according to the definition of memory safety in \cref{context::introduction::memory-safety::def}.

 \subsection{Context Switching}
 When the \gls{OS} preempts a task, it needs to store and preserve the current task's context.
 The context consists of all volatile resources that can possibly be overwritten by another task.
 This is at minimum a set of \gls{CPU} registers depending on the specific architecture.
-A description for \gls{amd64} is given in \autoref{tab:task-minimum-context-registers}.
+A description for \gls{amd64} is given in \cref{tab:task-minimum-context-registers}.

 The \gls{OS} stores the preempted context in a well-known and protected memory location, so that it can be restored when this task is resumed.

@ -317,35 +343,17 @@ The \gls{OS} stores the preempted context in a well-known and protected memory l
 In preemptive multitasking, context switches are not considered voluntary, but rather by force.
 This works by using the \gls{CPU}'s interrupt mechanism which has the ability to jump to an \gls{OS} function in the event of an interrupt.
 Interrupts for this use-case are usually triggered by programmed timer interrupts, occurring continuously and regularly.
-The interrupt mechanism itself is part of the \gls{CPU} and must be used by the \gls{OS} as is.
-  
-On \gls{amd64}, the \gls{CPU}'s interrupt mechanism does not switch the full context described previously, but only handles the registers that are necessary to successfully jump to the interrupt function: RFLAGS, RSP, RBP, RIP\footnote{Segment registers are neglected}.
-
-In this scenario, the context is stored on the \gls{stack} of the function that is interrupted.
-\autoref{fig:amd64-long-mode-interrupt-stac} pictures the \gls{stack} layout on interrupt entry.
-In order to leverage an interrupt for a context switch, the interrupt function needs to replace these values on the \gls{stack} with values for the new context.
-CS (Code-Segment) and SS (Stack-Segment) have no effect in \gls{amd64} 64-Bit mode\cite[p.~20]{AMD64Vol1} and can remain unchanged.
-The \gls{OS} developer needs to know the exact address where on the \gls{stack} this data structure has been pushed by the \gls{CPU}, and must then manipulate these addresses directly.
-This type of manipulation is inherently dangerous and can not be easily checked by the \gls{compiler}.
-The function that handles the interrupt must then use the instruction \textit{iretq}\cite[p.~252]{AMD64Vol2}, to make the \gls{CPU} restore the partial context from the \gls{stack} and continue to function pointed to by the RIP.

+The interrupt mechanism itself is part of the \gls{CPU} which is why the lowest level of the task switching mechanism in the \gls{OS} is hardware dependent.
 Safety could be increased if the \gls{compiler} or in a more general sense the \gls{proglang} could assist in architecture specific code.

-\begin{figure}
-\centering
-\includegraphics[width=0.8\textwidth]{gfx/amd64-long-mode-stack-after-interrupt.png}
-\caption{Long-Mode Stack After Interrupt\cite[p.~252]{AMD64Vol2}}
-\label{fig:amd64-long-mode-interrupt-stac}
-\end{figure}
-
-For a full context-switch, the other registers that are part of the context need to be handled by the \gls{OS}'s interrupt function.
-This function is critical for safety and, as any other function in the \gls{OS}, is written by the \gls{OS} developer.
-
+More details on this mechanism is given in \cnameref{rnd::sysprog-conventions::ir-driven-preemptive-cs-amd64}.
+  
 \chapter{Common Memory-Safety Mistakes}
 \label{context::common-mem-safety-mistakes}
-Building upon \autoref{context::introduction}, which describes the basic mechanics of memory usage and how mistakes come to existence, this chapter presents and explains common software vulnerabilities that are related to memory-safety.
+Building upon \cref{context::introduction}, which describes the basic mechanics of memory usage and how mistakes come to existence, this chapter presents and explains common software vulnerabilities that are related to memory-safety.
 The relevant vulnerability classes are explained alongside exemplary manifestations in \gls{C}/\gls{C++}.
-In \autoref{rnd::porting-c-vulns}, these are ported and compared to functionally equivalent versions written in \gls{Rust}.
+In \cref{rnd::porting-c-vulns}, these are ported and compared to functionally equivalent versions written in \gls{Rust}.

 \section{\glsentrylong{CWE}}
 \label{context::common-mem-safety-mistakes::cwe}
@ -365,10 +373,10 @@ The following information is provided for enumerations of the type weakness clas
 The relevant weaknesses for this thesis are children of the umbrella weakness \citetitle{MITRE-CWE-633}.

 % TODO test the autocite command with footnotes
-\subsection{\citetitle{MITRE-CWE-119}}
+\subsection{\glsentrylong{CWE-119}}
 \label{context::common-mem-safety-mistakes::cwe::119}
-One of its children weaknesses, \citetitle{MITRE-CWE-119}, is particularly interesting.
-Manifestations of this weakness are a direct violation of the memory-safety defined in \autoref{context::introduction::memory-safety::def} must have occurred, which "can cause read or write operations to be performed on memory locations that may be associated with other variables, data structures, or internal program data.
+One of its children weaknesses, \gls{CWE-119}, is particularly interesting.
+Manifestations of this weakness are a direct violation of the memory-safety defined in \cref{context::introduction::memory-safety::def} must have occurred, which "can cause read or write operations to be performed on memory locations that may be associated with other variables, data structures, or internal program data.
 As a result, an attacker may be able to execute arbitrary code, alter the intended control flow, read sensitive information, or cause the system to crash"\cite{MITRE-CWE-119}.
 This can happen on certain languages, which "allow direct addressing of memory locations and do not automatically ensure that these locations are valid for the memory buffer that is being referenced.
 \gls{C}, \gls{C++}, \gls{asm} and languages without memory management support"\autocite{MITRE-CWE-119}.
@ -377,20 +385,20 @@ Direct memory addressing support doesn't imply a lack of memory management suppo

 There are languages that provide memory management support and still allow direct memory addressing, which is interesting for \gls{OS} development.
 \gls{Rust} is one of these languages, although it requires the developer to explicitly acknowledge all direct memory access operations with the \textit{unsafe} keyword.
-More information on \gls{Rust} follows in \autoref{context::rust}.
+More information on \gls{Rust} follows in \cref{context::rust}.

 \subsection{Statistics}
 \label{context::common-mem-safety-mistakes::cwe::statistics}
 One of the main reasons for me to work on this topic is the increasing number of vulnerabilities based on memory-safety issues.

 This section is intended to express the weakness's severity in real-world software based on available statistics.
-The only data available is based on publicly available sources, thus the completeness of it is questionable, because many organizations might choose to not disclose their vulnerabilities, either to protect their reputation or for security reasons as explained in \autoref{context::introduction::memory-safety-violation-in-sw}.
+The only data available is based on publicly available sources, thus the completeness of it is questionable, because many organizations might choose to not disclose their vulnerabilities, either to protect their reputation or for security reasons as explained in \cref{context::introduction::memory-safety-violation-in-sw}.
 The data and visualizations are supplied by the \gls{NVD}, which collects the data based on the umbrella weakness CWE-635\footnote{http://cwe.mitre.org/data/definitions/635.html} that was specifically created for the \gls{NVD}.
 The numbers of these selected weaknesses are detailed in the following figures, the rest is grouped as \textit{other}.

-\autoref{fig:vulnerability-ratio-history} and \autoref{fig:vulnerability-counts-history} display a decade of data on vulnerabilities grouped by their \gls{CWE} category.
+\cref{fig:vulnerability-ratio-history} and \cref{fig:vulnerability-counts-history} display a decade of data on vulnerabilities grouped by their \gls{CWE} category.
 The category called \textit{buffer\footnote{A bounded chunk of memory used by programs to store and exchange data} errors} represents \autocite{MITRE-CWE-119}. 
-In \autoref{fig:vulnerability-ratio-history} it has the color light blue, 2nd from the bottom in the legend, and in \autoref{fig:vulnerability-counts-history} it has the color blue, 2nd from the top in the legend.
+In \cref{fig:vulnerability-ratio-history} it has the color light blue, 2nd from the bottom in the legend, and in \cref{fig:vulnerability-counts-history} it has the color blue, 2nd from the top in the legend.

 \begin{figure}
 \centering
@ -425,7 +433,7 @@ In \autoref{fig:vulnerability-ratio-history} it has the color light blue, 2nd fr
    \label{tab:vulnerability-buffer-error-by-history}
 \end{table}

-In \autoref{tab:vulnerability-buffer-error-by-history}, the column \textit{relative count} represents \autoref{fig:vulnerability-ratio-history}, and the column \textit{absolute count} represents \autoref{fig:vulnerability-counts-history}.
+In \cref{tab:vulnerability-buffer-error-by-history}, the column \textit{relative count} represents \cref{fig:vulnerability-ratio-history}, and the column \textit{absolute count} represents \cref{fig:vulnerability-counts-history}.
 With 16.34 percent of all vulnerabilities known by 2016, and an average of 12.92 percent over ten years, \autocite{MITRE-CWE-119} makes up a significant part of real-world weaknesses.

 \subsection{Vulnerable APIs in Linux and C/C++}
@ -455,7 +463,7 @@ This section contains real-world and \textit{re}constructed example manifestatio
 \subsection{The Stack Clash}
 A recent and high impact vulnerability named \textit{Stack Clash}\footnote{https://blog.qualys.com/securitylabs/2017/06/19/the-stack-clash}, is briefly described as \textit{"a vulnerability in the memory management of several operating systems. It affects Linux, OpenBSD, NetBSD, FreeBSD and Solaris, on i386 and amd64.  It can be exploited by attackers to corrupt memory and execute arbitrary code."}
 The \gls{LX} specific vulnerability is listed as CVE-2017-1000364\footnote{http://www.cvedetails.com/cve/CVE-2017-1000364/}, where \textit{"an issue was discovered in the size of the stack guard page on Linux, specifically a 4k stack guard page is not sufficiently large and can be "jumped" over (the stack guard page is bypassed)"}.
-It is assigned to the \autocite{MITRE-CWE-119} explained in \autoref{context::common-mem-safety-mistakes::cwe::119}.
+It is assigned to the \autocite{MITRE-CWE-119} explained in \cref{context::common-mem-safety-mistakes::cwe::119}.

 % TODO explain that this CWE-119 vulnerability is also "Execute Code"
 % TODO: more references and deeper explanation of what happens: see introduction in https://www.qualys.com/2017/06/19/stack-clash/stack-clash.txt
@ -493,13 +501,13 @@ if (ptr == NULL) {
 \end{lstlisting}


-\chapter{Safe \gls{OS} Development}
+\chapter{Safe OS Development}
 \label{context::introduciton::safe-os-dev}
 This section gives a brief summary of relevant concepts of \gls{OS} development on common hardware platforms, focusing on memory management and its risks.

-In order to protect the memory of each executed program according to \autoref{context::introduction::memory-safety::def}, the \gls{OS} must be designed, developed, and tested carefully.
+In order to protect the memory of each executed program according to \cref{context::introduction::memory-safety::def}, the \gls{OS} must be designed, developed, and tested carefully.

-\subsection{Detecting Memory-Safety Violations ASAP}
+\section{Detecting Memory-Safety Violations ASAP}
 \label{context::safe-os-dev::detecting-safety-violations-asap}
 Given that it can not be prevented for individuals to type erroneous code into their code editors.

@ -516,11 +524,11 @@ It is desirable to place tests as early as possible in the software life cycle,
 This suggests that since the \gls{OS} is lower in the hierarchy of system components at runtime, testing of the \gls{OS} must happen regardless of specific \glspl{app} and development time.
 Especially testing the \gls{OS}'s internal states which can not be directly mutated via the \gls{api} exposed to the \glspl{app}.
 To explain this from the \gls{app} perspective, testing the \gls{OS} at runtime states is not plausible , because the \gls{app} can not freely mutate the system's state.
-Even if it could, testing all possible permutations of system state in every possible \gls{app} would be highly redundant and nonetheless leaves the risk for untested edge cases that happen only under specific system circumstances, possibly influenced by other components on the system as described in the beginning of \autoref{context::introduction::memory-safety}. 
+Even if it could, testing all possible permutations of system state in every possible \gls{app} would be highly redundant and nonetheless leaves the risk for untested edge cases that happen only under specific system circumstances, possibly influenced by other components on the system as described in the beginning of \cref{context::introduction::memory-safety}. 
 The \gls{app} developer is forced to trust the underlying \gls{OS}.
 This puts high importance on the safety of the \gls{OS} design and implementation.

-\subsection{The Effects Of \Glspl{proglang} on Memory-Safety}
+\subsection{The Effects Of Programming Languages on Memory-Safety}
 There are dozens of \glspl{proglang} used by humans to write \glspl{app}, but only a few are used to write \glspl{OS}.

 \subsubsection{Abstraction: Safety vs. Functionality}
@ -536,7 +544,7 @@ By defining an abstraction layer in form of a programming language, the language
 \section{Safety In Language Compilers And Static Analyzers}
 \label{context::introduction::language-compilers-analyzers}

-In \autoref{context::introduction::memory-safety}, specifically in \autoref{context::introduction::memory-safety::detection}, it was explained that programming languages have direct impact on the memory-safety.
+In \cref{context::introduction::memory-safety}, specifically in TODO "reference detection" was explained that programming languages have direct impact on the memory-safety.
 This section gives an example of how severe this impact is and explains the requirements on a \gls{OS} language.


@ -551,9 +559,9 @@ For this reason there have been attempts to define subsets of the \gls{C} langua
 Safety checks that are performed at runtime introduce a high degree of overhead, which makes it a nonviable option in the domain of \gls{OS} development, where many code paths must be very fast to ensure the operation of high speed I/O devices\cite{Balasubramanian2017} or tasks with \gls{realtime} requirements. (TODO: explain realtime requirements)
 This has been forcing \gls{OS} developers to prioritize performance over safety. (TODO: reference)

-Details about the challenge of writing code that does memory management safely, and related vulnerabilities are given further along in \autoref{chap:mmt}.
+Details about the challenge of writing code that does memory management safely, and related vulnerabilities are given further along in \cref{context::common-mem-safety-mistakes}.

-\section{Choice of \Glsentrytext{proglang} Choice}
+\section{Choice of Programming Language}
 Criteria for the choice of programming language are much different from choosing a language for other types of \glspl{app}.

 This is a list of what is required for implementing an \glspl{OS}
@ -568,7 +576,7 @@ This is a list of what is required for implementing an \glspl{OS}


 \chapter{Memory-Safety Analysis Techniques}
-As per the previous \autoref{chap:context.mem-weaknesses} there is general awareness of the problems, and there has been ongoing effort to develop and improve techniques that assist the programmer to detect and avoid such mistakes first- or secondhand. 
+As per the previous \cref{context::common-mem-safety-mistakes} there is general awareness of the problems, and there has been ongoing effort to develop and improve techniques that assist the programmer to detect and avoid such mistakes first- or secondhand. 

 \section{Static vs. Dynamic Analysis}
 % TODO: explain first-/secondhand -> static/dynamic -> compile-time/runtime -> offline/online
--- a/src/docs/parts/eval_and_conclusion/eval_and_conclusion.tex
+++ b/src/docs/parts/eval_and_conclusion/eval_and_conclusion.tex
@ -10,8 +10,113 @@
 \section{The Necessary Evils of \textit{unsafe}}

 \chapter{Result Evaluation}
-% TODO: repeat that rust *can* be used to increase safety in the OS, but it doesn't guarantee it per-se
+
+\paragraph{Premised Trust In Hardware}
+Memory management mechanisms are partially implemented in the target system's hardware which can't be verified by at time of development.

 \chapter{Summary}

 \chapter{Final Conclusion}
+Safety - or security for this matter - is not something that can be achieved absolutely.
+It grows successively and gives the \gls{OS} developers and the end-users a \emph{feeling} of safety, until another vulnerability is found and disclosed.
+
+% TODO: repeat that rust *can* be used to increase safety in the OS,
+% TODO: how?
+% but it doesn't guarantee it per-se
+
+\chapter{Scratchpad}
+
+
+\begin{figure}[h!]
+\centering
+\begin{subfigure}[T]{0.50\textwidth}
+    \tikzmarkcountprep{callee}
+    \begin{compactminted}[
+            escapeinside=??,linenos,autogobble,highlightlines={}
+    ]{nasm}
+    mov    rax,QWORD PTR [rbp-0x48]?\tikzmarkcount?
+    add    rsp,0x50?\tikzmarkcount?             
+    pop    rbp?\tikzmarkcount?
+    ret?\tikzmarkcount?
+    \end{compactminted} 
+    \tikzmarkdrawcircles
+\caption{Subfig A}
+\end{subfigure}
+\begin{subfigure}[T]{0.45\textwidth}
+    \foreach \x/\xtext in {
+        1/{
+            this is going to be a really long sentence with line wraps 
+        },
+        2/{
+            second
+        }
+    } {\tikzmarkcircle{\x}\xtext\\}
+\caption{Subfig B}
+\end{subfigure}
+\caption{Whadup}
+\label{Whadup}
+\end{figure}
+
+\begin{listing}
+    \tikzmarkcountprep{example1}
+    \begin{minted}[
+            label=example1,labelposition=all,escapeinside=??,linenos,autogobble,highlightlines={}
+    ]{nasm}
+    mov    rax,QWORD PTR [rbp-0x48]?\tikzmarkcount?  ?\tikzmark{brace1upper}?
+    add    rsp,0x50?\tikzmarkcount?             
+    pop    rbp?\tikzmarkcount?
+    ret?\tikzmarkcount?                              ?\tikzmark{brace1lower}?
+    \end{minted} 
+    \begin{minted}[
+            escapeinside=??,linenos,autogobble,highlightlines={}
+    ]{nasm}
+    mov    rax,QWORD PTR [rbp-0x48]?\tikzmarkcount?
+    add    rsp,0x50                ?\tikzmarkcount?
+    pop    rbp                     ?\tikzmarkcount?
+    ret                            ?\tikzmarkcount?
+    \end{minted} 
+    \begin{tikzpicture}[remember picture,overlay]
+        \draw[thick,decorate,decoration={brace,raise=1ex}] 
+            (pic cs:brace1upper)+(0,1.5ex) -- node[shape=coordinate][right=1.5ex] (a) {} (pic cs:brace1lower);
+        \fill (a)+(2ex,0) circle[opacity=1,radius=1.1ex] node[white,font=\small]{a};
+    \end{tikzpicture}
+    \tikzmarkdrawcircles
+    \caption{Minted Listing A}
+%
+    \foreach \x/\xtext in {
+        1/{
+            this is going to be a really long sentence with line wraps 
+        \\}
+        ,2/{
+            second
+        \\} 
+        ,5/{},6/{
+            hi
+        \\}
+        ,a/{
+            hi
+        \\}
+    } {\tikzmarkcircle{\x}\xtext}
+%
+\end{listing}
+\FloatBarrier
+
+\begin{listing}
+    \tikzset{/minted/basename=example}
+    \begin{minted}[label=caller,labelposition=topline,escapeinside=??,highlightlines={},autogobble,linenos,breaklines=true]{nasm}
+    mov    rcx,QWORD PTR [rbp-0x40] ; copy 1st arg to rcx
+    mov    rsi,QWORD PTR [rbp-0x38] ; copy 2nd arg to rsi
+    mov    rdx,QWORD PTR [rbp-0x30] ; copy 3rd arg to rdx
+    mov    QWORD PTR [rbp-0x60],rdi ; save rdi to make it available
+    mov    rdi,rcx                  ; copy 1st arg to rdi
+    mov    QWORD PTR [rbp-0x68],rax ; save rax to make it available
+    call   7490?\tikzmark{exampleprecallfrom}? <_ZN14stack_handling3sum17h8f12d2383e075691E> ; push '756e' onto the stack and jump to the first instruction of sum
+    mov    QWORD PTR [rbp-0x28],rax ; save return value
+    \end{minted}
+    \caption{Function Call with Three Arguments}
+    \begin{tikzpicture}[remember picture,overlay]
+        \draw[red,thick] (pic cs:exampleprecallfrom) ellipse (0.7cm and 12pt) node { \textbf{1} };
+        \fill[blue] (pic cs:example1) circle (0.1cm);
+        \fill[yellow] (pic cs:example2) circle (0.1cm);
+    \end{tikzpicture}
+\end{listing}
--- a/src/docs/parts/research_and_development/research_and_development.tex
+++ b/src/docs/parts/research_and_development/research_and_development.tex
@ -2,12 +2,284 @@
 \chapter{Topic Refinement}
 % TODO: is this chapter required?

+\chapter{System Programming Conventions}
+\label{rnd::sysprog-conventions}
+
+\section{Stack Frame Handling on AMD64}
+\label{rnd::sysprog-conventions::stackframe-amd64}
+The usage of the \gls{stack} is tightly coupled with control flow instructions in conjunction with two registers, the Stack-Frame Base Pointer (RBP) and the Stack Pointer (RSP).
+The instructions that use these registers and explicitly or implicitly work with the stack\cite[p.~83]{AMD64Vol1} can be grouped into the following categories.
+Together they can be used to perform \gls{stack} based procedure calls, as demonstrated in the following \cref{context::introduction::hw-supported-mm::procedure-call-example}.
+
+\paragraph{Direct Stack Data Management} with PUSH and POP.
+
+PUSH takes value operand which is to be pushed onto the stack.
+The address in RSP moves towards numerically lower addresses with every PUSH instruction, which stores a new data entry on top.
+The order is to first change the RSP and then copy the value at its new address.
+
+POP takes a storage reference operand - \gls{CPU} register or memory address.
+It works in the opposite direction to PUSH.
+First, consuming the top-most data entry and storing it on the operand location, then moving the RSP address towards the numerically higher RBP address.
+
+When RBP and RSP point to the same address, the stack is considered empty.
+
+\paragraph{Procedure Calls} with CALL and RET. \\
+These instructions control the instruction flow by calling another instruction procedure\footnote{loosely synonymous with function}.
+
+The CALL instruction takes the address of the instruction that is to be called.
+Before jumping to the instruction at the given address, it PUSHes the current RIP (instruction pointer) register onto the \gls{stack}.
+
+RET takes no operand, but instead POPs the \gls{stack}'s top entry.
+The consumed value is used as a jump address.
+
+As PUSH and POP use the RSP register, the called procedure is responsible to finish with the RSP at the same position as when it was entered.
+For example, PUSHing some value onto the stack before the end of the function would cause the RET to jump to that address instead of returning to the caller.
+
+\paragraph{Called Procedure Setup} \emph{not} with ENTER and LEAVE.
+
+ When a procedure is called the stack is set up with the following four components 
+\cite[p.~48]{AMD64Vol1}:
+
+\begin{enumerate}
+    \item{%
+        Parameters passed to the called procedure (created by the calling procedure). \\
+        \textit{Only if parameters don't fit the \gls{CPU} registers}
+    }
+    \item{%
+            Return address (created by the CALL instruction). \\
+        \textit{Always used by CALL}
+    }
+    \item{%
+        Array of stack-frame pointers (pointers to stack frames of procedures with smaller nesting-level depth) which are used to access the local variables of such procedures. \\
+        \textit{Depends on support and implementation of nested functions in the \gls{compiler}}
+    }
+    \item{%
+        Local variables used by the called procedure. \\
+        \textit{This includes the variables passed via \gls{CPU} registers}
+    }
+\end{enumerate}
+only necessary when there aren't enough \gls{CPU} to pass the parameters.
+Item 3 is only necessary when 
+
+The \gls{amd64} manual also lists ENTER and LEAVE as instructions to \textit{"provide support for procedure calls, and are mainly used in high-level languages."}\cite[p.~48]{AMD64Vol1}.
+The latter claim could not be verified by inspecting binaries produced by the \gls{C} and \gls{Rust} \glspl{compiler}.
+
+Instead, these \glspl{compiler} generate a sequence of PUSH, MOV and SUB instructions to manage theset up the \gls{stack}.
+There are instructions before and after the procedure's logic, taking care of the technicalities of \gls{stack} management.
+These instruction groups within the called procedure are called prologue and epilogue.
+
+\subsection{Full Procedure Call Example}
+\label{context::introduction::hw-supported-mm::procedure-call-example}
+This section combines the separate categories into one complete example that shows how the \gls{stack} is used by various \gls{CPU} instructions to perform procedure calls.
+The following code samples are extracted from a disassembled binary which was originally created using \gls{Rust}.
+The Assembler that's shown uses Intel Mnemonic, which generally operates from right to left.
+For example, \mint{nasm}{mov a, b} copies b to a.
+
+\cref{code::context::examples::func-callee} shows the \gls{Rust} source code of the function \textit{sum}.
+
+\section{4-Level Paging Hierarchy on \glsentrytext{amd64}}
+\label{rnd::sysprog-conventions::paging-amd64}
+On \gls{amd64} "a four-level page-translation data structure is provided to allow long-mode operating systems to translate a 64-Bit virtual-address space into a 52-Bit physical-address space."\cite[p.~18]{AMD64Vol2}.
+This allows the system to only hold the \textit{PML4} table, the which is currently referenced by the \textit{Page Map Base Register (CR3)}, available in main memory.
+
+\cref{fig:virtual-addr-transl} shows the 64-Bit virtual address composition on \gls{amd64}, which uses four-levels of page tables.
+Counterintuitively the page-tables are not called level-\textit{n}-page-table, but the levels received distinct names in \citetitle{AMD64Vol2}.
+The most-significant Bits labelled as \textit{Sign Extend} are not used for addressing purposes, but must adhere the canonical address form and simply repeat the value of the most-significant implemented Bit \cite[p.~130]{AMD64Vol2}.
+The least significant Bits represent the offset within the physical page.
+The four groups in between are used to index the page-table at their respective level.
+
+\begin{figure}
+\centering
+\includegraphics[width=\textwidth]{gfx/Virtual-to-Physical-Address-Translation-Long-Mode.png}
+\caption{Virtual to Physical Address in Long Mode\cite{AMD64Vol2}}
+\label{fig:virtual-addr-transl}
+\end{figure}
+\subsubsection{Translation Scheme 4 KiB and 2 MiB Pages}
+The \gls{amd64} architecture allows configuring the page-size, two of which will be introduced in this section.
+\cref{tab:page-transl-vaddr-composition} displays the virtual address composition for the 4KiB and 2MiB page-size modes on \gls{amd64}.
+The direction from top to bottom in the table corresponds to most significant to least significant - left to right - in the virtual address.
+The \textit{sign extension} Bits cannot be used for actual information but act as a reservation for future architectural changes.
+
+\begin{table}
+    \begin{tabular}{l | c | c}
+        Description & Bits in 4 KiB Pages & Bits in 2 MiB Pages \\
+        \hline
+        Sign Extend & 12 & 12 \\
+        Page-Map-Level-4 Offeset & 9 & 9 \\
+        Page-Directory-Pointer Offeset & 9 & 9 \\
+        Page-Directory Offeset & 9 & 9 \\
+        Page-Table Offeset & 9 & - \\
+        Physical Page Offset & 9 & 21 \\
+    \end{tabular}
+    \caption{Paging on \gls{amd64}: Virtual Address Composition 4KiB/2MiB pagesizes}
+    \label{tab:page-transl-vaddr-composition}
+\end{table}
+
+\begin{figure}
+\centering
+\includegraphics[width=\textwidth]{gfx/amd64-4kb-page-translation-long-mode}
+\caption{4-Kbyte Page Translation—Long Mode\cite{AMD64Vol2}}
+\label{fig:4kb-page-transl}
+\end{figure}
+
+\cref{fig:4kb-page-transl} shows the detailed virtual address composition for 4 KiB pages, using four levels of page-tables.
+It uses four sets of 9-Bit indices in the virtual address, one per hierarchy level, followed by the 9 Bit page-internal offset.
+
+An alternative approach is displayed in \cref{fig:2mb-page-transl}, using 2 MiB sized pages.
+It uses three sets of 9-Bit indices for the page-tables, and a 21-Bit page-internal offset.
+Increasing the page-size improves speed and memory-usage and decreases the granularity.
+In this specific example the hierarchy is reduced by one level of page-tables.
+This reduces the amount of storage required for the page-tables in overall and causes the lookup algorithm to finish faster.
+
+\begin{figure}
+\centering
+\includegraphics[width=\textwidth]{gfx/amd64-2mb-page-translation-long-mode}
+\caption{2-Mbyte Page Translation—Long Mode\cite{AMD64Vol2}}
+\label{fig:2mb-page-transl}
+\end{figure}
+
+The other supported page sizes, 4 MiB and 1 GiB, as well as intermixing page sizes through the different levels don't add new insight into the mechanism and don't need to be detailed here.
+
+% \subsubsection{Top-Level Page Table Self-Reference}
+% \subsubsection{Caching Lookups}
+% \subsubsection{Full Example}
+% * http://taptipalit.blogspot.de/2013/10/theory-recursive-mapping-page.html
+% * https://www.coresecurity.com/blog/getting-physical-extreme-abuse-of-intel-based-paging-systems-part-2-windows
+
+\begin{listing}[htb]
+    \tikzset{/minted/basename=callee-c}
+    \begin{minted}[autogobble,linenos,breaklines=true]{rust}
+    \end{minted}
+    \caption{The called function in \gls{Rust}}
+    \label{code::context::examples::func-callee-c}
+\end{listing}
+
+\cref{code::context::examples::func-call} shows a snippet snippet of the calling function. 
+It stores the arguments within the registers according to the System V X86\_64 calling convention. %TODO REFERENCE
+The caller doesn't alter the stack-frame pointer (RBP) or the stack pointer (RSP) registers before call, hence the called function must restore these if it alters them.
+
+\begin{listing}
+    \begin{minted}[escapeinside=??,highlightlines={},autogobble,linenos,breaklines=true]{rust}
+    \end{minted}
+    \caption{Procedure Call Example: Caller Rust}
+    \label{code::context::examples::func-call}
+\end{listing}
+
+\begin{listing}
+    \begin{minted}[escapeinside=??,highlightlines={},autogobble,linenos,breaklines=true]{nasm}
+    \end{minted}
+    \caption{Procedure Call Example: Caller Assembly}
+    \label{code::context::examples::func-call}
+\end{listing}
+
+% \balloon{comment}{
+
+% RDI, RSI, RDX, RCX, R8, R9, XMM0–7
+
+\begin{table}[ht!]
+    \tikzmark{precallto}
+    \centering
+    \begin{tabular}{ r | >{\columncolor{YellowGreen}}c | l }
+        \multicolumn{1}{r}{RBP offset} & \multicolumn{1}{c}{Content} & \\
+        $\uparrow$ & \cellcolor{white} & \\
+        & \cellcolor{white} \dots \textit{beyond current stack} \dots & \\
+        \hhline{~-~} 
+        0 & \textit{Previous RSP} & $\leftarrow$ RBP \\
+        \hhline{~-~} 
+        \vdots & \dots~~\textit{local variables}~~\dots & \\
+        \hhline{~-~} 
+        -0x30 & 3rd arg & \\
+        \hhline{~|-|~}
+        -0x38 & 2nd arg & \\
+        \hhline{~-~}
+        -0x40 & 1st arg & \\
+        \hhline{~-~} 
+        \vdots & \dots~~\textit{local variables}~~\dots & \\
+        \hhline{~-~} 
+        -0x60 & rdi & \\
+        \hhline{~-~} 
+        & \dots~~\textit{local variables}~~\dots & \\
+        \hhline{~-~} 
+        $RBP-RSP$ & \textit{unknown} & $\leftarrow$ RSP \\
+        \hhline{~-~} 
+        & \cellcolor{white} & \\
+        $\downarrow$ & \cellcolor{white} & \\
+    \end{tabular}
+\end{table}
+
+
+
+\cref{code::context::examples::func-prologue} shows \textit{sum}'s prologue.
+The corresponding epilogue is displayed in \cref{code::context::examples::func-epilogue}. 
+The comments explain the code line by line, please read them to understand what exactly happens at each instruction.
+
+\begin{listing}[ht!]
+\begin{minted}[escapeinside=??,linenos=false,breaklines=true]{nasm}
+$7490: push ?\tikzmark{prologuestart}?  rbp       ; save the stack-frame pointer on the stack
+$7491: mov    rbp,rsp   ; set the stack-frame base pointer from the stack pointer
+$7494: sub    rsp,0x50  ; allocate 0x50 Bytes for arguments and local variables
+$7498: mov    QWORD PTR [rbp-0x30],rdi ; copy 1st arg onto stack 
+$749c: mov    QWORD PTR [rbp-0x28],rsi ; copy 2nd arg onto stack 
+$74a0: mov    QWORD PTR [rbp-0x20],rdx ; copy 3rd arg onto stack  
+\end{minted}
+\caption{Function Prologue with three Arguments}
+\label{code::context::examples::func-prologue}
+\end{listing}
+
+\begin{tikzpicture}[remember picture]
+    \draw[overlay,red,thick,dashed] (pic cs:precallto) circle [radius=7pt] node { \textbf{1} };
+    \draw[overlay,red,thick,dashed] (pic cs:prologuestart) circle [radius=7pt] node { \textbf{1} };
+\end{tikzpicture}
+
+\begin{listing}[ht!]
+\begin{minted}[linenos=true,breaklines=true]{nasm}
+$74ee: mov    rax,QWORD PTR [rbp-0x48] ; store return value in RAX
+$74f2: add    rsp,0x50                 ; set stack pointer to where stack-frame pointer was stored
+$74f6: pop    rbp                      ; restore the stack-frame pointer
+$74f7: ret                             ; return to the caller, following the address on the stack
+\end{minted}
+\caption{Function Epilogue}
+\label{code::context::examples::func-epilogue}
+\end{listing}
+
+\cref{fig:proc-call-example-mem} displays 
+
+\begin{figure}
+\centering
+\includegraphics[width=0.95\textwidth,]{gfx/call-procedure-memory-content.png}
+\caption{Memory Layout Throughout The Procedure Call Steps}
+\label{fig:proc-call-example-mem}
+\end{figure}
+
+\section{Interrupt Driven Preemptive Context Switches on \glsentrytext{amd64}}
+\label{rnd::sysprog-conventions::ir-driven-preemptive-cs-amd64}
+On \gls{amd64}, the \gls{CPU}'s interrupt mechanism does not switch the full context described previously, but only handles the registers that are necessary to successfully jump to the interrupt function: RFLAGS, RSP, RBP, RIP\footnote{Segment registers are neglected}.
+
+In this scenario, the context is stored on the \gls{stack} of the function that is interrupted.
+\Cref{fig:amd64-long-mode-interrupt-stac} pictures the \gls{stack} layout on interrupt entry.
+In order to leverage an interrupt for a context switch, the interrupt function needs to replace these values on the \gls{stack} with values for the new context.
+CS (Code-Segment) and SS (Stack-Segment) have no effect in \gls{amd64} 64-Bit mode\cite[p.~20]{AMD64Vol1} and can remain unchanged.
+The \gls{OS} developer needs to know the exact address where on the \gls{stack} this data structure has been pushed by the \gls{CPU}, and must then manipulate these addresses directly.
+This type of manipulation is inherently dangerous and can not be easily checked by the \gls{compiler}.
+The function that handles the interrupt must then use the instruction \textit{iretq}\cite[p.~252]{AMD64Vol2}, to make the \gls{CPU} restore the partial context from the \gls{stack} and continue to function pointed to by the RIP.
+
+
+\begin{figure}
+\centering
+\includegraphics[width=0.8\textwidth]{gfx/amd64-long-mode-stack-after-interrupt.png}
+\caption{Long-Mode Stack After Interrupt\cite[p.~252]{AMD64Vol2}}
+\label{fig:amd64-long-mode-interrupt-stac}
+\end{figure}
+
+For a full context-switch, the other registers that are part of the context need to be handled by the \gls{OS}'s interrupt function.
+
+
 \chapter{Research Questions}

 Setting up and maintaining the paging-structure, as well as allocating physical memory for the virtual pages is a complex task in the \gls{OS}.
 Developing this part of the \gls{OS} is error-prone, and is not well-supported by mainstream \glspl{proglang}.

-\subsection{Definition Of Additional Analysis Rules To Extend Safety Checks}
+\section{Definition Of Additional Analysis Rules To Extend Safety Checks}
 % TODO: How can Business Logical
 % Examples: 
 % TLB needs to be reset on Task Change
@ -36,18 +308,19 @@ Developing this part of the \gls{OS} is error-prone, and is not well-supported b

 \chapter{Porting \glsentrytext{C} Vulnerabilities}
 \label{rnd::porting-c-vulns}
-In this chapter, the examples from \autoref{TODO} ported to \gls{Rust} for evaluation.
+In this chapter, the weakness manifestations from \cref{context::common-mem-safety-mistakes::manifestations} are rewritten in \gls{Rust} to learn to what level they are mitigated just by porting them.

 \chapter{\glsentrytext{LX} Modules Written In \glsentrytext{Rust}}

 \chapter{Existing \glsentrytext{OS}-Development Projects Based On Rust}
-\label{rnd::existing-os-in-rust}
+\label{rnd::existing-os-dev-wity-rust}

 \section{Libraries}

 \subsection{Libfringe}
 % TODO: https://github.com/edef1c/libfringe

+
 \section{Systems}
 \subsection{intermezzOS}
 \subsection{Blog OS}
@ -55,6 +328,7 @@ In this chapter, the examples from \autoref{TODO} ported to \gls{Rust} for evalu
 \subsection{Tock}
 
 \chapter{\glsentrytext{imezzos}: Adding Preemptive \glsentrytext{OS}-Level Multitasking}
+\label{rnd::imezzos-preemptive-multitasking}

 \section{Timed Interrupts For Scheduling and Dispatching}

--- a/src/docs/thesis.bib
+++ b/src/docs/thesis.bib
@ -3,192 +3,6 @@ Any changes to this file will be lost if it is regenerated by Mendeley.

 BibTeX export options can be customized via Options -> BibTeX in Mendeley Desktop

-@article{Getreu2016,
-annote = {- runtime checkis are expensive 
-
- critical with energy restriction on the target device},
-author = {Getreu, Jens},
-file = {:home/steveej/src/github/steveej/msc-thesis/docs/Embedded System Security with Rust - Case Study of Heartbleed.pdf:pdf},
-pages = {1--24},
-title = {{Embedded System Security with Rust}},
-year = {2016}
-}
-@article{Xu2015,
-abstract = {Since vulnerabilities in Linux kernel are on the increase, attackers have turned their interests into related exploitation techniques. However, compared with numerous researches on exploiting use-after-free vulnerabilities in the user applications, few efforts studied how to exploit use-after-free vulnerabilities in Linux kernel due to the difficulties that mainly come from the uncertainty of the kernel memory layout. Without specific information leakage, attackers could only conduct a blind memory overwriting strategy trying to corrupt the critical part of the kernel, for which the success rate is negligible. In this work, we present a novel memory collision strategy to exploit the use-after-free vulnerabilities in Linux kernel reliably. The insight of our exploit strategy is that a probabilistic memory collision can be constructed according to the widely deployed kernel memory reuse mechanisms, which significantly increases the success rate of the attack. Based on this insight, we present two practical memory collision attacks: An object-based attack that leverages the memory recycling mechanism of the kernel allocator to achieve freed vulnerable object covering, and a physmap-based attack that takes advantage of the overlap between the physmap and the SLAB caches to achieve a more flexible memory manipulation. Our proposed attacks are universal for various Linux kernels of different architectures and could successfully exploit systems with use-after-free vulnerabilities in kernel. Particularly, we achieve privilege escalation on various popular Android devices (kernel version{\textgreater}=4.3) including those with 64-bit processors by exploiting the CVE-2015-3636 use-after-free vulnerability in Linux kernel. To our knowledge, this is the first generic kernel exploit for the latest version of Android. Finally, to defend this kind of memory collision, we propose two corresponding mitigation schemes.},
-author = {Xu, Wen and Li, Juanru and Shu, Junliang and Yang, Wenbo and Xie, Tianyi and Zhang, Yuanyuan and Gu, Dawu},
-doi = {10.1145/2810103.2813637},
-file = {:home/steveej/src/github/steveej/msc-thesis/docs/From Collision To Exploitation$\backslash$: Unleashing Use-After-Free Vulnerabilities in Linux Kernel.pdf:pdf},
-isbn = {978-1-4503-3832-5},
-issn = {15437221},
-journal = {Ccs},
-keywords = {linux kernel exploit,memory collision,user-after-free vulnerability},
-pages = {414--425},
-title = {{From Collision To Exploitation: Unleashing Use-After-Free Vulnerabilities in Linux Kernel}},
-url = {http://dl.acm.org/citation.cfm?doid=2810103.2813637},
-year = {2015}
-}
-@inproceedings{Ma2013,
-abstract = {—Aiming at the problem of higher memory consumption and lower execution efficiency during the dynamic detecting to C/C++ programs memory vulnerabilities, this paper presents a dynamic detection method called ISC. The ISC improves the Safe-C using pointer analysis technology. Firstly, the ISC defines a simple and efficient fat pointer representation instead of the safe pointer in the Safe-C. Furthermore, the ISC uses the unification-based analysis algorithm with one level flow static pointer. This identification reduces the number of pointers that need to be converted to fat pointers. Then in the process of program running, the ISC detects memory vulnerabilities through constantly inspecting the attributes of fat pointers. Experimental results indicate that the ISC could detect memory vulnerabilities such as buffer overflows and dangling pointers. Comparing with the Safe-C, the ISC dramatically reduces the memory consumption and lightly improves the execution efficiency.},
-author = {Ma, Rui and Chen, Lingkui and Hu, Changzhen and Xue, Jingfeng and Zhao, Xiaolin},
-booktitle = {Proceedings - 2013 IEEE 11th International Conference on Dependable, Autonomic and Secure Computing, DASC 2013},
-doi = {10.1109/DASC.2013.37},
-file = {:home/steveej/src/github/steveej/msc-thesis/docs/A Dynamic Detection Method to C-C++ Programs Memory Vulnerabilities Based on Pointer Analysis.pdf:pdf},
-isbn = {9781479933815},
-keywords = {dynamic detecting,fat pointer,improved Safe-C,memory vulnerability,pointer analysis},
-pages = {52--57},
-title = {{A dynamic detection method to C/C++ programs memory vulnerabilities based on pointer analysis}},
-year = {2013}
-}
-@misc{Endler,
-author = {Endler, Matthias},
-title = {{A curated list of static analysis tools, linters and code quality checkers for various programming languages}},
-url = {https://github.com/mre/awesome-static-analysis}
-}
-@misc{MITRE-CWE,
-author = {MITRE},
-title = {{CWE - Common Weakness Enumeration}},
-url = {http://cwe.mitre.org},
-urldate = {2017-08-31},
-year = {2017}
-}
-@article{Corporation2011,
-abstract = {The Intel{\{}$\backslash$textregistered{\}} 64 and IA-32 Architectures Software Developer's Manual, Volume 1, describes the basic architecture and programming environment of Intel 64 and IA-32 processors. The Intel{\{}$\backslash$textregistered{\}} 64 and IA-32 Architectures Software Developer's Manual, Volumes 2A {\&} 2B, describe the instruction set of the processor and the opcode struc- ture. These volumes apply to application programmers and to programmers who write operating systems or executives. The Intel{\{}$\backslash$textregistered{\}} 64 and IA-32 Architectures Software Developer's Manual, Volumes 3A {\&} 3B, describe the operating-system support environment of Intel 64 and IA-32 processors. These volumes target operating- system and BIOS designers. In addition, the Intel{\{}$\backslash$textregistered{\}} 64 and IA-32 Architectures Software Developer's Manual, Volume 3B, addresses the programming environment for classes of software that host operating systems.},
-author = {Corporation, Intel},
-doi = {10.1109/MAHC.2010.22},
-file = {:home/steveej/src/github/steveej/msc-thesis/docs/64-ia-32-architectures-software-developer-system-programming-manual-325384.pdf:pdf},
-isbn = {253665-057US},
-issn = {15222594},
-journal = {System},
-keywords = {253665,IA-32 architecture,Intel 64},
-number = {253665},
-title = {{Intel {\textregistered} 64 and IA-32 Architectures Software Developer ' s Manual Volume 3}},
-volume = {3},
-year = {2011}
-}
-@article{Nilsson2017,
-author = {Nilsson, Fredrik},
-file = {:home/steveej/src/github/steveej/msc-thesis/docs/A Rust-based Runtime for the Internet of Things.pdf:pdf},
-title = {{A Rust-based Runtime for the Internet of Things}},
-year = {2017}
-}
-@article{Szekeres2013,
-abstract = {Memory corruption bugs in software written in low-level languages like C or C++ are one of the oldest problems in computer security. The lack of safety in these languages allows attackers to alter the program's behavior or take full control over it by hijacking its control flow. This problem has existed for more than 30 years and a vast number of potential solutions have been proposed, yet memory corruption attacks continue to pose a serious threat. Real world exploits show that all currently deployed protections can be defeated. This paper sheds light on the primary reasons for this by describing attacks that succeed on today's systems. We systematize the current knowledge about various protection techniques by setting up a general model for memory corrup- tion attacks. Using this model we show what policies can stop which attacks. The model identifies weaknesses of currently deployed techniques, as well as other proposed protections enforcing stricter policies. We analyze the reasons why protection mechanisms imple- menting stricter polices are not deployed. To achieve wide adoption, protection mechanisms must support a multitude of features and must satisfy a host of requirements. Especially important is performance, as experience shows that only solutions whose overhead is in reasonable bounds get deployed. A comparison of different enforceable policies helps de- signers of new protection mechanisms in finding the balance between effectiveness (security) and efficiency.We identify some open research problems, and provide suggestions on improving the adoption of newer techniques.},
-author = {Szekeres, L??szl?? and Payer, Mathias and Wei, Tao and Song, Dawn},
-doi = {10.1109/SP.2013.13},
-file = {:home/steveej/src/github/steveej/msc-thesis/docs/SoK$\backslash$: Eternal War in Memory.pdf:pdf},
-isbn = {9780769549774},
-issn = {10816011},
-journal = {Proceedings - IEEE Symposium on Security and Privacy},
-pages = {48--62},
-title = {{SoK: Eternal war in memory}},
-year = {2013}
-}
-@article{Reed2015,
-abstract = {Rust is a new systems language that uses some advanced type system features, specifically affine types and regions, to statically guarantee memory safety and eliminate the need for a garbage collector. While each individual addition to the type system is well understood in isolation and are known to be sound, the combined system is not known to be sound. Furthermore, Rust uses a novel checking scheme for its regions, known as the Borrow Checker, that is not known to be correct. Since Rust's goal is to be a safer alternative to C/C++, we should ensure that this safety scheme actually works. We present a formal semantics that captures the key features relevant to memory safety, unique pointers and borrowed references, specifies how they guarantee memory safety, and describes the operation of the Borrow Checker. We use this model to prove the soudness of some core operations and justify the conjecture that the model, as a whole, is sound. Additionally, our model provides a syntactic version of the Borrow Checker, which may be more understandable than the non-syntactic version in Rust.},
-author = {Reed, Eric},
-file = {:home/steveej/src/github/steveej/msc-thesis/docs/Patina$\backslash$:  A Formalization of the Rust Programming Language.pdf:pdf},
-number = {February},
-pages = {1--37},
-title = {{Patina: A Formalization of the Rust Programming Language}},
-year = {2015}
-}
-@book{AMD64Vol1,
-author = {AMD},
-file = {:home/steveej/src/github/steveej/msc-thesis/docs/AMD64 Architecture Programmer's Manual Volume 1$\backslash$: Application Programming.pdf:pdf},
-keywords = {AMD64,SIMD,extended media instructions,legacy m},
-number = {26568},
-title = {{AMD64 Architecture Programmer's Manual Volume 1: Application Programming}},
-volume = {4},
-year = {2012}
-}
-@misc{MITRE-CWE-119,
-author = {MITRE},
-booktitle = {2.11},
-title = {{CWE-119: Improper Restriction of Operations within the Bounds of a Memory Buffer}},
-url = {http://cwe.mitre.org/data/definitions/119.html},
-urldate = {2017-08-31},
-year = {2017}
-}
-@article{Backus1962,
-abstract = {The report gives a defining description of the programming language Scheme. Scheme is a statically scoped and properly tail-recursive dialect of the Lisp programming language invented by Guy Lewis Steele, Jr. and Gerald Jay Sussman. It was designed to have an exceptionally clear and simple semantics and few different ways to form expressions. A wide variety of programming paradigms, including imperative, functional, and message passing styles, find convenient expression in Scheme.},
-author = {Backus, J. W. and Bauer, F. L. and Green, J. and Katz, C. and McCarthy, J. and Naur, P. and Perlis, A. J. and Rutishauser, H. and Samelson, K. and Vauquois, B. and Wegstein, J. H. and van Wijngaarden, A. and Woodger, M. and van der Poel, W. L.},
-doi = {10.1007/BF01386340},
-file = {:home/steveej/src/github/steveej/msc-thesis/docs/Revised report on the algorithmic language Algol 60.pdf:pdf},
-isbn = {9780521193993},
-issn = {0029599X},
-journal = {Numerische Mathematik},
-number = {1},
-pages = {420--453},
-title = {{Revised report on the algorithmic language Algol 60}},
-volume = {4},
-year = {1962}
-}
-@book{AMD64Vol2,
-author = {AMD},
-file = {:home/steveej/src/github/steveej/msc-thesis/docs/AMD64 Architecture Programmer's Manual Volume 2$\backslash$: System Programming.pdf:pdf},
-keywords = {24593,AMD64 Architecture Programmer's Manual Volume 2: S},
-number = {24592},
-title = {{AMD64 Architecture Programmer's Manual Volume 2: System Programming}},
-volume = {1},
-year = {2012}
-}
-@article{Corporation2011a,
-abstract = {The Intel{\{}$\backslash$textregistered{\}} 64 and IA-32 Architectures Software Developer's Manual, Volume 1, describes the basic architecture and programming environment of Intel 64 and IA-32 processors. The Intel{\{}$\backslash$textregistered{\}} 64 and IA-32 Architectures Software Developer's Manual, Volumes 2A {\&} 2B, describe the instruction set of the processor and the opcode struc- ture. These volumes apply to application programmers and to programmers who write operating systems or executives. The Intel{\{}$\backslash$textregistered{\}} 64 and IA-32 Architectures Software Developer's Manual, Volumes 3A {\&} 3B, describe the operating-system support environment of Intel 64 and IA-32 processors. These volumes target operating- system and BIOS designers. In addition, the Intel{\{}$\backslash$textregistered{\}} 64 and IA-32 Architectures Software Developer's Manual, Volume 3B, addresses the programming environment for classes of software that host operating systems.},
-author = {Corporation, Intel},
-doi = {10.1109/MAHC.2010.22},
-file = {:home/steveej/src/github/steveej/msc-thesis/docs/64-ia-32-architectures-software-developer-vol-1-manual.pdf:pdf},
-isbn = {253665-057US},
-issn = {15222594},
-journal = {System},
-keywords = {253665,64,ia 32 architecture},
-number = {253665},
-title = {{Intel {\textregistered} 64 and IA-32 Architectures Software Developer ' s Manual Volume 1}},
-volume = {1},
-year = {2011}
-}
-@article{Caballero2012,
-abstract = {Use-after-free vulnerabilities are rapidly growing in popularity, especially for exploiting web browsers. Use-after-free (and double-free) vulnerabilities are caused by a program operating on a dangling pointer. In this work we propose early detection, a novel runtime approach for finding and diagnosing use-after-free and double-free vulnerabilities. While previous work focuses on the creation of the vulnerability (i.e., the use of a dangling pointer), early detection shifts the focus to the creation of the dangling pointer(s) at the root of the vulnerability. Early detection increases the effectiveness of testing by identifying unsafe dangling pointers in executions where they are created but not used. It also accelerates vulnerability analysis and minimizes the risk of incomplete fixes, by automatically collecting information about all dangling pointers involved in the vulnerability. We implement our early detection technique in a tool called Undangle. We evaluate Undangle for vulnerability analysis on 8 real-world vulnerabilities. The analysis uncovers that two separate vulnerabilities in Firefox had a common root cause and that their patches did not completely fix the underlying bug. We also evaluate Undangle for testing on the Firefox web browser identifying a potential vulnerability.},
-author = {Caballero, Juan and Grieco, Gustavo and Marron, Mark and Nappa, Antonio},
-doi = {10.1145/2338965.2336769},
-isbn = {9781450314541},
-issn = {1450314546},
-journal = {ISSTA},
-keywords = {automated testing,binary analysis,debugging,dynamic analysis},
-pages = {133},
-title = {{Undangle: early detection of dangling pointers in use-after-free and double-free vulnerabilities}},
-url = {http://dl.acm.org/citation.cfm?doid=2338965.2336769},
-year = {2012}
-}
-@article{Balasubramanian2017,
-abstract = {Rust is a new system programming language that offers a practical and safe alternative to C. Rust is unique in that it enforces safety without runtime overhead, most importantly, without the overhead of garbage collection. While zero-cost safety is remarkable on its own, we argue that the super-powers of Rust go beyond safety. In particular, Rust's linear type system enables capabilities that cannot be implemented efficiently in traditional languages, both safe and unsafe, and that dramatically improve security and reliability of system software. We show three examples of such capabilities: zero-copy software fault isolation, efficient static information flow analysis, and automatic checkpointing. While these capabilities have been in the spotlight of systems research for a long time, their practical use is hindered by high cost and complexity. We argue that with the adoption of Rust these mechanisms will become commoditized.},
-author = {Balasubramanian, Abhiram and Baranowski, Marek S and Burtsev, Anton and Irvine, Uc and Rakamari, Zvonimir and Ryzhyk, Leonid and Research, Vmware},
-file = {:home/steveej/src/github/steveej/msc-thesis/docs/DRAFT$\backslash$: System Programming in Rust$\backslash$: Beyond Safety.pdf:pdf},
-title = {{DRAFT: System Programming in Rust: Beyond Safety}},
-year = {2017}
-}
-@inproceedings{Kuznetsov2014,
-abstract = {Systems code is often written in low-level languages like C/C++, which offer many benefits but also dele- gate memory management to programmers. This invites memory safety bugs that attackers can exploit to divert control flow and compromise the system. Deployed de- fense mechanisms (e.g., ASLR, DEP) are incomplete, and stronger defense mechanisms (e.g., CFI) often have high overhead and limited guarantees [19, 15, 9]. We introduce code-pointer integrity (CPI), a new de- sign point that guarantees the integrity of all code point- ers in a program (e.g., function pointers, saved return ad- dresses) and thereby prevents all control-flow hijack at- tacks, including return-oriented programming. We also introduce code-pointer separation (CPS), a relaxation of CPI with better performance properties. CPI and CPS offer substantially better security-to-overhead ratios than the state of the art, they are practical (we protect a complete FreeBSD system and over 100 packages like apache and postgresql), effective (prevent all attacks in the RIPE benchmark), and efficient: on SPEC CPU2006, CPS averages 1.2{\%} overhead for C and 1.9{\%} for C/C++, while CPI's overhead is 2.9{\%} for C and 8.4{\%} for C/C++. A prototype implementation of CPI and CPS can be obtained from http://levee.epfl.ch. 1},
-author = {Kuznetsov, Volodymyr and Szekeres, L{\'{a}}szl{\'{o}} and Payer, Mathias},
-booktitle = {Proceedings of the 11th USENIX Symposium on Operating Systems Design and Implementation},
-isbn = {9781931971164},
-pages = {147--163},
-title = {{Code-pointer integrity}},
-url = {https://www.usenix.org/conference/osdi14/technical-sessions/presentation/kuznetsov{\%}5Cnhttps://www.usenix.org/system/files/conference/osdi14/osdi14-paper-kuznetsov.pdf?utm{\_}source=dlvr.it{\&}utm{\_}medium=tumblr},
-year = {2014}
-}
-@article{Levy2015a,
-abstract = {Rust, a new systems programming language, provides compile-time memory safety checks to help eliminate runtime bugs that manifest from improper memory management. This feature is advantageous for operating system development, and especially for embedded OS development, where recovery and debugging are particularly challenging. However, embedded platforms are highly event-based, and Rust's memory safety mechanisms largely presume threads. In our experience developing an operating system for embedded systems in Rust, we have found that Rust's ownership model prevents otherwise safe resource sharing common in the embedded domain, conflicts with the reality of hardware resources, and hinders using closures for programming asynchronously. We describe these experiences and how they relate to memory safety as well as illustrate our workarounds that preserve the safety guarantees to the largest extent possible. In addition, we draw from our experience to propose a new language extension to Rust that would enable it to provide better memory safety tools for event-driven platforms.},
-author = {Levy, Amit and Andersen, Michael P. and Campbell, Bradford and Culler, David and Dutta, Prabal and Ghena, Branden and Levis, Philip and Pannuto, Pat},
-doi = {10.1145/2818302.2818306},
-file = {:home/steveej/src/github/steveej/msc-thesis/docs/tock-plos2015.pdf:pdf},
-isbn = {9781450339421},
-journal = {PLOS: Workshop on Programming Languages and Operating Systems},
-keywords = {embedded operating systems,linear types,ownership,rust},
-pages = {21--26},
-title = {{Ownership is Theft: Experiences Building an Embedded OS in Rust}},
-url = {http://dl.acm.org/citation.cfm?id=2818302.2818306},
-year = {2015}
-}
@article{Lattner2005,
 abstract = {The LLVM Compiler Infrastructure (http://llvm.cs. uiuc.edu) is a$\backslash$nrobust system that is well suited for a wide variety of research$\backslash$nand development work. This brief paper introduces the LLVM system$\backslash$nand provides pointers to more extensive documentation, complementing$\backslash$nthe tutorial presented at LCPC.},
 archivePrefix = {arXiv},
@ -207,6 +21,33 @@ title = {{The LLVM Compiler Framework and Infrastructure Tutorial}},
 url = {http://dx.doi.org/10.1007/11532378{\_}2},
 year = {2005}
 }
+@article{Levy2015a,
+abstract = {Rust, a new systems programming language, provides compile-time memory safety checks to help eliminate runtime bugs that manifest from improper memory management. This feature is advantageous for operating system development, and especially for embedded OS development, where recovery and debugging are particularly challenging. However, embedded platforms are highly event-based, and Rust's memory safety mechanisms largely presume threads. In our experience developing an operating system for embedded systems in Rust, we have found that Rust's ownership model prevents otherwise safe resource sharing common in the embedded domain, conflicts with the reality of hardware resources, and hinders using closures for programming asynchronously. We describe these experiences and how they relate to memory safety as well as illustrate our workarounds that preserve the safety guarantees to the largest extent possible. In addition, we draw from our experience to propose a new language extension to Rust that would enable it to provide better memory safety tools for event-driven platforms.},
+author = {Levy, Amit and Andersen, Michael P. and Campbell, Bradford and Culler, David and Dutta, Prabal and Ghena, Branden and Levis, Philip and Pannuto, Pat},
+doi = {10.1145/2818302.2818306},
+file = {:home/steveej/src/github/steveej/msc-thesis/docs/tock-plos2015.pdf:pdf},
+isbn = {9781450339421},
+journal = {PLOS: Workshop on Programming Languages and Operating Systems},
+keywords = {embedded operating systems,linear types,ownership,rust},
+pages = {21--26},
+title = {{Ownership is Theft: Experiences Building an Embedded OS in Rust}},
+url = {http://dl.acm.org/citation.cfm?id=2818302.2818306},
+year = {2015}
+}
+@article{Corporation2011a,
+abstract = {The Intel{\{}$\backslash$textregistered{\}} 64 and IA-32 Architectures Software Developer's Manual, Volume 1, describes the basic architecture and programming environment of Intel 64 and IA-32 processors. The Intel{\{}$\backslash$textregistered{\}} 64 and IA-32 Architectures Software Developer's Manual, Volumes 2A {\&} 2B, describe the instruction set of the processor and the opcode struc- ture. These volumes apply to application programmers and to programmers who write operating systems or executives. The Intel{\{}$\backslash$textregistered{\}} 64 and IA-32 Architectures Software Developer's Manual, Volumes 3A {\&} 3B, describe the operating-system support environment of Intel 64 and IA-32 processors. These volumes target operating- system and BIOS designers. In addition, the Intel{\{}$\backslash$textregistered{\}} 64 and IA-32 Architectures Software Developer's Manual, Volume 3B, addresses the programming environment for classes of software that host operating systems.},
+author = {Corporation, Intel},
+doi = {10.1109/MAHC.2010.22},
+file = {:home/steveej/src/github/steveej/msc-thesis/docs/64-ia-32-architectures-software-developer-vol-1-manual.pdf:pdf},
+isbn = {253665-057US},
+issn = {15222594},
+journal = {System},
+keywords = {253665,64,ia 32 architecture},
+number = {253665},
+title = {{Intel {\textregistered} 64 and IA-32 Architectures Software Developer ' s Manual Volume 1}},
+volume = {1},
+year = {2011}
+}
@misc{IEEEspectrum-proglangs,
 author = {IEEE},
 title = {{Interactive: The Top Programming Languages 2017}},
@ -214,6 +55,64 @@ url = {https://spectrum.ieee.org/static/interactive-the-top-programming-language
 urldate = {2017-09-08},
 year = {2017}
 }
+@article{Xu2015,
+abstract = {Since vulnerabilities in Linux kernel are on the increase, attackers have turned their interests into related exploitation techniques. However, compared with numerous researches on exploiting use-after-free vulnerabilities in the user applications, few efforts studied how to exploit use-after-free vulnerabilities in Linux kernel due to the difficulties that mainly come from the uncertainty of the kernel memory layout. Without specific information leakage, attackers could only conduct a blind memory overwriting strategy trying to corrupt the critical part of the kernel, for which the success rate is negligible. In this work, we present a novel memory collision strategy to exploit the use-after-free vulnerabilities in Linux kernel reliably. The insight of our exploit strategy is that a probabilistic memory collision can be constructed according to the widely deployed kernel memory reuse mechanisms, which significantly increases the success rate of the attack. Based on this insight, we present two practical memory collision attacks: An object-based attack that leverages the memory recycling mechanism of the kernel allocator to achieve freed vulnerable object covering, and a physmap-based attack that takes advantage of the overlap between the physmap and the SLAB caches to achieve a more flexible memory manipulation. Our proposed attacks are universal for various Linux kernels of different architectures and could successfully exploit systems with use-after-free vulnerabilities in kernel. Particularly, we achieve privilege escalation on various popular Android devices (kernel version{\textgreater}=4.3) including those with 64-bit processors by exploiting the CVE-2015-3636 use-after-free vulnerability in Linux kernel. To our knowledge, this is the first generic kernel exploit for the latest version of Android. Finally, to defend this kind of memory collision, we propose two corresponding mitigation schemes.},
+author = {Xu, Wen and Li, Juanru and Shu, Junliang and Yang, Wenbo and Xie, Tianyi and Zhang, Yuanyuan and Gu, Dawu},
+doi = {10.1145/2810103.2813637},
+file = {:home/steveej/src/github/steveej/msc-thesis/docs/From Collision To Exploitation$\backslash$: Unleashing Use-After-Free Vulnerabilities in Linux Kernel.pdf:pdf},
+isbn = {978-1-4503-3832-5},
+issn = {15437221},
+journal = {Ccs},
+keywords = {linux kernel exploit,memory collision,user-after-free vulnerability},
+pages = {414--425},
+title = {{From Collision To Exploitation: Unleashing Use-After-Free Vulnerabilities in Linux Kernel}},
+url = {http://dl.acm.org/citation.cfm?doid=2810103.2813637},
+year = {2015}
+}
+@article{Merity2016,
+abstract = {Recent neural network sequence models with softmax classifiers have achieved their best language modeling performance only with very large hidden states and large vocabularies. Even then they struggle to predict rare or unseen words even if the context makes the prediction unambiguous. We introduce the pointer sentinel mixture architecture for neural sequence models which has the ability to either reproduce a word from the recent context or produce a word from a standard softmax classifier. Our pointer sentinel-LSTM model achieves state of the art language modeling performance on the Penn Treebank (70.9 perplexity) while using far fewer parameters than a standard softmax LSTM. In order to evaluate how well language models can exploit longer contexts and deal with more realistic vocabularies and larger corpora we also introduce the freely available WikiText corpus.},
+archivePrefix = {arXiv},
+arxivId = {1609.07843},
+author = {Merity, Stephen and Xiong, Caiming and Bradbury, James and Socher, Richard},
+eprint = {1609.07843},
+journal = {Arxiv},
+title = {{Pointer Sentinel Mixture Models}},
+url = {http://arxiv.org/abs/1609.07843},
+year = {2016}
+}
+@inproceedings{Ma2013,
+abstract = {—Aiming at the problem of higher memory consumption and lower execution efficiency during the dynamic detecting to C/C++ programs memory vulnerabilities, this paper presents a dynamic detection method called ISC. The ISC improves the Safe-C using pointer analysis technology. Firstly, the ISC defines a simple and efficient fat pointer representation instead of the safe pointer in the Safe-C. Furthermore, the ISC uses the unification-based analysis algorithm with one level flow static pointer. This identification reduces the number of pointers that need to be converted to fat pointers. Then in the process of program running, the ISC detects memory vulnerabilities through constantly inspecting the attributes of fat pointers. Experimental results indicate that the ISC could detect memory vulnerabilities such as buffer overflows and dangling pointers. Comparing with the Safe-C, the ISC dramatically reduces the memory consumption and lightly improves the execution efficiency.},
+author = {Ma, Rui and Chen, Lingkui and Hu, Changzhen and Xue, Jingfeng and Zhao, Xiaolin},
+booktitle = {Proceedings - 2013 IEEE 11th International Conference on Dependable, Autonomic and Secure Computing, DASC 2013},
+doi = {10.1109/DASC.2013.37},
+file = {:home/steveej/src/github/steveej/msc-thesis/docs/A Dynamic Detection Method to C-C++ Programs Memory Vulnerabilities Based on Pointer Analysis.pdf:pdf},
+isbn = {9781479933815},
+keywords = {dynamic detecting,fat pointer,improved Safe-C,memory vulnerability,pointer analysis},
+pages = {52--57},
+title = {{A dynamic detection method to C/C++ programs memory vulnerabilities based on pointer analysis}},
+year = {2013}
+}
+@article{Corporation2011,
+abstract = {The Intel{\{}$\backslash$textregistered{\}} 64 and IA-32 Architectures Software Developer's Manual, Volume 1, describes the basic architecture and programming environment of Intel 64 and IA-32 processors. The Intel{\{}$\backslash$textregistered{\}} 64 and IA-32 Architectures Software Developer's Manual, Volumes 2A {\&} 2B, describe the instruction set of the processor and the opcode struc- ture. These volumes apply to application programmers and to programmers who write operating systems or executives. The Intel{\{}$\backslash$textregistered{\}} 64 and IA-32 Architectures Software Developer's Manual, Volumes 3A {\&} 3B, describe the operating-system support environment of Intel 64 and IA-32 processors. These volumes target operating- system and BIOS designers. In addition, the Intel{\{}$\backslash$textregistered{\}} 64 and IA-32 Architectures Software Developer's Manual, Volume 3B, addresses the programming environment for classes of software that host operating systems.},
+author = {Corporation, Intel},
+doi = {10.1109/MAHC.2010.22},
+file = {:home/steveej/src/github/steveej/msc-thesis/docs/64-ia-32-architectures-software-developer-system-programming-manual-325384.pdf:pdf},
+isbn = {253665-057US},
+issn = {15222594},
+journal = {System},
+keywords = {253665,IA-32 architecture,Intel 64},
+number = {253665},
+title = {{Intel {\textregistered} 64 and IA-32 Architectures Software Developer ' s Manual Volume 3}},
+volume = {3},
+year = {2011}
+}
+@article{Balasubramanian2017,
+abstract = {Rust is a new system programming language that offers a practical and safe alternative to C. Rust is unique in that it enforces safety without runtime overhead, most importantly, without the overhead of garbage collection. While zero-cost safety is remarkable on its own, we argue that the super-powers of Rust go beyond safety. In particular, Rust's linear type system enables capabilities that cannot be implemented efficiently in traditional languages, both safe and unsafe, and that dramatically improve security and reliability of system software. We show three examples of such capabilities: zero-copy software fault isolation, efficient static information flow analysis, and automatic checkpointing. While these capabilities have been in the spotlight of systems research for a long time, their practical use is hindered by high cost and complexity. We argue that with the adoption of Rust these mechanisms will become commoditized.},
+author = {Balasubramanian, Abhiram and Baranowski, Marek S and Burtsev, Anton and Irvine, Uc and Rakamari, Zvonimir and Ryzhyk, Leonid and Research, Vmware},
+file = {:home/steveej/src/github/steveej/msc-thesis/docs/DRAFT$\backslash$: System Programming in Rust$\backslash$: Beyond Safety.pdf:pdf},
+title = {{DRAFT: System Programming in Rust: Beyond Safety}},
+year = {2017}
+}
@article{Chisnall2015,
 abstract = {We propose a new memory-safe interpretation of the C ab-stract machine that provides stronger protection to benefit security and debugging. Despite ambiguities in the specifi-cation intended to provide implementation flexibility, con-temporary implementations of C have converged on a mem-ory model similar to the PDP-11, the original target for C. This model lacks support for memory safety despite well-documented impacts on security and reliability. Attempts to change this model are often hampered by as-sumptions embedded in a large body of existing C code, dat-ing back to the memory model exposed by the original C compiler for the PDP-11. Our experience with attempting to implement a memory-safe variant of C on the CHERI ex-perimental microprocessor led us to identify a number of problematic idioms. We describe these as well as their in-teraction with existing memory safety schemes and the as-sumptions that they make beyond the requirements of the C specification. Finally, we refine the CHERI ISA and abstract model for C, by combining elements of the CHERI capabil-ity model and fat pointers, and present a softcore CPU that implements a C abstract machine that can run legacy C code with strong memory protection guarantees.},
 author = {Chisnall, David and Rothwell, Colin and Watson, Robert N M and Woodruff, Jonathan and Vadera, Munraj and Moore, Simon W and Roe, Michael and Davis, Brooks and Neumann, Peter G},
@ -242,17 +141,97 @@ title = {{Memory safety without runtime checks or garbage collection}},
 volume = {38},
 year = {2003}
 }
-@article{Merity2016,
-abstract = {Recent neural network sequence models with softmax classifiers have achieved their best language modeling performance only with very large hidden states and large vocabularies. Even then they struggle to predict rare or unseen words even if the context makes the prediction unambiguous. We introduce the pointer sentinel mixture architecture for neural sequence models which has the ability to either reproduce a word from the recent context or produce a word from a standard softmax classifier. Our pointer sentinel-LSTM model achieves state of the art language modeling performance on the Penn Treebank (70.9 perplexity) while using far fewer parameters than a standard softmax LSTM. In order to evaluate how well language models can exploit longer contexts and deal with more realistic vocabularies and larger corpora we also introduce the freely available WikiText corpus.},
-archivePrefix = {arXiv},
-arxivId = {1609.07843},
-author = {Merity, Stephen and Xiong, Caiming and Bradbury, James and Socher, Richard},
-eprint = {1609.07843},
-journal = {Arxiv},
-title = {{Pointer Sentinel Mixture Models}},
-url = {http://arxiv.org/abs/1609.07843},
+@misc{MITRE-CWE,
+author = {MITRE},
+title = {{CWE - Common Weakness Enumeration}},
+url = {http://cwe.mitre.org},
+urldate = {2017-08-31},
+year = {2017}
+}
+@article{Affairs2015,
+author = {Affairs, Post Doctoral},
+file = {:home/steveej/src/steveej/msc-thesis/docs/You can't spell trust without Rust.pdf:pdf},
+title = {{YOU CAN ' T SPELL TRUST WITHOUT RUST alexis beingessner Master ' s in Computer Science Carleton University}},
+year = {2015}
+}
+@inproceedings{Kuznetsov2014,
+abstract = {Systems code is often written in low-level languages like C/C++, which offer many benefits but also dele- gate memory management to programmers. This invites memory safety bugs that attackers can exploit to divert control flow and compromise the system. Deployed de- fense mechanisms (e.g., ASLR, DEP) are incomplete, and stronger defense mechanisms (e.g., CFI) often have high overhead and limited guarantees [19, 15, 9]. We introduce code-pointer integrity (CPI), a new de- sign point that guarantees the integrity of all code point- ers in a program (e.g., function pointers, saved return ad- dresses) and thereby prevents all control-flow hijack at- tacks, including return-oriented programming. We also introduce code-pointer separation (CPS), a relaxation of CPI with better performance properties. CPI and CPS offer substantially better security-to-overhead ratios than the state of the art, they are practical (we protect a complete FreeBSD system and over 100 packages like apache and postgresql), effective (prevent all attacks in the RIPE benchmark), and efficient: on SPEC CPU2006, CPS averages 1.2{\%} overhead for C and 1.9{\%} for C/C++, while CPI's overhead is 2.9{\%} for C and 8.4{\%} for C/C++. A prototype implementation of CPI and CPS can be obtained from http://levee.epfl.ch. 1},
+author = {Kuznetsov, Volodymyr and Szekeres, L{\'{a}}szl{\'{o}} and Payer, Mathias},
+booktitle = {Proceedings of the 11th USENIX Symposium on Operating Systems Design and Implementation},
+isbn = {9781931971164},
+pages = {147--163},
+title = {{Code-pointer integrity}},
+url = {https://www.usenix.org/conference/osdi14/technical-sessions/presentation/kuznetsov{\%}5Cnhttps://www.usenix.org/system/files/conference/osdi14/osdi14-paper-kuznetsov.pdf?utm{\_}source=dlvr.it{\&}utm{\_}medium=tumblr},
+year = {2014}
+}
+@article{Caballero2012,
+abstract = {Use-after-free vulnerabilities are rapidly growing in popularity, especially for exploiting web browsers. Use-after-free (and double-free) vulnerabilities are caused by a program operating on a dangling pointer. In this work we propose early detection, a novel runtime approach for finding and diagnosing use-after-free and double-free vulnerabilities. While previous work focuses on the creation of the vulnerability (i.e., the use of a dangling pointer), early detection shifts the focus to the creation of the dangling pointer(s) at the root of the vulnerability. Early detection increases the effectiveness of testing by identifying unsafe dangling pointers in executions where they are created but not used. It also accelerates vulnerability analysis and minimizes the risk of incomplete fixes, by automatically collecting information about all dangling pointers involved in the vulnerability. We implement our early detection technique in a tool called Undangle. We evaluate Undangle for vulnerability analysis on 8 real-world vulnerabilities. The analysis uncovers that two separate vulnerabilities in Firefox had a common root cause and that their patches did not completely fix the underlying bug. We also evaluate Undangle for testing on the Firefox web browser identifying a potential vulnerability.},
+author = {Caballero, Juan and Grieco, Gustavo and Marron, Mark and Nappa, Antonio},
+doi = {10.1145/2338965.2336769},
+isbn = {9781450314541},
+issn = {1450314546},
+journal = {ISSTA},
+keywords = {automated testing,binary analysis,debugging,dynamic analysis},
+pages = {133},
+title = {{Undangle: early detection of dangling pointers in use-after-free and double-free vulnerabilities}},
+url = {http://dl.acm.org/citation.cfm?doid=2338965.2336769},
+year = {2012}
+}
+@book{AMD64Vol2,
+author = {AMD},
+file = {:home/steveej/src/github/steveej/msc-thesis/docs/AMD64 Architecture Programmer's Manual Volume 2$\backslash$: System Programming.pdf:pdf},
+keywords = {24593,AMD64 Architecture Programmer's Manual Volume 2: S},
+number = {24592},
+title = {{AMD64 Architecture Programmer's Manual Volume 2: System Programming}},
+volume = {1},
+year = {2012}
+}
+@article{Nilsson2017,
+author = {Nilsson, Fredrik},
+file = {:home/steveej/src/github/steveej/msc-thesis/docs/A Rust-based Runtime for the Internet of Things.pdf:pdf},
+title = {{A Rust-based Runtime for the Internet of Things}},
+year = {2017}
+}
+@article{Szekeres2013,
+abstract = {Memory corruption bugs in software written in low-level languages like C or C++ are one of the oldest problems in computer security. The lack of safety in these languages allows attackers to alter the program's behavior or take full control over it by hijacking its control flow. This problem has existed for more than 30 years and a vast number of potential solutions have been proposed, yet memory corruption attacks continue to pose a serious threat. Real world exploits show that all currently deployed protections can be defeated. This paper sheds light on the primary reasons for this by describing attacks that succeed on today's systems. We systematize the current knowledge about various protection techniques by setting up a general model for memory corrup- tion attacks. Using this model we show what policies can stop which attacks. The model identifies weaknesses of currently deployed techniques, as well as other proposed protections enforcing stricter policies. We analyze the reasons why protection mechanisms imple- menting stricter polices are not deployed. To achieve wide adoption, protection mechanisms must support a multitude of features and must satisfy a host of requirements. Especially important is performance, as experience shows that only solutions whose overhead is in reasonable bounds get deployed. A comparison of different enforceable policies helps de- signers of new protection mechanisms in finding the balance between effectiveness (security) and efficiency.We identify some open research problems, and provide suggestions on improving the adoption of newer techniques.},
+author = {Szekeres, L??szl?? and Payer, Mathias and Wei, Tao and Song, Dawn},
+doi = {10.1109/SP.2013.13},
+file = {:home/steveej/src/github/steveej/msc-thesis/docs/SoK$\backslash$: Eternal War in Memory.pdf:pdf},
+isbn = {9780769549774},
+issn = {10816011},
+journal = {Proceedings - IEEE Symposium on Security and Privacy},
+pages = {48--62},
+title = {{SoK: Eternal war in memory}},
+year = {2013}
+}
+@book{AMD64Vol1,
+author = {AMD},
+file = {:home/steveej/src/github/steveej/msc-thesis/docs/AMD64 Architecture Programmer's Manual Volume 1$\backslash$: Application Programming.pdf:pdf},
+keywords = {AMD64,SIMD,extended media instructions,legacy m},
+number = {26568},
+title = {{AMD64 Architecture Programmer's Manual Volume 1: Application Programming}},
+volume = {4},
+year = {2012}
+}
+@article{Getreu2016,
+annote = {- runtime checkis are expensive 
+
+- critical with energy restriction on the target device},
+author = {Getreu, Jens},
+file = {:home/steveej/src/github/steveej/msc-thesis/docs/Embedded System Security with Rust - Case Study of Heartbleed.pdf:pdf},
+pages = {1--24},
+title = {{Embedded System Security with Rust}},
 year = {2016}
 }
+@article{Reed2015,
+abstract = {Rust is a new systems language that uses some advanced type system features, specifically affine types and regions, to statically guarantee memory safety and eliminate the need for a garbage collector. While each individual addition to the type system is well understood in isolation and are known to be sound, the combined system is not known to be sound. Furthermore, Rust uses a novel checking scheme for its regions, known as the Borrow Checker, that is not known to be correct. Since Rust's goal is to be a safer alternative to C/C++, we should ensure that this safety scheme actually works. We present a formal semantics that captures the key features relevant to memory safety, unique pointers and borrowed references, specifies how they guarantee memory safety, and describes the operation of the Borrow Checker. We use this model to prove the soudness of some core operations and justify the conjecture that the model, as a whole, is sound. Additionally, our model provides a syntactic version of the Borrow Checker, which may be more understandable than the non-syntactic version in Rust.},
+author = {Reed, Eric},
+file = {:home/steveej/src/github/steveej/msc-thesis/docs/Patina$\backslash$:  A Formalization of the Rust Programming Language.pdf:pdf},
+number = {February},
+pages = {1--37},
+title = {{Patina: A Formalization of the Rust Programming Language}},
+year = {2015}
+}
@misc{MITRE-CWE-633,
 author = {MITRE},
 title = {{CWE-633: Weaknesses that Affect Memory}},
@ -260,6 +239,11 @@ url = {http://cwe.mitre.org/data/definitions/633.html},
 urldate = {2017-08-31},
 year = {2017}
 }
+@misc{Endler,
+author = {Endler, Matthias},
+title = {{A curated list of static analysis tools, linters and code quality checkers for various programming languages}},
+url = {https://github.com/mre/awesome-static-analysis}
+}
@article{Arpaci-Dusseau2015,
 abstract = {A book covering the fundamentals of operating systems, including virtualization of the CPU and memory, threads and concurrency, and file and storage systems. Written by professors active in the field for 20 years, this text has been developed in the classrooms of the University of Wisconsin-Madison, and has been used in the instruction of thousands of students.},
 author = {{Arpaci-Dusseau Remzi}, Arpaci-Dusseau Andrea},
@ -271,9 +255,11 @@ title = {{Operating Systems: Three Easy Pieces}},
 volume = {Electronic},
 year = {2015}
 }
-@article{Affairs2015,
-author = {Affairs, Post Doctoral},
-file = {:home/steveej/src/steveej/msc-thesis/docs/You can't spell trust without Rust.pdf:pdf},
-title = {{YOU CAN ' T SPELL TRUST WITHOUT RUST alexis beingessner Master ' s in Computer Science Carleton University}},
-year = {2015}
+@misc{MITRE-CWE-119,
+author = {MITRE},
+booktitle = {2.11},
+title = {{CWE-119: Improper Restriction of Operations within the Bounds of a Memory Buffer}},
+url = {http://cwe.mitre.org/data/definitions/119.html},
+urldate = {2017-08-31},
+year = {2017}
 }
--- a/src/docs/thesis.tex
+++ b/src/docs/thesis.tex
@ -11,7 +11,10 @@
 \geometry{a4paper, top=25mm, left=30mm, right=35mm,  bottom=35mm, headsep=10mm, footskip=12mm}

 \usepackage{multirow,tabularx,tabu}
-\usepackage{ctable,multirow,spreadtab}
+\usepackage{spreadtab}
+\usepackage{colortbl}
+\usepackage[dvipsnames]{xcolor}
+\usepackage{hhline}

 \usepackage[backend=biber,style=numeric,citestyle=numeric,url=true]{biblatex}
 \addbibresource{thesis.bib}
@ -19,23 +22,37 @@
 %\usepackage[hyphens]{url}
 \Urlmuskip = 0mu plus 1mu

-\hyphenpenalty=500
-\tolerance=10000
+%\hyphenpenalty=1
+\pretolerance=5000
+\tolerance=5000
+%\exhyphenpenalty=1

 \usepackage[numberedsection,toc,numberline,nopostdot]{glossaries}
 \makenoidxglossaries

 \usepackage{listings}
+\providecommand*{\listingautorefname}{Listing}
+\usepackage{minted}
 \usepackage{graphicx}
+\usepackage{placeins}
+\usepackage{tikz}
+\usetikzlibrary{tikzmark,mindmap}
+\usetikzlibrary{chains,shapes.arrows, arrows, positioning,decorations.pathreplacing,bending}
+\usetikzlibrary{calc}
+\usepackage{smartdiagram}
 \usepackage{color}

+\usepackage{caption}
+\usepackage{subcaption}
+
+\tikzset{/minted/basename/.initial=minted}
+\appto\theFancyVerbLine{\tikzmark{\pgfkeysvalueof{/minted/basename}\arabic{FancyVerbLine}}}
+
 \usepackage[parfill]{parskip}

 \usepackage{amsmath}

-\newcommand{\iitemA}{\setlength\itemindent{0pt}\item}
-\newcommand{\iitemB}{\setlength\itemindent{25pt}\item}
-\newcommand{\iitemC}{\setlength\itemindent{50pt}\item}
+\usepackage{etoolbox}

 \newcommand{\topic}{Guarantees On In-Kernel Memory-Safety Using Rust's Static Code Analysis}

@ -103,9 +120,13 @@
 \titleformat{\chapter}[hang]{\normalfont\Large\bfseries}{\thechapter}{0.5cm}{}

 \usepackage{hyperref}
+\usepackage{cleveref}

 \makeatletter

+\newcommand{\cnameref}[1]{\cref{#1} \textit{(\nameref{#1})}}
+\newcommand{\Cnameref}[1]{\Cref{#1} \textit{(\nameref{#1})}}
+
 %\renewcommand\paragraph{\startsection{paragraph}{4}{\z}%
 %                                     {-3.25ex\plus -1ex \minus -.2ex}%
 %                                     {0.0001pt \plus 0.2ex}%
@ -114,6 +135,60 @@
                                     {-3.25ex\plus -1ex \minus -.2ex}%
                                     {0.0001pt \plus 0.2ex}%
                                     {\normalfont\normalsize\bfseries}}
+
+\newcommand{\iitemA}{\setlength\itemindent{0pt}\item}
+\newcommand{\iitemB}{\setlength\itemindent{25pt}\item}
+\newcommand{\iitemC}{\setlength\itemindent{50pt}\item}
+
+\let\Partmark\partmark
+\def\partmark#1{\def\Partname{#1}\Partmark{#1}}
+\let\Chaptermark\chaptermark
+\def\chaptermark#1{\def\Chaptername{#1}\Chaptermark{#1}}
+\let\Sectionmark\sectionmark
+\def\sectionmark#1{\def\Sectionname{#1}\Sectionmark{#1}}
+\let\Subsectionmark\subsectionmark
+\def\subsectionmark#1{\def\Subsectionname{#1}\Subsectionmark{#1}}
+\let\Subsubsectionmark\subsubsectionmark
+\def\subsubsectionmark#1{\def\Subsubsectionname{#1}\Subsubsectionmark{#1}}
+
+
+\newenvironment{compactminted}{%
+    \VerbatimEnvironment
+    \let\FV@ListVSpace\relax
+    \begin{minted}}%
+    {\end{minted}}
+
+\tikzset{west above/.code=\tikz@lib@place@handle@{#1}{south west}{0}{1}{north west}{1}}
+\tikzset{west below/.code=\tikz@lib@place@handle@{#1}{north west}{0}{-1}{south west}{1}}
+\tikzset{east above/.code=\tikz@lib@place@handle@{#1}{south east}{0}{1}{north east}{1}}
+\tikzset{east below/.code=\tikz@lib@place@handle@{#1}{north east}{0}{-1}{south east}{1}}
+
+% Tikzmark code helpers {
+\newcommand{\tikzmarkprefix}{\pgfkeysvalueof{/tikz/tikzmark prefix}}
+\newcommand{\tikzmarkcountprep}[1]{%
+\tikzset{tikzmark prefix=#1}%
+%\tikzset{tikzmark prefixes/#1/counter/.initial=0}%
+\newcounter{Tikzcounter#1}%
+}
+\newcommand{\tikzmarkcount}[1][\tikzmarkprefix]{%
+\stepcounter{Tikzcounter#1}%
+\tikzmark{\arabic{Tikzcounter#1}}%
+}
+\newcommand{\tikzmarkgetcount}[1][\tikzmarkprefix]{%
+\expandafter\arabic\expandafter{Tikzcounter#1}%
+}
+\newcommand{\tikzmarkcircle}[1]{%
+    \tikz[baseline=-0.77ex]\fill circle[fill=black,radius=1.1ex] node[font=\small,color=white]{#1};
+}
+\newcommand{\tikzmarkdrawcircles}{%
+    \begin{tikzpicture}[remember picture,overlay]
+        \foreach \x in {1,...,\expandafter\arabic\expandafter{Tikzcounter\expandafter\tikzmarkprefix}}
+        \fill (pic cs:\x)+(1.3ex,0.5ex) circle[fill=black,radius=1.1ex,anchor=west] node[font=\small,color=white]{$\x$};
+    \end{tikzpicture}%
+}
+
+% }
+                                 
 \makeatother
 \include{glossary}