msc-thesis/src/docs/parts/research_and_development/research_and_development.tex

% // vim: set ft=tex:
\chapter{Refined Research Questions}

\section{Software Tests}
% TODO: describe that tests are mostly semantics as opposed to static checks being mostly syntactical and technical
% TODO: Are they necessary in addition to static checks to cover the well-known use-cases and edge-cases.
% TODO: example?

\section{Definition Of Additional Analysis Rules To Extend Safety Checks}
% TODO: How can Business Logical
% Examples: 
% TLB needs to be reset on Task Change
% Registers need to be 

\subsection{Paging}
Setting up and maintaining the paging-structure, as well as allocating physical memory for the virtual pages is a complex task in the \gls{os}.
Developing this part of the \gls{os} is error-prone, and is not well-supported by mainstream \glspl{proglang}.

\section{Software Fault Isolation}
% TODO: content from \cite{Balasubramanian2017}

% TODO Which language items help with managing memory?
% TODO How generic can the memory allocators be written?

% TODO Guarantees to be statically checked:
% TODO * Control access to duplicates in page tables
% TODO * Tasks can't access unallocated (physical) memory
% TODO * Tasks can't access other tasks memory


\chapter{System Programming Conventions}
\label{rnd::sysprog-conventions}

\section{Stack Frame Handling on AMD64}
\label{rnd::sysprog-conventions::stackframe-amd64}
The usage of the \gls{stack} is tightly coupled with control flow instructions in conjunction with two registers, the Stack-Frame Base Pointer (RBP) and the Stack Pointer (RSP).
The instructions that use these registers and explicitly or implicitly work with the stack\cite[p.~83]{AMD64Vol1} can be grouped into the following categories.
Together they can be used to perform \gls{stack} based procedure calls, as demonstrated in the following \cref{context::introduction::hw-supported-mm::procedure-call-example}.

\paragraph{Direct Stack Data Management} with PUSH and POP.

PUSH takes value operand which is to be pushed onto the stack.
The address in RSP moves towards numerically lower addresses with every PUSH instruction, which stores a new data entry on top.
The order is to first change the RSP and then copy the value at its new address.

POP takes a storage reference operand - \gls{cpu} register or memory address.
It works in the opposite direction to PUSH.
First, consuming the top-most data entry and storing it on the operand location, then moving the RSP address towards the numerically higher RBP address.

When RBP and RSP point to the same address, the stack is considered empty.

\paragraph{Procedure Calls} with CALL and RET. \\
These instructions control the instruction flow by calling another instruction procedure\footnote{loosely synonymous with function}.

The CALL instruction takes the address of the instruction that is to be called.
Before jumping to the instruction at the given address, it PUSHes the current RIP (instruction pointer) register onto the \gls{stack}.

RET takes no operand, but instead POPs the \gls{stack}'s top entry.
The consumed value is used as a jump address.

As PUSH and POP use the RSP register, the called procedure is responsible to finish with the RSP at the same position as when it was entered.
For example, PUSHing some value onto the stack before the end of the function would cause the RET to jump to that address instead of returning to the caller.

\paragraph{Called Procedure Setup} \emph{not} with ENTER and LEAVE.

When a procedure is called, the stack is set up with the \gls{sf}, the four components listed in \cref{lst:amd64-stack-frame-components}.
\cite[p.~48]{AMD64Vol1}:

\begin{listing}[h]
\begin{enumerate}
    \item{%
        Parameters passed to the called procedure (created by the calling procedure). \\
        \textit{Only if parameters don't fit the \gls{cpu} registers}
    }
    \item{%
            Return address (created by the CALL instruction). \\
        \textit{Always used by CALL}
    }
    \item{%
        Array of stack-frame pointers (pointers to stack frames of procedures with smaller nesting-level depth) which are used to access the local variables of such procedures. \\
        \textit{Depends on support and implementation of nested functions in the \gls{compiler}}
    }
    \item{%
        Local variables used by the called procedure. \\
        \textit{This includes the variables passed via \gls{cpu} registers}
    }
\end{enumerate}
\caption{\glsentrytext{amd64} Stack-Frame Components}
\label{lst:amd64-stack-frame-components}
\end{listing}
only necessary when there aren't enough \gls{cpu} to pass the parameters.
Item 3 is only necessary when 

The \gls{amd64} manual also lists ENTER and LEAVE as instructions to \textit{"provide support for procedure calls, and are mainly used in high-level languages."}\cite[p.~48]{AMD64Vol1}.
The latter claim could not be verified by inspecting binaries produced by the \gls{C} and \gls{Rust} \glspl{compiler}.

Instead, these \glspl{compiler} generate a sequence of PUSH, MOV and SUB instructions to manage theset up the \gls{stack}.
There are instructions before and after the procedure's logic, taking care of the technicalities of \gls{stack} management.
These instruction groups within the called procedure are called prologue and epilogue.

\subsection{Full Procedure Call Example}
\label{context::introduction::hw-supported-mm::procedure-call-example}
This section combines the separate categories into one complete example that shows how the \gls{stack} is used by various \gls{cpu} instructions to perform procedure calls.
The following code samples are extracted from a disassembled binary which was originally created using \gls{Rust}.
The Assembler that's shown uses Intel Mnemonic, which generally operates from right to left.
For example, \mint{nasm}{mov a, b} copies b to a.

\cref{code::context::examples::func-callee-rust} shows the \gls{Rust} source code of the function \textit{sum}.


% \subsubsection{Top-Level Page Table Self-Reference}
% \subsubsection{Caching Lookups}
% \subsubsection{Full Example}
% * http://taptipalit.blogspot.de/2013/10/theory-recursive-mapping-page.html
% * https://www.coresecurity.com/blog/getting-physical-extreme-abuse-of-intel-based-paging-systems-part-2-windows

\begin{listing}[htb]
    \tikzset{/minted/basename=callee-c}
    \begin{minted}[autogobble,linenos,breaklines=true]{rust}
    TODO
    \end{minted}
    \caption{The called function in \gls{Rust}}
    \label{code::context::examples::func-callee-rust}
\end{listing}

\cref{code::context::examples::func-call-asm} shows a snippet snippet of the calling function. 
It stores the arguments within the registers according to the System V X86\_64 calling convention. %TODO REFERENCE
The caller doesn't alter the stack-frame pointer (RBP) or the stack pointer (RSP) registers before call, hence the called function must restore these if it alters them.

\begin{listing}
    \begin{minted}[escapeinside=??,highlightlines={},autogobble,linenos,breaklines=true]{rust}
    TODO
    \end{minted}
    \caption{Procedure Call Example: Caller Rust}
    \label{code::context::examples::func-call-asm}
\end{listing}

\begin{listing}
    \begin{minted}[escapeinside=??,highlightlines={},autogobble,linenos,breaklines=true]{nasm}
    \end{minted}
    TODO
    \caption{Procedure Call Example: Caller Assembly}
    \label{code::context::examples::func-call-rust}
\end{listing}

% \balloon{comment}{

% RDI, RSI, RDX, RCX, R8, R9, XMM0–7

\begin{table}[ht!]
    \tikzmark{precallto}
    \centering
    \begin{tabular}{ r | >{\columncolor{YellowGreen}}c | l }
        \multicolumn{1}{r}{RBP offset} & \multicolumn{1}{c}{Content} & \\
        $\uparrow$ & \cellcolor{white} & \\
        & \cellcolor{white} \dots \textit{beyond current stack} \dots & \\
        \hhline{~-~} 
        0 & \textit{Previous RSP} & $\leftarrow$ RBP \\
        \hhline{~-~} 
        \vdots & \dots~~\textit{local variables}~~\dots & \\
        \hhline{~-~} 
        -0x30 & 3rd arg & \\
        \hhline{~|-|~}
        -0x38 & 2nd arg & \\
        \hhline{~-~}
        -0x40 & 1st arg & \\
        \hhline{~-~} 
        \vdots & \dots~~\textit{local variables}~~\dots & \\
        \hhline{~-~} 
        -0x60 & rdi & \\
        \hhline{~-~} 
        & \dots~~\textit{local variables}~~\dots & \\
        \hhline{~-~} 
        $RBP-RSP$ & \textit{unknown} & $\leftarrow$ RSP \\
        \hhline{~-~} 
        & \cellcolor{white} & \\
        $\downarrow$ & \cellcolor{white} & \\
    \end{tabular}
\end{table}


\cref{code::context::examples::func-prologue} shows \textit{sum}'s prologue.
The corresponding epilogue is displayed in \cref{code::context::examples::func-epilogue}. 
The comments explain the code line by line, please read them to understand what exactly happens at each instruction.

\begin{listing}[ht!]
\begin{minted}[escapeinside=??,linenos=false,breaklines=true]{nasm}
$7490: push ?\tikzmark{prologuestart}?  rbp       ; save the stack-frame pointer on the stack
$7491: mov    rbp,rsp   ; set the stack-frame base pointer from the stack pointer
$7494: sub    rsp,0x50  ; allocate 0x50 Bytes for arguments and local variables
$7498: mov    QWORD PTR [rbp-0x30],rdi ; copy 1st arg onto stack 
$749c: mov    QWORD PTR [rbp-0x28],rsi ; copy 2nd arg onto stack 
$74a0: mov    QWORD PTR [rbp-0x20],rdx ; copy 3rd arg onto stack  
\end{minted}
\caption{Function Prologue with three Arguments}
\label{code::context::examples::func-prologue}
\end{listing}

\begin{tikzpicture}[remember picture]
    \draw[overlay,red,thick,dashed] (pic cs:precallto) circle [radius=7pt] node { \textbf{1} };
    \draw[overlay,red,thick,dashed] (pic cs:prologuestart) circle [radius=7pt] node { \textbf{1} };
\end{tikzpicture}

\begin{listing}[ht!]
\begin{minted}[linenos=true,breaklines=true]{nasm}
$74ee: mov    rax,QWORD PTR [rbp-0x48] ; store return value in RAX
$74f2: add    rsp,0x50                 ; set stack pointer to where stack-frame pointer was stored
$74f6: pop    rbp                      ; restore the stack-frame pointer
$74f7: ret                             ; return to the caller, following the address on the stack
\end{minted}
\caption{Function Epilogue}
\label{code::context::examples::func-epilogue}
\end{listing}

\cref{fig:proc-call-example-mem} displays 

\begin{figure}
\centering
\includegraphics[width=0.95\textwidth,]{gfx/call-procedure-memory-content.png}
\caption{Memory Layout Throughout The Procedure Call Steps}
\label{fig:proc-call-example-mem}
\end{figure}
\FloatBarrier

\section{4-Level Paging Hierarchy on \glsentrytext{amd64}}
\label{rnd::sysprog-conventions::paging-amd64}
On \gls{amd64} "a four-level page-translation data structure is provided to allow long-mode operating systems to translate a 64-Bit virtual-address space into a 52-Bit physical-address space."\cite[p.~18]{AMD64Vol2}.
This allows the system to only hold the \textit{PML4} table, the which is currently referenced by the \textit{Page Map Base Register (CR3)}, available in main memory.

\cref{fig:virtual-addr-transl} shows the 64-Bit virtual address composition on \gls{amd64}, which uses four-levels of page tables.
Counterintuitively the page-tables are not called level-\textit{n}-page-table, but the levels received distinct names in \citetitle{AMD64Vol2}.
The most-significant Bits labelled as \textit{Sign Extend} are not used for addressing purposes, but must adhere the canonical address form and simply repeat the value of the most-significant implemented Bit \cite[p.~130]{AMD64Vol2}.
The least significant Bits represent the offset within the physical page.
The four groups in between are used to index the page-table at their respective level.

\begin{figure}
\centering
\includegraphics[width=\textwidth]{gfx/Virtual-to-Physical-Address-Translation-Long-Mode.png}
\caption{Virtual to Physical Address in Long Mode\cite{AMD64Vol2}}
\label{fig:virtual-addr-transl}
\end{figure}
\subsubsection{Translation Scheme 4 KiB and 2 MiB Pages}
The \gls{amd64} architecture allows configuring the page-size, two of which will be introduced in this section.
\cref{tab:page-transl-vaddr-composition} displays the virtual address composition for the 4KiB and 2MiB page-size modes on \gls{amd64}.
The direction from top to bottom in the table corresponds to most significant to least significant - left to right - in the virtual address.
The \textit{sign extension} Bits cannot be used for actual information but act as a reservation for future architectural changes.

\begin{table}
    \begin{tabular}{l | c | c}
        Description & Bits in 4 KiB Pages & Bits in 2 MiB Pages \\
        \hline
        Sign Extend & 12 & 12 \\
        Page-Map-Level-4 Offeset & 9 & 9 \\
        Page-Directory-Pointer Offeset & 9 & 9 \\
        Page-Directory Offeset & 9 & 9 \\
        Page-Table Offeset & 9 & - \\
        Physical Page Offset & 9 & 21 \\
    \end{tabular}
    \caption{Paging on \gls{amd64}: Virtual Address Composition 4KiB/2MiB pagesizes}
    \label{tab:page-transl-vaddr-composition}
\end{table}

\begin{figure}
\centering
\includegraphics[width=\textwidth]{gfx/amd64-4kb-page-translation-long-mode}
\caption{4-Kbyte Page Translation—Long Mode\cite{AMD64Vol2}}
\label{fig:4kb-page-transl}
\end{figure}

\cref{fig:4kb-page-transl} shows the detailed virtual address composition for 4 KiB pages, using four levels of page-tables.
It uses four sets of 9-Bit indices in the virtual address, one per hierarchy level, followed by the 9 Bit page-internal offset.

An alternative approach is displayed in \cref{fig:2mb-page-transl}, using 2 MiB sized pages.
It uses three sets of 9-Bit indices for the page-tables, and a 21-Bit page-internal offset.
Increasing the page-size improves speed and memory-usage and decreases the granularity.
In this specific example the hierarchy is reduced by one level of page-tables.
This reduces the amount of storage required for the page-tables in overall and causes the lookup algorithm to finish faster.

\begin{figure}
\centering
\includegraphics[width=\textwidth]{gfx/amd64-2mb-page-translation-long-mode}
\caption{2-Mbyte Page Translation—Long Mode\cite{AMD64Vol2}}
\label{fig:2mb-page-transl}
\end{figure}

The other supported page sizes, 4 MiB and 1 GiB, as well as intermixing page sizes through the different levels don't add new insight into the mechanism and don't need to be detailed here.


\section{Interrupt Driven Preemptive Context Switches on \glsentrytext{amd64}}
\label{rnd::sysprog-conventions::ir-driven-preemptive-cs-amd64}
On \gls{amd64}, the \gls{cpu}'s interrupt mechanism does not switch the full context described previously, but only handles the registers that are necessary to successfully jump to the interrupt function: RFLAGS, RSP, RBP, RIP\footnote{Segment registers are neglected}.

\subsection{Interrupts}
% TODO https://software.intel.com/sites/default/files/managed/39/c5/325462-sdm-vol-1-2abcd-3abcd.pdf p. 2848

\subsection{Context Content}
A description for \gls{amd64} is given in \cref{tab:task-minimum-context-registers}.

\begin{table}
    \begin{tabularx}{\textwidth}{| c | X | X |}
    \hline
        \textbf{descriptive name} & 
        \textbf{register names on amd64} &
        \textbf{description} \\ 
    \hline
    the instruction pointer register & RIP & address of the next instruction to be fetched \\
    \hline
    the stack pointer register & RSP & address of current position in stack \\
    \hline
    the flags register & RFLAGS & various attributes, e.g. the interrupt flag \\
    \hline
    all general-purpose registers & RAX, RBX, RCX, RDX, RDI, RSI, RBP, RSP, R8–R15  & arbitrary data \\
    \hline
    \end{tabularx}
    \caption{Minimum Context Registers on amd64\cite[p.~28]{AMD64Vol2}}
    \label{tab:task-minimum-context-registers}
\end{table}

\subsection{Storing The Context On The Stack}
In this scenario, the context is stored on the \gls{stack} of the function that is interrupted.
\Cref{fig:amd64-long-mode-interrupt-stac} pictures the \gls{stack} layout on interrupt entry.
In order to leverage an interrupt for a context switch, the interrupt function needs to replace these values on the \gls{stack} with values for the new context.
CS (Code-Segment) and SS (Stack-Segment) have no effect in \gls{amd64} 64-Bit mode\cite[p.~20]{AMD64Vol1} and can remain unchanged.
The \gls{os} developer needs to know the exact address where on the \gls{stack} this data structure has been pushed by the \gls{cpu}, and must then manipulate these addresses directly.
This type of manipulation is inherently dangerous and can not be easily checked by the \gls{compiler}.
The function that handles the interrupt must then use the instruction \textit{iretq}\cite[p.~252]{AMD64Vol2}, to make the \gls{cpu} restore the partial context from the \gls{stack} and continue to function pointed to by the RIP.


\begin{figure}
\centering
\includegraphics[width=0.8\textwidth]{gfx/amd64-long-mode-stack-after-interrupt.png}
\caption{Long-Mode Stack After Interrupt\cite[p.~252]{AMD64Vol2}}
\label{fig:amd64-long-mode-interrupt-stac}
\end{figure}

For a full context-switch, the other registers that are part of the context need to be handled by the \gls{os}'s interrupt function.

\chapter{Porting \glsentrytext{C} Vulnerabilities}
\label{rnd::porting-c-vulns}
In this chapter, the weakness manifestations from \cref{context::common-mem-safety-mistakes::manifestations} are rewritten in \gls{Rust} to learn to what level they are mitigated just by porting them.

\chapter{\glsentrytext{LX} Modules Written In \glsentrytext{Rust}}

\chapter{Existing \glsentrytext{os}-Development Projects Based On Rust}
\label{rnd::existing-os-dev-with-rust}

\section{Libraries}

\subsection{Libfringe}
% TODO: https://github.com/edef1c/libfringe


\section{Systems}
\subsection{intermezzOS}
\subsection{Blog OS}
\subsection{Redox}
\subsection{Tock}
%TODO: mention paper's by tockos team
 
\chapter{\glsentrytext{imezzos}: Adding Preemptive \glsentrytext{os}-Level Multitasking}
\label{rnd::imezzos-preemptive-multitasking}

\section{Timed Interrupts For Scheduling and Dispatching}

\section{Simple Stack Allocation Scheme}

\section{Risk Of Stack-Overflow}
% TODO: The compiler doesn't check for stack overflows.
% TODO: Describe possible implementation. 
%    Parameters:
%        Stack limit for each function: user defined constant,
%        Stack size for each function: calculated,
%        Call-Tree: calculated,