% // vim: set ft=tex:
\chapter{An Introduction To Rust}
\label{rnd::rust}
As described by the maintainers, \gls{Rust} is a "systems programming language that runs blazingly fast, prevents segfaults, and guarantees thread safety."\footnote{\url{www.rust-lang.org}}.
During early development it had a runtime-dependent garbage collector which has since been dropped from the language, making it a viable candidate for \gls{os} development.
It meets all requirements listed in \cref{context::os-dev-lang-choice::requirements} which has enabled many developers to create \gls{os} related projects.
These are the subject of \cnameref{rnd::existing-os-dev-with-rust::systems::blog-os::mm}.

This chapter gives an introduction to \gls{Rust} from the specific perspective, as a beginner in \gls{Rust} and only academical \gls{os} development experience.
The specific interest according to the topic, is which language aspects help with memory-safety in the latter.
In addition to existing functionality, potentialities are also taken into account, as well as the ability to extend the language for the specific use-case.
The introduction found here is a summary of features that have been encountered throughout this study.
As a more generic introduction to the language, the suggestion is to study at least the introduction in \citetitle{Beingessner2015}\cite{Beingessner2015}, or simply visit the official Rust website which has a complete and beginner-friendly documentation by now.\footnote{\url{https://doc.rust-lang.org}}.
Another note is that this study relies heavily on features that are only available in the nightly version of \gls{Rust}, which is a necessity for \gls{os} development.
These features will be highlighted throughout the various chapters.

\section{Compiler Architecture}
Detailed information about Rust's compiler architecture seems to be spread over any Rust related development website, including the Rust forums and GitHub, but also blog posts by Rust's developers.

\Cref{fig:rust-compiler-architecture} shows \gls{Rust}'s chain of compilation.
\begin{figure}[ht!]
    \centering
    \includegraphics[width=0.7\textwidth]{gfx/rust-compiler-flow.png}
    \caption{Rust's current and future compiler architecture}
    \label{fig:rust-compiler-architecture}
\end{figure}

In one of these blog posts \url{https://blog.rust-lang.org/2016/04/19/MIR.html}, one of the maintainers describes the compiler architecture, including planned changes and their improvements.

Rust, in particular its compiler \code{rustc}, parses and "desugares" the source code.
What is called "desugaring", is the expansion of syntax that exists for the mere purpose of being simple and comfortable to use, which is called syntactical sugar in \gls{proglang} slang.
This step also handles the expansion of Rust's hygienic macros, which are work differently than C macros, and produces the High-Intermediate-Representation -- comparable to an Abstract-Syntax-Tree -- on success.

The HIR is then type- and borrow-checked, Rust's most unique safety features.
On success, Rust's compiler delivers an Intermediate-Representation to \gls{llvm}, which is then optimized and  compiled into machine specific code, instructions for the physical \gls{cpu}.

Note that the new \emph{MIR} layer has not been fully completed as of today, but it has been activated in the compiler since October 2015\footnote{\url{https://github.com/rust-lang/rust/pull/28748}}.
Its development is based the assumption that "Rust’s rich type system should provide fertile ground for going beyond \gls{llvm}’s optimizations."
Among other improvements, it allows Rust to perform optimizations before monomorphizing the code for \gls{llvm}, which breaks down (looses) all Rust specific type system information, e.g. trait implementations, into a flat model.

\subsection{Zero-Cost Abstraction}
The optimization on the HIR and MIR is the origin of the term zero-cost abstractions.
By analyzing control flow and evaluation of constant expressions, code paths can be eliminated and the resulting \gls{llvm} IR can be reduced extensively.

It will also allow type checking to be more effective, as the MIR is simpler than the HIR.
An example of a potential optimization is given in the borrow checker explanation along with the \cnameref{rnd::rust::feat::own-borrow::mir-improvement}.

% TODO: Tokens

\subsection{Syntax}
In \gls{os} development, macros are often preferred over functions because they are processed at compile time and induce no runtime overhead.
In \gls{Rust}, they are deeply integrated into the language syntax, and are not trivially usable.
In order to extend Rust for specific use-cases in \gls{os} development with macros or other means, the syntax and its extension mechanisms need to be understood.
The following information has been extracted from the \url{https://danielkeep.github.io/tlborm/book}.

\paragraph{Tokens Trees}
Rust is a C-like language and doesn't look too much different for simple code.
It starts with tokens, which are similar to other languages.


\subsection{Extensions}
\subsubsection{Definition Of Additional Analysis Rules To Extend Safety Checks}
% TODO: Business Logic Checks
% Examples: 
% TLB needs to be reset on Task Change
% ISR-Stack-Frame needs to be updated on context-switch

% TODO How generic can the memory allocators be written?

% TODO Guarantees to be statically checked:
% TODO * Control access to duplicates in page tables
% TODO * Tasks can't access unallocated (physical) memory
% TODO * Tasks can't access other tasks memory


\subsection{Macro Rules}
A rather simple macro is presented in \cpnameref{rnd::imezzos-preemptive-multitasking::timer-interrupt-scheduling::macro}.

\subsubsection{Macro Recursion Limit}
Macro recursion can be limited via the attribute:

\mintinline{rust}{#![recursion_limit="10"]}

\subsection{Compiler Plugins}
The Rust Unstable Book \url{https://doc.rust-lang.org/unstable-book/language-features/plugin.html}
has a section on compiler plugins, which are user-provided libraries that extend the compiler's behavior with new syntax extensions, lint checks, etc.
This is

\subsection{Cargo}
\glsentrydesc{cargo}.

\subsubsection{Tweaking LLVM Compiler Options}
\label{rnd::rust::cargo::tweak-llvm}
Using \gls{cargo}, arguments for the \gls{llvm} \gls{compiler} can be passed all the way down to by creating the \code{$PROJECT_DIR/.cargo/config} file.
    The following is an example which has been used to experiment with stack protection in \cref{rnd::weakness-mitig-prev::stack-protection::stack-clash::user-space}.

\begin{minted}{yaml}
[build]
rustflags = [
    "-C", "llvm-args=-safe-stack-layout -enable-stackovf-sanitizer -asan-stack -warn-stack-size=1000",
]
\end{minted}

To enable this configuration, \code{cargo rustc} needs to be invoked for this project to respect the configured rustc options.
The config file shows other stack related options too that were enabled for experimentation purposes.
A full list of supported options can be retrieved with the following C++ program:

\begin{minted}{cpp}
// Call with `--help-list-hidden` as argument to get a full list
#include "llvm/Support/CommandLine.h"
using namespace llvm;
int main(int argc, char** argv) {
  cl::ParseCommandLineOptions(argc, argv, "");
  return 0;
}
\end{minted}
The reason why this is required is that it uses the same \gls{api} as \gls{Rust} to invoke \gls{llvm}, and should give accurate results on what options are supported by \gls{Rust}.
Standalone tools of \gls{llvm} might not expose the same functionality as the \gls{api} used here.

\section{Investigated Language Features}
The following sentence is placed here according to the Don't-Repeat-Yourself principle as it would have otherwise been in almost every subsection:
Developers unfamiliar with this concept are likely to take a while to get used to it, but safety-gains are well worth the effort.

\subsection{Memory Management}
- TODO: Static Variables on Stack, handled by compiler

- TODO: Heap requires implemented allocator

- TODO: BSYS SS17 GITHUB IO Rust Memory Layout - 4
- TODO: How can memory be dynamically allocated and still safety checked?

\subsubsection{Custom Allocators}
- TODO: mention ralloc by redox
- TODO: simple allocator by Blog OS
- TODO: Who owns global 'static variables?

\subsection{Ownership And Borrows}
\citeauthor{Beingessner2015} explores the ownership model in relation to some of the weaknesses explained in \cref{context::weaknesses-mem-safety}.
The ownership model is described as "a system for expressing where and when data lives, and where and when data can be mutated."

\paragraph{Effectiveness}
The ownership model was found to effectively eliminate vulnerabilities of the following weakness types:
\begin{itemize}
    \item use-after-free
    \item indexing out of bounds
    \item iterator invalidation
    \item data races
\end{itemize}

\paragraph{Not Fully Effective Against Memory Leaks}
It was found that the problem of memory leaks cannot be sufficiently solved by ownership, due to lack of proper linear typing.
It was described that leaked memory is not a direct memory-safety violation because the \gls{os} cleans up leaked memory after the \gls{process}'s termination.

Note: The suffering \gls{app} will prevent leaked memory from being used by other \glspl{app} until its termination. 
However, this should not happen in the \gls{os} as there is no underlying instance that can simply reclaim the leaked memory, thus it will be lost until system reboot.

\subsubsection{Potential MIR improvements}
\label{rnd::rust::feat::own-borrow::mir-improvement}
An example for potential changes are \emph{vector patterns} taken from the MIR-RFC\url{https://github.com/rust-lang/rfcs/blob/master/text/1211-mir.md}

The following match shows a vector pattern borrow in a match expression.
While this is legal today --
\begin{minted}{rust}
let mut vec = [1, 2];
match vec {
    [ref mut p, ref mut q] => { ... }
}
\end{minted}
--  one would intuitively expect it to be the same as:
\begin{minted}{rust}
p = &mut vec[0], q = &mut vec[1]
\end{minted}

In the latter case, the borrow checker would complain.
This is because it does not consider the two constant indices to borrow different items from the vector, but considers the whole vector to be borrowed by the first statement, causing an error for the second borrow attempt of the vector.

\subsection{Static Analyser}
- TODO: How does the Rust's static analysis work, theoretically and practically
- TODO: mention electrolyte, formal verification for Rust
- TODO: How does static typing help with preventing programming errors

- TODO: explain lints

\subsection{Inline Assembly}
Inline assembly is explained by example in \cref{rnd::imezzos-preemptive-multitasking::timer-interrupt-scheduling}

A more formal and helpful tutorial which is suggested, has been found in form of a web article.\footnote{\url{http://embed.rs/articles/2016/arm-inline-assembly-rust/}}

\subsection{Lifetimes}
- TODO: Where are global 'static variables allocated?

\subsection{Type Safety}
- TODO: demonstrate casts

- TODO: demonstrate raw pointers:
% https://rustbyexample.com/flow_control/match/destructuring/destructure_pointers.html

- TODO: discuss the equivalents of void*?

\subsection{Single Field Structs}
Structs with a single field can be used to wrap a under a different type name, and make it distinguishable for the type system.
This is different from a type alias, which wouldn't prevent the example situation given below.
This extended example\footnote{\url{https://aturon.github.io/features/types/newtype.html}} shows one way of preventing the mix-up of common length units.
Both new types wrap \code{f64} but are not interchangeable.

%\begin{figure}[ht!]
\begin{minted}[linenos,breaklines]{c}
struct Miles(pub f64);
struct Kilometers(pub f64);

impl Miles { 
    fn as_kilometers(&self) -> Kilometers { Kilometers { 0: self.0 * 1.6 } } 
}
impl Kilometers { 
    fn as_miles(&self) -> Miles { Miles { 0: self.0 / 1.6 } } 
}

struct Route { distance: Miles }

impl Route {
    fn are_we_there_yet(&self, distance_travelled: Miles) -> bool { 
        self.distance.0 <= distance_travelled.0
    }
}

fn main() {
    let distance = Miles { 0: 100.0 };
    let route_miles = Route{ distance }
    let travelled = Kilometers { 0: 100.0 };
    let arrived = route_miles.are_we_there_yet( travelled );
    println!("Are we there yet? {}", arrived);
}
\end{minted}
%\caption{}
%\label{code::}
%\end{figure}

The compiler rightfully rejects the code with the following error, and even gives a suggestion to use the \code{.as_miles()} method.

\begin{minted}[breaklines]{md}
error[E0308]: mismatched types
  --> src/main.rs:33:49
   |
33 |     let arrived = route_miles.are_we_there_yet( travelled );
   |                                                 ^^^^^^^^^ expected struct `Miles`, found struct `Kilometers`
   |
   = note: expected type `Miles`
              found type `Kilometers`
   = help: here are some functions which might fulfill your needs:
           - .as_miles()
\end{minted}

\subsection{Empty Types}
\label{rnd::rust::type-safety::empty-types}
Empty types are abstract types that can not be instantiated.

\subsubsection{Unreachable Code Paths}
They can be used to statically prevent certain code paths, declaring them impossible.
The simplest example is a function that is defined to never return:

\begin{minted}[linenos,breaklines]{rust}
enum CanNeverExist {}
fn never_returns() -> CanNeverExist {
    loop {}
}
\end{minted}

If line 2 was removed, the compiler would regard it as an error: 

\begin{minted}[breaklines]{md}
error[E0308]: mismatched types
 --> src/main.rs:2:37
  |
2 |   fn never_returns() -> CanNeverExist {
  |  _____________________________________^
3 | |     // loop {}
4 | | }
  | |_^ expected enum `CanNeverExist`, found ()
  |
  = note: expected type `CanNeverExist`
             found type `()`
\end{minted}

If no value is explicitly given at the end of the function, the compiler implies \code{()} which \emph{something}, unlike the empty enum which is \emph{nothing} and cannot actually be instantiated and returned.
\code{loop{}} among others evaluates to \emph{nothing} as it will never stop and return, that is why the compiler was satisfied with it.
Trying to pass an instance of \code{CanNeverExist} yields the following:

\begin{minted}{md}
error[E0574]: expected struct, variant or union type, found enum `CanNeverExist`
 --> src/main.rs:3:5
  |
3 |     CanNeverExist {}
  |     ^^^^^^^^^^^^^ not a struct, variant or union type
\end{minted}

This demonstrates that the empty enum cannot be instantiated, and is merely a symbolic type.
Rust includes the \code{!} type for this purpose, and the function could've been written as \mintinline{rust}{fn never_returns() -> ! { loop{} }}.
This pattern can be used in \gls{os} development for the \gls{os}'s function that runs the main loop, and is not supposed to return.

\subsubsection{In Combination With Traits And PhantomData}
Emtpy enums can be used for more advanced use-cases in combination with traits, as shown in  \cref{rnd::existing-os-dev-with-rust::systems::blog-os::mm}, where the lowest level of the page hierarchy is prevented from calling the \code{next_table} method.

\subsection{Inner- and Outer Mutability}
Some types in \gls{Rust} provide interior mutability, so that their \emph{value} can be mutated even though they have not been declared using \code{mut}.

An example of this is found in with the \code{spin::Mutex} type used in %\cpnameref{}.

Other examples which are not covered in this study include \code{Rc}, \code{Arc}, \code{RefCell}.

\section{Limitations}
* TODO: deadlock example
* TODO: cyclic reference memory leak example

\chapter{Weakness Mitigation And Prevention}
\label{rnd::weakness-mitig-prev}
The terminology \textit{mitigation} used by the \gls{CWE} literally expresses that the suggested measures are not fully preventive.
This chapter practically explores the weaknesses and their mitigation suggestions presented in \cref{context::weaknesses-mem-safety,context::weakness-mitigation}.
As this study is looking for weakness \emph{prevention}, which might be achieved through static analysis, mitigation and prevention are explored side-by-side with in this chapter.
The results are summarized in \cref{enc}.

\section{Porting \glsentrytext{C} Vulnerabilities}
\label{rnd::weakness-mitig-prev::porting-c-vulns}

In this chapter, the weakness manifestations given in \cref{context::weaknesses-mem-safety::manifestations} are rewritten in \gls{Rust} to examine if these are mitigated just by porting them.
This is done incrementally by first porting the vulnerability to unsafe Rust, followed by a rewrite to drop all unsafe code but keeping the intended functionality.

- TODO official CWE-119 examples

\section{Stack Protection}
\label{rnd::weakness-mitig-prev::stack-protection}
The goal of this chapter is to learn about \gls{Rust}'s stack protection mechanisms in comparison to C.

\subsection{Return Address Manipulation Experiments}
\label{rnd::weakness-mitig-prev::stack-protection::ret-addr-experiments}
Return address manipulation is a dangerous stack manipulation as it changes control flow of the program without explicit function calls.
First a \gls{C} example demonstrates the issue, then a \gls{Rust} port is attempted.

\subsubsection{Example in C}
\label{rnd::weakness-mitig-prev::stack-protection::ret-addr-experiments::c}

\begin{figure}[ht!]
\begin{minted}[linenos,breaklines]{c}
static void simple_printer(void) { fprintf(stderr, "I wonder who called me?"); }
void modifier(void) {
  uint64_t *p;
  *(&p + 1) = (uint64_t *)simple_printer;
  *(&p + 2) = (uint64_t *)simple_printer;
}
int main(void) {
  modifier();
  fprintf(stderr, "main exiting");
  return 0;
}
\end{minted}
\caption{Stack-Frame Modification in C}
\label{code::context::examples::sf-modification-simple-c}
\end{figure}

\Cref{code::context::examples::sf-modification-simple-c} is a little example program in \gls{C}, which manipulates the return function address stored on the \gls{stack}.
This is done by simple and legal in \gls{C} pointer arithmetic.
It (ab)uses the address of the first local variable to create references into the \gls{sf} below on the \gls{stack}.
Since the first variable is in the beginning of the \gls{sf} of the called function, it can be used to guess the position of the return address on the \gls{stack}.
Depending on the \gls{compiler} settings, the return address is stored either one or two stack entries in front of the first local variable for a function with no arguments.
In a brute-force manner the program simply overwrites both entries with the address of \code{simple_printer}.
By writing a different function address at these entries, the \code{ret} instruction will jump there, since the original return address has been overwritten.

The output of running this program is 
\begin{minted}{md}
I wonder who called me?Segmentation fault
\end{minted}

\Cref{code::context::examples::sf-modification-simple-c-asm} shows the Assembly code of the \code{modifier()} function from two different compilation runs.
One version makes use of the RBP register as the \gls{sf} Base-Pointer, and the other relies solely on the Stack-Pointer (RSP) for referencing \gls{sf} variables.
The RBP register is pushed onto the \gls{stack} in the function-prologue and restored in the function-epilogue, which takes up one \gls{stack} entry.

\begin{figure}[ht!]
\begin{subfigure}[T]{0.49\textwidth}
\centering
\begin{minted}[linenos,breaklines]{objdump}
<modifier>:
push   rbp
mov    rbp,rsp
movabs rax,0x400690
mov    QWORD PTR [rbp+0x0],rax
mov    QWORD PTR [rbp+0x8],rax
pop    rbp
ret    
nop    DWORD PTR [rax+rax*1+0x0]
\end{minted}
\caption{Compiled with \code{-fno-omit-frame-pointer}}
\end{subfigure}
\begin{subfigure}[T]{0.49\textwidth}
\centering
\begin{minted}[linenos,breaklines]{objdump}
<modifier>:
movabs rax,0x400690
mov    QWORD PTR [rsp],rax
mov    QWORD PTR [rsp+0x8],rax
ret    
\end{minted}
\subcaption{Compiled with \code{-fomit-frame-pointer}}
\end{subfigure}
\caption{Stack-Frame Modification in C: Assembly}
\label{code::context::examples::sf-modification-simple-c-asm}
\end{figure}

\Cref{TODO-callstack-manipulation} is an attempt to visualize what happens in memory and with the \gls{stack} and the \gls{cpu}'s RIP {64-Bit Instruction Pointer} register.

\begin{figure}
    \includegraphics[width=\textwidth]{gfx/TODO-callstack-manipulation}
    \caption{TODO-callstack-manipulation}
    \label{TODO-callstack-manipulation}
\end{figure}
\FloatBarrier

\paragraph{Compiler Hardening - Placing A Canary Value}
The manipulation can be mitigated on \gls{C} using the \code{-fstack-protection-all} option with the \gls{clang}.

\begin{figure}[ht!]
\begin{minted}[linenos,breaklines,highlightlines={3,8-9,13}]{nasm}
<modifier>:
sub    rsp,0x18
mov    rax,QWORD PTR fs:0x28
mov    QWORD PTR [rsp+0x10],rax
mov    QWORD PTR [rsp+0x10],0x400770
mov    QWORD PTR [rsp+0x18],0x400770
mov    rax,QWORD PTR fs:0x28
mov    rcx,QWORD PTR [rsp+0x10]
cmp    rax,rcx
jne    400760 <modifier+0x40>
add    rsp,0x18
ret    
call   4005a0 <__stack_chk_fail@plt>
data16 nop WORD PTR cs:[rax+rax*1+0x0]
\end{minted}
\caption{Stack Frame Modification C/ASM - clang stack protection}
\label{code::examples::sf-modification-clang-protection}
\end{figure}

The highlighted lines in \cref{code::examples::sf-modification-clang-protection} show the code that are part of the protection mechanism.
On \gls{LX}/\gls{amd64}, it inserts checks into the function prologue and epilogues, that make use the \gls{cpu}'s FS register, which can only be modified by the \gls{os}.
The highlighted lines are part of the stack protection.

First, the value is written on the \gls{stack} and later checked for equality, this is called a \textit{canary value}.
Inequality indicates a write operation to the stack-frame, so it jumps to the error handler.
This causes the program to quit with the message: \mint{md}{*** stack smashing detected ***: ./stack_handling terminated}

The following issues can be identified about this detection:
\begin{enumerate}
    \item It's not effective in all cases.
        If line 5 is omitted, which overwrites the canary value, the check doesn't detect any changes but the return address is manipulated nonetheless by line 6.
    \item Checks happen at runtime.
        This study is searching for compile-time checks.
\end{enumerate}

\subsubsection{Porting to Rust} 
\label{rnd::weakness-mitig-prev::stack-protection::ret-addr-experiments::rust}
\Cref{code::examples::sf-modification-simple-rust} shows the complete code for the return address modification attempt in Rust.

\begin{figure}[ht!]
\begin{minted}[linenos,breaklines]{rust}
#![feature(naked_functions)]

#[inline(never)]
fn modifier() {
    let v: usize = 0;
    let v_addr = (&v as *const usize) as usize;
    unsafe {
        *((v_addr + 1 * 8) as *mut usize) = simple_printer as usize;
    }
}

#[naked]
fn simple_printer() {
    println!("I wonder who called me?");
}

fn main() {
    modifier();
    println!("main exiting")
}
\end{minted}
\caption{Stack-Frame Modification \emph{attempt} in Rust}
\label{code::examples::sf-modification-simple-rust}
\end{figure}
\FloatBarrier
The output of running this program is \textit{I wonder who called me?Segmentation fault}, exactly the same as with the C version.

The \code{unsafe} keyword is required here for writing to the calculated raw pointer.
Removing it will cause the compilation to error as follows:
\begin{minted}[breaklines]{md}
error[E0133]: dereference of raw pointer requires unsafe function or block
  --> src/main.rs:96:5
   |
96 |     *((v_addr + 1 * 8) as *mut usize) = simple_printer as usize;
   |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ dereference of raw pointer
\end{minted}
Without unsafe, \gls{Rust} doesn't compile the program and stack manipulation in this manner is not possible.

In addition, two annotations had to be added to the code.
The function \code{simple_printer()} requires \code{\#[naked]}, which prevents the compiler from generate pro- and epilogues for it, which would have made assumptions about the stack that the constructed attack didn't satisfy.
The function \code{modifier()} requires \code{\#[inline(never)]}, which prevents the compiler from copying the function's instructions into the caller, so that there is no actual return made.

\subsection{Stack Clash}
\label{rnd::weakness-mitig-prev::stack-protection::stack-clash}
This subsection investigates the vulnerability described in \cpnameref{context::weaknesses-mem-safety::manifestations::stack-clash} in detail, from userspace and \gls{os} perspectives.
Current \gls{C} and \gls{Rust} compiler options need to be explored to find mitigation and prevention methods for the issue.
The primary focus is on \gls{Rust}'s static analyzer, while the secondary focus lies on \gls{llvm} , as it is currently the backend used in \gls{Rust}.

\subsubsection{Inside a hypothetical OS on AMD64}
\label{rnd::weakness-mitig::stack-protection::rust-stack-clash::in-os}
Despite its name, this section is about solving the stack clash that occurs in userpsace by code in the \gls{os}.
As described in \cref{context::os-dev-concepts::hw-supported-mm::multilevel-paging-concept,context::os-dev-concepts::hw-supported-mm::multilevel-paging-amd64}, the \gls{os} works with the \gls{mmu} to implement paging.
The \gls{os} gains control only when a page-fault is triggered, either due to an unmapped \gls{vaddr} or a page protection violation.
The latter is also caused by accessing the guard page behind the \gls{stack}.
The \gls{os} proposal mentioned in \cref{context::weaknesses-mem-safety::manifestations::stack-clash::proposals} suggests to increase this guard page to a bigger guard area.

\paragraph{Problematic Deferred Page Mapping}
The reason for this mechanism is that some \glspl{os}, including \gls{LX}, perform deferred mapping of pages for the \gls{stack}, i.e. they map the \glspl{vaddr} only when they are accessed by the userspace \gls{app}.
The \gls{stack} can grow by accessing unmapped \glspl{vaddr} until it reaches the guard area.
The issue here is that if a dynamic variable, e.g. a string, could instantly grow large enough to skip the guard area.
If the address at the end of this string would be mapped, e.g. to the heap, the \gls{os} would not even notice that this happened, as the memory access is transparent to the \gls{os}.

If the \gls{os} forced the \glspl{app} to explicitly request memory instead of mapping on-access, preventing such large growth would be simple.
The trade-offs for this design decision are beyond this study to discuss as it seems to be a mere design decision.
It is also not obvious that \gls{Rust} -- or any compiler for that matter -- could solve this specific problem in the \gls{os}, so there is no more investigation to be done.

\paragraph{Increasing the Page Size}
The \gls{os} can reserve a sufficiently large area of guard pages behind the process's stack, which are protected so that the process can't access them without causing a page-fault exception.
A sufficiently large stack-based buffer might still allow to jump over the this area\cite{TheStackClash}, so a \gls{os} only solution is not possible.

The patch\footnote{\url{https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=1be7107fbe18eed3e319a6c3e83c78254b693acb}} for the specific vulnerability in \gls{LX} was examined.
The patch doesn't indicate any fixes that could've have been prevented by \gls{Rust}'s static analyzer in the first place.

\subsubsection{In Userspace}
\label{rnd::weakness-mitig-prev::stack-protection::stack-clash::user-space}
The userspace proposal mentioned in \cref{context::weaknesses-mem-safety::manifestations::stack-clash::proposals} suggests to recompile all \glspl{app} with \code{-fstack-check} enabled in \gls{GCC}, which introduces a certain type of runtime checks.
The search for compile-time checks is documented after explaining this suggestion.

\paragraph{Runtime Checks}
More specifically, it causes the \gls{compiler} to
"generate code to verify that you do not go beyond the boundary of the stack. ... Note that this switch does not actually cause checking to be done; the operating system or the language runtime must do that. The switch causes generation of code to ensure that they see the stack being extended."\cite[p.~349]{GCC540}.
The note unveils that this mechanism relies on the guard-page to be available.

\gls{Rust} has a similar feature which it calls stack probing which is turned on by default.
The implementation displayed in \cref{code::context::examples::rust-stackprobe-asm} is extracted from the binary file\footnote{produced with rustc 1.21.0-nightly (aac223f4f 2017-07-30)} compiled from the code shown in \cref{code::examples::huge-stack-rust}.

\begin{figure}[ht!]
\begin{subfigure}[T]{0.39\textwidth}
\centering
\begin{minted}[linenos=false,breaklines]{nasm}
huge_stack>:
movabs rax,0x100000078
call   3f4e0 <__rust_probestack>
sub    rsp,rax
...
\end{minted}
\caption{Function Prologue}
\end{subfigure}
\begin{subfigure}[T]{0.6\textwidth}
\centering
\begin{minted}[highlightlines={2-7,9},linenos=false,breaklines]{nasm}
__rust_probestack>:
$3f4e0: mov    r11,rax
$3f4e3: sub    rsp,0x1000
$3f4ea: test   QWORD PTR [rsp+0x8],rsp
$3f4ef: sub    r11,0x1000
$3f4f6: cmp    r11,0x1000
$3f4fd: ja     3f4e3 <__rust_probestack+0x3>
$3f4ff: sub    rsp,r11
$3f502: test   QWORD PTR [rsp+0x8],rsp
$3f507: add    rsp,rax
$3f50a: ret    
$3f50b: nop    DWORD PTR [rax+rax*1+0x0]
\end{minted}
\subcaption{Probestack Implementation}
\end{subfigure}
\caption{Rust's Stack Probe Function Assembly}
\label{code::context::examples::rust-stackprobe-asm}
\end{figure}

On the left, it shows \code{huge_stack}'s function prologue, which has a call to the \code{__rust_probestack} implementation.
It passes the estimated stack size to the probestack via the `rax' resgister.

On the right side is the probestack implementation.
It is a loop (first highlighted section) which iterates from the stack pointer down to the estimated stack end address ($rsp - r11$) in steps of 0x1000 Byte. 
Uncoincidentally this size -- 4 KiB -- is the default page size on \gls{amd64}, this means the loop iterates over every page within the estimated stack.
It calls \code{nasm}{test} on each calculated page address, which acts as an unmodifying access.
This is enough to trigger a page fault in the \gls{mmu}, thus notify the \gls{os} about the stack growth.
The \gls{os} can then check if the guard page was accessed or if the stack is permited to grow this far.

As this code was extracted from a binary, the estimated stack size must have been calculcated at compile-time.
This is fortunate and drives the investigation further if this check could be performed entirely at compile-time.

\paragraph{Compile-Time Prevention}
\label{rnd::weakness-mitig-prev::stack-protection::stack-clash::user-space::compile-time}
%The compile-time prevention of the stack clash depends on the ability to predict the stack size and its boundaries accurately.
%This investigation justifies a separate chapter, please see \cref{rnd::stack-size-estimation}.
%
%\chapter{Compile-Time Stack-Size Estimation}
%\label{rnd::stack-size-estimation}
By estimating the stack size at compile-time the stack clash -- covered in \cref{rnd::weakness-mitig-prev::stack-protection::stack-clash}) -- and other undesired stack scenarios, could be predicted without running into them.
In theory, this analysis requires a prediction of the worst-case stack growth for each procedure based on source code information.
This maximum stack growth size must then be compared to stack size limit, as well as the distance and the size of the guard area; it must be equal or less than all given limits.
This could effectively prevent the stack from overflowing and from touching or leaping over the guard area.

The following simplified unequations must be true:
\begin{equation}
    sum~of~all~procedure~stacks =< max~stack~size
    \label{equ:size-all-stack-procedures}
\end{equation}
\begin{equation}
    each~procedure~stack =< max~procedure~stack~size
    \label{equ:size-stack-procedure}
\end{equation}

The calculation of the above values requires the following variable sizes to be known at the time of calculation:
\begin{listing}[ht!]
\begin{enumerate}
    \item{Prologue space allocation, depends on local variables and arguments}
    \item{Stack limit: the maximum stack size is not equal on all \glspl{os}, and can even change per process}
    \item{The page size and guard area size: not equal on all \glspl{os}} 
    \item{Dynamically sized stack variables have no static upper boundary}
    \item{Cyclic procedure calls cause endless stack growth, including recursion}
\end{enumerate}
\caption{Variables Required for Stack Overflow Prediction}
\label{lst:variables-stack-overflow}
\end{listing}

The following paragraph
The maximum stack size and the guard area size must be supplied to the compiler.
Dynamically sized stack variables and circular prodecure calls are more difficult to solve.


\subparagraph{Rust's State}
\gls{llvm} has an option called \code{-warn-stack-size=<uint>}, and has been enabled for this investigation.
How this option can be configured is explained in \cref{rnd::rust::cargo::tweak-llvm}.
Various combinations of the following configuration options have been tried:

\begin{minted}[breaklines]{markdown}
-asan-stack              - Handle stack memory
-safe-stack-layout       - enable safe stack layout
-warn-stack-size=<uint>  - Warn for stack size bigger than the given number
\end{minted}
The first two options are not expected to have an effect on the static analysis yet curious whether they have an additonal effect on runtime overflow detection.

\begin{figure}[ht!]
\begin{minted}[breaklines,highlightlines={3,4}]{rust}
#[inline(never)]
fn huge_stack() {
    const slice_length: usize = 0x100_000_000;
    let slice: [u64; slice_length] = [0xdeadbeef; slice_length];
    let slice_start_addr = &slice[0] as *const u64;
    let slice_end_addr = &slice[slice_length - 1] as *const u64;
    println!("{:?} - {:?} = {:?}",
             slice_start_addr,
             slice_end_addr,
             (slice_end_addr as usize - slice_start_addr as usize) / std::mem::size_of::<usize>());
}

fn main() {
    huge_stack();
    println!("main exiting")
}
\end{minted}
\caption{Program that allocates a huge slice on the stack}
\label{code::examples::huge-stack-rust}
\end{figure}
\FloatBarrier

The highlighted lines in \cref{code::examples::huge-stack-rust} construct a slice on the stack with the size of $8 * 0x100000000 = 0x800000000 = 4,294,967,296$ Bytes (4GiB), which would fill the main memory of any 32-Bit system and should definitely be enough to trigger the configured stack warning.

Unexpectedly this program compiled without a warning; 
It was expected that the \gls{compiler} detects this huge statically allocated stack array, compares it to the configured maximum allowed size and reports the violation.
At runtime it crashes with this message:

\begin{minted}{md}
thread 'main' has overflowed its stack
fatal runtime error: stack overflow
Aborted
\end{minted}
The various optoins had no effect on the runtime output.

One part of this message is even more unexpected, it is said to have overflowed the \code{main} stack although it is known that \code{huge_stack} is the function that allocates too much space on the \gls{stack}.

\paragraph{Available Size Information}
Taking a look at the function prologue reminds one that an estimaation of the stack size is in fact calculated, passed to \gls{Rust}'s probestack implementation, and then subtracted from the stack pointer (RSP) to reserve this space onthe \gls{sf}.

\begin{minted}{nasm}
huge_stack:
    movabs rax,0x800000078
    call   3e120 <__rust_probestack>
    sub    rsp,rax
\end{minted}

Out of the five variables required (\cpnameref{lst:variables-stack-overflow}), this serves the first a simplest one: prologue-allocated space.
A source code investigation of \gls{rustc} and \gls{llvm} has yielded the information that the function prologue is emmited by \gls{llvm}, and the Rust compiler has no knowledge about the \gls{sf} size..

\paragraph{Cyclic Procedure Calls}
Cyclic procedure calls are currently undetected, and the following code compiles fine:
\begin{minted}{rust}
fn a(i: usize) { b(i+3); }
fn b(i: usize) { a(i+5); }
fn main() { a(0); }
\end{minted}
Naturally this program causes a stack overflow at runtime, as it grows its stack with every function call and eventually hits the \gls{os} guard page or the maximum allowed stack size, depending on which is more restrictive.

\subparagraph{Uncodontional Recursion Detection}
Unconditional recursion is a special case of cyclic procedure calls and is detected in Rust.
The following is a minimal example of such a situation:

\begin{minted}{rust}
#![deny(unconditional_recursion)]

fn a() { a(); }
fn main() { a(); }
\end{minted}

By default, the compiler merely warns upon detection, but via the following line in the source code header it will abort compilation with an error instead:

\begin{minted}[breaklines]{md}
error: function cannot return without recurring
   --> src/main.rs:123:1
    |
123 | fn a() { a(); }
    | ^^^^^^^^^^^^^^^
    |
note: lint level defined here
   --> src/main.rs:2:9
    |
2   | #![deny(unconditional_recursion)]
    |         ^^^^^^^^^^^^^^^^^^^^^^^
note: recursive call site
   --> src/main.rs:123:10
    |
123 | fn a() { a(); }
    |          ^^^
    = help: a `loop` may express intention better if this is on purpose
\end{minted}
The error is very explicit about the finding, including the fact that the denial of unconditional recursion is user intended.

% TODO: https://gcc.gnu.org/onlinedocs/gnat_ugn/Static-Stack-Usage-Analysis.html

\paragraph{State Summary and Suggestions}
Not all required information is available at compilation-time.
\Cref{lst:amd64-stack-frame-components} is an extended version of the earlier determined list \cref{lst:variables-stack-overflow}.
This one includes the previous findings and suggestions on how this information could be retrieved. 

\begin{table}
    \begin{tabularx}{\textwidth}{@{}lX@{}}
    \toprule
    Information & Information Availability \\
    \hline
    Prologue space allocation & Available in \gls{llvm} \\
    Stack limit & Not available. Suggestions: heuristics in compiler, or provided by user. This must match the target system not necessarily the compiler system. \\
    Page size & see above \\
    Guard area & see above \\ 
    Recursiv procedure calls & Available. \\
    Cyclic procedure calls & Not available. \\
    \bottomrule
    \end{tabularx}
\caption{Result: Variables Required for Stack Overflow Prediction}
\label{lst:result-variables-stack-overflow}
\end{table}
\FloatBarrier

Dynamically sized stack variables have been omitted from the table since they are irrelevant.
On stack variable-length-arrays and variadic arguments are not supported by \gls{Rust}, and there is no indication of other use-cases.

\chapter{\glsentrytext{LX} Modules Written In \glsentrytext{Rust}}
The numerous \gls{LX} vulnerabilities are a great motivator for using \gls{Rust} for \gls{LX} drivers.
This chapter presents the attempt to use \gls{Rust} for a simple buffer that is presented to userspace as a character device.

- TODO: explain the difficulty to use the Kernel's C Macros, which are required to expose a character device

\chapter{Existing \glsentrytext{os}-Development Projects Based On Rust}
\label{rnd::existing-os-dev-with-rust}
This chapter presents research papers and existing projects that are related to this study.
In addition to presenting their content, the author's tangible influence on the Rust language is determined.

\section{Research Papers}
\label{rnd::existing-os-dev-with-rust::papers}
As Rust is a relatively young language, the selection of research papers relevant for this study is limited.
This is likely due to the fact that Rust hasn't been stabilized until May 15, 2015\footnote{\url{May 15, 2015}}, and relied on a runtime gargabe-collector for a long time of it's pre-stable existence.

\subsection{
    \citetitle{Levy2015a}
%    \cite{Levy2015a}
}
\citeauthor{Levy2015a} have been using Rust to develop a new embedded system \gls{os} for microcontrollers called Tock.
They describe to find Rust's ownership model restricting by preventing safe resource sharing in embedded-typical event-based scenarios.
They made suggestions to extend the langauge with Execution Contexts, which would "allow programs to mutably borrow values multiple times as long as those borrows are never shared between threads. Execution contexts allow the compiler to distinguish such sharing from actual errors using only local analysis."

On their website the authors recently made the following statement:
"After feedback from the Rust developers and the community, we were able to overcome those challanges without modifications to the language. We also learned that we understated how disruptive some of the changes we proposed would be to the language and do not believe they are worthwhile. This has been discussed extensively now in the Rust community. You should read this paper critically, not as conclusive scientific findings, but as the perspectives of the authors during a particular point in the development of Tock."\cite[/papers]{TockOS}

\subsection{
    \citetitle{Beingessner2015}
%    \cite{Beingessner2015}
}
Covered in \cref{rnd::rust}

\subsection{
    \citetitle{Reed2015}
%    \cite{Reed2015}
}
\subsection{
    \citetitle{Getreu2016}
    \cite{Getreu2016}
}
\subsection{
    \citetitle{Balasubramanian2017}
%    \cite{Balasubramanian2017}
}
\subsubsection{Software Fault Isolation}
- TODO: content from \cite{Balasubramanian2017}
\subsection{
    \citetitle{Nilsson2017}
%    \cite{Nilsson2017}
}


\section{Libraries}

\subsection{Libfringe}
% TODO: https://github.com/edef1c/libfringe


\section{Systems}
Most of the presented systems target the \gls{amd64} architecture; Tock OS, which is targeted towards an ARM variant, is the only exception.
The interesting parts of each \gls{os} are their origin, intentions, their current state, the level of memory-safety, and what design or language features made this level possible.

\subsection{Blog OS}
\label{rnd::existing-os-dev-with-rust::systems::blog-os}
Blog OS is a hobby project about writing an OS in \gls{Rust}.
It is well documented by the author through insightful blog posts\footnote{\url{https://os.phil-opp.com/}}.

\subsubsection{General State}
Blog OS has a working memory allocator which allows them to use Rust's heap-based features.
Exception handlers are stubbed and there are no notions of tasks yet.
The focus lies on a Rust-idiomatic implementation of the \gls{os} features.

\subsubsection{Paging With Type Safety}
\label{rnd::existing-os-dev-with-rust::systems::blog-os::mm}
Blog OS uses Rust's type system to model the hierarchical page tables (\cref{context::os-dev-concepts::hw-supported-mm::multilevel-paging-amd64}) in code in a safe way.
This is explained in on one of his blog posts\footnote{\url{https://os.phil-opp.com/page-tables/}}, and demonstrates how Rust can help to prevent mistakes.

Please note that the example has been rewritten for a 2-level page table hierarchy simply to save space in this document.
The methodology is the same for all levels above 1, so it is sufficient to have only one level above for demonstration.
The code example includes comments which are relevant for the understanding.

Starting with the result is the fastest way to explain this.
The highlighted line in the following code is supposed to fail in this test, as the lowest page table hierarchy is not followed by another one.
\begin{minted}[breaklines,highlightlines=5]{rust}
pub const P2: *mut Table<Level2> = 0xffffffff_fffff000 as *mut _;

fn test() {
    let p2 = unsafe { &*p2 };
    p2.next_table(42)
      .and_then(|p1| p1.next_table(0xcafebabe))
}
\end{minted}

The \code{P2} pointer is a static memory location, to which the page table has hypothetically been written by the \gls{os}.
It doesn't matter for testing purposes, because this test fails compilation successfully and is not able to run.
The following error occurs on compilation:

\begin{minted}[breaklines]{md}
error: no method named `next_table` found for type
  `&memory::paging::table::Table<memory::paging::table::Level1>`
  in the current scope
\end{minted}

This is achievied by defining the types accordingly:

\begin{minted}[breaklines,highlightlines={}]{rust}
// Empty enum provide distinct type for each level
pub enum Level2 {}
pub enum Level1 {}

// Trait for the lowest level
pub trait TableLevel {}

// Trait for all above levels, need a nested type to indicate what type follows after them
trait HierarchicalLevel: TableLevel { 
    type NextLevel: TableLevel;
}

// All levels above 1 are hierarchical and statically define what level comes next, e.g.
// ...                     Level4 { type NextLevel = Level3; }
// ...                     Level3 { type NextLevel = Level2; }
impl HierarchicalLevel for Level2 { type NextLevel = Level1; }
impl TableLevel for Level1 {}

// Use PhantomData to consume the TableLevel as it is detected as unused and won't compile
use core::marker::PhantomData;
pub struct Table<L: TableLevel> {
    entries: [Entry; ENTRY_COUNT],
    level: PhantomData<L>,
}

// Unified next_table method for all levels!
impl<L> Table<L> where L: HierarchicalLevel
{
    pub fn next_table(&self, index: usize) -> Option<&Table<NextLevel>> {...}
}
\end{minted}

\subsubsection{Influences on Rust}
\label{rnd::existing-os-dev-with-rust::systems::blog-os::influence}
The author filed a pull-request\footnote{\url{https://github.com/rust-lang/rust/pull/39832}} against Rust that enabled the x86-interrupt calling convention for \gls{Rust}, which is supported by the underlying \gls{llvm}.
The change was accepted by the maintainers.

The pull-request describes the motivations for the change, two of which are: interface safety and increased performce.
In detail:
\begin{enumerate}
    \item Safer interfaces: We can write a \code{set_handler} function that takes a \code{extern "x86-interrupt" fn(&ExceptionStackFrame)} and the compiler ensures that we always use the right function type for all handler functions. This isn't possible with the \code{#[naked]} attribute.
    \item Higher performance: A naked wrapper function always saves all registers before calling the Rust function. This isn't needed for a compiler supported calling convention, since the compiler knows which registers are clobbered by the interrupt handler.
\end{enumerate}

Argument 1 is a way to prevent mistakes made by the \gls{os} developer when working on the interrupt handlers, thus increasing safety.
This is smilar to the type safety explained in \cref{rnd::existing-os-dev-with-rust::systems::blog-os::mm}.
It could be strengthended even more, as one could define a type for each specific interrupt handler and entry which are forced to match by the \gls{compiler}'s type checks.

Argument 2 explains that context switches to and from interrupt handlers can be sped up, as the \gls{compiler} can now examine the interrupt handler and only store and restore those \gls{cpu} registers that are actually used by the function.

\subsection{Redox OS}
The Redox OS has a "hybrid kernel that supports X86\_64 systems and provides Unix-like syscalls for primarily Rust applications"\footnote{\url{https://doc.redox-os.org/kernel/kernel/}}
Ii is entirely written in Rust (with necessary inline ASM) and supports\footnote{\url{https://www.redox-os.org/}} the Rust standard library.

\paragraph{General State}
Its state has far surpassed being a hobby project, featuring multitasking on multiple CPUs, user- and kernel-space threads, a file system, rudimentary networking support and graphical output.

The userland of Redox OS provides a package manager, a graphical desktop environment, and due to its \gls{microkernel} aspect also the device drivers.

\paragraph{Page Management inspired by Blog OS inspired}
A comment in the Redox kernel\footnote{\url{https://github.com/redox-os/kernel/blob/b364d052f20f1aa8bf4c756a0a1ea9caa6a8f381/src/arch/x86_64/paging/mod.rs\#L2}} explitly states to include code taken from the Blog OS paging implementation \cref{rnd::existing-os-dev-with-rust::systems::blog-os::mm}.

\subsubsection{Stack Clash Invulnerable}
The page-fault handler in Redox OS is as simple as\footnote{\url{https://github.com/redox-os/kernel/blob/b364d052f20f1aa8bf4c756a0a1ea9caa6a8f381/src/arch/x86_64/interrupt/exception.rs\#L81}}

\begin{minted}[autogobble,breaklines,highlightlines=6]{rust}
interrupt_error!(page, stack, {
    let cr2: usize;
    asm!("mov rax, cr2" : "={rax}"(cr2) : : : "intel", "volatile");
    println!("Page fault: {:>02X}:{:>016X} at {:>02X}:{:>016X}", stack.code, cr2, stack.cs, stack.rip);
    stack_trace();
    ksignal(SIGSEGV);
});
\end{minted}
On Redox OS \emph{every} page-fault unconditionally sends the \textit{SIGSEGV} signal (line highlighted) to the process that caused the page-fault.
It does not use deferred page mapping described in \cref{rnd::weakness-mitig::stack-protection::rust-stack-clash::in-os}, and is therefore not vulnerable to the stack clash.
This is based on a design decision and has little to do with \gls{Rust}.

\subsection{Influences on Rust}
The main author of Redox OS has become an active contributor to the Rust language, likely with the main motivation of making Rust more suitable for \gls{os} development.

The biggest achievement from the perspective of this study is the successful integration into Rust's libstd, which happened continuously and cannot be referenced easily.
This allows programmers to use Rust with all it's features to develop programs for Redox OS.

\subsection{Tock OS}
Tock OS is "an embedded operating system designed for running multiple concurrent, mutually distrustful applications on low-memory and low-power microcontrollers."\cite{TockOS}

\subsubsection{Task Model}
\subsubsection{Memory Management}

\subsection{intermezzOS}
"intermezzOS is a teaching operating system, specifically focused on introducing systems programming concepts to experienced developers from other areas of programming."\footnote{\url{https://intermezzos.github.io/}}

The project consists of two source code repositories and an accompanying book.
It has been inspired by the Blog OS, the author of which is also a contributor to intermezzOS.

The "bare-bones" contains only rudimentary machinery, from which the book walks the developer step-by-step to a successful boot of the kernel within a virtual machine emulator.

The "kernel" contains more advanced development and even surpasses the books latest chapters.
This code base has been chosen as the foundation for the \gls{os} developments for this studies.
Starting with this code base, preemptive multitasking is implemented, with the goal to learn as much as possible about the languages memory-safety aspects.
This development is documented in \cref{rnd::imezzos-preemptive-multitasking}.

\subsection{Others}
This section gives an overview over the many projects I have stumbled upon that I think are worth mentioning.
All these projects are undertakings to write \gls{os}s in Rust, and interested readers might want to take a look around.
While it's fortunate to see that Rust has gained popularity among \gls{os} development interested programmers, the effort of investigating each cannot be spent in the course of this work.

\paragraph{Tifflin}
Experimental Kernel (and eventually Operating System).
\url{https://github.com/thepowersgang/rust_os}

\paragraph{Rust Bare-Bones Kernel}
This is designed to be a rust equivalent of the \url{OSDev.org} Bare\_Bones article, presenting the bare minimum you need to get started.
\url{https://github.com/thepowersgang/rust-barebones-kernel}

\paragraph{Bare Metal Rust: Building kernels in Rust}
A blog series that advances Blog OS (\cref{rnd::existing-os-dev-with-rust::systems::blog-os}).
\url{http://www.randomhacks.net/bare-metal-rust/}

\paragraph{The Stupid Operating System}
SOS is a simple, tiny toy OS implemented in Rust.
\url{https://github.com/hawkw/sos-kernel/}

\chapter{\glsentrytext{imezzos}: Adding Preemptive \glsentrytext{os}-Level Multitasking}
\label{rnd::imezzos-preemptive-multitasking}
Development on intermezzOS -- or any other \gls{os} -- requires features that are only available in Rust's nightly version.
This version is under very active development, and at the I started development on interemezzOS, the project was not compatible with the current version. 
Debugging a system that does not work for someone who never experienced the working state is hard, and the initial learning curve for the required tools, in addition to learning a new language, was very steep.
This chapter assumes basic knowledge on how binaries are compiled and linked by a chain of tools.

\paragraph{On Code Length in this Chapter}
I am aware that the code takes up much space, but I have decided to keep it as is, as this study is about \gls{os} \emph{development}, which is the combination of theoretical knowledge and practical implementation.
To keep this document self-contained and allow a comfortable reading experience, the findings and code are tightly coupled.

It shows that \gls{GCC} is still required in the development process, at least for this specific \gls{os}.
However, it is only used for linking and not for actual compilation.

\begin{minted}[breaklines]{diff}
--- a/x86_64-unknown-intermezzos-gnu.json
+++ b/x86_64-unknown-intermezzos-gnu.json
@@ -3,7 +3,8 @@
 	"cpu": "x86-64",
 	"data-layout": "e-m:e-i64:64-f80:128-n8:16:32:64-S128",
 	"executables": true,
-        "linker-flavor": "gcc",
+	"linker": "gcc",
+	"linker-flavor": "gcc",
 	"llvm-target": "x86_64-unknown-none-gnu",
 	"no-compiler-rt": true,
 	"os": "intermezzos",
\end{minted}

The changes not very interesting, but the file per-se is.
It is used to teach \gls{rustc} about the target system, so that it can produce compatible code.

\paragraph{Toolchain}
The following tools form the toolchain required to work on interemezzOS:
\begin{itemize}
    \item rustc -- \glsentrylong{rustc}
    \item cargo -- \glsentrylong{cargo}
    \item xargo -- \glsentrylong{xargo}
    \item nasm -- Assembly compiler
    \item ld -- Linker
    \item qemu-system-x86\_64
    \item grub2-mkrescue -- GRUB2 Bootloader bmage builder
    \item xorriso -- ISO file writer
    \item gdb -- Debugger
    \item make -- Because Makefile is (still) being used.
\end{itemize}

\paragraph{Build Process}
The build process gives an impression of what is required to build an \gls{os} executable with \gls{Rust}.
\begin{enumerate}
    \item \code{make} manages the inter-dependencies of the build process.
    \item \code{nasm} compiles an assembly from that bootstraps the system from multiboot stage for 32-Bit mode.
    \item \code{rustc} compiles the Rust programs which contains the 64-Bit code.
    \item \code{ld} is used to link these two together and form a multiboot compliant kernel binary.
    \item \code{grub2-mkrescue} is used to generate a multiboot-compliant bootloader. It will load the kernel binary.
    \item \code{xorriso} combines the kernel binary and bootloader into a bootable ISO
    \item \code{qemu-system-x86_64} can be used to boot the ISO
\end{enumerate}

\section{Development State}
The anticipated development of preemptive multitasking has been reached.
Tasks are represented by plain \code{fn()} instances. 
The tasks and the task table are statically defined in the \gls{os} source code.
Task switches are driven by the Programmable-Interrupt-Timer, for which a driver has been implemented.
The task scheduler works in a round-robin fashion and detects stack overflows.
Any stack overflowing task is not scheduled anymore.
The stack size is statically defined and is allocated globally by the compiler.

The implementation uses no dynamic memory allocations, thus there was no experience gathered with managing dynamic memory within the \gls{os}.
The global state references might be accessed by any defined task, e.g. allowing TODO


\section{System Clock Driver}
This section will walk through the creation of a simple clock driver.

The first usage of traits was the definition and implementation of the \code{Clock} trait for the \code{Pic} type.
The trait defines the properties of a driver that implements a Clock, and the Pit is the hardware specific implementation for this trait.

\subsection{Trait and Pit Implementation}
\paragraph{Trait Definition}
The trait defines a clean interface for any system clock.
The highlighted lines show the \code{unsafe} functions within this trait.
This is used to force the caller to use \code{unsafe}, which must only be done with care and never from a regular task.

\begin{minted}[breaklines,highlightlines={4,18}]{rust}
/// The Clock trait is for each clock type.
pub trait Clock {
    /// Start the clock
    unsafe fn start(&self);

    /// Receive the frequency the clock is set for
    fn frequency(& self) -> SimpleResult<Frequency>;

    /// Update the internal clock counter by one.
    /// The time of one tick is `1/self.frequency()`s.
    unsafe fn tick(&self);

    /// Receive the current tick counter
    fn ticks(&self) -> SimpleResult<(u64, Duration)>;

    /// Returns the uptime as `Duration`.
    /// This assumes that **all** fired clock interrupts have successfully called `self.tick()`.
    fn uptime(& self) -> SimpleResult<Duration>;
}
\end{minted}

\paragraph{Implementing Clock for Pit}
This code lives in the same file as the Clock trait, but it shows how code can be structured by modules.
The \code{use} statements are required for using anything defined outside of the module, even for parent constants as can been.
This is a clean way of handling hardware-specific constants.

\begin{minted}[breaklines,highlightlines={4,6}]{rust}
/// This module implements a system clock using the Programmable Interrupt Timer
/// ...
pub mod pit {
    ...
    use super::consts::NSEC_MULTIPLIER;
    ...

    /// Constants definitions for the pit module
    pub mod consts {
        pub const BASE_FREQUENCY: u32 = 1193182;
        ...
        pub const CHANNEL_IO_PORTS: [u16; 3] =
            [CHANNEL0_IO_PORT, CHANNEL1_IO_PORT, CHANNEL2_IO_PORT];
    }

    ...

    /// Type for the Programmable-Interrupt-Timer
    pub struct Pit {
        pub frequency: Frequency,
        divisor: u16,
        pub resolution: u64,
        channel: u8,
        ticks_atomic: AtomicUsize,
    }
    ...
\end{minted}
Some of the fields of \code{Pit} are made \code{pub}lic, notice that the counter is not one of them.

\subsection{Global CLOCK State}
\paragraph{Initialization}
The state of the clock is held globally, though it must be initialized with non-static code.
This is possible in Rust lazy initialization, which works by defining a static reference, implementing the singleton pattern.

The dereference method for this reference has been automatically generated to call the initialization code on first reference.
The compiler is aware of this and reserves memory at compile time.
This functionality is implemented by an external crate called \emph{lazy\_static}.

\begin{minted}[breaklines]{rust}
/// Initialization of these references happens on first deref
lazy_static! {
    static ref CONTEXT: intermezzos::kernel::Context = intermezzos::kernel::Context::new();
    static ref CLOCK: clock::pit::Pit = clock::pit::new(0, (0x71ae) as u16);
    ...
}

...

/// Task0 starts the clock, enables interrupts and goes to sleep.
fn task0() {
    // This will trigger clock::pit::new(...) and then call .start()
    unsafe { CLOCK.start() };
    kprintln!(CONTEXT,
              "System clock set up. Frequency: {} / Resolution: {}ns",
              // pub fields can be accessed
              CLOCK.frequency,
              CLOCK.resolution);
    ...
}
\end{minted}

As can be seen in the code snippet, the \gls{os}'s \code{CONTEXT} reference is stored next to it.
The task0 can then simply access the public fields.

\paragraph{Starting The Clock}
The driver uses an atomic integer type that is part of the core library, which is predestined as a clock counter.
It doesn't require a lock even when shared via multiple tasks.

\begin{minted}[breaklines]{rust}
pub mod pit {
    use x86::shared::io::outb;
    use core::sync::atomic::{AtomicUsize, Ordering, ATOMIC_USIZE_INIT};
    ...
}
...
impl Clock for Pit {
    unsafe fn start(&self) {
        let lobyte = (self.divisor & 0xFF) as u8;
        let hibyte = ((self.divisor >> 8) & 0xFF) as u8;
        unsafe {
            outb(consts::COMMAND_PORT, gen_command(self.channel));
            outb(consts::CHANNEL_IO_PORTS[self.channel as usize], lobyte);
            outb(consts::CHANNEL_IO_PORTS[self.channel as usize], hibyte);
        };
    }
    ...
    unsafe fn tick(&self) {
        self.ticks_atomic.fetch_add(1, Ordering::SeqCst);
    }
    ...
}
\end{minted}
The \code{start} method is the first occurrence of \code{unsafe}, which is required to perform raw I/O port access using \code{outb}.
\code{tick} is extremely simple, it uses a method to atomically add one, requesting a specific ordering: \textit{SeqCstr: Like AcqRel with the additional guarantee that all threads see all sequentially consistent operations in the same order}.\footnote{\url{https://doc.rust-lang.org/core/sync/atomic/enum.Ordering.html}}
This method is called in the \gls{os} timer interrupt handler.


% TODO: Is the static analysis of hardware specific assembly code possible and useful at all?
% TODO: LLVM knows about the target and can potentially give hints about hardware specific instructions

\section{Timer Interrupt For Scheduling and Dispatching}
\label{rnd::imezzos-preemptive-multitasking::timer-interrupt-scheduling}
The timer interrupt will trigger according to the frequency that was set for the \code{Pit} clock driver previously explained.

\subsection{Macro For Interrupt-Handler Setup}
\label{rnd::imezzos-preemptive-multitasking::timer-interrupt-scheduling::macro}
The handler definition is assisted by the a macro rule that had existed in the codebase but was significantly changed.
It showcases macro and also inline assembly functionality.

\subsubsection{Macro Semantics}
The macro matches one pattern with five language items.
This means that it cannot be invoked with more or less items.
Additionally, they have different language item types, which make them match only certain token trees.
The passed \code{name} will be the name of the defined function.
\code{$esf:ident: $esfty:ty} at usage, looks like a normal variable definition with a name a and a type.
It is used as the parameter for the interrupt handler.
The \code{$body} is the function body of the interrupt handler, defined by the macro caller.
The last important aspect of the macro semantics is that the last line within the emitted code does not have a semicolon, which means the macro expression will evaluate to this value, namely an instance of \code{IdtEntry}.

\begin{figure}[ht!]
\begin{minted}[breaklines,linenos,highlightlines={3,6,12,17,21}]{rust}
#[macro_export]
macro_rules! make_idt_entry {
    ($name:ident, $esf:ident: $esfty:ty, $ir_gate:expr, $body:expr) => {{
        ... 

        extern "x86-interrupt" fn $name($esf: $esfty) {
            unsafe {
                asm!(""
                    : // output operands
                    : // input operands
                    : // clobbers
                    "rax", "rbx", "rcx", "rdx", "rsi", "rdi", "r8",  "r9",  "r10", "r11", "r12", "r13", "r14", "r15", "rbp"
                    : // options
                    "intel" "volatile"
                );
            }
            $body
        };
        ...
        let handler = VAddr::from_usize($name as usize);
        IdtEntry::new(handler, 0x8, PrivilegeLevel::Ring0, $ir_gate)
    }};
}
\end{minted}
\caption{intermezzOS: Macro for defining Interrupt Handlers}
\label{code::imezzos::ir-handler-macro}
\end{figure}
\FloatBarrier

\subsubsection{OS Semantics}
It is worth explaining how the macro semantics are used to model \gls{os} semantics in \cref{code::imezzos::ir-handler-macro}

\paragraph{Interrupt ABI Function Type}
For each defined handler, the macro allows to set the argument type for the handler function.
The \emph{value} will be passed at runtime by the \gls{cpu} for each interrupt, and not each interrupt uses the same layout for this argument.
This type must match the exception \gls{sf} layout introduced in \cpnameref{fig:amd64-long-mode-interrupt-stac}, which can be either with or without the error field.
This decision is made by the macro caller, as the interrupt type is not known within the macro, and can only be known by the developer later.

Thanks to the pull-request described in \cpnameref{rnd::existing-os-dev-with-rust::systems::blog-os::influence}, the \code{extern "x86-interrupt"} can be used for defining the interrupt types.
It enables the proper handling of the first argument, and in combination with the \emph{clobber} registers shown in \cref{code::imezzos::ir-handler-macro} line 11, enables the compiler to generate a functoin pro- and epilogue to automatically \code{PUSH/POP} all named registers from the stack.
As a result, the inline assembly string provided by the programmer is empty, which alleviates the necessatiy of \code{unsafe}.

\paragraph{Inline Assembly}
Further, the inline assembly is interesting.

\paragraph{Inline Assembly}

\begin{minted}[breaklines]{rust}
    let timer = make_idt_entry!(isr32, esf: &mut ExceptionStackFrame, true, {
        ... 
        unsafe { CLOCK.tick() };
        ...
\end{minted}
TODO


\section{Tasks and Stacks}
\label{rnd::imezzos-preemptive-multitasking::tasks-stacks}
The implementation of the tasks has been kept straight forward, using static variables.

\subsection{Declaration and Intantiation}
\label{rnd::imezzos-preemptive-multitasking::tasks-stacks::dni}
\Cref{code::imezzos::stack-and-tasks-1} defines a \code{Stack} with a top and a bottom address based which are offset by a constant.
Subsequent stacks grow the multiplier by 10, which keeps space between the stacks.

\begin{listing}[ht!]
\begin{minted}[breaklines,highlightlines={4-7}]{rust}
const STACKS_TOP: usize = 0x1_000_000; // 15.7MiB
const STACK_SIZE: usize = 0x_002_000;  // 64KiB
use tasks::stack::Stack;
const TASK0_STACK: Stack = Stack {
    top: STACKS_TOP - 10 * STACK_SIZE,
    bottom: STACKS_TOP - (10 + 1) * STACK_SIZE,
};
\end{minted}
\caption{intermezzOS: Stack and Task Definition - 1}
\label{code::imezzos::stack-and-tasks-1}
\end{listing}

\Cref{code::imezzos::stack-and-tasks-2} defines a \code{TaskEntry} in a static array of the same.
The highlighted lines are unique to each task.
In the given order, they represent their first instruction, their initial top of stack, and their initial set of \gls{cpu} registers.
Except for the instruction pointer, these variables have their own type and cannot easily be mixed up.

\begin{listing}[ht!]
\begin{minted}[breaklines,highlightlines={7,9,12}]{rust}
    let tasklist = [
        tasks::TaskEntry {
                name: "Task 0",
                esf: interrupts::ExceptionStackFrame{
                    code_segment: 0x8,
                    stack_segment: 0x10,
                    instruction_pointer: task0 as usize,
                    cpu_flags: 0x200202,
                    stack_pointer: TASK0_STACK.top,
                },
                stack: TASK0_STACK,
                registers: tasks::TaskRegisters::empty(),
                blocked: false,
                },
        ...
        tasks::TaskEntry {
                ...
                },
    ];
\end{minted}
\caption{intermezzOS: Stack and Task Definition - 2}
\label{code::imezzos::stack-and-tasks-2}
\end{listing}

\Cref{code::imezzos::stack-and-tasks-3} wraps this array by a \code{Mutex}, which is returned by the expression and stored as a \code{lazy_static} reference as explained in the previous section.
The \code{Mutex} type is interesting, as it provides \emph{interior mutability}.
This explains how the tasklist can be mutated at runtime, even though it is not declared as \code{mut}.

\begin{listing}[ht!]
\begin{minted}[breaklines,highlightlines={7}]{rust}
lazy_static! {
...
static ref TSI: Mutex<tasks::TaskStateInformation> = {
    let tasklist = [
        ...
    ];
    Mutex::new(tasks::TaskStateInformation::new(tasklist))
    };
    ...
};
\end{minted}
\caption{intermezzOS: Stack and Task Definition - 3}
\label{code::imezzos::stack-and-tasks-3}
\end{listing}

\subsection{Preemptive Task Switches}

\subsection{Task Definitions}

\section{Safety}
\subsection{Protecting Static Resources}
 TODO
\begin{minted}{md}
error: usage of an `unsafe` block
   --> src/main.rs:499:5
    |
499 |     unsafe { CLOCK.start() };
    |     ^^^^^^^^^^^^^^^^^^^^^^^^
    |
note: lint level defined here
   --> src/main.rs:497:8
    |
497 | #[deny(unsafe_code)]
    |      
\end{minted}

\subsection{Risk Of Stack-Overflow}
TODO
- TODO: reference stack protection
Give a practical example what this could look like with an extension attribute

\chapter{Result Summary}
- TODO 
TODO