The usage of the \gls{stack} is tightly coupled with control flow instructions in conjunction with two registers, the Stack-Frame Base Pointer (RBP) and the Stack Pointer (RSP).
The instructions that use these registers and explicitly or implicitly work with the stack\cite[p.~83]{AMD64Vol1} can be grouped into the following categories.
Together they can be used to perform \gls{stack} based procedure calls, as demonstrated in the following \cref{context::introduction::hw-supported-mm::procedure-call-example}.
\paragraph{Direct Stack Data Management} with PUSH and POP.
PUSH takes value operand which is to be pushed onto the stack.
The address in RSP moves towards numerically lower addresses with every PUSH instruction, which stores a new data entry on top.
The order is to first change the RSP and then copy the value at its new address.
First, consuming the top-most data entry and storing it on the operand location, then moving the RSP address towards the numerically higher RBP address.
When RBP and RSP point to the same address, the stack is considered empty.
\paragraph{Procedure Calls} with CALL and RET. \\
These instructions control the instruction flow by calling another instruction procedure\footnote{loosely synonymous with function}.
The CALL instruction takes the address of the instruction that is to be called.
Before jumping to the instruction at the given address, it PUSHes the current RIP (instruction pointer) register onto the \gls{stack}.
RET takes no operand, but instead POPs the \gls{stack}'s top entry.
The consumed value is used as a jump address.
As PUSH and POP use the RSP register, the called procedure is responsible to finish with the RSP at the same position as when it was entered.
For example, PUSHing some value onto the stack before the end of the function would cause the RET to jump to that address instead of returning to the caller.
\paragraph{Called Procedure Setup}\emph{not} with ENTER and LEAVE.
Return address (created by the CALL instruction). \\
\textit{Always used by CALL}
}
\item{%
Array of stack-frame pointers (pointers to stack frames of procedures with smaller nesting-level depth) which are used to access the local variables of such procedures. \\
\textit{Depends on support and implementation of nested functions in the \gls{compiler}}
The \gls{amd64} manual also lists ENTER and LEAVE as instructions to \textit{"provide support for procedure calls, and are mainly used in high-level languages."}\cite[p.~48]{AMD64Vol1}.
The latter claim could not be verified by inspecting binaries produced by the \gls{C} and \gls{Rust}\glspl{compiler}.
Instead, these \glspl{compiler} generate a sequence of PUSH, MOV and SUB instructions to manage theset up the \gls{stack}.
There are instructions before and after the procedure's logic, taking care of the technicalities of \gls{stack} management.
These instruction groups within the called procedure are called prologue and epilogue.
This section combines the separate categories into one complete example that shows how the \gls{stack} is used by various \gls{cpu} instructions to perform procedure calls.
It stores the arguments within the registers according to the System V X86\_64 calling convention. %TODO REFERENCE
The caller doesn't alter the stack-frame pointer (RBP) or the stack pointer (RSP) registers before call, hence the called function must restore these if it alters them.
\section{4-Level Paging Hierarchy on \glsentrytext{amd64}}
\label{rnd::sysprog-conventions::paging-amd64}
On \gls{amd64} "a four-level page-translation data structure is provided to allow long-mode operating systems to translate a 64-Bit virtual-address space into a 52-Bit physical-address space."\cite[p.~18]{AMD64Vol2}.
This allows the system to only hold the \textit{PML4} table, the which is currently referenced by the \textit{Page Map Base Register (CR3)}, available in main memory.
\cref{fig:virtual-addr-transl} shows the 64-Bit virtual address composition on \gls{amd64}, which uses four-levels of page tables.
Counterintuitively the page-tables are not called level-\textit{n}-page-table, but the levels received distinct names in \citetitle{AMD64Vol2}.
The most-significant Bits labelled as \textit{Sign Extend} are not used for addressing purposes, but must adhere the canonical address form and simply repeat the value of the most-significant implemented Bit \cite[p.~130]{AMD64Vol2}.
The least significant Bits represent the offset within the physical page.
The four groups in between are used to index the page-table at their respective level.
The other supported page sizes, 4 MiB and 1 GiB, as well as intermixing page sizes through the different levels don't add new insight into the mechanism and don't need to be detailed here.
On \gls{amd64}, the \gls{cpu}'s interrupt mechanism does not switch the full context described previously, but only handles the registers that are necessary to successfully jump to the interrupt function: RFLAGS, RSP, RBP, RIP\footnote{Segment registers are neglected}.
A description for \gls{amd64} is given in \cref{tab:task-minimum-context-registers}.
\begin{table}
\begin{tabularx}{\textwidth}{| c | X | X |}
\hline
\textbf{descriptive name}&
\textbf{register names on amd64}&
\textbf{description}\\
\hline
the instruction pointer register & RIP & address of the next instruction to be fetched \\
\hline
the stack pointer register & RSP & address of current position in stack \\
\hline
the flags register & RFLAGS & various attributes, e.g. the interrupt flag \\
\hline
all general-purpose registers & RAX, RBX, RCX, RDX, RDI, RSI, RBP, RSP, R8–R15 & arbitrary data \\
\hline
\end{tabularx}
\caption{Minimum Context Registers on amd64\cite[p.~28]{AMD64Vol2}}
\label{tab:task-minimum-context-registers}
\end{table}
\subsection{Storing The Context On The Stack}
In this scenario, the context is stored on the \gls{stack} of the function that is interrupted.
\Cref{fig:amd64-long-mode-interrupt-stac} pictures the \gls{stack} layout on interrupt entry.
In order to leverage an interrupt for a context switch, the interrupt function needs to replace these values on the \gls{stack} with values for the new context.
CS (Code-Segment) and SS (Stack-Segment) have no effect in \gls{amd64} 64-Bit mode\cite[p.~20]{AMD64Vol1} and can remain unchanged.
The \gls{os} developer needs to know the exact address where on the \gls{stack} this data structure has been pushed by the \gls{cpu}, and must then manipulate these addresses directly.
This type of manipulation is inherently dangerous and can not be easily checked by the \gls{compiler}.
The function that handles the interrupt must then use the instruction \textit{iretq}\cite[p.~252]{AMD64Vol2}, to make the \gls{cpu} restore the partial context from the \gls{stack} and continue to function pointed to by the RIP.
In this chapter, the weakness manifestations from \cref{context::common-mem-safety-mistakes::manifestations} are rewritten in \gls{Rust} to learn to what level they are mitigated just by porting them.