context/rnd: paging/stack/heap/virtualization
This commit is contained in:
parent
12b71b3744
commit
83c5540a42
8 changed files with 972 additions and 382 deletions
|
@ -1,6 +1,32 @@
|
|||
% // vim: set ft=tex:
|
||||
\chapter{Topic Refinement}
|
||||
% TODO: is this chapter required?
|
||||
\chapter{Refined Research Questions}
|
||||
|
||||
\section{Software Tests}
|
||||
% TODO: describe that tests are mostly semantics as opposed to static checks being mostly syntactical and technical
|
||||
% TODO: Are they necessary in addition to static checks to cover the well-known use-cases and edge-cases.
|
||||
% TODO: example?
|
||||
|
||||
\section{Definition Of Additional Analysis Rules To Extend Safety Checks}
|
||||
% TODO: How can Business Logical
|
||||
% Examples:
|
||||
% TLB needs to be reset on Task Change
|
||||
% Registers need to be
|
||||
|
||||
\subsection{Paging}
|
||||
Setting up and maintaining the paging-structure, as well as allocating physical memory for the virtual pages is a complex task in the \gls{os}.
|
||||
Developing this part of the \gls{os} is error-prone, and is not well-supported by mainstream \glspl{proglang}.
|
||||
|
||||
\section{Software Fault Isolation}
|
||||
% TODO: content from \cite{Balasubramanian2017}
|
||||
|
||||
% TODO Which language items help with managing memory?
|
||||
% TODO How generic can the memory allocators be written?
|
||||
|
||||
% TODO Guarantees to be statically checked:
|
||||
% TODO * Control access to duplicates in page tables
|
||||
% TODO * Tasks can't access unallocated (physical) memory
|
||||
% TODO * Tasks can't access other tasks memory
|
||||
|
||||
|
||||
\chapter{System Programming Conventions}
|
||||
\label{rnd::sysprog-conventions}
|
||||
|
@ -17,7 +43,7 @@ PUSH takes value operand which is to be pushed onto the stack.
|
|||
The address in RSP moves towards numerically lower addresses with every PUSH instruction, which stores a new data entry on top.
|
||||
The order is to first change the RSP and then copy the value at its new address.
|
||||
|
||||
POP takes a storage reference operand - \gls{CPU} register or memory address.
|
||||
POP takes a storage reference operand - \gls{cpu} register or memory address.
|
||||
It works in the opposite direction to PUSH.
|
||||
First, consuming the top-most data entry and storing it on the operand location, then moving the RSP address towards the numerically higher RBP address.
|
||||
|
||||
|
@ -37,13 +63,14 @@ For example, PUSHing some value onto the stack before the end of the function wo
|
|||
|
||||
\paragraph{Called Procedure Setup} \emph{not} with ENTER and LEAVE.
|
||||
|
||||
When a procedure is called the stack is set up with the following four components
|
||||
When a procedure is called, the stack is set up with the \gls{sf}, the four components listed in \cref{lst:amd64-stack-frame-components}.
|
||||
\cite[p.~48]{AMD64Vol1}:
|
||||
|
||||
\begin{listing}[h]
|
||||
\begin{enumerate}
|
||||
\item{%
|
||||
Parameters passed to the called procedure (created by the calling procedure). \\
|
||||
\textit{Only if parameters don't fit the \gls{CPU} registers}
|
||||
\textit{Only if parameters don't fit the \gls{cpu} registers}
|
||||
}
|
||||
\item{%
|
||||
Return address (created by the CALL instruction). \\
|
||||
|
@ -55,10 +82,13 @@ For example, PUSHing some value onto the stack before the end of the function wo
|
|||
}
|
||||
\item{%
|
||||
Local variables used by the called procedure. \\
|
||||
\textit{This includes the variables passed via \gls{CPU} registers}
|
||||
\textit{This includes the variables passed via \gls{cpu} registers}
|
||||
}
|
||||
\end{enumerate}
|
||||
only necessary when there aren't enough \gls{CPU} to pass the parameters.
|
||||
\caption{\glsentrytext{amd64} Stack-Frame Components}
|
||||
\label{lst:amd64-stack-frame-components}
|
||||
\end{listing}
|
||||
only necessary when there aren't enough \gls{cpu} to pass the parameters.
|
||||
Item 3 is only necessary when
|
||||
|
||||
The \gls{amd64} manual also lists ENTER and LEAVE as instructions to \textit{"provide support for procedure calls, and are mainly used in high-level languages."}\cite[p.~48]{AMD64Vol1}.
|
||||
|
@ -70,75 +100,13 @@ These instruction groups within the called procedure are called prologue and epi
|
|||
|
||||
\subsection{Full Procedure Call Example}
|
||||
\label{context::introduction::hw-supported-mm::procedure-call-example}
|
||||
This section combines the separate categories into one complete example that shows how the \gls{stack} is used by various \gls{CPU} instructions to perform procedure calls.
|
||||
This section combines the separate categories into one complete example that shows how the \gls{stack} is used by various \gls{cpu} instructions to perform procedure calls.
|
||||
The following code samples are extracted from a disassembled binary which was originally created using \gls{Rust}.
|
||||
The Assembler that's shown uses Intel Mnemonic, which generally operates from right to left.
|
||||
For example, \mint{nasm}{mov a, b} copies b to a.
|
||||
|
||||
\cref{code::context::examples::func-callee} shows the \gls{Rust} source code of the function \textit{sum}.
|
||||
\cref{code::context::examples::func-callee-rust} shows the \gls{Rust} source code of the function \textit{sum}.
|
||||
|
||||
\section{4-Level Paging Hierarchy on \glsentrytext{amd64}}
|
||||
\label{rnd::sysprog-conventions::paging-amd64}
|
||||
On \gls{amd64} "a four-level page-translation data structure is provided to allow long-mode operating systems to translate a 64-Bit virtual-address space into a 52-Bit physical-address space."\cite[p.~18]{AMD64Vol2}.
|
||||
This allows the system to only hold the \textit{PML4} table, the which is currently referenced by the \textit{Page Map Base Register (CR3)}, available in main memory.
|
||||
|
||||
\cref{fig:virtual-addr-transl} shows the 64-Bit virtual address composition on \gls{amd64}, which uses four-levels of page tables.
|
||||
Counterintuitively the page-tables are not called level-\textit{n}-page-table, but the levels received distinct names in \citetitle{AMD64Vol2}.
|
||||
The most-significant Bits labelled as \textit{Sign Extend} are not used for addressing purposes, but must adhere the canonical address form and simply repeat the value of the most-significant implemented Bit \cite[p.~130]{AMD64Vol2}.
|
||||
The least significant Bits represent the offset within the physical page.
|
||||
The four groups in between are used to index the page-table at their respective level.
|
||||
|
||||
\begin{figure}
|
||||
\centering
|
||||
\includegraphics[width=\textwidth]{gfx/Virtual-to-Physical-Address-Translation-Long-Mode.png}
|
||||
\caption{Virtual to Physical Address in Long Mode\cite{AMD64Vol2}}
|
||||
\label{fig:virtual-addr-transl}
|
||||
\end{figure}
|
||||
\subsubsection{Translation Scheme 4 KiB and 2 MiB Pages}
|
||||
The \gls{amd64} architecture allows configuring the page-size, two of which will be introduced in this section.
|
||||
\cref{tab:page-transl-vaddr-composition} displays the virtual address composition for the 4KiB and 2MiB page-size modes on \gls{amd64}.
|
||||
The direction from top to bottom in the table corresponds to most significant to least significant - left to right - in the virtual address.
|
||||
The \textit{sign extension} Bits cannot be used for actual information but act as a reservation for future architectural changes.
|
||||
|
||||
\begin{table}
|
||||
\begin{tabular}{l | c | c}
|
||||
Description & Bits in 4 KiB Pages & Bits in 2 MiB Pages \\
|
||||
\hline
|
||||
Sign Extend & 12 & 12 \\
|
||||
Page-Map-Level-4 Offeset & 9 & 9 \\
|
||||
Page-Directory-Pointer Offeset & 9 & 9 \\
|
||||
Page-Directory Offeset & 9 & 9 \\
|
||||
Page-Table Offeset & 9 & - \\
|
||||
Physical Page Offset & 9 & 21 \\
|
||||
\end{tabular}
|
||||
\caption{Paging on \gls{amd64}: Virtual Address Composition 4KiB/2MiB pagesizes}
|
||||
\label{tab:page-transl-vaddr-composition}
|
||||
\end{table}
|
||||
|
||||
\begin{figure}
|
||||
\centering
|
||||
\includegraphics[width=\textwidth]{gfx/amd64-4kb-page-translation-long-mode}
|
||||
\caption{4-Kbyte Page Translation—Long Mode\cite{AMD64Vol2}}
|
||||
\label{fig:4kb-page-transl}
|
||||
\end{figure}
|
||||
|
||||
\cref{fig:4kb-page-transl} shows the detailed virtual address composition for 4 KiB pages, using four levels of page-tables.
|
||||
It uses four sets of 9-Bit indices in the virtual address, one per hierarchy level, followed by the 9 Bit page-internal offset.
|
||||
|
||||
An alternative approach is displayed in \cref{fig:2mb-page-transl}, using 2 MiB sized pages.
|
||||
It uses three sets of 9-Bit indices for the page-tables, and a 21-Bit page-internal offset.
|
||||
Increasing the page-size improves speed and memory-usage and decreases the granularity.
|
||||
In this specific example the hierarchy is reduced by one level of page-tables.
|
||||
This reduces the amount of storage required for the page-tables in overall and causes the lookup algorithm to finish faster.
|
||||
|
||||
\begin{figure}
|
||||
\centering
|
||||
\includegraphics[width=\textwidth]{gfx/amd64-2mb-page-translation-long-mode}
|
||||
\caption{2-Mbyte Page Translation—Long Mode\cite{AMD64Vol2}}
|
||||
\label{fig:2mb-page-transl}
|
||||
\end{figure}
|
||||
|
||||
The other supported page sizes, 4 MiB and 1 GiB, as well as intermixing page sizes through the different levels don't add new insight into the mechanism and don't need to be detailed here.
|
||||
|
||||
% \subsubsection{Top-Level Page Table Self-Reference}
|
||||
% \subsubsection{Caching Lookups}
|
||||
|
@ -149,27 +117,30 @@ The other supported page sizes, 4 MiB and 1 GiB, as well as intermixing page siz
|
|||
\begin{listing}[htb]
|
||||
\tikzset{/minted/basename=callee-c}
|
||||
\begin{minted}[autogobble,linenos,breaklines=true]{rust}
|
||||
TODO
|
||||
\end{minted}
|
||||
\caption{The called function in \gls{Rust}}
|
||||
\label{code::context::examples::func-callee-c}
|
||||
\label{code::context::examples::func-callee-rust}
|
||||
\end{listing}
|
||||
|
||||
\cref{code::context::examples::func-call} shows a snippet snippet of the calling function.
|
||||
\cref{code::context::examples::func-call-asm} shows a snippet snippet of the calling function.
|
||||
It stores the arguments within the registers according to the System V X86\_64 calling convention. %TODO REFERENCE
|
||||
The caller doesn't alter the stack-frame pointer (RBP) or the stack pointer (RSP) registers before call, hence the called function must restore these if it alters them.
|
||||
|
||||
\begin{listing}
|
||||
\begin{minted}[escapeinside=??,highlightlines={},autogobble,linenos,breaklines=true]{rust}
|
||||
TODO
|
||||
\end{minted}
|
||||
\caption{Procedure Call Example: Caller Rust}
|
||||
\label{code::context::examples::func-call}
|
||||
\label{code::context::examples::func-call-asm}
|
||||
\end{listing}
|
||||
|
||||
\begin{listing}
|
||||
\begin{minted}[escapeinside=??,highlightlines={},autogobble,linenos,breaklines=true]{nasm}
|
||||
\end{minted}
|
||||
TODO
|
||||
\caption{Procedure Call Example: Caller Assembly}
|
||||
\label{code::context::examples::func-call}
|
||||
\label{code::context::examples::func-call-rust}
|
||||
\end{listing}
|
||||
|
||||
% \balloon{comment}{
|
||||
|
@ -250,18 +221,110 @@ $74f7: ret ; return to the caller, following the add
|
|||
\caption{Memory Layout Throughout The Procedure Call Steps}
|
||||
\label{fig:proc-call-example-mem}
|
||||
\end{figure}
|
||||
\FloatBarrier
|
||||
|
||||
\section{4-Level Paging Hierarchy on \glsentrytext{amd64}}
|
||||
\label{rnd::sysprog-conventions::paging-amd64}
|
||||
On \gls{amd64} "a four-level page-translation data structure is provided to allow long-mode operating systems to translate a 64-Bit virtual-address space into a 52-Bit physical-address space."\cite[p.~18]{AMD64Vol2}.
|
||||
This allows the system to only hold the \textit{PML4} table, the which is currently referenced by the \textit{Page Map Base Register (CR3)}, available in main memory.
|
||||
|
||||
\cref{fig:virtual-addr-transl} shows the 64-Bit virtual address composition on \gls{amd64}, which uses four-levels of page tables.
|
||||
Counterintuitively the page-tables are not called level-\textit{n}-page-table, but the levels received distinct names in \citetitle{AMD64Vol2}.
|
||||
The most-significant Bits labelled as \textit{Sign Extend} are not used for addressing purposes, but must adhere the canonical address form and simply repeat the value of the most-significant implemented Bit \cite[p.~130]{AMD64Vol2}.
|
||||
The least significant Bits represent the offset within the physical page.
|
||||
The four groups in between are used to index the page-table at their respective level.
|
||||
|
||||
\begin{figure}
|
||||
\centering
|
||||
\includegraphics[width=\textwidth]{gfx/Virtual-to-Physical-Address-Translation-Long-Mode.png}
|
||||
\caption{Virtual to Physical Address in Long Mode\cite{AMD64Vol2}}
|
||||
\label{fig:virtual-addr-transl}
|
||||
\end{figure}
|
||||
\subsubsection{Translation Scheme 4 KiB and 2 MiB Pages}
|
||||
The \gls{amd64} architecture allows configuring the page-size, two of which will be introduced in this section.
|
||||
\cref{tab:page-transl-vaddr-composition} displays the virtual address composition for the 4KiB and 2MiB page-size modes on \gls{amd64}.
|
||||
The direction from top to bottom in the table corresponds to most significant to least significant - left to right - in the virtual address.
|
||||
The \textit{sign extension} Bits cannot be used for actual information but act as a reservation for future architectural changes.
|
||||
|
||||
\begin{table}
|
||||
\begin{tabular}{l | c | c}
|
||||
Description & Bits in 4 KiB Pages & Bits in 2 MiB Pages \\
|
||||
\hline
|
||||
Sign Extend & 12 & 12 \\
|
||||
Page-Map-Level-4 Offeset & 9 & 9 \\
|
||||
Page-Directory-Pointer Offeset & 9 & 9 \\
|
||||
Page-Directory Offeset & 9 & 9 \\
|
||||
Page-Table Offeset & 9 & - \\
|
||||
Physical Page Offset & 9 & 21 \\
|
||||
\end{tabular}
|
||||
\caption{Paging on \gls{amd64}: Virtual Address Composition 4KiB/2MiB pagesizes}
|
||||
\label{tab:page-transl-vaddr-composition}
|
||||
\end{table}
|
||||
|
||||
\begin{figure}
|
||||
\centering
|
||||
\includegraphics[width=\textwidth]{gfx/amd64-4kb-page-translation-long-mode}
|
||||
\caption{4-Kbyte Page Translation—Long Mode\cite{AMD64Vol2}}
|
||||
\label{fig:4kb-page-transl}
|
||||
\end{figure}
|
||||
|
||||
\cref{fig:4kb-page-transl} shows the detailed virtual address composition for 4 KiB pages, using four levels of page-tables.
|
||||
It uses four sets of 9-Bit indices in the virtual address, one per hierarchy level, followed by the 9 Bit page-internal offset.
|
||||
|
||||
An alternative approach is displayed in \cref{fig:2mb-page-transl}, using 2 MiB sized pages.
|
||||
It uses three sets of 9-Bit indices for the page-tables, and a 21-Bit page-internal offset.
|
||||
Increasing the page-size improves speed and memory-usage and decreases the granularity.
|
||||
In this specific example the hierarchy is reduced by one level of page-tables.
|
||||
This reduces the amount of storage required for the page-tables in overall and causes the lookup algorithm to finish faster.
|
||||
|
||||
\begin{figure}
|
||||
\centering
|
||||
\includegraphics[width=\textwidth]{gfx/amd64-2mb-page-translation-long-mode}
|
||||
\caption{2-Mbyte Page Translation—Long Mode\cite{AMD64Vol2}}
|
||||
\label{fig:2mb-page-transl}
|
||||
\end{figure}
|
||||
|
||||
The other supported page sizes, 4 MiB and 1 GiB, as well as intermixing page sizes through the different levels don't add new insight into the mechanism and don't need to be detailed here.
|
||||
|
||||
|
||||
\section{Interrupt Driven Preemptive Context Switches on \glsentrytext{amd64}}
|
||||
\label{rnd::sysprog-conventions::ir-driven-preemptive-cs-amd64}
|
||||
On \gls{amd64}, the \gls{CPU}'s interrupt mechanism does not switch the full context described previously, but only handles the registers that are necessary to successfully jump to the interrupt function: RFLAGS, RSP, RBP, RIP\footnote{Segment registers are neglected}.
|
||||
On \gls{amd64}, the \gls{cpu}'s interrupt mechanism does not switch the full context described previously, but only handles the registers that are necessary to successfully jump to the interrupt function: RFLAGS, RSP, RBP, RIP\footnote{Segment registers are neglected}.
|
||||
|
||||
\subsection{Interrupts}
|
||||
% TODO https://software.intel.com/sites/default/files/managed/39/c5/325462-sdm-vol-1-2abcd-3abcd.pdf p. 2848
|
||||
|
||||
\subsection{Context Content}
|
||||
A description for \gls{amd64} is given in \cref{tab:task-minimum-context-registers}.
|
||||
|
||||
\begin{table}
|
||||
\begin{tabularx}{\textwidth}{| c | X | X |}
|
||||
\hline
|
||||
\textbf{descriptive name} &
|
||||
\textbf{register names on amd64} &
|
||||
\textbf{description} \\
|
||||
\hline
|
||||
the instruction pointer register & RIP & address of the next instruction to be fetched \\
|
||||
\hline
|
||||
the stack pointer register & RSP & address of current position in stack \\
|
||||
\hline
|
||||
the flags register & RFLAGS & various attributes, e.g. the interrupt flag \\
|
||||
\hline
|
||||
all general-purpose registers & RAX, RBX, RCX, RDX, RDI, RSI, RBP, RSP, R8–R15 & arbitrary data \\
|
||||
\hline
|
||||
\end{tabularx}
|
||||
\caption{Minimum Context Registers on amd64\cite[p.~28]{AMD64Vol2}}
|
||||
\label{tab:task-minimum-context-registers}
|
||||
\end{table}
|
||||
|
||||
\subsection{Storing The Context On The Stack}
|
||||
In this scenario, the context is stored on the \gls{stack} of the function that is interrupted.
|
||||
\Cref{fig:amd64-long-mode-interrupt-stac} pictures the \gls{stack} layout on interrupt entry.
|
||||
In order to leverage an interrupt for a context switch, the interrupt function needs to replace these values on the \gls{stack} with values for the new context.
|
||||
CS (Code-Segment) and SS (Stack-Segment) have no effect in \gls{amd64} 64-Bit mode\cite[p.~20]{AMD64Vol1} and can remain unchanged.
|
||||
The \gls{OS} developer needs to know the exact address where on the \gls{stack} this data structure has been pushed by the \gls{CPU}, and must then manipulate these addresses directly.
|
||||
The \gls{os} developer needs to know the exact address where on the \gls{stack} this data structure has been pushed by the \gls{cpu}, and must then manipulate these addresses directly.
|
||||
This type of manipulation is inherently dangerous and can not be easily checked by the \gls{compiler}.
|
||||
The function that handles the interrupt must then use the instruction \textit{iretq}\cite[p.~252]{AMD64Vol2}, to make the \gls{CPU} restore the partial context from the \gls{stack} and continue to function pointed to by the RIP.
|
||||
The function that handles the interrupt must then use the instruction \textit{iretq}\cite[p.~252]{AMD64Vol2}, to make the \gls{cpu} restore the partial context from the \gls{stack} and continue to function pointed to by the RIP.
|
||||
|
||||
|
||||
\begin{figure}
|
||||
|
@ -271,40 +334,7 @@ The function that handles the interrupt must then use the instruction \textit{ir
|
|||
\label{fig:amd64-long-mode-interrupt-stac}
|
||||
\end{figure}
|
||||
|
||||
For a full context-switch, the other registers that are part of the context need to be handled by the \gls{OS}'s interrupt function.
|
||||
|
||||
|
||||
\chapter{Research Questions}
|
||||
|
||||
Setting up and maintaining the paging-structure, as well as allocating physical memory for the virtual pages is a complex task in the \gls{OS}.
|
||||
Developing this part of the \gls{OS} is error-prone, and is not well-supported by mainstream \glspl{proglang}.
|
||||
|
||||
\section{Definition Of Additional Analysis Rules To Extend Safety Checks}
|
||||
% TODO: How can Business Logical
|
||||
% Examples:
|
||||
% TLB needs to be reset on Task Change
|
||||
% Registers need to be
|
||||
|
||||
\subsubsection{Software Fault Isolation}
|
||||
% TODO: content from \cite{Balasubramanian2017}
|
||||
|
||||
\subsection{More Detailed Research Questions}
|
||||
% TODO Which language items help with managing memory?
|
||||
% TODO How generic can the memory allocators be written?
|
||||
|
||||
% TODO Guarantees to be statically checked:
|
||||
% TODO * Control access to duplicates in page tables
|
||||
% TODO * Tasks can't access unallocated (physical) memory
|
||||
% TODO * Tasks can't access other tasks memory
|
||||
|
||||
\subsection{Interrupts}
|
||||
% TODO https://software.intel.com/sites/default/files/managed/39/c5/325462-sdm-vol-1-2abcd-3abcd.pdf p. 2848
|
||||
|
||||
\section{Software Tests}
|
||||
% TODO: describe that tests are mostly semantics as opposed to static checks being mostly syntactical and technical
|
||||
% TODO: They necessary in addition to static checks to cover the well-known use-cases and edge-cases.
|
||||
% TODO: example?
|
||||
|
||||
For a full context-switch, the other registers that are part of the context need to be handled by the \gls{os}'s interrupt function.
|
||||
|
||||
\chapter{Porting \glsentrytext{C} Vulnerabilities}
|
||||
\label{rnd::porting-c-vulns}
|
||||
|
@ -312,8 +342,8 @@ In this chapter, the weakness manifestations from \cref{context::common-mem-safe
|
|||
|
||||
\chapter{\glsentrytext{LX} Modules Written In \glsentrytext{Rust}}
|
||||
|
||||
\chapter{Existing \glsentrytext{OS}-Development Projects Based On Rust}
|
||||
\label{rnd::existing-os-dev-wity-rust}
|
||||
\chapter{Existing \glsentrytext{os}-Development Projects Based On Rust}
|
||||
\label{rnd::existing-os-dev-with-rust}
|
||||
|
||||
\section{Libraries}
|
||||
|
||||
|
@ -326,8 +356,9 @@ In this chapter, the weakness manifestations from \cref{context::common-mem-safe
|
|||
\subsection{Blog OS}
|
||||
\subsection{Redox}
|
||||
\subsection{Tock}
|
||||
%TODO: mention paper's by tockos team
|
||||
|
||||
\chapter{\glsentrytext{imezzos}: Adding Preemptive \glsentrytext{OS}-Level Multitasking}
|
||||
\chapter{\glsentrytext{imezzos}: Adding Preemptive \glsentrytext{os}-Level Multitasking}
|
||||
\label{rnd::imezzos-preemptive-multitasking}
|
||||
|
||||
\section{Timed Interrupts For Scheduling and Dispatching}
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue