parts/context: work on introduction

Figuring out what exactly will follow I need to know what I can build
upon.
This commit is contained in:
steveej 2017-07-13 21:03:02 +02:00
parent 1f4d579dcc
commit 77d01e79b1
6 changed files with 228 additions and 146 deletions

View file

@ -120,6 +120,9 @@ in pkgs.stdenv.mkDerivation {
let g:vimtex_complete_recursive_bib = 1
let g:vimtex_indent_enabled = 1
let g:vimtex_indent_bib_enabled = 1
let g:vimtex_fold_enabled = 1
let g:vimtex_fold_comments = 1
let g:vimtex_fold_preamble = 1
if !exists('g:ycm_semantic_triggers')
let g:ycm_semantic_triggers = {}
@ -138,7 +141,7 @@ in pkgs.stdenv.mkDerivation {
function! ViewerCallback() dict
call self.forward_search(self.out)
endfunction
let g:vimtex_view_zathura_hook_callback = 'ViewerCallback'
"let g:vimtex_view_zathura_hook_callback = 'ViewerCallback'
" }}}
autocmd BufWritePost * execute ':silent ! cp /home/steveej/src/mendeley/Static-Code-Analysis-Kernel-Memory-Saftey.bib /home/steveej/src/steveej/msc-thesis/src/docs/thesis.bib >/dev/null 2>&1'
@ -157,6 +160,7 @@ in pkgs.stdenv.mkDerivation {
];
};
})
pkgs.bashInteractive
mytexlive
pkgs.zathura
];

View file

@ -1,10 +1,10 @@
% // vim: set ft=tex:
\newglossaryentry{rustlang}{
name = Rust,
description = {
The Rust Programming Language.
},
\newglossaryentry{Rust}{
name=Rust
, description={
TODO programming language
}
}
\newglossaryentry{compiler}{
@ -219,3 +219,30 @@
}
}
\newglossaryentry{C}{
name=C programming language,
, description={
TODO C
}
}
\newglossaryentry{CPU}{
name=Central Processing Unit
, description={
TODO CPU
}
}
\newglossaryentry{MMU}{
name=Memory Management Unit
, description={
TODO MMU
}
}
\newglossaryentry{sysadmin}{
name=System Administrator
, description={
TODO sysadmin
}
}

View file

@ -1,47 +1,68 @@
% // vim: set ft=tex:
\chapter{Introduction}
This thesis studies the feasibility of using static code analysis, as found in the \gls{rustlang} \gls{compiler}, to ensure memory safety within an \gls{OS} kernel.
Because an \gls{OS} is nothing but a \gls{app}, this study could be applied to all \glspl{app}, but the focus is on the implementation of \glspl{OS} which is the \gls{app} that is responsible for managing the system's resources.
The \gls{OS} is the only \gls{app} that has unrestricted access to these resources, and in order to protect them it needs to be secure.
This thesis studies the feasibility of using static code analysis, as found in the \gls{Rust} \gls{compiler}, for ensuring safety within an \gls{OS} kernel.
Because an \gls{OS} is nothing but a \gls{app}, this study could be applied to all \glspl{app}, but the focus is on the implementation of \glspl{OS} which is the \gls{app} that is responsible for managing the system's resources and provide abstractions for higher level applications.
The \gls{OS} is the only \gls{app} that has unrestricted access to these resources, with the task of managing these safely according to the rules that were set up by the \gls{sysadmin}.
\section{A Definition of Memory Safety In The \gls{OS}}
A clear definition of memory safety is laid out in this section.
For decades computer systems are able to execute instructions that they have previously loaded into their main memory.
The details of loading the instructions for the \gls{OS} are irrelevant for the understanding, but it is assumed that the persistent memory that holds these instructions is the responsibility of the \gls{sysadmin} and not the \gls{OS}.
Once the \gls{OS} is in execution, it is responsible for loading the instructions and data for other \gls{app} into main memory.
This happens either automatically through configured jobs, or based on well-defined events which can be any form of input via the system's interfaces.
The latter is potentially dangerous because it requires an extensive amount of care and foresight from the developers of the \gls{OS} and \glspl{app} to prepare a system for the various events that might possibly occur.
This is not an easy task, especially if the interface or the environment of a system are diverse and complex.
In this context, memory safety is the ability to prevent an alteration of the memory content that would otherwise lead to malfunctioning at best, and malicious behavior at worst.
\section{Academic And Industrial Activities}
% Primary Research Questions
% The primary research question is the basis for data collection and arises from the Purpose of the Study. There may be one, or there may be several. When the research is finished, the contribution to the knowledge will be the answer to these questions. Do not confuse the primary research questions with interview questions in a qualitative study, or survey questions in a quantitative study. The research questions in a qualitative study are followed by both a null and an alternate hypothesis.
% Hypotheses
% A hypothesis is a testable prediction for an observed phenomenon, namely, the gap in the knowledge. Each research question will have both a null and an alternative hypothesis in a quantitative study. Qualitative studies do not have hypotheses. The two hypotheses should follow the research question upon which they are based. Hypotheses are testable predictions to the gap in the knowledge. In a qualitative study the hypotheses are replaced with the primary research questions.
* TODO: mention redox, tockos, intermezzOS and more activities
* TODO: mention paper's by tockos team
* TODO: mention electrolyte, formal verification for Rust
According to my best-effort literature research in Q1/2017, the hypothesis that \textit{Rust's static code analysis can guarantee memory safety in the \gls{OS}} has not been studied explicitly.
This is to my surprise, because as explained in more details in this chapter the situation in
\gls{OS} is critical and \gls{Rust} offers attractive features to help improve this situation.
However, the hypothesis cannot be trivially approved or denied, which drives the research efforts for my final thesis project.
% Purpose of the Study
%The Purpose of the Study is a statement contained within one or two paragraphs that identifies the research design, such as qualitative, quantitative, mixed methods, ethnographic, or another design. The research variables, if a quantitative study, are identified, for instance, independent, dependent, comparisons, relationships, or other variables. The population that will be used is identified, whether it will be randomly or purposively chosen, and the location of the study is summarized. Most of these factors will be discussed in detail in Chapter 3.
The purpose of this study is to evaluate Rust's feasibility to guarantee memory safety when it's used for \gls{OS} development.
The results will be of qualitative nature by implementing and analyzing popular memory management techniques in Rust, discerning the level of memory safety improvements - or guarantees - in comparison to implementations in C.
The results will be of qualitative nature, captured by analyzing existing and a self-developed \gls{Rust}-implementations of popular memory management techniques.
In addition to the sole analysis of \gls{Rust}-implementations, comparisons will be made, discerning the level of memory safety guarantees gained over similarly intending implementations in \gls{C}.
\section{Status Quo: Zero Memory-Safety A Day}
% Significance of the Study
% The significance is a statement of why it is important to determine the answer to the gap in the knowledge, and is related to improving the human condition. The contribution to the body of knowledge is described, and summarizes who will be able to use the knowledge to make better decisions, improve policy, advance science, or other uses of the new information. The “new” data is the information used to fill the gap in the knowledge.
A very popular \gls{OS} that has been developed with C (and some assembly) is \gls{Linux}.
A very popular and widespread is \gls{OS}, which has been developed with \gls{C} (and some assembly).
Recent years have shown how prone it is to vulnerabilities that result from the unsafe language design and programming errors.
With the growing number of vulnerabilities, various solutions have been proposed to increase the safety of C, either with static code analysis or via checks imposed at runtime. (TODO: reference).
The former is complex to perform on a language that has not been designed to be safety-analysed. TODO? reference?
Despite its complexity, attempts exist to define a subset of the C language that can be safety checked, namely Safe-C.
Despite its complexity, attempts exist to define a subset of the \gls{C} language that can be safety checked, TODO: refernces of Cyclone, CCured, etc..
The performance overhead of the latter is immense which makes it an unviable option in the domain of \gls{OS} development, where there exists code paths which must be very fast to ensure the operation of high speed I/O devices\cite{Balasubramanian2017}.
Safety checks that are performed at runtime introduce a high degree of overhead, which makes it an unviable option in the domain of \gls{OS} development, where many code paths must be very fast to ensure the operation of high speed I/O devices\cite{Balasubramanian2017} or other tasks with hard- or soft-realtime requirements.. (TODO: explain realtime requirements)
This has been forcing \gls{OS} developers to prioritize performance over safety. (TODO: reference)
Details about the challenge of writing code that does memory management safely, and related vulnerabilities are given in \autoref{chap:mmt}.
Details about the challenge of writing code that does memory management safely, and related vulnerabilities are given further along in \autoref{chap:mmt}.
\section{Static Code Analysis}
* TODO: Difference between static- and runtime checks
\section{Programming the OS in Rust: Guaranteed Memory Safety?}
% Primary Research Questions
% The primary research question is the basis for data collection and arises from the Purpose of the Study. There may be one, or there may be several. When the research is finished, the contribution to the knowledge will be the answer to these questions. Do not confuse the primary research questions with interview questions in a qualitative study, or survey questions in a quantitative study. The research questions in a qualitative study are followed by both a null and an alternate hypothesis.
\section{Hypotheses}
% Hypotheses
% A hypothesis is a testable prediction for an observed phenomenon, namely, the gap in the knowledge. Each research question will have both a null and an alternative hypothesis in a quantitative study. Qualitative studies do not have hypotheses. The two hypotheses should follow the research question upon which they are based. Hypotheses are testable predictions to the gap in the knowledge. In a qualitative study the hypotheses are replaced with the primary research questions.
\section{Research Design}
\section{Assessing Memory-Safety}
% In Chapter 1 this is a summary of the methodology and contains a brief outline of three things: (a) the participants in a qualitative study or the subjects of a quantitative study (human participants are referred tyo as participants, non-human subjects are referred to as subjects), (b) the instrumentation used to collect data, and (c) the procedure that will be followed. All of these elements will be reported in detail in Chapter 3. In a quantitative study, the instrumentation will be validated in Chapter 3 in detail. In a qualitative study, if it is a researcher-created questionnaire, validating the correctness of the interview protocol is usually accomplished with a pilot study. For either a quantitative or a qualitative study, using an already validated survey instrument is easier to defend and does not require a pilot study; however, Chapter 3 must contain a careful review of the instrument and how it was validated by the creator.
% In a qualitative study, which usually involves interviews, the instrumentation is an interview protocol a pre-determined set of questions that every participant is asked that are based on the primary research questions. A qualitative interview should contain no less than 10 open-ended questions and take no less than 1 hour to administer to qualify as “robust” research.
% In the humanities, a demographic survey should be circulated with most quantitative and qualitative studies to establish the parameters of the participant pool. Demographic surveys are nearly identical in most dissertations. In the sciences, a demographic survey is rarely needed.
* TODO: what is memory?
* TODO: when it is considered safe?
* TODO: Explain how memory-safety can be measured
\section{Theoretical Framework}
\section{Compilers And Static Code Analysis}
% The theoretical framework is the foundational theory that is used to provide a perspective upon which the study is based. There are hundreds of theories in the literature. For instance, if a study in the social sciences is about stress that may be causing teachers to quit, Apples Intensification Theory could be cited as the theory was that stress is cumulative and the result of continuing overlapping, progressively stringent responsibilities for teachers that eventually leads to the desire to quit. In the sciences, research about new species that may have evolved from older, extinct species would be based on the theory of evolution pioneered by Darwin.
% Some departments put the theoretical framework explanation in Chapter 1; some put it in Chapter 2.
* TODO: put in some scientific background about static checks
* affine types
\section{Assumptions, Limitations, and Scope (Delimitations)}
% Assumptions are self-evident truths. In a qualitative study, it may be assumed that participants be highly qualified in the study is about administrators. It can be assumed that participants will answer truthfully and accurately to the interview questions based on their personal experience, and that participants will respond honestly and to the best of their individual abilities.
@ -50,19 +71,41 @@ Details about the challenge of writing code that does memory management safely,
% Scope is the extent of the study and contains measurements. In a qualitative study this would include the number of participants, the geographical location, and other pertinent numerical data. In a quantitative study the size of the elements of the experiment are cited. The generalizability of the study may be cited. The word generalizability, which is not in the Word 2007 dictionary, means the extent to which the data are applicable in places other than where the study took place, or under what conditions the study took place.
% Delimitations are limitations on the research design imposed deliberately by the researcher. Delimitations in a social sciences study would be such things as the specific school district where a study took place, or in a scientific study, the number of repetitions.
\section{Definition of Terms}
% The definition of terms is written for knowledgeable peers, not people from other disciplines As such, it is not the place to fill pages with definitions that knowledgeable peers would know at a glance. Instead, define terms that may have more than one meaning among knowledgeable peers.
\section{Premised Trust In Hardware}
* TODO: is it worth to explain ECC?
* TODO: explain that the hardware might be unsafe but this is not in scope of the thesis
\section{Summary}
% Summarize the content of Chapter 1 and preview of content of Chapter 2.
\chapter{Memory Management Techniques}
\label{chap:mmt}
The \autoref{chap:mmt} gives a detailed introduction to memory management in contemporary architectures and \glspl{OS}.
\chapter{Sophisticated Memory Management Techniques}
* TODO: in the beginnings application software had full control over memory
* TODO: from single-job via batch systems to multiprocessing
As the result of collaborations between hard- and software developers, the memory management task in the \gls{OS} is partially delegated to the \gls{CPU}'s \gls{MMU}.
A complete understand of this task is necessary in order to reason about it's safety.
This chapter starts with the provides a thorough introduction to modern memory management techniques on the x86\_64 architecture.
\section{Abstraction And Protection Of Resources}
* TODO: recap that management has been motivated by multiprocessing without side-effects
* TODO: brief history and market share of x86\_64 processors and ARM
\section{Virtual Addresses}
* TODO: describe dynamic addresses
* TODO: describe swapping
* TODO: describe virtual address
* TODO: describe segmantation
* TODO: describe paging
% * TODO: parse http://wiki.osdev.org/Memory_Management_Unit
\section{Multi-Level Paging}
\subsection{Top-Levle Pagetable Self-Reference}
\section{Paging}
\subsection{Multi-Level Paging}
\subsection{Top-Level Page table Self-Reference}
\subsection{Caching Lookups}
\subsection{Full Example}
* http://taptipalit.blogspot.de/2013/10/theory-recursive-mapping-page.html
* https://www.coresecurity.com/blog/getting-physical-extreme-abuse-of-intel-based-paging-systems-part-2-windows

View file

@ -4,13 +4,12 @@
\section{Static Checks}
* TODO: Difference between static- and runtime checks
\subsection{Define Additional Anlyse Rules}
* Example: TLB needs to be reset on Task Change
\subsection{Static Variable Declaration}
\section{Virtual Memory Management In Hard- and Software}
* Architecture choice: x86\_64
* CPU supports

View file

@ -1,19 +1,65 @@
Automatically generated by Mendeley Desktop 1.17.10
Automatically generated by Mendeley Desktop 1.17.8
Any changes to this file will be lost if it is regenerated by Mendeley.
BibTeX export options can be customized via Options -> BibTeX in Mendeley Desktop
@article{Szekeres2013,
abstract = {Memory corruption bugs in software written in low-level languages like C or C++ are one of the oldest problems in computer security. The lack of safety in these languages allows attackers to alter the program's behavior or take full control over it by hijacking its control flow. This problem has existed for more than 30 years and a vast number of potential solutions have been proposed, yet memory corruption attacks continue to pose a serious threat. Real world exploits show that all currently deployed protections can be defeated. This paper sheds light on the primary reasons for this by describing attacks that succeed on today's systems. We systematize the current knowledge about various protection techniques by setting up a general model for memory corrup- tion attacks. Using this model we show what policies can stop which attacks. The model identifies weaknesses of currently deployed techniques, as well as other proposed protections enforcing stricter policies. We analyze the reasons why protection mechanisms imple- menting stricter polices are not deployed. To achieve wide adoption, protection mechanisms must support a multitude of features and must satisfy a host of requirements. Especially important is performance, as experience shows that only solutions whose overhead is in reasonable bounds get deployed. A comparison of different enforceable policies helps de- signers of new protection mechanisms in finding the balance between effectiveness (security) and efficiency.We identify some open research problems, and provide suggestions on improving the adoption of newer techniques.},
author = {Szekeres, L??szl?? and Payer, Mathias and Wei, Tao and Song, Dawn},
doi = {10.1109/SP.2013.13},
file = {:home/steveej/src/github/steveej/msc-thesis/docs/SoK$\backslash$: Eternal War in Memory.pdf:pdf},
isbn = {9780769549774},
issn = {10816011},
journal = {Proceedings - IEEE Symposium on Security and Privacy},
pages = {48--62},
title = {{SoK: Eternal war in memory}},
year = {2013}
@article{Levy2015a,
abstract = {Rust, a new systems programming language, provides compile-time memory safety checks to help eliminate runtime bugs that manifest from improper memory management. This feature is advantageous for operating system development, and especially for embedded OS development, where recovery and debugging are particularly challenging. However, embedded platforms are highly event-based, and Rust's memory safety mechanisms largely presume threads. In our experience developing an operating system for embedded systems in Rust, we have found that Rust's ownership model prevents otherwise safe resource sharing common in the embedded domain, conflicts with the reality of hardware resources, and hinders using closures for programming asynchronously. We describe these experiences and how they relate to memory safety as well as illustrate our workarounds that preserve the safety guarantees to the largest extent possible. In addition, we draw from our experience to propose a new language extension to Rust that would enable it to provide better memory safety tools for event-driven platforms.},
author = {Levy, Amit and Andersen, Michael P. and Campbell, Bradford and Culler, David and Dutta, Prabal and Ghena, Branden and Levis, Philip and Pannuto, Pat},
doi = {10.1145/2818302.2818306},
file = {:home/steveej/src/github/steveej/msc-thesis/docs/tock-plos2015.pdf:pdf},
isbn = {9781450339421},
journal = {PLOS: Workshop on Programming Languages and Operating Systems},
keywords = {embedded operating systems,linear types,ownership,rust},
pages = {21--26},
title = {{Ownership is Theft: Experiences Building an Embedded OS in Rust}},
url = {http://dl.acm.org/citation.cfm?id=2818302.2818306},
year = {2015}
}
@article{Affairs2015,
author = {Affairs, Post Doctoral},
file = {:home/steveej/src/steveej/msc-thesis/docs/You can't spell trust without Rust.pdf:pdf},
title = {{YOU CAN ' T SPELL TRUST WITHOUT RUST alexis beingessner Master ' s in Computer Science Carleton University}},
year = {2015}
}
@article{Merity2016,
abstract = {Recent neural network sequence models with softmax classifiers have achieved their best language modeling performance only with very large hidden states and large vocabularies. Even then they struggle to predict rare or unseen words even if the context makes the prediction unambiguous. We introduce the pointer sentinel mixture architecture for neural sequence models which has the ability to either reproduce a word from the recent context or produce a word from a standard softmax classifier. Our pointer sentinel-LSTM model achieves state of the art language modeling performance on the Penn Treebank (70.9 perplexity) while using far fewer parameters than a standard softmax LSTM. In order to evaluate how well language models can exploit longer contexts and deal with more realistic vocabularies and larger corpora we also introduce the freely available WikiText corpus.},
archivePrefix = {arXiv},
arxivId = {1609.07843},
author = {Merity, Stephen and Xiong, Caiming and Bradbury, James and Socher, Richard},
eprint = {1609.07843},
journal = {Arxiv},
title = {{Pointer Sentinel Mixture Models}},
url = {http://arxiv.org/abs/1609.07843},
year = {2016}
}
@misc{Endler,
author = {Endler, Matthias},
title = {{A curated list of static analysis tools, linters and code quality checkers for various programming languages}},
url = {https://github.com/mre/awesome-static-analysis}
}
@inproceedings{Kuznetsov2014,
abstract = {Systems code is often written in low-level languages like C/C++, which offer many benefits but also dele- gate memory management to programmers. This invites memory safety bugs that attackers can exploit to divert control flow and compromise the system. Deployed de- fense mechanisms (e.g., ASLR, DEP) are incomplete, and stronger defense mechanisms (e.g., CFI) often have high overhead and limited guarantees [19, 15, 9]. We introduce code-pointer integrity (CPI), a new de- sign point that guarantees the integrity of all code point- ers in a program (e.g., function pointers, saved return ad- dresses) and thereby prevents all control-flow hijack at- tacks, including return-oriented programming. We also introduce code-pointer separation (CPS), a relaxation of CPI with better performance properties. CPI and CPS offer substantially better security-to-overhead ratios than the state of the art, they are practical (we protect a complete FreeBSD system and over 100 packages like apache and postgresql), effective (prevent all attacks in the RIPE benchmark), and efficient: on SPEC CPU2006, CPS averages 1.2{\%} overhead for C and 1.9{\%} for C/C++, while CPI's overhead is 2.9{\%} for C and 8.4{\%} for C/C++. A prototype implementation of CPI and CPS can be obtained from http://levee.epfl.ch. 1},
author = {Kuznetsov, Volodymyr and Szekeres, L{\'{a}}szl{\'{o}} and Payer, Mathias},
booktitle = {Proceedings of the 11th USENIX Symposium on Operating Systems Design and Implementation},
isbn = {9781931971164},
pages = {147--163},
title = {{Code-pointer integrity}},
url = {https://www.usenix.org/conference/osdi14/technical-sessions/presentation/kuznetsov{\%}5Cnhttps://www.usenix.org/system/files/conference/osdi14/osdi14-paper-kuznetsov.pdf?utm{\_}source=dlvr.it{\&}utm{\_}medium=tumblr},
year = {2014}
}
@article{Caballero2012,
abstract = {Use-after-free vulnerabilities are rapidly growing in popularity, especially for exploiting web browsers. Use-after-free (and double-free) vulnerabilities are caused by a program operating on a dangling pointer. In this work we propose early detection, a novel runtime approach for finding and diagnosing use-after-free and double-free vulnerabilities. While previous work focuses on the creation of the vulnerability (i.e., the use of a dangling pointer), early detection shifts the focus to the creation of the dangling pointer(s) at the root of the vulnerability. Early detection increases the effectiveness of testing by identifying unsafe dangling pointers in executions where they are created but not used. It also accelerates vulnerability analysis and minimizes the risk of incomplete fixes, by automatically collecting information about all dangling pointers involved in the vulnerability. We implement our early detection technique in a tool called Undangle. We evaluate Undangle for vulnerability analysis on 8 real-world vulnerabilities. The analysis uncovers that two separate vulnerabilities in Firefox had a common root cause and that their patches did not completely fix the underlying bug. We also evaluate Undangle for testing on the Firefox web browser identifying a potential vulnerability.},
author = {Caballero, Juan and Grieco, Gustavo and Marron, Mark and Nappa, Antonio},
doi = {10.1145/2338965.2336769},
isbn = {9781450314541},
issn = {1450314546},
journal = {ISSTA},
keywords = {automated testing,binary analysis,debugging,dynamic analysis},
pages = {133},
title = {{Undangle: early detection of dangling pointers in use-after-free and double-free vulnerabilities}},
url = {http://dl.acm.org/citation.cfm?doid=2338965.2336769},
year = {2012}
}
@article{Lattner2005,
abstract = {The LLVM Compiler Infrastructure (http://llvm.cs. uiuc.edu) is a$\backslash$nrobust system that is well suited for a wide variety of research$\backslash$nand development work. This brief paper introduces the LLVM system$\backslash$nand provides pointers to more extensive documentation, complementing$\backslash$nthe tutorial presented at LCPC.},
@ -33,62 +79,27 @@ title = {{The LLVM Compiler Framework and Infrastructure Tutorial}},
url = {http://dx.doi.org/10.1007/11532378{\_}2},
year = {2005}
}
@misc{Endler,
author = {Endler, Matthias},
title = {{A curated list of static analysis tools, linters and code quality checkers for various programming languages}},
url = {https://github.com/mre/awesome-static-analysis}
}
@article{Balasubramanian2017,
abstract = {Rust is a new system programming language that offers a practical and safe alternative to C. Rust is unique in that it enforces safety without runtime overhead, most importantly, without the overhead of garbage collection. While zero-cost safety is remarkable on its own, we argue that the super-powers of Rust go beyond safety. In particular, Rust's linear type system enables capabilities that cannot be implemented efficiently in traditional languages, both safe and unsafe, and that dramatically improve security and reliability of system software. We show three examples of such capabilities: zero-copy software fault isolation, efficient static information flow analysis, and automatic checkpointing. While these capabilities have been in the spotlight of systems research for a long time, their practical use is hindered by high cost and complexity. We argue that with the adoption of Rust these mechanisms will become commoditized.},
author = {Balasubramanian, Abhiram and Baranowski, Marek S and Burtsev, Anton and Irvine, Uc and Rakamari, Zvonimir and Ryzhyk, Leonid and Research, Vmware},
file = {:home/steveej/src/github/steveej/msc-thesis/docs/DRAFT$\backslash$: System Programming in Rust$\backslash$: Beyond Safety.pdf:pdf},
title = {{DRAFT: System Programming in Rust: Beyond Safety}},
year = {2017}
}
@inproceedings{Kuznetsov2014,
abstract = {Systems code is often written in low-level languages like C/C++, which offer many benefits but also dele- gate memory management to programmers. This invites memory safety bugs that attackers can exploit to divert control flow and compromise the system. Deployed de- fense mechanisms (e.g., ASLR, DEP) are incomplete, and stronger defense mechanisms (e.g., CFI) often have high overhead and limited guarantees [19, 15, 9]. We introduce code-pointer integrity (CPI), a new de- sign point that guarantees the integrity of all code point- ers in a program (e.g., function pointers, saved return ad- dresses) and thereby prevents all control-flow hijack at- tacks, including return-oriented programming. We also introduce code-pointer separation (CPS), a relaxation of CPI with better performance properties. CPI and CPS offer substantially better security-to-overhead ratios than the state of the art, they are practical (we protect a complete FreeBSD system and over 100 packages like apache and postgresql), effective (prevent all attacks in the RIPE benchmark), and efficient: on SPEC CPU2006, CPS averages 1.2{\%} overhead for C and 1.9{\%} for C/C++, while CPI's overhead is 2.9{\%} for C and 8.4{\%} for C/C++. A prototype implementation of CPI and CPS can be obtained from http://levee.epfl.ch. 1},
author = {Kuznetsov, Volodymyr and Szekeres, L{\'{a}}szl{\'{o}} and Payer, Mathias},
booktitle = {Proceedings of the 11th USENIX Symposium on Operating Systems Design and Implementation},
isbn = {9781931971164},
pages = {147--163},
title = {{Code-pointer integrity}},
url = {https://www.usenix.org/conference/osdi14/technical-sessions/presentation/kuznetsov{\%}5Cnhttps://www.usenix.org/system/files/conference/osdi14/osdi14-paper-kuznetsov.pdf?utm{\_}source=dlvr.it{\&}utm{\_}medium=tumblr},
year = {2014}
}
@article{Getreu2016,
annote = {- runtime checkis are expensive
- critical with energy restriction on the target device},
author = {Getreu, Jens},
file = {:home/steveej/src/github/steveej/msc-thesis/docs/Embedded System Security with Rust - Case Study of Heartbleed.pdf:pdf},
pages = {1--24},
title = {{Embedded System Security with Rust}},
year = {2016}
}
@article{Xu2015,
abstract = {Since vulnerabilities in Linux kernel are on the increase, attackers have turned their interests into related exploitation techniques. However, compared with numerous researches on exploiting use-after-free vulnerabilities in the user applications, few efforts studied how to exploit use-after-free vulnerabilities in Linux kernel due to the difficulties that mainly come from the uncertainty of the kernel memory layout. Without specific information leakage, attackers could only conduct a blind memory overwriting strategy trying to corrupt the critical part of the kernel, for which the success rate is negligible. In this work, we present a novel memory collision strategy to exploit the use-after-free vulnerabilities in Linux kernel reliably. The insight of our exploit strategy is that a probabilistic memory collision can be constructed according to the widely deployed kernel memory reuse mechanisms, which significantly increases the success rate of the attack. Based on this insight, we present two practical memory collision attacks: An object-based attack that leverages the memory recycling mechanism of the kernel allocator to achieve freed vulnerable object covering, and a physmap-based attack that takes advantage of the overlap between the physmap and the SLAB caches to achieve a more flexible memory manipulation. Our proposed attacks are universal for various Linux kernels of different architectures and could successfully exploit systems with use-after-free vulnerabilities in kernel. Particularly, we achieve privilege escalation on various popular Android devices (kernel version{\textgreater}=4.3) including those with 64-bit processors by exploiting the CVE-2015-3636 use-after-free vulnerability in Linux kernel. To our knowledge, this is the first generic kernel exploit for the latest version of Android. Finally, to defend this kind of memory collision, we propose two corresponding mitigation schemes.},
author = {Xu, Wen and Li, Juanru and Shu, Junliang and Yang, Wenbo and Xie, Tianyi and Zhang, Yuanyuan and Gu, Dawu},
doi = {10.1145/2810103.2813637},
file = {:home/steveej/src/github/steveej/msc-thesis/docs/From Collision To Exploitation$\backslash$: Unleashing Use-After-Free Vulnerabilities in Linux Kernel.pdf:pdf},
isbn = {978-1-4503-3832-5},
issn = {15437221},
journal = {Ccs},
keywords = {linux kernel exploit,memory collision,user-after-free vulnerability},
pages = {414--425},
title = {{From Collision To Exploitation: Unleashing Use-After-Free Vulnerabilities in Linux Kernel}},
url = {http://dl.acm.org/citation.cfm?doid=2810103.2813637},
@article{Arpaci-Dusseau2015,
abstract = {A book covering the fundamentals of operating systems, including virtualization of the CPU and memory, threads and concurrency, and file and storage systems. Written by professors active in the field for 20 years, this text has been developed in the classrooms of the University of Wisconsin-Madison, and has been used in the instruction of thousands of students.},
author = {{Arpaci-Dusseau Remzi}, Arpaci-Dusseau Andrea},
file = {:home/steveej/src/github/steveej/msc-thesis/docs/operating{\_}systems{\_}{\_}three{\_}easy{\_}pieces{\_}{\_}electronic{\_}version{\_}0{\_}91{\_}.pdf:pdf},
journal = {Arpaci-Dusseau},
number = {0.91},
pages = {665},
title = {{Operating Systems: Three Easy Pieces}},
volume = {Electronic},
year = {2015}
}
@inproceedings{Ma2013,
abstract = {—Aiming at the problem of higher memory consumption and lower execution efficiency during the dynamic detecting to C/C++ programs memory vulnerabilities, this paper presents a dynamic detection method called ISC. The ISC improves the Safe-C using pointer analysis technology. Firstly, the ISC defines a simple and efficient fat pointer representation instead of the safe pointer in the Safe-C. Furthermore, the ISC uses the unification-based analysis algorithm with one level flow static pointer. This identification reduces the number of pointers that need to be converted to fat pointers. Then in the process of program running, the ISC detects memory vulnerabilities through constantly inspecting the attributes of fat pointers. Experimental results indicate that the ISC could detect memory vulnerabilities such as buffer overflows and dangling pointers. Comparing with the Safe-C, the ISC dramatically reduces the memory consumption and lightly improves the execution efficiency.},
author = {Ma, Rui and Chen, Lingkui and Hu, Changzhen and Xue, Jingfeng and Zhao, Xiaolin},
booktitle = {Proceedings - 2013 IEEE 11th International Conference on Dependable, Autonomic and Secure Computing, DASC 2013},
doi = {10.1109/DASC.2013.37},
file = {:home/steveej/src/github/steveej/msc-thesis/docs/A Dynamic Detection Method to C-C++ Programs Memory Vulnerabilities Based on Pointer Analysis.pdf:pdf},
isbn = {9781479933815},
keywords = {dynamic detecting,fat pointer,improved Safe-C,memory vulnerability,pointer analysis},
pages = {52--57},
title = {{A dynamic detection method to C/C++ programs memory vulnerabilities based on pointer analysis}},
@article{Szekeres2013,
abstract = {Memory corruption bugs in software written in low-level languages like C or C++ are one of the oldest problems in computer security. The lack of safety in these languages allows attackers to alter the program's behavior or take full control over it by hijacking its control flow. This problem has existed for more than 30 years and a vast number of potential solutions have been proposed, yet memory corruption attacks continue to pose a serious threat. Real world exploits show that all currently deployed protections can be defeated. This paper sheds light on the primary reasons for this by describing attacks that succeed on today's systems. We systematize the current knowledge about various protection techniques by setting up a general model for memory corrup- tion attacks. Using this model we show what policies can stop which attacks. The model identifies weaknesses of currently deployed techniques, as well as other proposed protections enforcing stricter policies. We analyze the reasons why protection mechanisms imple- menting stricter polices are not deployed. To achieve wide adoption, protection mechanisms must support a multitude of features and must satisfy a host of requirements. Especially important is performance, as experience shows that only solutions whose overhead is in reasonable bounds get deployed. A comparison of different enforceable policies helps de- signers of new protection mechanisms in finding the balance between effectiveness (security) and efficiency.We identify some open research problems, and provide suggestions on improving the adoption of newer techniques.},
author = {Szekeres, L??szl?? and Payer, Mathias and Wei, Tao and Song, Dawn},
doi = {10.1109/SP.2013.13},
file = {:home/steveej/src/github/steveej/msc-thesis/docs/SoK$\backslash$: Eternal War in Memory.pdf:pdf},
isbn = {9780769549774},
issn = {10816011},
journal = {Proceedings - IEEE Symposium on Security and Privacy},
pages = {48--62},
title = {{SoK: Eternal war in memory}},
year = {2013}
}
@article{Chisnall2015,
@ -104,15 +115,30 @@ title = {{Beyond the PDP-11 : Architectural support for a memory-safe C abstract
url = {http://www.cl.cam.ac.uk/research/security/ctsrd/pdfs/201503-asplos2015-cheri-cmachine.pdf},
year = {2015}
}
@article{Merity2016,
abstract = {Recent neural network sequence models with softmax classifiers have achieved their best language modeling performance only with very large hidden states and large vocabularies. Even then they struggle to predict rare or unseen words even if the context makes the prediction unambiguous. We introduce the pointer sentinel mixture architecture for neural sequence models which has the ability to either reproduce a word from the recent context or produce a word from a standard softmax classifier. Our pointer sentinel-LSTM model achieves state of the art language modeling performance on the Penn Treebank (70.9 perplexity) while using far fewer parameters than a standard softmax LSTM. In order to evaluate how well language models can exploit longer contexts and deal with more realistic vocabularies and larger corpora we also introduce the freely available WikiText corpus.},
archivePrefix = {arXiv},
arxivId = {1609.07843},
author = {Merity, Stephen and Xiong, Caiming and Bradbury, James and Socher, Richard},
eprint = {1609.07843},
journal = {Arxiv},
title = {{Pointer Sentinel Mixture Models}},
url = {http://arxiv.org/abs/1609.07843},
@article{Balasubramanian2017,
abstract = {Rust is a new system programming language that offers a practical and safe alternative to C. Rust is unique in that it enforces safety without runtime overhead, most importantly, without the overhead of garbage collection. While zero-cost safety is remarkable on its own, we argue that the super-powers of Rust go beyond safety. In particular, Rust's linear type system enables capabilities that cannot be implemented efficiently in traditional languages, both safe and unsafe, and that dramatically improve security and reliability of system software. We show three examples of such capabilities: zero-copy software fault isolation, efficient static information flow analysis, and automatic checkpointing. While these capabilities have been in the spotlight of systems research for a long time, their practical use is hindered by high cost and complexity. We argue that with the adoption of Rust these mechanisms will become commoditized.},
author = {Balasubramanian, Abhiram and Baranowski, Marek S and Burtsev, Anton and Irvine, Uc and Rakamari, Zvonimir and Ryzhyk, Leonid and Research, Vmware},
file = {:home/steveej/src/github/steveej/msc-thesis/docs/DRAFT$\backslash$: System Programming in Rust$\backslash$: Beyond Safety.pdf:pdf},
title = {{DRAFT: System Programming in Rust: Beyond Safety}},
year = {2017}
}
@inproceedings{Ma2013,
abstract = {—Aiming at the problem of higher memory consumption and lower execution efficiency during the dynamic detecting to C/C++ programs memory vulnerabilities, this paper presents a dynamic detection method called ISC. The ISC improves the Safe-C using pointer analysis technology. Firstly, the ISC defines a simple and efficient fat pointer representation instead of the safe pointer in the Safe-C. Furthermore, the ISC uses the unification-based analysis algorithm with one level flow static pointer. This identification reduces the number of pointers that need to be converted to fat pointers. Then in the process of program running, the ISC detects memory vulnerabilities through constantly inspecting the attributes of fat pointers. Experimental results indicate that the ISC could detect memory vulnerabilities such as buffer overflows and dangling pointers. Comparing with the Safe-C, the ISC dramatically reduces the memory consumption and lightly improves the execution efficiency.},
author = {Ma, Rui and Chen, Lingkui and Hu, Changzhen and Xue, Jingfeng and Zhao, Xiaolin},
booktitle = {Proceedings - 2013 IEEE 11th International Conference on Dependable, Autonomic and Secure Computing, DASC 2013},
doi = {10.1109/DASC.2013.37},
file = {:home/steveej/src/github/steveej/msc-thesis/docs/A Dynamic Detection Method to C-C++ Programs Memory Vulnerabilities Based on Pointer Analysis.pdf:pdf},
isbn = {9781479933815},
keywords = {dynamic detecting,fat pointer,improved Safe-C,memory vulnerability,pointer analysis},
pages = {52--57},
title = {{A dynamic detection method to C/C++ programs memory vulnerabilities based on pointer analysis}},
year = {2013}
}
@article{Getreu2016,
author = {Getreu, Jens},
file = {:home/steveej/src/github/steveej/msc-thesis/docs/Embedded System Security with Rust - Case Study of Heartbleed.pdf:pdf},
pages = {1--24},
title = {{Embedded System Security with Rust}},
year = {2016}
}
@article{Dhurjati2003,
@ -130,35 +156,17 @@ title = {{Memory safety without runtime checks or garbage collection}},
volume = {38},
year = {2003}
}
@article{Levy2015a,
abstract = {Rust, a new systems programming language, provides compile-time memory safety checks to help eliminate runtime bugs that manifest from improper memory management. This feature is advantageous for operating system development, and especially for embedded OS development, where recovery and debugging are particularly challenging. However, embedded platforms are highly event-based, and Rust's memory safety mechanisms largely presume threads. In our experience developing an operating system for embedded systems in Rust, we have found that Rust's ownership model prevents otherwise safe resource sharing common in the embedded domain, conflicts with the reality of hardware resources, and hinders using closures for programming asynchronously. We describe these experiences and how they relate to memory safety as well as illustrate our workarounds that preserve the safety guarantees to the largest extent possible. In addition, we draw from our experience to propose a new language extension to Rust that would enable it to provide better memory safety tools for event-driven platforms.},
author = {Levy, Amit and Andersen, Michael P. and Campbell, Bradford and Culler, David and Dutta, Prabal and Ghena, Branden and Levis, Philip and Pannuto, Pat},
doi = {10.1145/2818302.2818306},
file = {:home/steveej/src/github/steveej/msc-thesis/docs/tock-plos2015.pdf:pdf},
isbn = {9781450339421},
journal = {PLOS: Workshop on Programming Languages and Operating Systems},
keywords = {embedded operating systems,linear types,ownership,rust},
pages = {21--26},
title = {{Ownership is Theft: Experiences Building an Embedded OS in Rust}},
url = {http://dl.acm.org/citation.cfm?id=2818302.2818306},
year = {2015}
}
@article{Caballero2012,
abstract = {Use-after-free vulnerabilities are rapidly growing in popularity, especially for exploiting web browsers. Use-after-free (and double-free) vulnerabilities are caused by a program operating on a dangling pointer. In this work we propose early detection, a novel runtime approach for finding and diagnosing use-after-free and double-free vulnerabilities. While previous work focuses on the creation of the vulnerability (i.e., the use of a dangling pointer), early detection shifts the focus to the creation of the dangling pointer(s) at the root of the vulnerability. Early detection increases the effectiveness of testing by identifying unsafe dangling pointers in executions where they are created but not used. It also accelerates vulnerability analysis and minimizes the risk of incomplete fixes, by automatically collecting information about all dangling pointers involved in the vulnerability. We implement our early detection technique in a tool called Undangle. We evaluate Undangle for vulnerability analysis on 8 real-world vulnerabilities. The analysis uncovers that two separate vulnerabilities in Firefox had a common root cause and that their patches did not completely fix the underlying bug. We also evaluate Undangle for testing on the Firefox web browser identifying a potential vulnerability.},
author = {Caballero, Juan and Grieco, Gustavo and Marron, Mark and Nappa, Antonio},
doi = {10.1145/2338965.2336769},
isbn = {9781450314541},
issn = {1450314546},
journal = {ISSTA},
keywords = {automated testing,binary analysis,debugging,dynamic analysis},
pages = {133},
title = {{Undangle: early detection of dangling pointers in use-after-free and double-free vulnerabilities}},
url = {http://dl.acm.org/citation.cfm?doid=2338965.2336769},
year = {2012}
}
@article{Affairs2015,
author = {Affairs, Post Doctoral},
file = {:home/steveej/src/steveej/msc-thesis/docs/You can't spell trust without Rust.pdf:pdf},
title = {{YOU CAN ' T SPELL TRUST WITHOUT RUST alexis beingessner Master ' s in Computer Science Carleton University}},
@article{Xu2015,
abstract = {Since vulnerabilities in Linux kernel are on the increase, attackers have turned their interests into related exploitation techniques. However, compared with numerous researches on exploiting use-after-free vulnerabilities in the user applications, few efforts studied how to exploit use-after-free vulnerabilities in Linux kernel due to the difficulties that mainly come from the uncertainty of the kernel memory layout. Without specific information leakage, attackers could only conduct a blind memory overwriting strategy trying to corrupt the critical part of the kernel, for which the success rate is negligible. In this work, we present a novel memory collision strategy to exploit the use-after-free vulnerabilities in Linux kernel reliably. The insight of our exploit strategy is that a probabilistic memory collision can be constructed according to the widely deployed kernel memory reuse mechanisms, which significantly increases the success rate of the attack. Based on this insight, we present two practical memory collision attacks: An object-based attack that leverages the memory recycling mechanism of the kernel allocator to achieve freed vulnerable object covering, and a physmap-based attack that takes advantage of the overlap between the physmap and the SLAB caches to achieve a more flexible memory manipulation. Our proposed attacks are universal for various Linux kernels of different architectures and could successfully exploit systems with use-after-free vulnerabilities in kernel. Particularly, we achieve privilege escalation on various popular Android devices (kernel version{\textgreater}=4.3) including those with 64-bit processors by exploiting the CVE-2015-3636 use-after-free vulnerability in Linux kernel. To our knowledge, this is the first generic kernel exploit for the latest version of Android. Finally, to defend this kind of memory collision, we propose two corresponding mitigation schemes.},
author = {Xu, Wen and Li, Juanru and Shu, Junliang and Yang, Wenbo and Xie, Tianyi and Zhang, Yuanyuan and Gu, Dawu},
doi = {10.1145/2810103.2813637},
file = {:home/steveej/src/github/steveej/msc-thesis/docs/From Collision To Exploitation$\backslash$: Unleashing Use-After-Free Vulnerabilities in Linux Kernel.pdf:pdf},
isbn = {978-1-4503-3832-5},
issn = {15437221},
journal = {Ccs},
keywords = {linux kernel exploit,memory collision,user-after-free vulnerability},
pages = {414--425},
title = {{From Collision To Exploitation: Unleashing Use-After-Free Vulnerabilities in Linux Kernel}},
url = {http://dl.acm.org/citation.cfm?doid=2810103.2813637},
year = {2015}
}

View file

@ -1,4 +1,6 @@
\documentclass[12pt,a4paper]{report}
\documentclass[draft,12pt,a4paper]{report}
\overfullrule=5mm
\usepackage[utf8]{inputenc}
@ -123,7 +125,7 @@
\chapter*{Preface}
This thesis is original, unpublished, independent work by the author, \authorOne.
I strongly believe in openness and collaboration in the development of new technology, therefore the development will be based solely on Open-Source software.
The results of this project will be freely available on my personal Github site\footnote{https://github.com/steveeJ/msc-thesis} once the academic process of this project is complete.
The results of this project will be freely available on my personal Gitlab site\footnote{https://gitlab.com/steveeJ/msc-thesis} once the academic process of this project is complete.
\tableofcontents
@ -138,14 +140,13 @@
\label{part:research}
\include{parts/research/research}
\part{Development}
\part{Conclusion}
\newpage
%TODO \listofmyequations
\listoftables
%TODO \lstlistoflistings
\listoffigures
\listoffigures
\bibliography{thesis}
\end{document}