A Formal Model of Checked C
Abstract
We present a formal model of Checked C, a dialect of C that aims to enforce spatial memory safety. Our model pays particular attention to the semantics of dynamically sized, potentially null-terminated arrays. We formalize this model in Coq, and prove that any spatial memory safety errors can be blamed on portions of the program labeled unchecked; this is a Checked C feature that supports incremental porting and backward compatibility. While our model's operational semantics uses annotated ("fat") pointers to enforce spatial safety, we show that such annotations can be safely erased: Using PLT Redex we formalize an executable version of our model and a compilation procedure from it to an untyped C-like language, and use randomized testing to validate that generated code faithfully simulates the original. Finally, we develop a custom random generator for well-typed and almost-well-typed terms in our Redex model, and use it to search for inconsistencies between our model and the Clang Checked C implementation. We find these steps to be a useful way to co-develop a language (Checked C is still in development) and a core model of it.
- Publication:
-
arXiv e-prints
- Pub Date:
- January 2022
- DOI:
- 10.48550/arXiv.2201.13394
- arXiv:
- arXiv:2201.13394
- Bibcode:
- 2022arXiv220113394L
- Keywords:
-
- Computer Science - Programming Languages;
- Computer Science - Software Engineering;
- D.3.1
- E-Print:
- This is an extended version of a paper that appears at the 2022 Computer Security Foundations Symposium