A Formal Model of Checked C

doi:10.48550/arXiv.2201.13394

A Formal Model of Checked C

We present a formal model of Checked C, a dialect of C that aims to enforce spatial memory safety. Our model pays particular attention to the semantics of dynamically sized, potentially null-terminated arrays. We formalize this model in Coq, and prove that any spatial memory safety errors can be blamed on portions of the program labeled unchecked; this is a Checked C feature that supports incremental porting and backward compatibility. While our model's operational semantics uses annotated ("fat") pointers to enforce spatial safety, we show that such annotations can be safely erased: Using PLT Redex we formalize an executable version of our model and a compilation procedure from it to an untyped C-like language, and use randomized testing to validate that generated code faithfully simulates the original. Finally, we develop a custom random generator for well-typed and almost-well-typed terms in our Redex model, and use it to search for inconsistencies between our model and the Clang Checked C implementation. We find these steps to be a useful way to co-develop a language (Checked C is still in development) and a core model of it.

Publication:

arXiv e-prints

Pub Date:

January 2022

DOI:

10.48550/arXiv.2201.13394

arXiv:

arXiv:2201.13394

Bibcode:

2022arXiv220113394L

Keywords:

Computer Science - Programming Languages;
Computer Science - Software Engineering;
D.3.1

E-Print:

This is an extended version of a paper that appears at the 2022 Computer Security Foundations Symposium

NASA/ADS

A Formal Model of Checked C

Abstract