Wednesday, October 16, 2024

Google’s memory safety plan still involves unsafe C/C++

Must read

Google has revealed that its approach to making programming code more memory safe involves both the adoption of memory safe languages and making unsafe languages more secure – to the extent that’s possible.

The Chocolate Factory has been an avid booster of memory safety for the past few years – celebrating the security benefits that accrue when code is written or rewritten in a language that, like Rust, offers guarantees of memory safety.

But the biz also acknowledges that legacy C and C++ code can’t all be revised or discarded. So it’s trying to balance its memory safety evangelism with the reality that C and C++ codebases will exist for decades to come, and they must be hardened.

This two-pronged approach has been discussed for some time, but the part about learning to live with unsafe code often gets drowned out by the appreciative odes to Rust and other memory safe languages (MSLs) like Java, Kotlin, Go, and Python.

“Our long-term objective is to progressively and consistently integrate memory-safe languages into Google’s codebases while phasing out memory-unsafe code in new development,” explained Googlers Alex Rebert, Chandler Carruth, Jen Engel, and Andy Qin in a blog post. “Given the amount of C++ code we use, we anticipate a residual amount of mature and stable memory-unsafe code will remain for the foreseeable future.”

Memory safety bugs date back more than 50 years and occur when code tries to read or write memory in a way that’s undefined – a concern that Rust contributor Steve Klabnik argues goes beyond memory safety. Undefined behaviour may occur, for example, when a program in an unsafe language tries to access an object’s memory outside of its allocated memory region. The result is an out of bounds error.

Other memory safety flaws arise when, for example, a pointer references heap-allocated memory that has been freed.

Such issues turn out to be rather common in C and C++, which make programmers responsible for memory management.

Which may be why 75 percent of the CVEs used in zero-day exploits are memory safety vulnerabilities, according to Google. About 70 percent of severe vulnerabilities in large codebases are attributable to such bugs.

The repeated citation of such statistics over the past few years has led to an international campaign – backed by government cyber security agencies – to use MSLs where possible, as well as initiatives to convert existing unsafe code into something more sound.

Google has embraced MSLs and tried to harden C++. “We have allocated a portion of our computing resources specifically to bounds-checking the C++ standard library across our workloads,” explained Rebert et al, adding that the promising results of this effort will be shared at a later date.

In addition to Chrome’s MiraclePtr mechanism, which has cut use-after-free memory bugs by more by 57 percent, Google’s ongoing efforts to expand isolation techniques like sandboxing and privilege reduction have led to projects like the beta release of the V8 heap sandbox, an LLM-based vulnerability hunting tool called Project Naptime, support for Arm’s Memory Tagging Extension (MTE), and research into Capability Hardware Enhanced RISC Instructions (CHERI) architecture.

Google is not alone in its work to fortify C and C++. The Open Source Security Foundation has published a guide to hardening C and C++ code. The C++ Alliance recently published a Safe C++ Extensions proposal. C23 – a draft of the latest version of the C programming language – has features like N3020, Qualifier-preserving Standard Functions, which help improve read-only memory safety.

Also, Bjarne Stroustrup, creator of C++, has proposed Safety Profiles [PDF] – a set of rules that makes certain safety guarantees.

Memory safe languages may be the future – but for some time to come, so are C and C++. ®

Latest article