Safegcd’s Implementation Formally Verified

Introduction

The security of Bitcoin, and other blockchains, such as Liquid, hinges on the use of digital signatures algorithms such as ECDSA and Schnorr signatures. A C library called libsecp256k1, named after the elliptic curve that the library operates on, is used by both Bitcoin Core and Liquid, to provide these digital signature algorithms. These algorithms make use of a mathematical computation called a modular inverse, which is a relatively expensive component of the computation.

In “Fast constant-time gcd computation and modular inversion,” Daniel J. Bernstein and Bo-Yin Yang develop a new modular inversion algorithm. In 2021, this algorithm, referred to as “safegcd,” was implemented for libsecp256k1 by Peter Dettman. As part of the vetting process for this novel algorithm, Blockstream Research was the first to complete a formal verification of the algorithm’s design by using the Coq proof assistant to formally verify that the algorithm does indeed terminate with the correct modular inverse result on 256-bit inputs.

The Gap between Algorithm and Implementation

The formalization effort in 2021 only showed that the algorithm designed by Bernstein and Yang works correctly. However, using that algorithm in libsecp256k1 requires implementing the mathematical description of the safegcd algorithm within the C programming language. For example, the mathematical description of the algorithm performs matrix multiplication of vectors that can be as wide as 256 bit signed integers, however the C programming language will only natively provide integers up to 64 bits (or 128 bits with some language extensions).

Implementing the safegcd algorithm requires programming the matrix multiplication and other computations using C’s 64 bit integers. Additionally, many other optimizations have been added to make the implementation fast. In the end, there are four separate implementations of the safegcd algorithm in libsecp256k1: two constant time algorithms for signature generation, one optimized for 32-bit systems and one optimized for 64-bit systems, and two variable time algorithms for signature verification, again one for 32-bit systems and one for 64-bit systems.

Verifiable C

In order to verify the C code correctly implements the safegcd algorithm, all the implementation details must be checked. We use Verifiable C, part of the Verified Software Toolchain for reasoning about C code using the Coq theorem prover.

Verification proceeds by specifying preconditions and postconditions using separation logic for every function undergoing verification. Separation logic is a logic specialized for reasoning about subroutines, memory allocations, concurrency and more.

Once each function is given a specification, verification proceeds by starting from a function’s precondition, and establishing a new invariant after each statement in the body of the function, until finally establishing the post condition at the end of the function body or the end of each return statement. Most of the formalization effort is spent “between” the lines of code, using the invariants to translate the raw operations of each C expression into higher level statements about what the data structures being manipulated represent mathematically. For example, what the C language regards as an array of 64-bit integers may actually be a representation of a 256-bit integer.

The end result is a formal proof, verified by the Coq proof assistant, that libsecp256k1’s 64-bit variable time implementation of the safegcd modular inverse algorithm is functionally correct.

Limitations of the Verification

There are some limitations to the functional correctness proof. The separation logic used in Verifiable C implements what is known as partial correctness. That means it only proves the C code returns with the correct result if it returns, but it doesn’t prove termination itself. We mitigate this limitation by using our previous Coq proof of the bounds on the safegcd algorithm to prove that the loop counter value of the main loop in fact never exceeds 11 iterations.

Another issue is that the C language itself has no formal specification. Instead the Verifiable C project uses the CompCert compiler project to provide a formal specification of a C language. This guarantees that when a verified C program is compiled with the CompCert compiler, the resulting assembly code will meet its specification (subject to the above limitation). However this doesn’t guarantee that the code generated by GCC, clang, or any other compiler will necessarily work. For example, C compilers are allowed to have different evaluation orders for arguments within a function call. And even if the C language had a formal specification any compiler that isn’t itself formally verified could still miscompile programs. This does occur in practice.

Lastly, Verifiable C doesn’t support passing structures, returning structures or assigning structures. While in libsecp256k1, structures are always passed by pointer (which is allowed in Verifiable C), there are a few occasions where structure assignment is used. For the modular inverse correctness proof, there were 3 assignments that had to be replaced by a specialized function call that performs the structure assignment field by field.

Summary

Blockstream Research has formally verified the correctness of libsecp256k1’s modular inverse function. This work provides further evidence that verification of C code is possible in practice. Using a general purpose proof assistant allows us to verify software built upon complex mathematical arguments.

Nothing prevents the rest of the functions implemented in libsecp256k1 from being verified as well. Thus it is possible for libsecp256k1 to obtain the highest possible software correctness guarantees.

This is a guest post by Russell O’Connor and Andrew Poelstra. Opinions expressed are entirely their own and do not necessarily reflect those of BTC Inc or Bitcoin Magazine.

Source: bitcoinmagazine.com