The data_encoding library is used throughout the Tezos codebase for various serializing/deserializing tasks. Because of this, in our effort to formally verify the protocol, there are many proofs involving encoding.
data_encoding has a series of primitives for encoding/decoding scalar
data types and combinators for non-scalar data types, combinators work by
combining primitives together with functions and letting users compose smaller
encodings into bigger ones.
Having axioms and lemmas for primitives and combinators enable us to verify the encodings, but in practice, these proofs can be very mechanical.
For example here is a proof involving data_encoding.
Lemma fixed_encoding_is_valid : Data_encoding.Valid.t (fun _ => True) Constants_repr.fixed_encoding. eapply Data_encoding.Valid.implies. eapply Data_encoding.Valid.conv, Data_encoding.Valid.obj10; constructor; try first [ apply Data_encoding.Valid.uint8 | apply Data_encoding.Valid.uint16 | apply Data_encoding.Valid.int31 ]; eapply Data_encoding.Valid.list_value; eapply Data_encoding.Valid.conv_with_guard; apply Data_encoding.Valid.int64_value. intro x; destruct x; intuition. repeat constructor.Qed.
Data_encoding.Valid.* lines can now be replaced by a simple call of
data_encoding_auto. This tactic will try to apply each of the combinator axioms
Data_encoding.Valid and then use the
Data_encoding_db hint database to
solve the remaining subgoals regarding
data_encoding primitives. If the proof
cannot be fully solved it will remain subgoals that are specific for each encoding.
Now here is the same proof but with
Lemma fixed_encoding_is_valid : Data_encoding.Valid.t (fun _ => True) Constants_repr.fixed_encoding. Data_encoding.Valid.data_encoding_auto. intro x; destruct x; intuition. repeat constructor.Qed.
data_encoding combinators involve arbitrary functions; these are
for encoding variants of sum types, and
conv for providing a conversion
between encodings. Because of this it is not always possible to fully automate the
encoding proofs, but we can leave all the mechanical work for the machine so we
can focus on things that are specific for each encoding.
As we write more encoding proofs we can increment the hints database. Since encodings have a nesting nature, we can use these hints to prove bigger encodings making it easier to write encoding proofs and avoiding repetitive proofs.