Skip to main content

Translate the Tenderbake's code to Coq

ยท 11 min read
Foobar.land (for Nomadic Labs)

Recently, an important merge request landed in Tezos to implement Tenderbake. This new consensus algorithm offers deterministic finality:

"a block that has just been appended to the chain of some node is known to be final once it has two additional blocks on top of it, regardless of network latency." (Nomadic Labs' blog)

This merge request and other changes resulted in almost 10,000 new Coq lines ๐Ÿ“ in the translated code.

In this blog post, we explain how we dealt with all recent code changes to maintain the Coq translation of the protocol. We use the tool coq-of-ocaml to do our translation. The resulting Coq code is what you can browse on this website. All our changes on the OCaml code are in our fork branch guillaume-claret@proto_alpha-coq-of-ocaml. The procedure to compile the protocol to Coq is in this HOWTO.

Record in a functor#

We have never dealt with definitions of record types in a functor, and this feature was broken in coq-of-ocaml. The definition is in a new file sampler.ml:

module Make (Mass : Mass) : S with type mass = Mass.t = struct  type mass = Mass.t
  type 'a t = {    total : Mass.t;    support : 'a FallbackArray.t;    p : Mass.t FallbackArray.t;    alias : int FallbackArray.t;  }
  (* ... *)end

Our initial translation was:

Module Make.  Class FArgs {Mass_t : Set} := {    Mass : Mass (t := Mass_t);  }.  Arguments Build_FArgs {_}.    Definition mass `{FArgs} : Set := Mass.(Mass.t).    Module t.    Record record {a : Set} : Set := Build {      total : Mass.(Mass.t);      support : FallbackArray.t a;      p : FallbackArray.t Mass.(Mass.t);      alias : FallbackArray.t int }.    Arguments record : clear implicits.    Definition with_total {t_a} total (r : record t_a) :=      Build t_a total r.(support) r.(p) r.(alias).    Definition with_support {t_a} support (r : record t_a) :=      Build t_a r.(total) support r.(p) r.(alias).    Definition with_p {t_a} p (r : record t_a) :=      Build t_a r.(total) r.(support) p r.(alias).    Definition with_alias {t_a} alias (r : record t_a) :=      Build t_a r.(total) r.(support) r.(p) alias.  End t.  Definition t := t.record.

We use a special type class FArgs to represent the current functor's parameters. However, we need to give the class as a parameter to all our definitions using `FArgs (the additional backquote is there to also include implicit type parameters of the class, in this case Mass_t). This is something we forgot in all our definitions related to records. In addition to that, having an extra parameter changes the implicit settings of the type record, as the polymorphic type a must be explicit but not the class parameter FArgs.

We did not have time to implement all these changes yet in coq-of-ocaml. So instead, we added new replacement rules to our patch file scripts/alpha/patch.rb so that we generate the following Coq code:

Module Make.  Class FArgs {Mass_t : Set} := {    Mass : Mass (t := Mass_t);  }.  Arguments Build_FArgs {_}.    Definition mass `{FArgs} : Set := Mass.(Mass.t).    Module t.    Record record `{FArgs} {a : Set} : Set := Build {      total : Mass.(Mass.t);      support : FallbackArray.t a;      p : FallbackArray.t Mass.(Mass.t);      alias : FallbackArray.t int }.    Arguments record {_ _}.    Definition with_total `{FArgs} {t_a} total (r : record t_a) :=      Build _ _ t_a total r.(support) r.(p) r.(alias).    Definition with_support `{FArgs} {t_a} support (r : record t_a) :=      Build _ _ t_a r.(total) support r.(p) r.(alias).    Definition with_p `{FArgs} {t_a} p (r : record t_a) :=      Build _ _ t_a r.(total) r.(support) p r.(alias).    Definition with_alias `{FArgs} {t_a} alias (r : record t_a) :=      Build _ _ t_a r.(total) r.(support) r.(p) alias.  End t.  Definition t `{FArgs} := t.record.

Replacement rules are simple "search and replace" on the Coq code generated by coq-of-ocaml.

When we will implement the support of records in functors in coq-of-ocaml, we should be more cautious with our tests and check the definition of inductive types in functors works too. Note that, in practice, defining a new type in a functor is rarely done in Tezos. What happens more frequently is the definition of new synonym types.

Flat polymorphic variants#

In the protocol, in token.ml, we define three related types:

type container =  [ `Contract of Contract_repr.t  | `Collected_commitments of Blinded_public_key_hash.t  | `Delegate_balance of Signature.Public_key_hash.t  | `Frozen_deposits of Signature.Public_key_hash.t  | `Block_fees  | `Legacy_deposits of Signature.Public_key_hash.t * Cycle_repr.t  | `Legacy_fees of Signature.Public_key_hash.t * Cycle_repr.t  | `Legacy_rewards of Signature.Public_key_hash.t * Cycle_repr.t ]
type source =  [ `Invoice  | `Bootstrap  | `Initial_commitments  | `Revelation_rewards  | `Double_signing_evidence_rewards  | `Endorsing_rewards  | `Baking_rewards  | `Baking_bonuses  | `Minted  | `Liquidity_baking_subsidies  | container ]
type sink =  [ `Storage_fees  | `Double_signing_punishments  | `Lost_endorsing_rewards of Signature.Public_key_hash.t * bool * bool  | `Burned  | container ]

The types source and sink include the type container. Thus when we use the constructor `Block_fees for example it could be of type container, source or sink depending on the context. We use the ability to do sub-typing with the polymorphic variants of OCaml.

There are no direct equivalents in Coq of the polymorphic variants. We can probably emulate their behavior using coercions from container to source or sink. For now we prefer to do an explicit change in the OCaml code of our fork of the protocol to use algebraic data types instead:

type container =  | Contract of Contract_repr.t  | Collected_commitments of Blinded_public_key_hash.t  | Delegate_balance of Signature.Public_key_hash.t  | Frozen_deposits of Signature.Public_key_hash.t  | Block_fees  | Legacy_deposits of Signature.Public_key_hash.t * Cycle_repr.t  | Legacy_fees of Signature.Public_key_hash.t * Cycle_repr.t  | Legacy_rewards of Signature.Public_key_hash.t * Cycle_repr.t
type source =  | Invoice  | Bootstrap  | Initial_commitments  | Revelation_rewards  | Double_signing_evidence_rewards  | Endorsing_rewards  | Baking_rewards  | Baking_bonuses  | Minted  | Liquidity_baking_subsidies  | SourceContainer of container
type sink =  | Storage_fees  | Double_signing_punishments  | Lost_endorsing_rewards of Signature.Public_key_hash.t * bool * bool  | Burned  | SinkContainer of container

The syntax is a little heavier since we have to use the constructors SourceContainer or SinkContainer to lift from a container to a source or a sink. However, we believe this change is not too important and helps to make the code more explicit. We will see if this change can be integrated into the master branch of Tezos and seek the coercion approach if it cannot.

The translated Coq code for these algebraic types is straightforward:

Inductive container : Set :=| Contract : Contract_repr.t -> container| Collected_commitments : Blinded_public_key_hash.t -> container| Delegate_balance : Signature.public_key_hash -> container| Frozen_deposits : Signature.public_key_hash -> container| Block_fees : container| Legacy_deposits : Signature.public_key_hash -> Cycle_repr.t -> container| Legacy_fees : Signature.public_key_hash -> Cycle_repr.t -> container| Legacy_rewards : Signature.public_key_hash -> Cycle_repr.t -> container.
Inductive source : Set :=| Invoice : source| Bootstrap : source| Initial_commitments : source| Revelation_rewards : source| Double_signing_evidence_rewards : source| Endorsing_rewards : source| Baking_rewards : source| Baking_bonuses : source| Minted : source| Liquidity_baking_subsidies : source| SourceContainer : container -> source.
Inductive sink : Set :=| Storage_fees : sink| Double_signing_punishments : sink| Lost_endorsing_rewards : Signature.public_key_hash -> bool -> bool -> sink| Burned : sink| SinkContainer : container -> sink.

Top-level name collisions#

In Coq, we cannot define two times the same value at top-level. This creates errors on the Coq side when a name is re-defined in OCaml at top-level. To avoid that, we do some renaming on our OCaml fork of the protocol. An example of such renaming occurs in raw_context.ml. The function non_consensus_operations is defined two times:

let[@inline] non_consensus_operations ctxt =  ctxt.back.non_consensus_operations
(* ... some code in between ... *)
let non_consensus_operations ctxt =  List.rev (non_consensus_operations ctxt)

We rename the first definition to non_consensus_operations_rev. We should probably also rename the corresponding field in ctxt.back to non_consensus_operations_rev. We believe this top-level collision could result in bugs as it can be unclear which one we use and which one reverses the list.

Another frequent collision we have is for init expressions. With coq-of-ocaml, we convert an OCaml code of the form:

let () =  (* some code *)

to:

Definition init_module : unit :=  (* some code *)

The translated code is mostly useless as we can prove it to be equal to tt of type unit since there are no side-effects in Coq. The translated code has some interests to see how the translated code would look like or if we want to do extraction to OCaml. In the file round_repr.ml there are several let () = statements at top-level. Thus we rename each of them to init_module1, init_module2, ... We wonder if we should always group the init expressions of a file together and if we should replace init expressions by init functions:

let init () =  (* some code *)

The advantage of the init functions is that we control when (and if) they are called. Having the init code in a monad would also help to verify it by capturing the side-effects.

Other name collisions#

Another frequent name collision is the collision between type names and value names. In Coq, types and values are in the same namespace, essentially because this is a dependently typed language. A collision happens when we use the name of a type for a value (for example list) or a single letter name (like a), as polymorphic types often use single letter names too. We use an explicit list of value names to rename, appending the _value suffix. We needed to add k and round_and_offset to the list of values to rename for the new code changes.

We had one collision between module and module type names in sampler.ml. We renamed the module signature Mass to avoid a name collision with the module Mass.

Finally, in OCaml, we can have the same constructor name for several different constructors, using type information to distinguish them. This is not possible in Coq, so we namespace these constructors by putting them into different modules. This happened for the file operation_repr.ml.

Module sub-typing#

We have limited support of module sub-typing in coq-of-ocaml. This is because we translate modules to records, and records do not have sub-typing. In storage.ml, the module Int31_index has two signatures which are disjoint on their value fields:

  • Storage_functors.INDEX with type t = int and
  • Storage_sigs.VALUE with type t = int.

We chose to split this module into two different modules.

Unpacking of first-class modules#

Due to universe consistency issues, we cannot unpack first-class modules to plain modules at top-level in Coq. With coq-of-ocaml, we represent first-class modules in Coq by records with existential types in impredicative sets for the abstract module types. There are a few examples of such unpacking for the cache system. For now we replace these unpacking by an axiom as we do not verify anything on the cache system yet. An alternative solution is to disable the universe-level check in Coq or rewrite the OCaml code to avoid unpacking first-class modules at top-level.

Axiomatized GADTs#

Despite recent efforts, we are still not able to translate the GADTs of the Michelson interpreter. As a result, we needed to axiomatize some functions matching on these GADTs. This is for example the case for a few functions defined in ticket_scanner.ml.

Fixpoints#

Coq requires fixpoints to only make recursive calls on syntactically smaller parameters. As a result, we sometimes need to disable the termination check using the Guard Checking flag. For some of the definitions, we set the mutually recursive functions as notations using the attributes coq_mutual_as_notation of coq-of-ocaml. For example, we use this technique for the function has_tickets_of_ty in the file ticket_scanner.ml.

For a future version of coq-of-ocaml, we are considering removing these annotations from the code and having them in a separated annotation file for those who prefer. Indeed, this seems to be a request from some OCaml developers.

Update in the proofs#

We had no proofs to update. However, we had to remove some of them as their corresponding definitions or file disappeared. Overall, our proof coverage decreased as the size of the translated Coq code increased by 20%.