Verified Effectful Programming in F*

fstar-logo

Cătălin Hriţcu, Inria Paris

Trends in Functional Programming

University of Kent, Canterbury, 21 June 2017

The current F* team

Microsoft Research, Inria Paris, Univ of Edinburgh, MIT, …

  • Danel Ahman
  • Benjamin Beurdouche
  • Karthikeyan Bhargavan
  • Antoine Delignat-Lavaud
  • Victor Dumitrescu
  • Cédric Fournet
  • Armaël Guéneau
  • Cătălin Hriţcu
  • Samin Ishtiaq
  • Markulf Kohlweiss
  • Tomer Libal
  • Qunyan Magnus
  • Kenji Maillard
  • Guido Martínez
  • Clément Pit-Claudel
  • Gordon Plotkin
  • Jonathan Protzenko
  • Tahina Ramananandro
  • Aseem Rastogi
  • Nikhil Swamy
  • Peng (Perry) Wang
  • Christoph M. Wintersteiger
  • Santiago Zanella-Béguelin
  • Jean-Karim Zinzindohoué

Program verification: Shall the twain ever meet?

Interactive proof assistants Semi-automated verifiers of imperative programs
Coq, CompCert,   air Dafny, Verve,
Isabelle, seL4, FramaC, IronClad,
Agda, Bedrock, Why3 miTLS
Lean, 4 colors   gap Vale
  • In the left corner: Very expressive dependently-typed logics,
    but only purely functional programming

  • In the right: effectful programming, SMT-based automation,
    but only first-order logic

Bridging the gap: F*

  • Functional programming language with effects

  • Semi-automated verification system using SMT

    • like Dafny, FramaC, Why3,
  • Interactive proof assistant based on dependent types

    • like Coq, Lean, Agda,
  • Other tools in this space:

    • DML/ATS, HTT, Idris, Trellys/Zombie, CoqHammer,

F* in action, at scale

  • Functional programming language with effects

    • F* is programmed in F*, but not (yet) verified
  • Semi-automated verification system

    • Project Everest: verify and deploy new, efficient HTTPS stack
      • miTLS*: Verified reference implementation of TLS (1.2 and 1.3)
      • HACL*: High-Assurance Cryptographic Library
      • Vale: Verified Assembly Language for Everest
  • Proof assistant based on dependent types

    • Fallback when SMT fails; also for mechanized metatheory:
      • MicroFStar: Fragment of F* formalized in F*
      • Wys*: Verified DSL for secure multi-party computations
      • ReVer: Verified compiler to reversible circuits

The rest of this talk

  • The functional core of F*

  • Verifying functional programs

  • Monadic effects in F*

  • Verifying effectful programs

  • Under the hood: Dijkstra monads for free

  • Using very simple examples throughout

The functional core of F*

Recursive functions

val factorial : int -> int
let rec factorial n = (if n = 0 then 1 else n * (factorial (n - 1)))

Inductive datatypes

type list (a:Type) =
  | Nil  : list a
  | Cons : hd:a -> tl:list a -> list a

val map : ('a -> 'b) -> list 'a -> list 'b
let rec map f x = match x with
  | [] -> []
  | h :: t -> f h :: map f t

Lambdas

map (fun x -> x + 42) [1;2;3]

Refinement types

type nat = x:int{x>=0}

Refinements introduced by type annotations (code unchanged)

val factorial : nat -> nat
let rec factorial n = (if n = 0 then 1 else n * (factorial (n - 1)))

Logical obligations discharged by SMT (simplified)

n >= 0, n <> 0 |= n - 1 >= 0
n >= 0, n <> 0, (factorial (n - 1)) >= 0 |= n * (factorial (n - 1)) >= 0

Refinements eliminated by subtyping: nat<:int

let i : int = factorial 42

Dependent types

Dependent function types ($\Pi$), here together with refinements:

val incr : x:int -> y:int{x < y}
let incr x = x + 1

Can express pre- and post- conditions of pure functions

val incr : x:int -> y:int{y = x + 1}

Total functions in F*

The F* functions we saw so far were all total

Tot effect (default) = no side-effects, terminates on all inputs

val factorial : nat -> Tot nat
let rec factorial n = (if n = 0 then 1 else n * (factorial (n - 1)))

How about giving this weak type to factorial?

val factorial : int -> Tot int
let rec factorial n = (if n = 0 then 1 else n * (factorial (n - 1)))
                                                            ^^^^^
Subtyping check failed; expected type (x:int{(x << n)}); got type int

factorial (-1) loops! (int type in F* is unbounded)

Semantic termination checking

  • based on well-founded ordering on expressions (<<)
    • naturals related by < (negative integers unrelated)
    • inductives related by subterm ordering
    • lex tuples %[a;b;c] with lexicographic ordering
  • order constraints discharged by the SMT solver
  • arbitrary total expression as decreases metric
      val ackermann: m:nat -> n:nat -> Tot nat (decreases %[m;n])
      let rec ackermann n m =
        if m=0 then n + 1
        else if n = 0 then ackermann 1 (m - 1)
        else ackermann (ackermann (n - 1) m) (m - 1)
  • default metric is lex ordering of all (non-function) args
      val ackermann: m:nat -> n:nat -> Tot nat

The divergence effect (Dv)

We might not want to prove all code terminating

    val factorial : int -> Dv int

Some useful code really is not always terminating

val eval : exp -> Dv exp
let rec eval e =
  match e with
  | App (Lam x e1) e2 -> eval (subst x e2 e1)
  | App e1 e2 -> eval (App (eval e1) e2)
  | Lam x e1 -> Lam x (eval e1)
  | _ -> e
let main = eval (App (Lam 0 (App (Var 0) (Var 0)))
                     (Lam 0 (App (Var 0) (Var 0))))
./Divergence.exe

F* effect system encapsulates effectful code

Pure code cannot call potentially divergent code

Only pure code can appear in specifications

val factorial : int -> Dv int
    type tau = x:int{x = factorial (-1)}
                     ^^^^^^^^^^^^^^^^^^
Expected a pure expression; got an expression ... with effect "DIV"

Sub-effecting: Tot t <: Dv t
(e.g. divergent code can include pure code)

incr 2 + factorial (-1) : Dv int

Verifying pure programs

Variant #1: intrinsically (at definition time)

  • Using refinement types (saw this already)
    val factorial : nat -> Tot nat
  • Can equivalently use pre- and post- conditions for this
    val factorial : x:int -> Pure int (requires (x >= 0))
                                    (ensures (fun y -> y >= 0))
  • Each computation type contains
    • effect (Pure, Div), result type (int), spec (e.g. pre and post)
  • Tot can be seen as just an abbreviation
    Tot t = Pure t (requires True) (ensures (fun _ -> True))

Verifying potentially divergent programs

The only variant: intrinsically (partial correctness)

  • Using refinement types
    val factorial : nat -> Dv nat
  • Or the Div computation type (pre- and post- conditions)
    val eval_closed : e:exp -> Div exp (requires (closed e))
                                     (ensures (fun e' -> Lam? e' /\ closed e'))
    let rec eval_closed e =
    match e with
    | App e1 e2 -> let Lam e1' = eval_closed e1 in
                   below_subst_beta 0 e1' e2;
                   eval_closed (subst (sub_beta e2) e1')
    | Lam e1 -> Lam e1
  • Dv just an abbreviation
    Dv t = Div t (requires True) (ensures (fun _ -> True))

Another way to look at this

  • Two classes of types

    • Value types (t): int, list int,
    • Computation types (C): Tot t   and   Dv t
  • Dependent function types of the form: x:t -> C

    • argument can't have side-effects, so value type
  • Two forms of refinement types

    • Refined value types: x:t{p}
    • Refined computation types:
      • Pure t pre post   and   Div t pre post
      • these will get more interesting for more interesting effects

Verifying pure programs

Variant #2: extrinsically using SMT-backed lemmas

let rec append (#a:Type) (xs : list a) (ys : list a) : Tot (list a) =
  match xs with
  | [] -> ys
  | x :: xs' -> x :: append xs' ys
let rec append_length (#a:Type) (xs : list a) (ys : list a) :
    Pure unit
      (requires True)
      (ensures (fun _ -> length (append xs ys) = length xs + length ys))
= match xs with
  | [] -> ()
  | x :: xs' -> append_length xs' ys

Syntax sugar (Lemma)

let rec append_length (#a:Type) (xs : list a) (ys : list a) :
    Lemma (ensures (length (append xs ys) = length xs + length ys)) = ...

Verifying pure programs

Variant #3: extrinsically using proof terms

val preservation : #e:exp -> #e':exp -> #g:env -> #t:typ ->
       ht:(typing g e t) -> hs:step e e' -> Tot (typing g e' t) (decreases ht)
let rec preservation #e #e' #g #t (TyApp h1 h2) hs =
  match hs with
  | SBeta tx e1' e2' -> substitution_beta h2 (TyLam?.hbody h1)
  | SApp1 e2' hs1   -> TyApp (preservation h1 hs1) h2
  | SApp2 e1' hs2   -> TyApp h1 (preservation h2 hs2)
val progress : #e:exp -> #t:typ -> h:typing empty e t ->
                         Pure (cexists (fun e' -> step e e'))
                              (requires (~ (is_value e)))
                              (ensures (fun _ -> True)) (decreases h)
let rec progress #e #t h =
  match h with
  | TyApp #g #e1 #e2 #t11 #t12 h1 h2 ->
     match e1 with
     | ELam t e1' -> ExIntro (subst (sub_beta e2) e1') (SBeta t e1' e2)
     | _          -> let ExIntro e1' h1' = progress h1 in
                     ExIntro (EApp e1' e2) (SApp1 e2 h1')

Note: match exhaustiveness check also semantic (via SMT)

The functional core of F*

  • Dependent type theory variant

    • $\lambda$, $\Pi$, inductives, matches, universe polymorphism
  • Recursion and semantic termination check

    • potential non-termination is an effect
  • Refined value and computation types

    • with proof irrelevance, discharged by SMT (classical)
  • Subtyping and sub-effecting

  • Extensional equality

    • type conversion up to SMT

Monadic effects in F*

Effects defined using monads built from total functions

type st (mem:Type) (a:Type) = mem -> Tot (a * mem)
total new_effect {
 STATE_m (mem:Type) : a:Type -> Effect
 with repr = st mem;
      return = fun (a:Type) (x:a) (m:mem) -> x, m;
      bind = fun (a b:Type) (f:st mem a) (g:a -> st mem b) (m:mem) ->
             let z, m' = f m in g z m';
      get = fun () (m:mem) -> m, m;
      put = fun (m:mem) _ -> (), m  }
total new_effect STATE = STATE_m heap
  • Monadic representation of STATE effect usually kept abstract
    • e.g. primitively implemented under the hood by ML heap or C stack+heap
    • other effects implemented in terms of their monadic representation
  • Whether implemented primitively or not
    • monadic effect definition is the model used by F* to reason about effectful code

Monadic lifts in F*

State and exception monad

let stexn a = nat -> Tot ((either a string) * nat))
new_effect {
  STEXN: a:Type -> Effect with
    repr    = stexn;
    return  = fun (a:Type) (x:a) s -> Inl x, s;
    bind    = fun (a b:Type) (f:stexn a) (g:a -> stexn b) s0 ->
                 let (r,s1) = f s0 in
                 match r with
                 | Inl ret -> Inl (g ret s1), s1
                 | Inr m -> Inr m, s1
    raise   = fun (a:Type) (msg:string) s -> Inr msg, s
}
sub_effect STATE ~> STEXN {
  lift = fun (a:Type) (e:st nat a) -> (fun s -> let (x,s') = e s in Inl x, s')
}

Programming with effects, in direct style

  • In F*, the programmer writes:

    let incr () =
        let x = STATE.get() in
        STATE.put (x + 1);
        let y = STATE.get() in
        assert (y > x)
  • Made explicitly monadic via type and effect inference

    let incr () =
      STATE.bind (STATE.get ()) (fun x -> 
      STATE.bind (STATE.put (x + 1)) (fun _ ->
      STATE.bind (STATE.get ()) (fun y ->
      STATE.return (assert (y > x)))))

Programming with multiple effects

  • Programmer writes:

      ( / ) : int -> x:int{x<>0} -> Tot int
      let divide_by (x:int) : STEXN unit ...
        = if x <> 0 then put (get () / x) else raise "Divide by zero"
  • Elaborated to:

      let divide_by x = 
        if x <> 0 then STATE_STEXN.lift (STATE.bind (STATE.get())
                                          (fun n -> STATE.put (n / x)))
        else STEXN.raise "Divide by zero"
  • F* infers the least effect of each sub-term

    • automatically lifts computations to use the suitable effect
    • ensures that reasoning isn't needlessly polluted by unused effects

Verifying effectful programs

Variant #1: intrinsically (at definition time)

  • Hoare (type theory) style pre- and post-conditions

    val incr : unit -> ST unit (requires (fun n0 -> True))
                               (ensures (fun n0 _ n1 -> n1 = n0 + 1))
    let incr () =
        let x = STATE.get() in
        STATE.put (x + 1)
  • Beyond what can be expressed with refinements

  • Now pre-condition is relation on initial states; post-condition is relation on initial states, result values, and final states

    • so type of pre- and post- conditions depends on the effect
    • the more complex the effect, the more complex the specs (and proofs)

Verifying effectful programs

Variant #2: extrinsically (by monadic reification)

  • Revealing pure effect representation

      STATE.reify (e : ST a pre post)
        : n0:nat -> Pure (a * nat) (requires (pre n0))
                                   (ensures (fun r -> post n0 (fst r) (snd r))
  • Allows us to give weak specification to an effectful function

      let incr () : ST a (requires (fun _ -> True)) (ensures (fun _ _ _ -> True))
        = let n = STATE.get() in STATE.put (n + 1)
  • then prove lemmas about reification of effectful computation

      let incr_increases (s0:s) : Lemma (snd (ST.reify (incr()) s0) = s0 + 1) = ()
  • reify/reflect only when the abstraction permits (e.g. ghost code)

Reification works better than we expected

  • Reducing effectful verification to pure verification

    • for which F* already has good support (e.g. SMT automation)
  • Recent experiments using this for “relational reasoning”

    • Correctness of program transformations
    • Information flow control
    • Proofs of algorithmic optimizations (memoization, union-find)
    • Simple game-based cryptographic proofs

Under the hood

Weakest pre-conditions

Dijkstra monads for free

C-types indexed by predicate transformers

Pre- and post- conditions are just syntactic sugar:

Pure t (pre : Type0) (post : t->Type0)
  = PURE t (fun k -> pre /\ forall y. post y ==> k y)
val factorial : x:int -> Pure int (requires (x >= 0)) (ensures (fun y -> y >= 0))
val factorial : x:int -> Pure (fun k -> x >= 0 /\ forall y. y >= 0 ==> k y)

Same for user-defined effects, like state:

ST t (pre : nat -> Type0) (post : nat -> t -> nat -> Type0) 
  = STATE t (fun n0 k -> pre n0 /\ forall x n1. post n0 x n1 ==> k x n1)
val incr : unit -> St unit (requires (fun n0 -> True))
                           (ensures (fun n0 _ n1 -> n1 = n0 + 1))
val incr : unit -> STATE unit (fun n0 k -> k () (n0 + 1))

Computing weakest preconditions, by example

let incr () = STATE.bind (STATE.get()) (fun x -> STATE.put (x + 1))

By inferring type for incr against following interface:

STATE.get : unit -> STATE nat (STATE.get_wp())
STATE.put : n:nat -> STATE unit (STATE.put_wp n)
STATE.bind : STATE 'a 'wa ->
             (x:'a -> STATE 'b ('wb x)) ->
             STATE 'b (STATE.bind_wp 'wa 'wb)

we compute the weakest precondition for incr

val incr : unit -> STATE unit
  (STATE.bind_wp (STATE.get_wp()) (fun x -> STATE.put_wp (x + 1)))
= (fun n0 k -> k () (n0 + 1)) 

Generic way of computing weakest-preconditions for all effects, provided we have a monad on predicate transformers

Predicate transformers monad for state

aka Dijkstra monad

let STATE.wp t = (t -> nat -> Type0) -> (nat -> Type0)
val STATE.return_wp : 'a -> Tot (STATE.wp 'a)
val STATE.bind_wp : (STATE.wp 'a) ->
                    ('a -> Tot (STATE.wp 'b)) ->
                      Tot (STATE.wp 'b)
val STATE.get_wp : unit -> Tot (STATE.wp nat)
val STATE.put_wp : nat -> Tot (STATE.wp unit)

we need to implement this Dijkstra monad:

let STATE.return_wp v = fun p -> p v
let STATE.bind_wp wp f = fun p -> wp (fun v -> f v p)
let STATE.get_wp () = fun p n0 -> p n0 n0
let STATE.put_wp n = fun p _ -> p () n

and for a while we wrote such things by hand;
but this is quite tricky and comes with strong proof obligations
(correctness with respect to effect definition, monad laws, )

Dijkstra monads for free

STATE.wp t  = (t -> nat -> Type0) -> (nat -> Type0)
           ~= nat -> (t * nat -> Type0) -> Type0

This can be automatically derived from the state monad

STATE.repr t = nat -> t * nat

by selective continuation-passing style (CPS) returning Type0

This works well for large class of monadic effects:
state, exceptions, continuations, etc.

From monadic effect definition we can derive a correct-by-construction weakest-precondition calculus for this effect.

Formalization

  • Two calculi

    • DMF: simply-typed with an abstract base monad, and restricted;
      used to define monads, actions, lifts
    • EMF*: dependently-typed, user-defined effects, reifying/reflecting
  • Two translations from well-typed DMF terms to EMF*

    • *-translation: gives specification (selective CPS)
    • elaboration: gives implementation (essentially an identity)
  • *-trans gives correct Dijkstra monad for elaborated terms

Graphically

dm4free

Correct reasoning about PURE

  • PURE is the only primitive EMF* effect (F* also has DIV)

  • A WP for PURE is of type

    PURE.wp t = (t -> Type0) -> Type0
    • Dijkstra monad for PURE is exactly the continuation monad
  • Total Correctness of PURE:
    If ⊢ e : PURE t wp   and   ⊢ wp p   then   e ↝* v   s.t.   ⊨ p v

Correct reasoning about STATE

  • Say we have a term e : nat -> t × nat

  • From logical relation, we get

    • e : s₀ : nat -> PURE (t × nat) (e* s₀)
  • From previous and correctness of PURE, we get
    Correctness of STATE
    If ⊢ e : nat -> t × nat   and  e* s₀ p
    then   e s₀ ↝* (v,s)   s.t.   ⊨ p (v,s)

Extra properties of the translations

  • *-translation preserves equality

    • Monads mapped to Dijkstra monads
    • Lifts mapped to Dijkstra lifts
    • Laws about actions preserved
  • e* is monotonic: maps weaker post's to weaker pre's

    • (∀x. p₁ ⇒ p₂) ⇒ e* p₁ ⇒ e* p₂
  • e* is conjunctive: distributes over ∧ and ∀

    • e* (fun x -> p₁ x ∧ p₂ x) ⇔ e* p₁ ∧ e* p₂
  • so for any DMF monad we produce correct Dijkstra monad,
    that's usable within the F* verification system

Some dreams and ongoing work on F*

  • Find best balance between automation and control

    • adding tactics à la Lean/Idris, using F* itself for meta-programming
  • Fully work out metatheory of F*

    • formalizing preorder-indexed state monads and F* to SMT encoding
    • longer term: semantic model; self-certify implementation
  • Verify and deploy new, efficient HTTPS stack (Everest)

    • verified interoperability of F* (OCaml), Low* (C), and Vale (ASM)
  • Opportunities brought in by extensible effect system

    • beyond ML effects: probabilities, concurrency,
    • non-termination as defined effect? extrinsic reasoning about DIV?
    • can DM4Free be also applied to algebraic handlers?

F*: verification of effectful programs

  • Functional programming language with effects

  • Semi-automated verification system using SMT

  • Interactive proof assistant based on dependent types

  • Open source, code on GitHub

  • Tutorial, papers, slides at fstar-lang.org

  • PhD internships at Microsoft Research (Cambridge, Redmond, Bangalore); application in Dec 2017 - Jan 2018