Documentation for MLsem

Basic usage
Syntax of the language
Caveat

Basic usage

The prototype consists of an online text editor (based on Monaco Editor).

The basic Monaco keyboard shortcuts (e.g. Ctrl-/ for single line comment) are available. The command palette can be invoked with the context menu or F1
F2 presents a menu with a choice of predefined examples.
[Ctrl-Enter] types the buffer

Computed types are printed as lenses (above each definition). In case of error, the error message is printed instead.

Syntax

Definitions

A program is a sequence of toplevel statements:
prog ::= stmt* stmt ::= type id [(tvar, ..., tvar)] = t [and ...] | abstract type id [(tvar, ..., tvar)] | let gid param* = e [and ...] | let mut id [: t] = e [and ...] | val [mut] gid [: t] param ::= id | ( pat : t )

A statement can be either

a type definition, composed of a type name, an optional list of type variables (parameters), and a type expression. Several mutually recursive type definitions can be defined together using the and keyword.
a abstract type definition, composed of a type name, and an optional list of type variables (parameters).
a toplevel mutable or immutable variable definition which follows, roughly, OCaml's syntactic convention. Function parameters can be either simple identifiers or patterns and optionally feature a type constraint.
a type signature for a mutable or immutable variable, composed of a simple identifier and the associated type. If no type is given, the identifier will be given the dynamic type (gradual typing).

Note that top-level definitions are considered recursive. A definition may refer to an identifier whose type signature is declared later. However, as top-level value definitions are typed sequentially, mutually-recursive definitions that have no explicit type signatures must be defined together using the and keyword.

Identifiers

Identifiers come in several flavours :


    gid ::= id | ( op )

    id ::= [a-z_][a-zA-Z0-9_']*

    cid ::= [A-Z][a-zA-Z0-9_']*

    tvar ::= '[a-zA-Z][a-zA-Z0-9_]*

value identifiers can be either variable names (starting with a lowercase) or prefix or infix operator symbols in parenthesis.
constructor identifiers start with an uppercase.
variable and type identifiers are variable names
type variables start with a single quote ' character

Type expressions


    t ::= simple_type
        | t -> t
        | t ,..., t
        | t :: t
        | [ tregex ]
        | { id :[?] t;...; id :[?] t [..]}
        | t | t
        | t & t
        | t \ t
        | ~ t
        | t where id tvar* = t [and ...]

    simple_type ::= b
                | id [(t, ..., t)]
                | cid [(t)]
                | tvar
                | (t)
    b ::= ()
        | lit
        | (int..int) | (..int) | (int..) | (..)
        | any | empty | tuple | tuplen | true | false | bool
        | int | char | unit | string | list | nil

Types are the usual set-theoretic types with tuple (product), arrow and record constructors and union, intersection, difference and negation operators. A sequence type constructor is provided as well. The content of a sequence type can be a regular expression over types, using the usual operators (*, + and ?). For instance, the type [ 'a* (bool|int)? ] is equivalent to the type definition


     t where t = 'a::t | s and s = [] | (bool | int)::[]

Record types are given by the list of their fields, that is labels associated with a type. A label may be absent (denoted by :?). Open record types end their field list with ... Polymorphic types can be instantiated by giving a list of type parameters. Basic types consists of literals (which denote their own singleton type), augmented with integer interval types, and set of builtin type identifiers.

Expressions


        e ::= lit
            | gid
            | cid [(e)]
            | (e : t)
            | (e :> t) | (e :>> t) | (e :>>> t)
            | { [e with] id=e;...;id=e }
            | [ e; ... ;e ]
            | e ,..., e
            | fun param+ -> e
            | let gid param* = e in e
            | let mut id [: t] = e in e
            | let (pat) = e in e
            | if e [is t] then e else e
            | fst e | snd e | hd e | tl e
            | match e with [|] pat -> e | ... | pat -> e end
            | e ; e
            | id := e
            | if e [is t] do e [else e] end
            | while e [is t] do e end
            | return e | break | continue

        lit ::= [0-9]+
            | 'char'
            | "char*"
            | float
            | () | false | true | []

Expressions can be

integer, character (delimited with single quotes), string (delimited with double quotes) or floating point literals
predefined enums such as false, true, …
identifiers
constructors (note that constructors are created on-the-fly, they do not need to be declared)
type casts, e.g. (x : bool)
type coercions, e.g. (42 :> int) which can be used locally to coerce the type of an expression into a supertype. Two other type coercion operators are available: (x :>> int) allows coercion of the dynamic part of a type into any type, and (x :>>> int) allows coercion of a type into any type (unchecked coercion). You can write # as target type for a coercion to the dynamic type.
constructors for products, lists and records
anonymous functions
local let binding (defined using either a variable or a pattern)
a type case. In the tested type, all occurrences of the arrow constructor are of the form empty->any. If the tested type is missing, it is synonymous for is true
first and second projections for pairs
a pattern matching construct. Patterns follow a first match policy.
a sequence
an assignment to a mutable variable
an imperative conditional (branches return ())
an imperative while loop (body return ())
a control flow escape operation (return for functions, break or continue for loops)

Patterns


        pat ::= :simple_type
            | id
            | lit
            | cid [(pat)]
            | pat,...,pat | pat|pat | pat&pat
            | (pat)
            | [ pat; ... ;pat ]
            | { id [= pat];...; id [= pat] [..]}
            | id = lit

Patterns are essentially types with capture variables. For instance, the following expression


            match y with
            | :[ int* ] -> false
            | ( x & :bool, :int ) | x = false -> x
            end

First checks whether y is a list of integers, in which case it returns false. Or, it tests whether either y is a pair of a Boolean and an integer, and captures the Boolean in x, or defines x to the constant false, and then returns x.

Caveat

The syntax given above is only a rough approximation. The usual priority of common operators should apply, but in case of doubt, parentheses can be used to disambiguate expressions, types or patterns.
This online prototype uses Js_of_ocaml with support for algebraic effects, and is about 15 to 20 times slower than when compiled to native code. Some examples may not terminate in reasonable time or cause a stack overflow due to the relatively shallow recursion stack of web browsers.