Reference Guide

Sterling is a system for defining extensible languages using inference rules. It has three parts for defining languages: the semantic portion, describing the abstract syntax and judgments in the language; the concrete portion, describing the concrete syntax of the language; and the main function portion, describing how the language should be run on a program.

This is a reference for specific features, not a tutorial. A Sterling tutorial can be found elsewhere.

Running Sterling

To run Sterling to check a module named mo:du:le, run

sterling mo:du:le

This will look for files in the directory mo/du/le rooted in the current directory. To add another root directory location, use the -I flag:

sterling -I .. mo:du:le

This will find the module when run from within the mo directory. The -I flag may be used multiple times, with each given directory added as a location. The root directories of the module being checked and each of the modules on which it builds must be added if they are not located in the directory where Sterling is being run.

Running Sterling without flags other than the -I flag will check for errors but not result in other actions. To see more options for running Sterling, use the --help flag.

Semantic Portion

The semantic portion of the language is written in .sos files. It includes declaring abstract syntax and declaring and defining judgments. This is also where modules on which the current one builds are declared, with builds-on declarations like

Builds on mo:du:le

Declaring Abstract Syntax

Syntactic categories and sets of constructors for them are declared together:

expr ::= or(expr, expr)
       | and(expr, expr)
       | true
       | false
       | abs(string, ty, expr)
       | app(expr, expr)

If a constructor takes no arguments we may leave the parentheses off, as with true above.

We can add new constructors to an existing category by starting the declaration of constructors with an ellipsis:

expr ::= ...
       | plus(expr, expr)
       | num(int)
       | greater(expr, expr)

Each new syntactic category requires a declaration of its projection type. For example, we might define projection for an expression to depend on a typing context:

Projection expr : typingCtx

Quite often, we find we don't want projections to depend on anything, so we can have no arguments:

Projection stmt :

We might project a repeat-while construct into a regular while loop:


----------------------------------------------------------------- [Proj-RepeatWhile]
|{stmt}- repeatWhile(Body, Cond) ~~> seq(Body, while(Cond, Body))

We might project a let binding to the basic simply-typed lambda calculus by checking the type of the bound expression:

typeOf Ctx E1 Ty
------------------------------------------------------- [Proj-Let]
Ctx |{expr}- let(X, E1, E2) ~~> app(abs(X, Ty, E2), E1)

Declaring and Defining Judgments

The syntax of terms in Sterling includes the following constructs:

c(t1, t2, t3) constructor application (parentheses optional for constructors without arguments)

Var variable (starts with capital letter)

<integer> integer literal

"string" string literal

(t1, t2, t3) tuple literal

t1::t2 cons list

[] empty list

[t1, t2, t3] list literal

(t : ty) ascription

The syntax of judgments in Sterling includes the following constructs:

t1 = t2 equality check

t1 != t2 inequality check

t1 < t2 less than for integers

t1 > t2 greater than for integers

t1 <= t2 less than or equal to for integers

t1 >= t2 greater than or equal to for integers

t1 + t2 = t3 integer addition

t1 - t2 = t3 integer subtraction

t1 * t2 = t3 integer multiplication

t1 / t2 = t3 integer division

t1 % t2 = t3 integer modulus

t1 ++ t2 = t3 append for two strings or two lists

t1, t2, t3 |{c}- t4 ~~> t5 projection

rel t1 t2 t3 relation judgment

!rel t1 t2 t3 negation of a relation judgment

Fixed judgments have all their rules defined in the module introducing them. We can declare a fixed ternary judgment frel as

Fixed Judgment frel : a b c

Rules for fixed judgments have the form

<premise judgment 1>
...
<premise judgment n>
===================== [<rule name>]
<conclusion judgment>

The line must include at least five equals signs. Rule names are expected to be unique.

Fixed judgments may be declared to be polymorphic by including type variables, written using capital letters, in their type declarations. For example,

Fixed Judgment rel : [A] A

declares a relation relating a list containing some type of element to a term of that type.

Extensible judgments have their rules extended by further extensions building on the module in which they are introduced. We can declare an extensible binary judgment erel as

Judgment erel : a b*

The * marks the primary component around which the relation is defined, required for each extensible judgment. Rules for extensible judgments have the form

<premise judgment 1>
...
<premise judgment n>
--------------------- [<rule name>]
<conclusion judgment>

The line must include at least five hyphens. Extensions building on the module introducing a relation may add new rules defining it, provided the conclusion of each new rule has its primary component built at the top level by a constructor introduced by the extension introducing the rule.

Extensible judgments declared with a primary component category from a module on which the current one builds (as opposed to a category the current module introduces) must declare a lifting rule, marked with * after the rule name. These often have the form of a copy rule,

|{cat}- B ~~> B_Proj
erel A B_Proj
--------------------- [Erel-Proj]*
erel A B

but may have any form. The conclusion must have a variable as its primary component argument.

Extensibella Stand-In Rules

When using Sterling to write specifications for Extensibella reasoning, we often need to specify stand-in rules that will be used to model the behavior of other, unknown extensions. These are written as regular extensible rules, but within an Extensibella_Stand_In environment:

Extensibella_Stand_In {
  <premises>
  ----------- [Erel-Q]
  erel <args>
}

These rules do not have any effect on a language's semantics and are only used in the generated Extensibella specification. All arguments to the conclusion must be unique variables; structured terms or repeated variable names are not allowed, as they would limit the applicability of this rule that is supposed to represent all unknown rules. As with default rules, stand-in rules can only be introduced by the module introducing the relation being defined.

Concrete Portion

Terminals are declared with names and regular expressions, such as

name_t /[a-z][a-zA-Z0-9_]*

Regular expressions are as expected. Special characters must be escaped using backslashes. Spaces are ignored by default, but can be included by backslash escaping them.

Terminals that will be ignored by all productions in the grammar can be declared as ignore terminals, such us

ignore /[\n\r\t\ ]+/

Strings matching ignore terminals may occur anywhere in a string being parsed by the grammar and will be skipped.

Nonterminals and productions are declared together, similar to how abstract syntax categories and constructors are declared. Nonterminals are also declared with the type of abstract syntax built for them:

nt<abs ty> ::= <prod 1>
            | <prod 2>
            | <prod 3>

When adding new productions to existing nonterminals, the nonterminal type is given without an abstract syntax type and the productions are prefaced with an ellipsis:

nt ::= ..
     | <new prod 1>
     | <new prod 2>

Productions have two parts, the signature and the abstract syntax term, joined by ~~>. Production signatures are built from terminals and nonterminals, which may be given names to reference them. For example,

backslash_t Name::name_t dot_t Body::term_c

might be the signature for a lambda abstract, where backslash_t, name_t, and dot_t are terminals and term_c is a nonterminal.

The syntax of terms for building abstract syntax in Sterling:

Name names of production elements from the signature

c(t1, t2, t3) abstract syntax constructors

"string" string literals

<integer> integer literal

$to_int(t) convert a string to an integer

t1[t2:t3] substring of a string with zero-based indices

t1::t2 cons list

[] empty list

[t1, t2, t3] list literal

(t1, t2, t3) tuple literal

Note that the type of a named production element is string for a terminal and, for a nonterminal, the type of its built abstract syntax tree.

Main Function Portion

The main function portion of the language uses functions to tie together the parsing and judgments written in the other two parts of the language. Functions are declared as

Function <name> : <name 1 : type 1> <name 2 : type 2> <name 3 : type 3> -> <return type>{
   <body>
}

Each parameter name must be capitalized. The types may be the types in the semantic portion of the language as well as bool and unit types in only this portion of the language.

The starting point for running a Sterling module is the main function with type [string] -> int.

Each function's body is a single expression:

true Boolean true

false Boolean false

e1 || e2 Boolean disjunction

e1 && e2 Boolean conjunction

If e1 Then e2 Else e3 conditional

e1 + e2 integer addition

e1 - e2 integer subtraction

e1 * e2 integer multiplication

e1 / e2 integer division

e1 % e2 integer modulus

<integer> integer literal

e1 < e2 integer less than

e1 > e2 integer greater than

e1 <= e2 integer less than or equal to

e1 >= e2 integer greater than or equal to

e1 = e2 equality comparison

"string" string literals

fun(e1, e2, e3) function call

e1[e2] zero-based list indexing

e1 ++ e2 list and string concatenation

Let <Var> := e1 In e2 let binding (variable must be capitalized)

<Var> variables referring to arguments to the function or let-bound variables

e1 Before e2 expression sequencing (first expression must have type unit)

Print e print the string form of the expression's value

Write e1 to e2 write the string form of expression e1's value to a file with name e2

Read e read the contents of a file

Parse <nt> from e parse the given nonterminal from the string expression (detailed discussion below)

Derive {<judgment>} for <input vars> assigning [<output vars>] derive the given relation for the given arguments, setting the output variables (detailed discussion below)

Parsing takes a nonterminal name, for the nonterminal to parse, and a string expression for the string from which to parse it. A Parse returns a triple of type (bool, <abstract syntax>, string). These are

whether the parse was successful (true) or not
the abstract syntax produced from the parse tree (only defined if the first element is true)
errors describing why the string didn't parse (only defined if the first element is false)

A common structure for parsing, then, is

Let Success, A, Err := Parse nt from String
In
  If Success
  Then <use A>
  Else <use Err>

In deriving judgments, we need to treat them as functions so we identify both input variables and output variables. Input variables take their values from the current environment and output variables produce values. For example, in

      Derive {typeOf [] Tm Ty} for Tm assigning [Ty]

This reads the value of Tm, then derives a type for it. Deriving a judgment produces a tuple of n + 1 values, where n is the number of output variables. The first element of the tuple has type bool and tells whether the derivation is successful. The other elements in the tuple are the values found for the output variables. In the example, the output type of the expression would be (bool, ty).

Miscellaneous

Names

All defined names start with a lowercase letter, followed by zero or more lowercase letters, capital letters, digits, and underscores. All variables must start with capital letters.

Qualified Names

All names introduced in any portion of the language are implicitly qualified by the module in which they are declared. For example, a type expr declared in a module mo:du:le has as its qualified name mo:du:le:expr. The non-qualified name, the short name, can be used as long as no other name for the same type of thing is in scope, in which case the qualified names must be used for both. For example, if two categories a:c and b:c are both in scope, they must be referenced by their qualified names. If instead b:c is a constructor, they do not interfere and may be referenced by their short names.

Built-In Types

Sterling has a few built-in types:

string
int
lists, written [ty]
n-ary tuples, written as comma-separated lists of types in parentheses (e.g. (int, expr, [int]))

Emacs Support

Sterling has modes written for Emacs for each part of the language, found in the Sterling GitHub repository. To use these, load them in your .emacs file with the following lines:

(load "path/to/sterling/ide/sterling.el")
(load "path/to/sterling/ide/sterling-conc.el")
(load "path/to/sterling/ide/sterling-main.el")

This will make Emacs know to open Sterling files using the appropriate mode automatically.

`c(t1, t2, t3)`	constructor application (parentheses optional for constructors without arguments)
`Var`	variable (starts with capital letter)
`<integer>`	integer literal
`"string"`	string literal
`(t1, t2, t3)`	tuple literal
`t1::t2`	cons list
`[]`	empty list
`[t1, t2, t3]`	list literal
`(t : ty)`	ascription

`t1 = t2`	equality check
`t1 != t2`	inequality check
`t1 < t2`	less than for integers
`t1 > t2`	greater than for integers
`t1 <= t2`	less than or equal to for integers
`t1 >= t2`	greater than or equal to for integers
`t1 + t2 = t3`	integer addition
`t1 - t2 = t3`	integer subtraction
`t1 * t2 = t3`	integer multiplication
`t1 / t2 = t3`	integer division
`t1 % t2 = t3`	integer modulus
`t1 ++ t2 = t3`	append for two strings or two lists
`t1, t2, t3 \|{c}- t4 ~~> t5`	projection
`rel t1 t2 t3`	relation judgment
`!rel t1 t2 t3`	negation of a relation judgment

`Name`	names of production elements from the signature
`c(t1, t2, t3)`	abstract syntax constructors
`"string"`	string literals
`<integer>`	integer literal
`$to_int(t)`	convert a string to an integer
`t1[t2:t3]`	substring of a string with zero-based indices
`t1::t2`	cons list
`[]`	empty list
`[t1, t2, t3]`	list literal
`(t1, t2, t3)`	tuple literal

`true`	Boolean true
`false`	Boolean false
`e1 \|\| e2`	Boolean disjunction
`e1 && e2`	Boolean conjunction
`If e1 Then e2 Else e3`	conditional
`e1 + e2`	integer addition
`e1 - e2`	integer subtraction
`e1 * e2`	integer multiplication
`e1 / e2`	integer division
`e1 % e2`	integer modulus
`<integer>`	integer literal
`e1 < e2`	integer less than
`e1 > e2`	integer greater than
`e1 <= e2`	integer less than or equal to
`e1 >= e2`	integer greater than or equal to
`e1 = e2`	equality comparison
`"string"`	string literals
`fun(e1, e2, e3)`	function call
`e1[e2]`	zero-based list indexing
`e1 ++ e2`	list and string concatenation
`Let <Var> := e1 In e2`	let binding (variable must be capitalized)
`<Var>`	variables referring to arguments to the function or let-bound variables
`e1 Before e2`	expression sequencing (first expression must have type `unit`)
`Print e`	print the string form of the expression's value
`Write e1 to e2`	write the string form of expression `e1`'s value to a file with name `e2`
`Read e`	read the contents of a file
`Parse <nt> from e`	parse the given nonterminal from the string expression (detailed discussion below)
`Derive {<judgment>} for <input vars> assigning [<output vars>]`	derive the given relation for the given arguments, setting the output variables (detailed discussion below)