Plasma Language Reference

Paul Bone
<paul@plasmalang.org>
version 0.1, April 2017
Draft. Copyright © 2015-2017 Plasma Team License: CC BY-SA 4.0
Table of Contents

As the language is under development this is a working draft. Many choices may be described only as bullet points. As the language develops these will be filled out and terms will be clarified.

Lexical analysis and parsing

The "front end" passes of Plasma compilation work as follows:

  • Tokenisation converts a character stream into a token stream.

  • Parsing converts the token stream into an AST.

  • AST→Core transformation converts the AST into the core representation. This phase also performs symbol resolution, converting textual identifiers in the AST into unique references.

Lexical analysis

  • Input files are UTF-8

  • Comments begin with a # and extend to the end of line.

  • Curly braces for blocks/scoping

  • Whitespace is only significant when it separates two tokens what would otherwise form a single token

  • Statements and declarations are not delimited. The end of a statement can be determined by the statement alone. Therefore: there are no statement terminators or separators (such as semicolons in C) nor significant whitespace (as in Python or Haskell).

  • String constants are surrounded by double quotes and may contain the following escapes. \n \r \t \v \f \b \\. Escaping the double quote character is not currently supported, using character codes is not currently supported. Escaping any other character prints that character as is; this allows \' to work as many programmers may expect, even though it’s not necessary.

Parsing

Plasma’s EBNF is given in prices throughout this document as concepts are introduced. However the top level and some shared definitions are given here. In this ENBF syntax I use ( and ) to denote groups and ? + and * to denote optional, one or more, and zero or more.

Plasma := ModuleDecl ToplevelItem*

ModuleDecl := module ident

ToplevelItem := ExportDirective
              | ImportDirective
              | TypeDefinition
              | ResourceDefinition
              | FuncDefinition

ModuleQualifiers := ( ident . )*
QualifiedIdent := ModuleQualifiers ident

IdentList := ident ( , ident )*

A note on case and style.

Sometimes it is necessary to use case to distinguish symbols in different namespaces that may appear in the same expression. For example type names and type variables can both appear in type expressions. In other situations there is no requirement but it can be useful to adopt a convention that makes it easier to read code.

Disambiguation based on case is done as part of AST→Core transformation.

Plasma either requires or suggests the following cases in the following situations.

Requirement

Suggestion

Notes

Variable

first letter lower

lower_case

Function Name

first letter lower

lower_case

Module Name

-

UpperCase

Case insensitive

Type Name

first letter upper

UpperCase

Type Variable

first letter lower

lower_case

Data constructor

first letter upper

UpperCase

to distinguish construction from function application or variable use.

Field selector

first letter lower

lower_case

Must be the same as function names.

Interface

?

UpperCase

Instance

?

lower_case

not first class, but may appear in expressions.

Resources

-

lower_case

Note that there may be more symbol namespaces in the future.

The rationale for these decisions is:

Variables, functions and field selectors

The most common symbols should be in lower case and use _ to separate words are preferred, but not enforced.

Module names

It is useful to visually distinguish module names from other symbols that can appear within expressions. Currently module names can only be used to module-qualify other symbols, but this may change in the future. This may also become a requirement rather than a suggestion.

Types and type variables

Type variables must be distinguished from types. Note that variables don’t need to be distinguished from functions as this is available from context: free variables do not exist and a bound variable has the same semantics as a defined function name.

Therefore plasma uses case to distinguish between type names (uppercase first letter) and type variables (lowercase first letter).

List(t)

Is a list of t where List is the list type and t is a type variable, it may stand for any type. Note that this is the same as Haskell but the opposite of Mercury.

Data constructors

Code that does different things should look different. Therefore data construction should stand apart from function calls, and hence it is useful if data constructors to begin with capital letters. It could be argued that the same is true for field selection. Suggestions welcome.

Interfaces and Interface Instances

Interfaces are to instances as types are to values, This is reflected in our decision to suggest that interfaces should be CamelCase and instances lower_case. Also, instances and module qualifiers can both appear within expressions as a prefix to another symbol. Instances will also appear distinct from module qualifiers.

Environment

The environment is a concept we will consider for Plasma’s scoping rules. The environment maps symbols to their underlying items (modules, types, functions, variables etc). Even though no environment exists at runtime, and the compile-time structure is an implementation detail of the compiler (pre.env), it is useful to think of scoping in these terms, as it explains most scoping behaviours.

Some languages allow overloading of symbols, usually based on a symbol’s type and sometimes on it’s arity. Plasma does not support any overloading.

Scopes

When a new name is defined it is added to the current environment.

print!(x)     # x does not exist.
x = "hello"   # x (a variable) is added to the environment.
print!(x)     # We may now refer to x.

When a nested block starts, it creates a new environment based upon the old environment.

x = "hello"
if (...) {
    print!(x)   # Ok
}

When a nested block ends, the original environment is restored.

if (...) {
    x = "Hello"
    print!(x)   # Ok
}
print!(x)     # Error
Variables

When all paths (of a branching statement) assign the same variable, then that variable is added to the original environment. No earlier declaration is required. This only makes sense for branching statements, and only happens with variables.

if (...) {
    x = "Hello"
    y = "Hi"
} else {
    x = "Goodbye"
}
print!(x)     # Ok
print!(y)     # Error
Other symbols

This is an error for other symbols.

if (...) {
    import SortedListSet as Set
    # Some code using Set
} else {
    import RBTreeSet as Set
    # Some code using Set
}
# Cannot use set here.

This is a desirable feature however it complicates the type system. It might be added in the future.

Shadowing

Shadowing refers to a new binding with the same name as an old binding being permitted and dominant in an inner or later scope. Shadowing is not permitted for variables at all. It is permitted for other symbols.

Note
TODO: Decide on rules for a symbol of one type overriding a symbol of another type. For example it should probably be an error for a module import to shadow an interface declaration. But it’s probably okay for a variable to overload a function, unless that function is defined within another function (a closure).

Variables

A variable cannot shadow another variable.

x = 3
x = 4   # Error

if (...) {
  x = 5 # Error
}
Note
We are considering a special syntax to use with variables that allows shadowing.

Other symbols

Symbols other than variables allow shadowing, for example module imports can create shadowing of their contents (types, functions etc). Including when import is used with a wildcard. Therefore we can use a different Set implementation in the inner scope:

import SortedListSet as Set
...
# some code
...
{
  import RBTreeSet as Set
  ...
  # some code using RBTreeSets
  ...
}
...
# back to SortedListSet
...

(Yes, module imports may appear within function bodies and so-on.)

However, a binding that cannot be observed such as:

import SortedListSet as Set
import RBTreeSet as Set

Doesn’t make sense, and the compiler should generate a warning.

TODO: Figure out if context always tells us enough about the role of a symbol that modules do not need to shadow types and constructors. I suspect this is true but I’ll have to define the rest of the language first.

Namespaces

The environment maps names to items. Names might be qualified and if so the qualifier is required to refer to that name. For example.

import Set
my_set1 = Set.new   # Ok
my_set2 = new       # Undefined symbol

Or they can be unqualified

import Set.new
my_set1 = Set.new   # Undefined symbol Set
my_set2 = new       # Ok

The name within the namespace does not need to correspond to the name as it was defined.

import Set.new as new_set
my_set = new_set    # Ok

This applies to all symbols except for variables, which can never be qualified. There is no syntax that would allow a variable to be defined with a qualifier.

Modules

Each file is a module, the file name must match the module name (case insensitive). By convention CamelCase is used.

Note
I prefer lower case filenames, but I also want these to match. Maybe I’ll grow to like CamelCase file names.

Each module begins with a module declaration.

module MyModule

Module Imports

ImportDirective := import QualifiedIdent
                 | import QualifiedIdent . *
                 | import QualifiedIdent as ident

Modules may be imported with an import declaration.

import RBTreeMap
import RBTreeMap as Map
import IO.getpid as getpid
import IO.*

import imports one or more symbols from a module. Lines one and two add a module name (RBTreeMap or Map, respectively) to the current environment. Line three imports only the getpid function from IO and names it getpid in the current environment. While line four imports everything in IO, adding them all to the current environment.

Code using symbols imported by lines one and two will require module qualification (either RBTreeMap or Map), while code using getopt (or other symbols from IO) will not.

A module cannot be used without an import declaration.

Module exports

ExportDirective := export IdentList
                 | export *

Symbols can be exported with export directives.

export my_function

If a module has no export directives then nothing is exported. Which probably makes the module useless.

To export everything from a module use a *.

export *

TODO: Syntax for exporting types abstractly & fully.

Types

The Plasma type system supports:

  • Algebraic types

  • parametric polymorphism (aka generics)

  • Abstract types

  • Other features may be considered for a later version

  • Type variables are lower case, type names begin with an uppercase letter (Haskell style).

See also Type system design which reflects more up-to-date ideas.

Type expressions refer to types.

TypeExpr := TypeName
          | TypeName '(' TypeExpr ( , TypeExpr )* ')'
          | TypeVar

TypeName := QualifiedIdent
TypeVar := ident

Type names must begin with an upper case first letter, and TypeVars with a lower case letter, otherwise they are both identifiers.

We can define new types using type definitions

TypeDefinition := type UpperIdent TypeParams? = OrTypeDefn ( '|' OrTypeDefn )*
TypeParams := '(' IdentList ')'
OrTypeDefn := ConstructorName
            | ConstructorName '(' TypeField ( , TypeField )+ ')'
TypeField := FieldName ':' TypeExpr
           | TypeExpr         # Not supported

ConstructorName := ident
FieldName := ident

TypeParams is a comma separated list of lowercase identifiers.

TypeField will need lookahead, so for now all fields must be named, but the anonymous name (_) is supported.

TODO: We use vertical bars to separate or types. Vertical bars mean "or" and are used in Haskell, but in C commas (for enums) and semicolons (for unions) are used. Which is best? Mercury uses semicolons as these mean "or" in Mercury.

TODO: We use parens around the arguments of constructors, like Mercury, and because fancy brackets aren’t required. However curly braces would be more familiar to C programmers.

Builtin types

How "builtin" these are varies. Ints are completely builtin and handled by the compiler where as a List has some compiler support (for special symbols & no imports required to say "List(t)") but operations may be via library calls.

  • Int

  • Uint

  • Int8, Int16, Int32, Int64

  • Uint8, UInt16, UInt32, UInt64

  • Char (a unicode codepoint)

  • Float (NIY)

  • Array(t)

  • List(t)

  • String (neither a CString or a list of chars).

  • Function types

These types are implemented in the standard library.

  • CString

  • Map(t)

  • Set(t)

  • etc…

User types

User defined types support discriminated unions (here a Map is either a Node or Empty), and generics (k and v are type parameters).

type Map(k, v) = Node(
                      m_key   : k,
                      m_value : v,
                      m_left  : Map(k, v),
                      m_right : Map(k, v)
                  )
                | Empty

TODO: Syntax will probably change, I don’t like , as a separator, I prefer a terminator, or nothing to match the rest of the language. Curly braces? | is also used as a separator here.

Types may also be defined abstractly, with their details hidden behind module abstraction.

Interfaces

Interfaces are a lot like OCaml modules. They are not like OO classes and only a little bit like Haskell typeclasses.

Interfaces are used to say that some type and/or code behaves in a particular way.

The Ord interface says that values of type Ord.t are totally ordered and provides a generic comparison function for Ord.t.

type CompareResult = LessThan | EqualTo | GreaterThan

interface Ord {
    type t

    func compare(t, t) -> CompareResult
}

t is not a type parameter but Ord itself may be a parameter to another interface, which is what enables t to represent different types in different situations; compare may also represent different functions in different situations.

We can create instances of this interface.

instance ord_int : Ord {
    type t = Int

    func compare(a : Int, b : Int) -> CompareResult {
        if (a < b) {
            LessThan
        } else if (a > b) {
            GreaterThan
        } else {
            EqualTo
        }
    }
}

Note that in this case each member has a definition. This is what makes this an interface instance (plus the different keyword), rather than an (abstract) interface. The importance of this distinction is that interfaces cannot be used by code directly, instances can.

Code can now use this instance.

r = ord_int.compare(3, 4)

Interfaces can also be used as parameter types for other interfaces. Here we define a sorting algorithm interface using an instance (o) of the Ord interface.

interface Sort {
    type t
    func sort(List(t)) -> List(t)
}

instance merge_sort(o : Ord) : Sort {
    type t = o.t
    func sort(l : List(t)) -> List(t) {
        ...
    }
}

merge_sort is an instance, each of its members has a definition, but it cannot be used without passing an argument (an instance of the Ord interface). A list of +Int+s can now be sorted using:

sorted_list = merge_sort(ord_int).sort(unsorted_list)
Note
This example is somewhat contrived, I think it’d be more convenient for sort to take a higher order parameter. But the example is easy to follow.

merge_sort(ord_int) is an instance expression, so is ord_int in the example above. Instance expressions will also allow developers to name and reuse specific interfaces, for example:

instance s = merge_sort(ord_int)
sorted_list = s.sort(unsorted_list)

More powerful expressions may also be added.

Instances can also be made implicit within a context:

implicit_instance merge_sort(ord_int)
sorted_list = sort(unsorted_list)

This is useful when an instance defines one or more operators, it makes using the interface more convenient. Suitable instances for the basic types such as Int are implicitly made available in this way.

Only one implicit instance for the given interface and types may be used at a time.

Resources

ResourceDefinition := 'resource' UpperIdent 'from' QualifiedIdent

This defines a new resource. The resource has the given name and is a child resource of the specified resource. SuperRes is the ultimate resource and is already defined, along with it’s child resource such as IO. See Handling effects below.

Code

Functions

FuncDefinition := func ident '(' ( Param ( , Param )* )? ')'
                      (-> TypeExpr (, TypeExpr)*)? Uses* Block

Param := ident ':' TypeExpr

# Uses denotes which resources a function may use.
Uses := uses IdentList
      | observes IdentList

Block := '{' Statement* '}'

TODO: Probably add support for naming return parameters

TODO: Consider adding optional perens to enclose return parameters.

TODO: More expressions and statements

Code is organised into functions.

A function has the following form.

func Name(arg1 : type1, arg2 : type2, ...) -> ret_type1, ret_type2
        Resources?
{
    Statement+
}

In the future if the types are omitted from a non-exported function’s argument list the compiler will attempt to infer them. For now all types are required.

TODO: Find a way that return parameters can be named. This will change the behaviour of functions WRT having the value of their last statement.

TODO: What if neither the name or type of a return value is specified?

Resources is optional and may either or both "uses" or "observes" clauses, which are either the uses or observes keywords followed by a list of one or more comma separated resource names.

Statements

Statement := Call               % An effectful/observing call
           | IdentList = Expr
           | 'return' TupleExpr
           | 'match' Expr '{' Case+ '}'

Call := ExprPart1 '!'? '(' Expr ( , Expr )* ')'

Case := Pattern '->' '{' Statement* '}'

Assignment

Statements may be assignments.

variable = expr

An expression may have more than one value.

var1, var2 = expr # expr returns a two items.

This works when the expression is a multi-valued call.

TODO: it should also work when the expression is for example a if or switch expression that returns multiple items. This is a property of the if or switch expression, not the assignment statement.

var1, var2 = if (...) then expr1, expr2 else expr3, expr4

Plasma is a single assignment language. Each variable can only be assigned to once along any execution path, and must be assigned on each execution path that falls-through (see Environment).

Call

A statement may also be a call. These calls don’t assign their result to anything and therefore they only make sense if they use some resource.

function_name!(Args);

Calls may also be expressions (see below), as an expression a call might still use or observe some resource. However only one call per statement may observe the same or a related resource, this ensures that effects happen in a clear order.

Return

return expr

return expr1, expr2
Note
We could have said that return isn’t necessary, instead making all expressions also statements, and a statement sequence’s value the value of the last statement in the sequence. However by introducing return we separate statements and expressions which simplifies some implementation details. We may revisit this in the future (maybe making it optional) but I kind of like having it.

Pattern matching

Pattern := Number
         | IdentLower
         | '_'          # Actually parsed as IdentLower
         | IdentUpper ( '(' Pattern ',' ( Pattern ',' )+ ')' )?

Pattern matching is also a statement (as well as an expression). Cases are tried in the order they are written, the compiler should provide a warning if a case will never be executed, or a value is not covered by any cases. All variables in the pattern (the LHS of the ) must be new.

match (n) {
  0 -> {
    beer = "There's no beer!"
  }
  1 -> {
    beer = "There's only one beer"
  }
  m -> {
    beer = "There are " ++ show(m) ++ " bottles of beer"
  }
}
print!(beer)

If a variable bound by one of the cases is used outside the match (like beer) then it must be bound by every case. If it is not used outside the match, then each cases' variable of the same name is "named apart" (see Environment).

Currently either all cases must have a return statement or none of them. Matches where some return and others do not will be added in the future.

If-then-else

if (expr) {
    statements
} else if (expr) {
    statements
} else {
    statements
}

There may be zero or more else if parts.

Plasma’s single-assignment rules imply that if the "then" part of an if-then-else binds a non-local variable, then there must be an else part that also binds the variable (or does not fall-through). Else branches aren’t required if the then branch does not fall-through or does not bind anything (it may have an effect).

Loops

Note
Not implemented yet.
Note
I’m seeking feedback on this section in particular.
# Loop over both structures in a pairwise way.
for [x <- xs, y <- ys] {
    # foo0 and foo form an accumulator starting at 0.  The value of foo
    # becomes the value of foo0 in the next iteration.
    accumulator foo0 foo initial 0

    # The loop body.
    z = f(x, y)
    foo = foo0 + bar(x)

    # This loop has three outputs.  "list" and "sum" are names of
    # reductions.  Reductions are instances of the reduction
    # interfaces.  They "reduce" the values produced by each iteration
    # into a single value.
    output zs = list of z
    output sum = sum of x
    # foo is not visible outside the loop, an output is required to
    # expose it.  value is a keyword, it is handled specially and
    # simply takes the last value encountered.
    output foo_final = value of foo
}
Note
the accumulator syntax will probably change after the introduction of some kind of state variable notation.

TODO: Introduce a more concise syntax for one-liners and expressions. Similar in succinctness to using map and foldl calls.

The loop will iterate over corresponding items from multiple inputs. When they’re not of equal length the loop will stop after the shortest one is exhausted. This decision allows them to be used with a mix of finite and infinite sequences.

Looping over the Cartesian combination of all items should also be supported (syntax not yet defined, maybe use &). This is equivalent to using nested loops in many other languages.

Valid input structures are: lists, arrays and sequences. Sequences are coroutines and therefore can be used to iterate over the keys and values of a dictionary, or generate a list of numbers.

TODO: Possibly allow this to work on keys and values in dictionaries. If the keys are unmodified during the loop then the output dictionary can be rebuilt more easily, its structure doesn’t need to change. Lua has the ability to require keys to be sorted, or to drop this requirement.

The output declarations include a reduction. This is how the loop should build the result.

TODO: Reduction isn’t a good word for it, since the output type can be either a scalar or a vector.

The reduction can be completely different from the type of any of the inputs. This builds an array from a list (or other ADT). This uses the array reduction.

for [x <- xs] {
    y = f(x)
    output ys = array of y
}

Many reductions will be possible: array, list, sequence, min, max, sum, product, concat_list. Developers will be able to create their own as these are interfaces.

Loops are implemented in terms of coroutines. Coroutines return the values for the inputs and the loop body and coroutines handle building the value of the outputs (list and sum are coroutines above). Coroutines offer the most flexibility as some of their state is kept on the stack.

Simpler implementations should be used as an optimisation when it is possible. In these cases some loops may be optimised to calls to map or foldl, or even simpler inline code.

Auto-parallelisation (a future goal) will work better with reductions that are known to be either:

  • Order independent

  • Associative / commutative, but whose input type is the same as the output

  • Mergable, with a known identity value.

Accumulators are implemented more directly (not coroutines). However they require the iterations to be processed in a specific order and may inhibit parallelisation. A dependency analysis on the body and separating out the code for each accumulator may mitigate this, especially if it can be combined with the same analyses as reductions above.

Expressions

Expressions are broken into two parts. This allows us to parse call expressions properly, with the correct precedence and without a left recursive grammar. Binary operators are described as a left recursive grammar, but are not implemented this way, their precedence rules are documented below.

TupleExpr := Expr ( ',' Expr )*

Expr := Expr BinOp Expr
      | UOp Expr
      | ExprPart1 '!'? '(' Expr ( , Expr )* ')'         % A call or
                                                        % construction
      | ExprPart1 '[' '-'? Expr ( '..' '-'? Expr )? ']' % array access
      | ExprPart1

ExprPart1 := '(' Expr ')'
           | '[' ListExpr ']'
           | '[:' TupleExpr? ':]'       # An array
           | QualifiedIdent             # A value
           | Const                      # A constant value

BinOp := '+'
       | '-'
       | '*'
       | '/'
       | '%'
       | '++'
       | '>'
       | '<'
       | '>='
       | '<='
       | '=='
       | '!='
       | 'and'
       | 'or'

UOp := '-'      # Minus
     | 'not'    # Logical negation

UOp operators have higher precedence than BinOp, BinOp precidence is as follows, group 1: * / %, group 2: + - group 3: < > ⇐ >= == !=, group: 4-7: and or ++ ,

Lists have the following syntax (within square brackets)

ListExpr := e
          | Expr ( ',' Expr )* ( '|' Expr )?

Examples of lists are:

# The empty list
[]

# A cons cell
[ head | tail ]

# A list 1, 2, and 3 are "consed" onto the empty list.
[ 1, 2, 3 ]

# Consing multiple items at once onto a list.
[ 1, 2, 3 | list ]

Arrays elements may be access by subscripting the array. Eg a[3] will retrieve the 3rd element (1-based). A dash before the subscript expression will count backwards from the end of the array, a[-2] is the second last element. This syntax currently clashes with unary minus and so is currently unimplemented. Array slices will use the .. token and are also unimplemented.

TODO: Streams

Any control-flow statement is also an expression.

x = if (...) { statements } else { statements }

In this case the branches cannot bind anything visible outside of themselves, and the value of a branch is the value of the last statement in that branch.

TODO: Pattern matching expressions.

Ideas

These are just ideas at this stage, they are probably bad ideas.

If a multi-return expression is used as a sub-expression in another context then that expression is in-turn duplicated.

x, y = multi_value_expr + 3

is

x0, y0 = multi_value_expr
x = x0 + 3
y = y0 + 3

Therefore calls involved in these expressions must not "use resources".

Another idea to consider is that a multiple return expression in the context of function application applies as many arguments as values it returns. We probably won’t do this.

... = bar(foo(), Z);

Is the same as

x, y = foo();
... = bar(x, y, z);

Handling effects (IO, destructive update)

Plasma is a pure language, we need a way to handle effects like IO and destructive update. This is called resources. A function call that uses a resource (such as print()), may only be called from functions that declare that they use a resource. This means that a callee cannot use a resource that a caller doesn’t expect (resource usage is transitive) and anyone looking at a functions' signature can tell that it might use a resource.

A resource usage declaration looks like:

func main() -> Int uses IO

Here main() declares that it uses (technically may use) the IO resource. Resources can be either used or observed; and a function may use or observe any number of resources (decided statically). An observed resource may be read but is never updated, a used resource may be read or updated. This distinction allows two observations of a resource to commute (code may be re-arranged during optimisation), but two uses of a resource may not commute.

Developers may declare new resources, the standard library will provide some resources including the IO resource. Examples of IO 's children might be Filesystem and Time, Filesystem might have children for open files (WIP), although none of these have been decided / implemented.

A call is valid if:

Callee is Pure

Callee may Observe

Callee may Use

Caller is Pure

Y

N

N

Caller may Observe

Y

Y

N

Caller may Use

Y

Y

Y

You’ll find that this is very intuitive. It’s shown in a table for completeness.

Resource hierarchy

Resources form a hierarchy (not yet defined). For a call to be valid either the resource, or its parent must be available in the caller. For example if mkdir() uses the Filesystem resource, which is a child of IO then any caller that uses IO can call mkdir().

Temporary resources (NIY)

Some resources can be creating and destroyed, and rather than being a part of their parent always (Filesystem is always a part of IO) they are subsumed by their parent instead. For example an array uses some memory as its resource, that memory is allocated and freed when the array is initialised and then goes out of scope (it is unique). But if that the memory resource is created and destroyed within the same function, it’s caller does not need the uses declaration, memory and possibly some other resources are special cases.

Resources in statements

Every call that uses a resource must have the ! suffix. For example:

    print!("Hello world\n")

This makes it clear to anyone reading the code to beware something happens, changes or might be observed to have happened or have changed. This is also the entire reason to have it in the language, it serves no other function, but the compiler will make sure that it is present on every call that either uses or observes something.

Multiple calls with ! may be used in the same statement, provided that their resources do not overlap, or they are all observing the resource and not modifying it. (Note that we are debating) this at the moment).

Commutativity of resources

Optimisation may cause code to be executed in a different order than written. The following reorderings of two related (ancestor/descendant) resources are legal.

None

Observe

Use

None

Y

Y

Y

Observe

Y

Y

N

Use

Y

N

N

Non-related resources may be reordered freely.

Higher order code (NIY)

This aspect of Plasma is currently under consideration.

  • Higher order functions need to handle resources, otherwise their usefulness is reduced.

  • Resource usage from such code needs to be safe (WRT order of operations).

  • We want to encourage polymorphism here, otherwise people will write higher-order abstractions that can’t be used with resources.

  • We’d prefer to make code concise that isn’t intended to be used with resources, but ought to be resource-capable anyway.

Proposal 1

Higher order values may have uses/observes declarations (added to their type) values without such declarations are pure. All higher order calls have the usual ! sigil and the statement rules apply.

Map over list looks like:

func map(f : a -> b using r, l : List(a)) -> List(b) using r {
    switch (l) {
      case []       -> {
        return []
      }
      case [x0 | xs0] -> {
        x = f!(x0)
        xs = map!(f, xs0)
        return [x | xs]
      }
    }
}

Note that the calls to f and map must be in separate statements.

This has the disadvantage that it is not as concise, and that people who aren’t planning to use resources, won’t write resource-capable code, if that code is in a library it may be annoying to modify if it needs to be used with a resource later.

Other proposals

There are several other ideas and their combinations that may help.

  • All higher order code implicitly uses resources, a function like map therefore also uses that resource since it contains such calls. Unless there’s a way to show this in the function’s signature I’m not fond of it. Unless-unless we leave that to type inference.

  • Require all higher-order code to handle resources, users may feel that the compiler is being overly-pedantic.

  • Higher order calls are exempt from the one-resource-per-statement rule. Making the code more concise (it still includes a !).

  • Either expressions have a well-ordered declarative semantics or

  • resources must be declared as don’t-care ordering so they can be placed in the same statements.

Linking to and storing as data (NIY)

Linking a resource with a real piece of data, such as a file descriptor, is highly desirable. Likewise putting such data inside a structure to be used later, such as a pool of warmed-up database connections, will be necessary.

There are a couple of ideas. We could add information to the types to say that they are resources and what their parent resource type is. So that the variable can stand-in for the resource.

type Fd =
    resource from Filesystem

func write(Fd, ...) uses Fd