|
This document presents in semi-formal way syntax and semantics of the Nemerle language. It is not meant to be a tutorial.
We often refer to .NET terminology [FIXME: reference it].
Programs are written using the Unicode character set. Every Nemerle source file is reduced to a sequence of lexical units (tokens) separated by sequences of white characters (blanks).
There are five classes of lexical tokens:
/* A comment. */ // Also a comment foo // identifier foo_bar foo' foo3 // other identifiers 42 // integer literal 0x2a // hexadecimal integer literal 0o52 // octal integer literal 0b101010 // binary integer literal 'a' // character literal '\n' // also a character literal "foo\nbar" // string literal @"x\n" // same as "x\\n" @if // keyword used as identifier |
Spaces, vertical and horizontal tabulation characters, new-page characters, new-line characters and comments (called blanks altogether) are discarded, but can separate other lexical tokens.
A traditional comment begins with a /*, and ends with */. An end-of-line comment starts with //, and ends with the line terminator (ASCII LF character).
Type variables occur in polymorfic definition of types. It can be any valid identifier.
Ordinary identifiers consist of letters, digits, underscores and apostrophe, but cannot begin with a digit nor an apostrophe. Identifiers may be quoted with the @ character, which is stripped. It removes any lexical and syntactic meaning from the following string of characters until blank, thus enabling programmer to use keywords as identifiers.
There is an important difference between identifiers starting with underscore character _ and the other ones. When you define local value with name starting with _ and won't use it, compiler won't complain about it. It will warn about other unused values though.
Symbolic identifiers consist of following characters: =, <, >, @, ^, |, &, +, -, *, /, $, %, !, ?, ~, ., :, #. Symbolic identifiers are treated as standard identifiers except to the fact that they are always treated as infix operators.
Following identifiers are used as keywords, and may not be used in any other context unquoted: [[FIXME: update this list.]] _, abstract, and, array, as, base, class, const, def, else, ensure, enum, extern, finally, fun, if, in, interface, internal, let, macro, match, module, mutable, namespace, new, null, using, out, private, protected, public, throw, ref, require, sealed, static, struct, then, this, try, tymatch, type, variant, void, volatile, where, with.
Following infix identifiers are reserved keywords: =, $, ?, |, <-, ->, =>, <[, ]>, &&, ||.
There are few kinds of literals:
floating_point_literal ::= |
digits ::=
{
}
|
exponential_marker ::=
e
|
exponential_marker ::=
E
|
sign ::=
+
|
sign ::=
-
|
compilation_unit ::= |
A Nemerle program consists of one or more compilation units. Compilation units are text files with the .n extension. Compilation unit consists of namespace-related declarations and types within them.
toplevel_declaration ::=
using
qualified_identifier
;
|
Add the specified namespace (which, unlike in C#, can also be a type name) to the symbol search path. Every symbol till end of current namespace or compilation unit (if not within namspace) will be searched also in location specified by this path.
toplevel_declaration ::=
namespace
IDENTIFIER
=
qualified_identifier
;
|
Define an alias for a namespace. After namespace Foo = Bar.Baz; any reference to Foo.bar will be expanded to Bar.Baz.bar.
toplevel_declaration ::= |
Put declarations within the specified namespace. Namespaces can be nested, creating a tree of namespaces.
toplevel_declaration ::= |
Define a new top level type.
This section lists grammar rules common to most of other sections.
identifier_or_dummy ::=
IDENTIFIER
|
identifier_or_dummy ::=
_
|
In several places it is possible to use the _ keyword to denote the intent to ignore a parameter or return value. Semantics of _ in such places is to generate a new temporary name.
qualified_identifier ::=
IDENTIFIER
{
.
IDENTIFIER
}
|
Identifiers can be qualified with namespaces.
Types are defined at the top level, within namespaces or within other types. Top level type names are prefixed with the namespace they are defined in. Nested type names are prefixed with the parent type name. Nesting affects accessibility of a type.
type_header ::= |
Type header is similar to .NET. The main difference is the optional type_parameters list, defined below.
type_parameters ::=
<
TYPE_VARIABLE
{
,
TYPE_VARIABLE
}
>
|
where_constraints ::= |
When defining polymorphic type one has to specify list of type variables in declaration. It can have following form:
class Foo <a, b> |
Optional list of where parameters can be used to add constraints to the type variables (type coercion).
where a : Nemerle.Collections.IEnumerable, IComparable where b : Nemerle.Collections.IDictionary |
type_declaration ::= |
This type declaration creates an alias to another type.
type_declaration ::= |
Nemerle interfaces are similar to .NET.
type_declaration ::= |
Class definition is similar to .NET.
type_declaration ::= |
Module is much like a class, but all module members are static. There is no need to place static attributes on module members. It is also not possible to create instances of module types.
type_declaration ::= |
Variant declaration consist of a type name and a list of bar-separated constructors enclosed in brackets.
variant_option ::= |
The constructor declaration describe the constructor associated to this variant type. A constructor may take an argument. Constructor name must be capitalized.
The semantics of attributes is the same as in C#.
attributes ::= |
attribute ::=
new
|
attribute ::=
public
|
attribute ::=
protected
|
attribute ::=
internal
|
attribute ::=
private
|
attribute ::=
abstract
|
attribute ::=
sealed
|
attribute ::=
override
|
attribute ::=
static
|
Following fields are allowed in class or module body:
type_member ::= |
type_member ::= |
type_member ::= |
field_definition ::= |
Unless the optional mutable keyword is used the field can be modified only inside the constructor.
interface_member ::= |
Keyword new is necessary when declared method hides inherited one from another interface.
method_definition ::= |
This is definition of method within class or module. Program entry point is method static Main.
method_type_parameters ::=
<
TYPE_VARIABLE
{
,
TYPE_VARIABLE
}
>
|
Declaration of polymorphic method needs its type variables listed after the identifier.
method_header ::=
IDENTIFIER
[
method_type_parameters
]
(
method_parameters
)
:
type
[
where_constraints
]
method_implements
|
This is declaration of method. Unlike C# type is specified after parameters list.
method_header ::= |
Special method named this specifies constructor. This declaration cannot contain method type and method has to have type void.
method_implements ::= |
method_parameter ::= |
Method parameter is a pair consisting of identifier or _ and it's type specification. Type declaration can be omitted in local functions definitions.
method_parameters ::= |
Method parameters are comma-separated list of parameter specification.
method_body ::=
=
extern
STRING_LITERAL
;
|
Method body can be linked to external function, example:
static ps (s : string) : void = extern "System.Console.Write"; |
This feature is used to give meaning to infix operators. It isn't currently fully supported. It should not be considered rock-stable feature.
method_body ::=
;
|
Empty method body (a ;) is a method declaration.
method_body ::= |
This is classical method definition.
Type expression relate to type declarations much like function calls relate to function and value definitions. Type declarations define ways the types can be constructed and type expressions define actual types.
Types are both static and dynamic characterization of values. Static type of expression depends on its building blocks and is defined in paragraph describing given expression. Dynamic (runtime) type is bound to the value at the moment it is created, and remains there until the value is garbage collected.
The type system is modeled after .NET Generics design, except for tuple and function types, which are new, but can be easily simulated using generics.
primary_type ::= |
Type constructor (defined with type declaration) can be applied to zero or more arguments forming type expression. Number of type arguments in type application must match number of type arguments in definition. Moreover actual type arguments must solve where constraints imposed on formal type arguments.
primary_type ::=
TYPE_VARIABLE
|
Refer to type substituted to given type variable. The type variable has to be defined (bound, quantified) before it is used. Type variable can be defined in type arguments or method header (of global or local function).
primary_type ::=
(
type
)
|
This construct has no semantic meaning -- it exists only to enforce particular syntax decomposition.
primary_type ::=
void
|
This is mostly an alias for System.Void -- a type with exactly one inhibiting value it is however first class value -- can be passed as function parameter as well as returned from functions.
The name comes from System.Void, but should be in fact unit.
primary_type ::=
ref
primary_type
|
primary_type ::=
out
primary_type
|
These are for parameters passed by reference. This is not implemented yet, but will have semantics similar as in C#.
type ::= |
Construct product (tuple) type. This operator is not associative, which means that each two of following types are different:
int * int * int (int * int) * int int * (int * int) |
type ::= |
Construct function type with specified argument and return types respectively. The -> operator is right associative, which means that the following type are equivalent:
int -> int -> int int -> (int -> int) |
Multi-argument function types are written using tuple notation, for example after local declaration:
def some_function (a : int, b : string) : float { ... } |
the expression some_function has type int * string -> float.
These are used in expressions and patterns.
literal ::=
true
|
literal ::=
false
|
These literals have type bool and represent respectively true (false) boolean value
literal ::=
null
|
Represents null reference, one that does not refer to any object. It posses types of all reference types -- can be used in any context reference type is required. It does not however posses the void type nor any value type (like System.Int32 or System.Single).
literal ::=
(
)
|
Indicates returning no value. It is the only possible value of type void. See also void type.
literal ::=
STRING_LITERAL
|
Represents string constant. Nemerle supports two forms of string:
A regular string literal consists of zero or more characters enclosed in double quotes and may include both simple escape sequences (such as \n for the newline character) and hexadecimal and Unicode escape sequences (See character literals for details).
A verbatim string literal consists of an @ character followed by a double-quote character, zero or more characters, and a closing double-quote character. In a verbatim string literal, the characters between the double-quotes are recognized verbatim, the only exception is a sequence "" (used to indicate '"' character) (Note that simple escape sequences and hexadecimal and Unicode escape sequences are not recognized in verbatim string literals). A verbatim string literal may span multiple lines.
Examples:
def s1 = "Nemerle string !"; // Nemerle string ! def s2 = @"Nemerle string !"; // Nemerle string ! def s3 = "Nemerle\tstring !"; // Nemerle string ! def s4 = @"Nemerle\tstring !"; // Nemerle\tstring ! def s5 = "I heard \"zonk !\""; // I heard "zonk !" def s6 = @"I heard ""zonk !"""; // I heard "zonk !" def s7 = "\\\\trunk\\ncc\\ncc.exe"; // \\trunk\ncc\ncc.exe def s8 = @"\\trunk\ncc\ncc.exe"; // \\trunk\ncc\ncc.exe def s9 = "\"Nemerle\"\nstring\n!"; // "Nemerle" // string // ! def s10 = @"""Nemerle"" // same as s9 rocks !"; |
String s10 is a verbatim string literal that spans 3 lines.
literal ::=
NUMBER_LITERAL
|
Represents one of numeric types. See literals for details of representing particular numerical types.
literal ::=
CHARACTER_LITERAL
|
Character literal consist of one character enclosed in single-quotes (' ') or escape character of form '\X' where X can be one of following: [FIXME : characters with (N) are not implemented yet (will they ?)]
It has type char.
Primary expressions is grammar category referring to expressions that have closed structure and are otherwise simple. Primary expressions and plain expressions do not differ at the semantic level.
primary_expr ::= |
The value and type of expression being literal is the value and type of respective literal.
primary_expr ::= |
This expression result is a variable itself (not its value). [[FIXME: hem?!]] Type of this expression is ref 'a where 'a is a type of variable being referenced.
primary_expr ::=
this
|
This expression can only be used within non-static methods and indicates a reference to the current instance of the class (which posses this method).
Expression like this.foo can be shortened to foo unless it would generate an identifier ambiguity with some variable being in this lexical scope.
primary_expr ::=
(
expr
)
|
Grouping expression allow to enforce particular syntax decomposition of expression.
primary_expr ::= |
This expression allows dynamic type coercion. It is done during runtime and if it cannot be realized then System.InvalidCastException is thrown. If it succeeds the type of this expression is equal to the type of type.
primary_expr ::= |
This expression allows static type enforcement. It is checked during compile-time and error is reported if expr type is not a subtype of type. It allows only type widening. If it succeeds the type of this expression is equal to the type of type.
primary_expr ::=
primary_expr
.
IDENTIFIER
|
This expression allows referring to the field or method that object represented by primary_expr contains.
primary_expr ::= |
This expression allows creating a tuple of expr whose types may differ. The type of that tuple is type_1 * ... * type_n where type_1 and the following are types of corresponding expressions.
primary_expr ::= |
This expression allows to refer to indexed (even by multiple indexes) fields of objects represented by leftmost expr where second (and further) expr are indexes values of field we want to refer to. expr must refer to indexing object.
expr ::= |
The value and type are the same as primary_expr we are referring to.
expr ::= |
Call a function with given parameters. The type of the function call expression is the same as the type of the function return value; that is, if function's type is 'a -> 'b, then the type of the function call expression is 'b. The value of the whole expression is the return value of the function.
expr ::=
primary_expr
<-
expr
|
Assign a value to a variable. Left side of the assignment expression must evaluate to a mutable variable. The type of the assignment is always void.
expr ::= |
expr is matched sequentially to the patterns in given match cases. If one of the patterns is consistent with the value of expr then the corresponding computation branch of the match case is evaluated. Patterns in all the match cases must be of the same type. Expressions being computation branches in all the match cases must be of the same type, as well. The type of the match expression is the same as the type of the computation branch in all the match cases.
expr ::=
throw
expr
|
Throws given exception. The expression given must be of type System.Exception.
expr ::= |
If the evaluation of expr does not throw any exception, then the result is that of the evaluation of expr. Otherwise, the runtime type of the exception which was thrown is compared against each type description in handlers. First matching handler is executed and its value returned. If none of the handlers matches the exception is propagated. The type of the whole expression is the same as type of guarded expression. The value is the value of expression or lunched handler. Consult .NET specification if you want to know more about exceptions.
expr ::= |
Evaluates the first expression and -- regardless of whether the evaluation has finished correctly or some exception has been thrown during the evaluation -- the second expression is evaluated. The value (and thus the type) of the whole expression is the value of the first expression.
expr ::=
OPERATOR
expr
|
expr ::= |
expr ::= |
The value (and thus the type) of the whole expression is the value of the last expression in the sequence.
expr ::= |
Create an array consisting of given elements. All elements must be of the same type. If the elements are of the type 'a then the whole expression is of the type array ('a).
expr ::= |
Defines the binding between the variables in the pattern and the value of the expression expr which will be known to all subsequent expressions in the current block.
expr ::= |
Defines the functions which will be known to all subsequent expressions in the current block. Names of all defined functions are put into the symbol space before their bodies are parsed.
expr ::=
mutable
IDENTIFIER
<-
expr
|
Defines new variable, value of which can be changed at any time using the assignment expression.
This section describes expressions that are in fact just syntactic sugar over Core Expressions. We just present translation of Secondary Expressions into Core Expressions.
expr ::= |
Standard branch, which executes and returns value of first expression if condition evaluates to true or second elsewhere.
Internally it is translated into
match (cond) { | true => expr1 | false => expr2 } |
expr ::= |
Loop, executing body expression as long as condition is true. Its value is always checked before execution of body and if it evaluates to false, then loop ends. Body must be of type void.
While loop is translated internally into following code
def loop () { if (cond) { body; loop () } else () }; loop () |
expr ::= |
Version of if condition, but having only one branch -- execution of body only when condition is satisfied. If its value if false, then nothing is done (i. e. () is returned).
Its semantics is the same as
if (cond) body else () |
expr ::= |
Opposite version of when. It executes and returns value of body only if conditions if not satisfied (i. e. evaluates to false).
Its semantics is the same as
if (cond) () else body |
expr ::= |
Lambda expressions can be thought as of anonymous local functions. This construct defines such a function and returns it as a functional value. This value can be used just like the name of regular local function.
Example:
List.Iter (fun (x) { printf ("%d\n", x) }, intList) |
is equivalent to
def tmpfunc (x) { printf ("%d\n", x) }; List.Iter (tmpfunc, intList) |
Lambda expression is indeed translated internally to
expr ::=
def
temporary_name
[
method_type_parameters
]
(
method_parameters
)
[
:
type
]
[
where_constraints
]
block
|
where temporary_name is automatically created by compiler.
expr ::= |
[1, 2, 3] is translated to Cons (1, Cons (2, Cons (3, Nil ()))).
This section describes some constructs used in Expressions section.
sequence ::= |
Expressions in the sequence are evaluated sequentially, and the value (and thus the type) of the sequence is the value of the last expression in the sequence.
Value of expression (except for the last one) are ignored, and thus if the type of some expression is not void -- a warning is generated.
block ::=
{
sequence
}
|
This is just a standard execution of sequence of expressions. Value (and type) of this block is the same as last expression in a sequence.
block ::= |
This syntax is a shortcut for matching parameters of defined function with given list of patterns. It is equivalent to making a tuple from parameters of function and creating match expression.
def f (p1, p2, p3) { | (1, 3, "a") => 1 | _ => 2 } |
translates to
def f (p1, p2, p3) { match ((p1, p2, p3)) { | (1, 3, "a") => 1 | _ => 2 } } |
It is also to note, that when function has only one parameter, matching goes just on this parameter itself (there is no one element tuples).
try_catch_handler ::= |
parameter ::= |
parameter ::= |
ref is used to denote parameter passes by reference. This is not implemented yet, but will have semantics similar as in C#.
guarded_pattern ::= |
Guarded pattern requires expression expr to be of type bool. Given some expression e this expression satisfies the guarded pattern only if it is pattern-matched with pattern and expression expr is evaluated to true.
match_case ::= |
Given some expression e this expression satisfies this match case if and only if it satisfies one of the guarded patterns in this match case.
Patterns are form of accessing data structures, especially trees. Patterns can match values. Definition of the term to match is given with each pattern construct. However the main idea behind patterns is that they match values that look like them.
Pattern are used in match expression and value definitions.
pattern ::= |
The identifier should refer to name of variant option. This pattern matches value iff it is specified variant option, and sub-pattern matches variant option payload.
pattern ::=
_
|
This pattern matches any value.
pattern ::= |
This pattern matches value of class, that has all specified fields (this is checked statically), and value of each field matches respective pattern.
pattern ::=
(
pattern
)
as
IDENTIFIER
|
This pattern matches the same value as enclosed pattern does. However in addition value matched by enclosed pattern is bound to specified variable, which can be used in when guard or match body.
pattern ::= |
This pattern matches a tuple with specified contents (each tuple member is matched be respective pattern).
In addition, when tuple pattern is seen, where record pattern would be otherwise expected -- tuple pattern is transformed to record pattern by adding field identifiers in order they appear in definition of given class. Tuple pattern transformed to record pattern cannot match fields inherited from the base class.
pattern ::= |
The following two lines are equivalent:
:: Cons ( , ) |
pattern ::= |
The following are equivalent:
[, , ... , ] Cons ( , Cons ( , ... Cons ( , Nil) ... )) |
pattern ::= |
This pattern matches specified constant value.
Please refer to macros.html for now.