Table of Contents
*****************

The L Documentation
1 Introduction
  1.1 L in brief
  1.2 L features
2 Tutorial
  2.1 L tutorial for the C programmer
    2.1.1 Translating C into L
      2.1.1.1 Variables declaration
      2.1.1.2 Type declaration and utilisation
      2.1.1.3 Grammar for types
      2.1.1.4 Type construction and memory allocation
    2.1.2 Extensions to C
      2.1.2.1 Blocks and conditional as expressions
      2.1.2.2 Tuples
      2.1.2.3 Keyword and default arguments
      2.1.2.4 Recursive types and functions
      2.1.2.5 Macros
      Introduction
      Macros and type
      Use of macros
      Care with macros
      Using multiple levels of macros
      Conclusion
      2.1.2.6 Expanders
      2.1.2.7 Extensible syntax
    2.1.3 Restrictions to C
3 L presentation
  3.1 L structure
  3.2 L concrete syntax
    3.2.1 Extending the syntax
  3.3 Cocytus
    3.3.1 Language constructs
      3.3.1.1 Constructs for local structure
      3.3.1.2 Constructs for global structure
      3.3.1.3 Constructs for changing flow of control
      3.3.1.4 Constructs for iteration
      3.3.1.5 Construct for affectation
      3.3.1.6 Constructs for pointer manipulation
      3.3.1.7 Constructs for structure manipulation
      3.3.1.8 Construct for aggregating several values together
      3.3.1.9 Construct for function calling
    3.3.2 Definers
    3.3.3 Chunk
  3.4 Malebolge
    3.4.1 Type classes
    3.4.2 Defining new macros
    3.4.3 Creating new expanders
    3.4.4 Defining coercions


The L Documentation
*******************

This manuals documents the L language.  It is both a user guide to L
and a complete documentation on the inner workings of the language.

  It is not perfectly in sync with the source code; some parts of the
source code are not yet documented, and some parts of the documentation
do not exist in the code yet. In most cases, the later case is
explicitly written in the documentation.

  L is still alpha; although many parts already work, some important do
not.  The development has been focused on showing what is unique to L,
and what you can do with it.  Convenient libraries for writing concrete
code do not exist yet; but you are welcomed to help! If you do, you
should know that the prime mean of communication is the L mailing list,
l-lang-devel at nongnu dot org.

  This manual is for L, version 0.0.1

  Copyright (C) 2006 Matthieu Lemerre

     Permission is granted to copy this manual, modify it, and publish
     your modifications according to the GNU Free Documentation
     Licencse, as published by the Free Software Foundation.

1 Introduction
**************

1.1 L in brief
==============

L is:

   * A compiled language with a C-like syntax, and Lisp-like macros.

   * It is an extensible programming language : even the syntax is
     modifiable at run-time.

   * It is a mostly-safe language, with strong typing and encouraged
     confinement of dangerous constructs.

   * Finally, thanks to extensibility, L is an _universal_ language: it
     can be used both for low-level and system programming than for
     creating complex high-level applications.  Using certain set of
     predefined modules, it could even be used as a scripting language.

  So, L can be seen both as:

   * C with stronger typing + extensible compiler support (macros,
     parser, expanders) + fully expression-based.

   * or Lisp with low-level capabilities, static typing, support for
     custom syntaxes, and type-aware macros.


  Hence the name, L: L combines C and Lisp.

  But L is not just a combinaison of two languages: it a language of its
own.  Just like Lisp and C are both defined by a just a small number of
specific characteristics, L's features make it unique.

1.2 L features
==============

L has numerous features that makes it a cool language to hack with.
Among them:

   * L is a compiled language.  It can thus run very fast.  Compilation
     allows to track your errors sooner, and execution is faster than
     with other techniques.  L programs can have very low memory
     footprint.  L can sometimes outperfom C; for instance, its type
     system allows more agressive memory aliasing.

   * L is interactive. Speed up your developpement by not using the
     "edit-compile-test" cycle developpement process.  Write part of a
     function and immediately test it.  Interactively compiling
     functions is very fast, so you don't have to wait for long
     compile/link cycles.

   * L is extensible.  If you need anything to add in the language you
     don't need to wait for 10 years that a committee decide to enhance
     the language. You just do it.  Extensibility is at the heart of L
     programming, and is a programmation paradigm in itself.

   * L is multi-paradigm.  L's extensibility allows to write programs
     using different paradigm at the same time.  I.e., you can freely
     mix functional, imperative, and object-oriented code.

   * L has a flexible type system.  Types are there to help you and
     automatically manage things for you; they don't get in your way.

   * L helps you write safe programs.  Traditionnal languages are split
     between too extremes:

        - Safe languages.  Unfortunately, clever optimisations aren't
          allowed with them.

        - Unsafe language, like C.  They allow to write any code, but
          programs written with them often crash or lead to security
          issues.


     With L, it is easy to write safe programs with some confined unsafe
     parts.  Just write the unsafe parts carefully, and your program
     will be both optimized and won't crash.

     For instance, you can use pointers (both powerful and unsafe), but
     L provides many ways to avoid their use (for instance using
     Tuples, and multiple return values) or confine them (using macros,
     like the `foreach' macro for iterating on lists).

   * L interfaces well with C.  You can directly use C libraries from L,
     and use L libraries from C.  No need to write long wrappers for
     that.

   * L enables extreme factorisation of your programs.  Typical L
     programs are a lot smaller than equivalent C programs.  As such,
     they are easier to understand as a whole.

     As a corollary, with L, you can optimize your program without
     sacrificing readability.  Dirty hacks can be confined and
     factorised.  If you later change your mind about a hack, you just
     have to change one place, something that is not always possible
     e.g. in C.

2 Tutorial
**********

2.1 L tutorial for the C programmer
===================================

In this first part, I will assume that you are already accustomed to a
C like-language, preferably C.  If you are not, skip to the unwritten
node "L tutorial for the newbie programmer", or to "L tutorial for the
Ruby programmer", or to "L tutorial for the Lisp programmer".
Unfortunately for you, these tutorials are not yet written.

2.1.1 Translating C into L
--------------------------

First, you want to need at least as proficient with L than you were
with C.

  L is basically compatible with C, but there are some syntaxic
differences.

  Let's see our first example:

     int
     foo(int a, int b)
     {
       return (a * (a + b);
     }

  In this short example, you can see that L syntax is quite close to the
one in C. Still, here are the main differences that you will find when
you want to translate C code into L :

2.1.1.1 Variables declaration
.............................

You have to precede all your local variable definitions with a let :

     int
     foo(int a, int b)
     {
       let int c = a + b;
       return a * c;
     }

  I can already hear some old UNIX hackers already yelling : so L is
more verbose than C, even more verbose than Java?

  No: in L, type annotations are optional.  That is, you can define
exactly the same function if you use:

     int
     foo(int a, int b)
     {
       let c = a + b;
       return a * c;
     }

  Type annotations are useful for the programmers for as checks of what
is going on; you can see them as partial specifications of your program.

  Passing the types of the parameters (and knowing the types of the
different global variables) is in fact sufficient for deducing the
types of all local variables; that's why type annotation is useless for
the compiler.

  Thus you avoid things like:

     BufferedInputStream bufferedInputStream = new BufferInputStream(....);

which are quite redundant, and for which the type of the variable is
obvious.

  C's type qualifiers also have a different syntax: UNIMPLEMENTED

  let Int : extern global_variable; let Float : static
other_global_variable;

  This ":" syntax will become clearer when we have seen *Note Type
classes::.

2.1.1.2 Type declaration and utilisation
........................................

Type declaration (`typedef' in C) are written as follows:

     type Toto = struct { Int foo; Int baz;};

  L's convention for types is that they should be capitalized, and if
you have an identifiers that spans over several words, separate them by
underscores and capitalize each word, Like_That.

  But this is is really just a convention; your own program can be
written as you like.

  The `type =' declaration isn't like `typedef', because it introduces
a new type that is incompatible with the first. (COMMENT : the = is
maybe misleading in that case?)

  If you write, for instance:

     type Kelvin = Float;

  then you cannot have:

     let Kelvin k = 3.0;

  You have to do this instead: (UNIMPLEMENTED: does not work for now,
explicit casts needed)

     let Kelvin k = Kelvin(3.0);

  or simply:

     let k = Kelvin(3.0);

  Similarly, the `+' operations work on `Int's, and because `Kelvin's
are not `Int's, you cannot add `Kelvin's.

  L provides a shortcut (UNIMPLEMENTED) for specifying which operations
are still allowed for the new type:

     type Kelvin = Float | allows +,-;

  Forbiding `*' is interesting for instance, because multiplying
Kelvins between them make no sense.  Multiplying them by a Float makes
sense, however. (UNIMPLEMENTED: so we need a way to tell that).

  Finally, writing all this can be factorized to:
     import std.unit;

     type Kelvin = Float : unit;

  `unit' declares a type that does all this (in fact, that respects the
`unit' interface).  `numeric' declares a type that has `+', `-', `*',
`/'.  All these are in fact type classes, see *Note Type classes::.

  If you just want something similar to C `typedef', you can use L's
`typealias':

     typealias Floating_Point = Float;
     typealias Integer = Int;

  defines Integer as an alias for Int. (i.e. they are the same type with
just different names.)

2.1.1.3 Grammar for types
.........................

The grammar for types is quite different from C's one.  First, all
types information are grouped together; you don't have some that are
prefix to the identifier, and some postfix (like `*' and `[]' are in C).

  Second, the notation for functions is simply `<-'.

  Here are some examples:

C                                    L
-------------------------------------------------------------------------- 
int i;                               let Int i;
int *pi;                             let Int *pi;
int pi[3];                           let Int[3] pi;
int *pi[3];                          let Int *[3] pi;
int (*)pi[3];                        let Int [3]* pi;
float (*fi)(int);                    let (Float <- Int)* fi;

2.1.1.4 Type construction and memory allocation
...............................................

By default when creating a new type, L creates also a new generic
constructor, which depends on the created type.

  For instance, you must use

     type Kelvin = int;

     let degree = Kelvin(24);

  to create new Kelvin objects.

  When the type is a struct, or a pointer to a struct, the constructor
is used as in the following:

     type Point = struct { int x; int y; } *;
     type Point_Struct = struct { int x; int y; };

     ...

     let p = Point(x:4, y:5);
     let p2 = Point_Struct(x:4, y:5);

  The difference between the two is that `p2' is allocated on the stack
(as a regular C struct), while `p' is allocated on the heap.

  The `x:' and `y:' are keys for keyword arguments.  The rationale
behing using them is that if, later, you change your mind about the
structure contents, you will know it immediatly (this wouldn't be the
case if you used normal arguments).

  In order not to forget this '*', you can use the `record' type
constructor:

     type Point = record { int x; int y; };
     // same as type Point = struct { int x; int y; } *;

  In fact, it isn't really the same: `record'-declared types are by
default garbage collected, whereas `struct *'-declared types must be
explicitly freed.

  This can be overriddent by the `alloc:' parameter:

     let p1 = Point(x:4, y:5)
     // Allocated and managed by the most appropriate garbage collector

     let p2 = Point(x:4, y:5, alloc:gc) // Same as p1

     let p3 = Point(x:4, y:5, alloc:refcount)
     // Same as p1, but explicitly asks for the refcount GC

     let p4 = Point(x:4, y:5, alloc:mark_sweep)
     // Same as p1, but explicitly asks for the mark&sweep GC

     let p5 = Point(x:4, y:5, alloc:heap) //No garbage collection is done

     let p6 = Point(x:4, y:5, alloc:stack) //Allocated on the stack.

  By default, all the type fields have to be given as arguments.

2.1.2 Extensions to C
---------------------

L has numerous extensions to C, but here are some extensions that are
useful when programming in the small

2.1.2.1 Blocks and conditional as expressions
.............................................

In L, blocks can return a value.  For instance:

     let Int a = { let Int sum = 0;
                   let Int i = 0;
                   for(i = 0; i <= 10; i++) { sum += i; }
                   sum };

  The above code creates a new block where `sum' and `i' are two
variables created in the block.  An iteration is done, and then the
value of sum is returned, and affected to `a'.

  Note the syntax : if a block is supposed to return a value, it is
composed of a list of _statements_ followed by one expression.  By
contrast, the block in the `for' does not return a value, and is
composed only of `statements'.  This will be detailed more in *Note
Extending the syntax::.

  This will be very useful when you write *Note Macros::; but it is also
good programming style : it is better to pass values around using this
mechanism than creating a new variable to do it.

  Conditionals can also return a value. Here is how to compute the
absolute value of a :

     let Int abs_x = if(x >= 0) x else -x;

  This eliminates the need for a distinct `? ... :' operator.

  Note: You can still use conditional as you would in C.  The above
example could also be written:

     let Int abs_x;
     if(x >= 0)
        abs_x = x;
     else
        abs_x = -x;

  or even:

     let Int abs_x = x;
     if(x < 0)
       abs_x = -x;

2.1.2.2 Tuples
..............

Tuples are a way to manipulate several values at once.

  Creating a new tuple does not allocate memory.  Tuples don't have
addresses, their content is not placed contiguously in memory.  They
_really_ are just a way to consider several values at once.

  For instance, the tuple (4, 5, 'toto') is a constant.  Thus you can
create multiple-values constants with L.

  When a tuple contain _complex_ expressions (that is to say, anything
except a constant or a variable), _the order of evaluation is defined
to be from the left to the right._

   * Passing arguments to functions is done using a tuple.  I.e.

           let h = hypot(side1, side2);

     calls `hypot' with the tuple `(side1, side2)'.  As a side effect
     of the order of evaluation rule stated before, _function arguments
     are evaluated from left to right_.  In C, this is undefined, which
     causes many portability problems.

   * Return values of functions are also tuples.  So, functions can
     return multiple values in L:

          let (return_value, is_present) = gethash('key');
          let (num_char_scanned, num1_scanned, num2_scanned) = scanf("%d %d");

     In most cases, the use of pointers to return several data should be
     replaced by a multiple return value form.  This is one of the
     example where pointers can be easily avoided in L (thus providing
     more safety).

     In the implementation, the return values are passed in clobbered
     registers, and is very efficient.

   * Finally, it is also useful to do multiple affectations
     simultaneously, without  having to explicitly declare local
     storage:

          (str,len) = ("foo",4); // same as str = "foo"; len = 4;

          (point.x, point.y) = (3.0, 5.0);

          a = 2;
          b = 3;

          (a,b) = (b, a); // Now b = 2, and a = 3.

          let (c,d) = (4, 'foo');

     When doing multiple affectations, you can "skip" one by using the
     special symbol '_':

          let i = 0;
          (a,_,c) = (++i,++i,++i); //a = 1; c=3;

     This is most useful when you want to receive values from functions:

          (value, present) = gethash(key, hash_table);
          (value, _) = gethash(key, hash_table);
          value = gethash(key, hash_table);

  Finally, on an implementation note, using tuple is higly efficient,
because each component of a tuple can be a register.

  For instance, the `(a,b) = (b,a)' construct may use the efficient
`xchg' instruction on CISC machines; It is difficult for a standard C
compiler to use these instructions, and the corresponding code would
use three instructions and a supplementary register.

  Tuple is thus an both a pleasant and efficient abstraction.

  Note: Depending on the architecture, Word64 can be or not a tuple.
But most code can ignore this fact and completly ignore the issue.

2.1.2.3 Keyword and default arguments
.....................................

L functions can have default arguments UNIMPLEMENTED, like C++ ones:

     Int foo(Int bar = 3)
     {
       bar
     }

     foo(5); //5
     foo();  //3

  L functions argument passing may also be done using keyword arguments
UNIMPLEMENTED:

     Int foo(Int bar, Int baz)
     {
       10 * bar + baz
     }

  If you write:
     Int foo(Int bar = 1, Int baz)
     {
       10 * bar + baz
     }

  Then `foo' can only be called like this:

     foo(3, 4)  //OK, 34
     foo(baz:5) //OK, 15
     foo(5)     //Wrong

  Use of keyword arguments is a really good style when you create
functions that create complex objects (structures and records),
especially when they contain fields that have the same type.

  For instance:
     let richards_shoes = Shoes(color:Green, sole_color:Brown)

  Is much better style than
     let richards_shoes = Shoes(Green, Brown)

  In general, it is better to use them when a function has several
arguments of the same type, or several arguments at all.  It makes your
code more readable.

  You can also use them for "cosmetic" usage, as in:
     foreach(a, in:list) {...}
     foreach(element:a, in:list) {...}
     divide(25, by:73);

2.1.2.4 Recursive types and functions
.....................................

L handle recursive types and functions very well.  In functional
programming languages, recursive type definition is often
"work-arounded" by using a "rec" keyword.

  In L, the rule is that "every code simultaneously entered is treated
as a whole".

  For instance, if you feed L with:

     type Toto = struct { Tata; } *;
     type Tata = struct { Toto; } *;

  L would correctly interpret this. UNIMPLEMENTED: not yet.  Only

     type Int_List ; struct { Int head; Int_List tail; }*

  recursive definition works for now.

  But if you feed L with:

     type Toto = struct { Tata; } *;

  and (after) with:

     type Tata = struct { Toto; } *;

  You would get an error after the first sentence, because the
definitions you supply is incomplete.

  See *Note Chunk:: for more informations.

2.1.2.5 Macros
..............

Macros are L's replacement for the C preprocessor, or C++ templates. It
allows you to factorize patterns your code, so that your code is
clearer and easier to understand.

  More specifically, macros are a templating language that replaces a
construct by a sequence of code.

Introduction
............

Let's start with an example to see how it works:
     macro square(x)
     { let _value = $x;
       _value * _value }

  The effect of the macro is as follow: if you write:
     let Int number = square(4);

  It will be converted into:
     let Int number = {let _value = 4; _value * _value; };

  thus yielding 16.

  If you have typed:
     let Float number = square(0.5);

  it would have been converted to
     let Float number = {let _value = 0.5; _value * _value; };

  thus yielding 0.25.

  The process of converting a "macro call" into what it stands for is
called _macro expansion_.

  Here you can see how macros are type independent, since you can use
them with Floats and Ints. In fact, you can use them on every type T
for which the operation `*(T,T)' exists.

  Here you can see the interest of:
   * Let without type annotation (that helps creating type independent
     macros easily), and

   * Blocks that can return expressions (it allows to create local
     variables and still return a value).

Macros and type
...............

One of the distinguished features of L macros is that they are typed.
L macros do not just perform text rewriting, like the C processor :
they can check type, and even act differently according to the type.

     macro power(Float x, Float y)
     {
       powf($x, $y)
     }

     macro power(Long x, Long y)
     {
       powl($x, y)
     }

  Note: this is in fact the way how overloading is implemented.
UNIMPLEMENTED: or will be.

  The use of types allows earlier type-checking. If we take the previous
example:

     macro square(x)
     { let _value = $x;
       _value * _value }

  And if we write `square("25")', we will end up with:

     error: (in expansion from square)
          :  in file test.c, line 25:
            * does not accept (String, String) arguments.

  If we had written instead:
     macro square(Int x)
     { let _value = $x;
       _value * _value }

  The error message would have become:
     error: in file test.c, line 25: square does not accept String argument.

  which is much more readable.

  But if we want the definition of square to be more generic (for
instance, to allow both Int and Floats, and any type T that have a
`*(T,T)' operation), just specify the macro like this:

     macro square(x : numeric)
     { let _value = $x;
       _value * _value }

  UNIMPLEMENTED: this relies on type classes, that are not yet
implemented.

Use of macros
.............

The main use of macros is not for `square'-like examples : inline
functions are here for this (even if inline functions are implemented
as macros in L).

  It is mainly useful to introduce new programming constructs.  For
instance, in L, `while' is not part of the language: it is defined as a
macro.  Here is its definition:

     macro while(Bool condition, body)
     {
       loop {
         if(!$condition)
           break;
         $body; }
     }

  Then `while' can be used like this:
     let i = 25;
     while(i > 0,
         { print(i--, '\n');
         });

  This isn't really like the C while.  To really obtain the C definition
in this case, you also have to hook the parser: this is explained in
*Note Extending the syntax::.

  `++' and `--' are also not part of the language, but defined as
macros, and  could be defined as:

     //pre_inc(x) is the same as ++x
     macro pre_inc(x)
     { x = x+1 }

     //post_inc(x) is the same as x++
     macro post_inc(x)
     { let _temp = x;
       x = x + 1;
       _temp }

  Note: In fact, to make the Cocytus more readable, and the C output
look more like real code, `while', `++' and `--' are part of the
Cocytus language.  But their default implementations are the above ones.

Care with macros
................

Use macros with care: for instance if you define

     macro square(Int x)
     {$x * $x}

  Then `square(i++)' will be transformed into `(i++) * (i++)' which is
certainly not what you want.

  In this example, you must write:
     macro square(Int x)
     { let _x_value = $x;
       _x_value * _x_value }

  That would yield the correct result.  (Note: in this case, an inline
function would have been better than a macro).

  In general, it is dangerous to insert an argument directly more than
once, and you must do it only if it is the effect that you want to
achieve, as in

     macro do_twice(x)
     { $x;
        $x;
     }

  The second problem you can encounter is called _variable capture_.
For instance, if you define `do_times' like this: (the following
example is inspired by Paul Graham's On Lisp):
     macro do_times(Unsigned_Int max, body)
     { let max_value = $max;
       let i = 0;
       loop {
       if(i++ == max_value)
          break;
        $body;
        }
     }

  then your code would work in most cases:
     do_times(4) { print("hello\n"); } //print hello four times

  But if you write:
     let i = 0;
     do_times(4) { i++; }
     //Here, i is still 0

  (Note: if you had written `do_times(5)' instead of `do_times(4)',
then the code would never have returned.)

  The reason is that the local variable `i' used by the `do_times'
macro and the one used in the block passed as an argument "clash"
because they have the same name.  Here is the full expansion of the
above call:

     let i = 0;
     { let max_value = 4;
       let i = 0;
       loop {
       if(i++ == max_value)
          break;
        { i++ };
       }
     }

  The naive solution to this problem is to use such ugly names that they
can never clash with identifiers that a programmer would choose, like
`srchroukjuk239'.  This doesn't really make your code pretty, and
worse, it does not work when macro definitions are nested:

     do_times(4) { do_times(5) { print("Hello\n");}}

  The real solution to this problem is to use _generated symbols_, i.e.
to generate a new symbol for the name of the variable for each new
expansion, that are guaranteed to be uniques, and thus never clash.

  L provides a simple way to do this: just prepend a '_' to your
identifiers in your template code.  L will recognize that and instead
will generate a new symbol for each new expansion of the macro.  (Note:
more over, the generated symbols are generated in such a way that they
are readable, and that you can find out in what macro they have been
generated.  This simplifies debugging of macros).

  In normal code, you do not have the right to begin an identifier by an
underscore (as it is forbidden in C).

  Thus, the correct definition for the above macro is:

     macro do_times(Unsigned_Int max, body)
     { let _max_value = $max;
       let _i = 0;
       loop {
       if(_i++ == _max_value)
          break;
        $body;
        }
     }

  And macro expansion becomes:

     let i = 0;
     { let do_times#max_value_1 = 4;
       let do_times#i_1 = 0;
       loop {
       if(do_times#i_1++ == do_times#max_value_1)
          break;
        { i++ };
       }
     }

  Note: You may wonder why all `let' definitions are not simply
transformed into generated symbols.  First, sometimes you may want to
generate a new symbol, even if there is no `let' (imagine, for
instance, that you have written a `my_let' macro that expands into
`let').

  Second, there are some cases where variable capture is actually
wanted, and it would be silly to prevent it.

  L can however generate a warning when it detects a variable capture
because of a `let'; this warning can be desactivated like this:

     macro with_log_level(number, body) captures(log_level)
     {
       let log_level = number;
       $body;
     }

  although this changes `log_level' only in the current lexical scope;
compare it with that:

     macro with_log_level(number, body)
     {
      let _save_log_level = level;
      log_level = number;
      $body
      log_level = _save_log_level;
     }

Using multiple levels of macros
...............................

Use of macros can be nested, as in:

     let x = 0;
     while(i <= 25) { i = square(x); print(i); x++; }

  In this case, inside expansions occurs _before_ outside expansions.
That is to say, the above example is first converted into:

     let x = 0;
     while(i <= 25) { i = {let square#temp = x; square#temp * square#temp }

  Before being converted to:

     let x = 0;
     loop {
        if(!(i <= 25)) break;
        {
           i = {let square#temp = x; square#temp * square#temp }
        }
     }

Conclusion
..........

Macros are a templating language for L; it allows code to be more
understandable, and to let the compiler do things for you.

  The combination of macros and types allows many powerful definitions,
that depends on the type of the arguments.  You can really define new
programming language constructs, and use them as if they were part of
the language.

  Note: as you have to use { } in a macro, variables declared in a
macro are always local to a macro, and are thus limited to computation
inside the macro.  This catch programmers errors that are hard to find.

  If you want to implicitly declare new variables in your code, you have
to use the more powerful *Note Expanders::.  In general, if you want to
generate code and that a templating language is not powerful for this,
you have to use the more general *Note Expanders::.

  Note: using macros create many redundant code that a programmer
wouldn't have written.  Fortunately, this redundance is often
eliminated when converted to SSA Form:

     let a = { let temp i; i = 3; i };

  is converted in :

     let a = 3;

  in the optimisations pass of the compiler.

2.1.2.6 Expanders
.................

Expanders are a generalisation of the notion of macro. (In fact, it is
more accurate to say that macros are a special case of expanders).

  Instead of using a fixed template for replacing code, code can be
dynamically constructed based on the the parameters given (which do not
need to be expanded).

  TO_WRITE

2.1.2.7 Extensible syntax
.........................

TO_WRITE: how to hook the parser, the parse language...

2.1.3 Restrictions to C
-----------------------

TO_WRITE

3 L presentation
****************

The L compiler is composed of many parts.

  TO_COMPLETE

3.1 L structure
===============

L' compilation process is composed of several parts:

   * The `Parser', that takes the buffer of characters and converts it
     into a tree of forms, the abstract syntax tree.  The input of the
     parser is what is called "L"; and the parser acts according to the
     *Note L concrete syntax::.

   * The `Expander', transforms the abstract, high level forms into
     concrete, low level ones. It is responsible for macro expansion.
     It is also called `Malebolge compiler', because the parsed L forms
     are called Malebolge. Malebolge deals with the abstract syntax of
     the language.

   * The `Code generators', take the low level forms and transform it
     into output.  It is highly dependent of the output; we could
     consider that there are in fact several code generators, and these
     are called `backends'.  It is also called `Cocytus compiler',
     because the low level forms are the Cocytus language. Cocytus
     deals with the precise semantics of the language.

  Malebolge is a bit to Cocytus what C++ is to C: a super-set of the
language, that allows more abstraction, more hiding of informations.
But reading C++ code, one cannot exactly says what will be executed; by
contrast any experienced C programmer knows exactly what's going on
when he reads a C program.  This is exactly the same thing for
Malebolge and Cocytus: reading a Malebolge program provides a global
understanding of the program, you can read the intention of the
programmer, understand the algorithms.  Reading the Cocytus
transformation, you can read how this intention was transformed by the
Malebolge compiler, and how it is translated into machine language.

  Malebolge programs can be converted to Cocytus : both languages
coexists. This is important, because it allows a programmer to manually
check what's going on in its program, AND to have precise control over
the language. So in L, we combine both advantages.  Separating the
language into two levels renders easy both the global comprehension and
deep comprehension of a program.

  We already have a complete compilation process that takes L source
code, parses it, expands it, and generates x86 code dynamically in
memory (that's why the compiler is _interactive_).

  The following (outdated) diagrams shows how the complete L
implementation could look like:


                        L source file (.l)
                               |
                            parser* (produce L forms)
                            /  | \
        parsed file generator  |  sexped file generator
                          /    |   \
           L parsed file (.pl) |    L sexped file (.sl)
                               |
                               |
                         Malebolge compiler*  (produce L Cocytus forms)
                           /   |     \     \------------------\
     L cocytus source generator |    L slim binaries genetor    \
                         /     |       \                        GNU C generator
                        /      |      L binary file (.bl)           \
      L cocytus source (.cl)    |                                   C output file (.c)
                               |
                          Cocytus compiler* (calls the backend functions)
                        /      |         \
                       /       |          \-------\-----------------
                      /   dynamic x86 code*        \                \
          L GCC frontend                       ANSI C generator      \
                    /                                            L header files (.lh)
                  GCC
                  /  \---------------
                 /                   \
             assembly file (.S)   object file (.o)

  *: already implemented

3.2 L concrete syntax
=====================

The parser handle the concrete syntax of the language.

  TO_WRITE

  S-expressions (sexps): a notation to represent trees.

  Concept of tree : `a + b * c' is converted into `(* (+ a b) c)'. In
this representation, the first symbol represents the head of the tree,
and subsequents fields represent the sons, that can be trees
themselves.  So the above code is really:

     *--- + --- a
      \    \----b
       \--c

  Another example: L's code
     function1(toto, function2(24,49), 3+i)

  has the tree representation:

                     function1
                    /     |    \
                toto function2  +
                       /   \    |\
                      24   49   3 i

  And the sexp representation:

     (function1 toto
                (function2 24 49)
                (+ 3 i))

  So the sexp notation is just a convenient representation to represent
trees.

3.2.1 Extending the syntax
--------------------------

TO_WRITE

3.3 Cocytus
===========

The Cocytus is the set of core language constructs of L.  Every L code
is transformed into an assembly of Cocytus operations, that form a tree.

  The goal of the Cocytus language is to have a language that is
semantically non ambiguous; that is to say, that can express very
precisely what the computer will do.  Cocytus output should be
readable; it can be used to manually verify what a code does.  It is
much higher level than assembly, approximatively of equal
expressiveness than C, but more precise, less ambiguous.

3.3.1 Language constructs
-------------------------

In L, new language constructs can be defined, and existing language
constructs can be redefined.  There are many cases where this may be
useful:

   * If you have a processor with special instructions, you need to
     create some backend-specific language constructs to exploit them.

   * In general, every addition to the language that need some assembly
     manipulations require a new language construct.

  L defines a few language constructs that, taken alone, make L
approximatively as expressive as C.  To have full expressiveness, you
need macros and expanders, that are defined in the next section *Note
Malebolge::.

  The standard language constructs are the following:

3.3.1.1 Constructs for local structure
......................................

 -- Language construct: seq (form1..., form_n)
     `seq' Executes each of its subforms in turn, and returns the
     result of the last one.  `seq' must have at least one form.

     L's standard syntax for this construct is `,':
          x = 3, y = 4, x * y

     has for abstract syntax tree:

          (seq (= x 3)
               (= y 4)
               (*#Int x y))

     that has for result `12'.

 -- Language construct: block (form1..., form_n)
     `block' acts like `seq', except that it also begins a new block of
     code.  All subsequent `let' definitions have a scope that ends at
     the end of the current block.  Block must have at least one form,
     like seq.

     Note that L's blocks can return a value, unlike C ones.  The syntax
     for L blocks thus differs from C's (even GNU C's) one:

          let Int a = { let Int sum = 0;
                         for(let Int i = 0; i <= 10; i++)
                           { sum += i; }
                         sum };

     This code creates two local variables to do a local calculation,
     before returning a result.  This is particularly handy in
     combination with macros.

 -- Language construct: let (type_form, variable_name)
     `let' creates a new local variable that exists from its
     declaration until the end of the block.

     A `let' form can be used as a lvalue, like in the construct:

          let Int i = 3;

     that has for abstract syntax tree:

          (= (let Int i) 3)

     but it cannot be used as a rvalue.

3.3.1.2 Constructs for global structure
.......................................

 -- Language construct: define (definition_type name rest)
     `define' defines NAME as being a DEFINITION_TYPE with value REST.
     This special form just really calls the definer associated with
     DEFINITION_TYPE, with parameters DEFINITION_TYPE, NAME, and REST.

     Exemple of DEFINITION_TYPE are `function' (for defining new
     functions) `type' (for creating new types), `expander',
     `thread-local' (unimplemented), `global' for thread-local and
     global variables.

     See *Note Definers:: for the details.

3.3.1.3 Constructs for changing flow of control
...............................................

 -- Language construct: goto (label_name) UNIMPLEMENTED
     `goto' branches to the label LABEL_NAME.  For the goto to be
     valid, the following condition must be met:

     The label must appear in a scope "accessible" by the goto
     instruction, i.e. either the current scope or any parent enclosing
     scope.  It is an error to jump to a label to an unreachable scope.

 -- Language construct: return (value) UNIMPLEMENTED
     Aborts the execution of the current function and returns to the
     caller, with return value VALUE.

3.3.1.4 Constructs for iteration
................................

You will notice that L does not have any of the standard constructs for
iteration built-in, like C `for', `while', or `do ... while'.  These
can be defined by userlibraries; so a standard library defines them,
but they are not part of the language.

 -- Language construct: loop (form)
     Repeatly execute forms, until it reaches one of `break',
     `continue', or one of the two preceding change of flow of control
     commands `goto' (if the label is outside of the loop) or `return'.

 -- Language construct: break ()
     `break' exits the current loop; i.e., acts as if the enclosing
     loop was executed

 -- Language construct: continue ()
     `continue' continues the execution at the beginning of the loop.

3.3.1.5 Construct for affectation
.................................

L can be close to hardware, and thus is an imperative language, in the
sense that it defines an imperative construct, `='.  By restricting its
use, you can obtain a fully functional language if you prefer; this is
discussed in further sections.

 -- Language construct: = (assignee expression)
     `=' affects the value of EXPRESSION to ASSIGNEE, that must be a
     correct lvalue.

3.3.1.6 Constructs for pointer manipulation
...........................................

L permits the use of pointers; not doing so result in a huge
performance penalty as seen in the "modern" languages, and give up all
hope to do low level programming.  L is so an unsafe language.

  However, L make it easy to hide the use of pointers behing safe
constructs; if you make the usage of these constructs mandatory (which
is not feasible by now), you transform L into a safe, AND efficient,
language.

 -- Language construct: ref

 -- Language construct: deref
     `deref' can be used as an lvalue

3.3.1.7 Constructs for structure manipulation
.............................................

3.3.1.8 Construct for aggregating several values together
.........................................................

 -- Language construct: tuple expression_1 ... expression_n
     L has a builtin notion of tuple, that is quite different from what
     is called tuple in many different languages.

     L's tuple only purpose is to consider several values at once. In
     particular, _no special assumption is made about the location of
     the components of the tuple_.  Unlike the structure, tuple
     components are not placed contiguously in memory, for instance.

     The following code:
          (block (let Int a)
                 (let Int b)
                 (tuple a b))

     does not do anything, for instance.

     Tuples are really useful when it comes to _simultaneous
     affectations_. For instance:

          (a,b) = (b,a);

     exchange the values in `a' and `b'.  As there is no assumption
     behind the memory placement of the tuple, potentially optimized
     instructions can be used here; for instance if `a' and `b' where
     stored in registers, the above example could have used the x86
     instruction `xchg', which is never used by a normal C compiler.

     L defines the order of evaluation of the expressions in a tuple to
     be from left to right.  Unlike C programs, L programs can rely on
     this fact.

     It is also important to notice that all expressions of the tuple
     are evaluated before the assignment takes place.  This is what
     makes the above example to work; as opposed to sequential
     affectation.

3.3.1.9 Construct for function calling
......................................

funcall: calls a function.  Takes a tuple as an argument, returns a
tuple: thus we have multiple return values (this alleviates many use of
pointers).

  UNIMPLEMENTED: partial affectation of a tuple, using _:

     (x,_,color) = f(i);  f: (Float,optional Float, Color)<-(Int);
     (x,y,_) = f(i);   f: (Float,Float,optional Color)<-(Int);
     (x,y) = f(i);   f: (Float,Float,optional Color)<-(Int);
     x = f(i);   f: (Float,optional Float,optional Color)<-(Int);

3.3.2 Definers
--------------

Definers are used to define new global values.  Standard L cocytus
package provides several standard definers:

 -- Definer type: function
     This definer makes it possible to define new functions, that can
     later be compiled.

 -- Definer type: type
     With this definer, you can create new types

  L's style encourage creation of new definer.  For instance, imagine
that you are developping a PCI device driver interface.  Then, a device
driver should look like:

     pci_device_driver my_driver
     {
       name: "my-driver";
       probe: my_probe_function;
       remove: my_remove_function;
       suspend: my_suspend_function;
       resume: my_resume_function;
     }

     Error_Code my_probe_function(Pci_Info inf)
     { ... }

  Any change of interface would be immediately known at compile time:
for instance, if a suspend is deprecated, the definer can warn the
developper when it is called; and so on.

  UNIMPLEMENTED This is also how Cocytus backends should be declared:
     Cocytus_Backend pretty_GNU_C_backend
     {
       let: pgc_compile_let;
       =: pgc_compile_assign;

     }

  A warning or error message could then be issued when a Cocytus backend
does not follow a change in Cocytus; or does not fully implement it for
instance.

  Finally, the ability to add new definers make it possible to transform
L into a declarative language, or more generally use declarative
constructs, so writing what you want to have instead of writing how to
obtain it.

3.3.3 Chunk
-----------

Cocytus is compiled by chunks.  If you reference something, it has to
be defined in the same chunk, or already defined in a previous chunk.

3.4 Malebolge
=============

TO_WRITE

3.4.1 Type classes
------------------

3.4.2 Defining new macros
-------------------------

3.4.3 Creating new expanders
----------------------------

3.4.4 Defining coercions
------------------------