Next: , Previous: Recursive types and functions, Up: Extensions to C


2.1.2.5 Macros

Macros are L's replacement for the C preprocessor, or C++ templates. It allows you to factorize patterns your code, so that your code is clearer and easier to understand.

More specifically, macros are a templating language that replaces a construct by a sequence of code.

Introduction

Let's start with an example to see how it works:

     macro square(x)
     { let _value = $x;
       _value * _value }

The effect of the macro is as follow: if you write:

     let Int number = square(4);

It will be converted into:

     let Int number = {let _value = 4; _value * _value; };

thus yielding 16.

If you have typed:

     let Float number = square(0.5);

it would have been converted to

     let Float number = {let _value = 0.5; _value * _value; };

thus yielding 0.25.

The process of converting a “macro call” into what it stands for is called macro expansion.

Here you can see how macros are type independent, since you can use them with Floats and Ints. In fact, you can use them on every type T for which the operation *(T,T) exists.

Here you can see the interest of:

Macros and type

One of the distinguished features of L macros is that they are typed. L macros do not just perform text rewriting, like the C processor : they can check type, and even act differently according to the type.

     macro power(Float x, Float y)
     {
       powf($x, $y)
     }
     
     macro power(Long x, Long y)
     {
       powl($x, y)
     }

Note: this is in fact the way how overloading is implemented. UNIMPLEMENTED: or will be.

The use of types allows earlier type-checking. If we take the previous example:

     macro square(x)
     { let _value = $x;
       _value * _value }

And if we write square("25"), we will end up with:

     error: (in expansion from square)
          :  in file test.c, line 25:
            * does not accept (String, String) arguments.

If we had written instead:

     macro square(Int x)
     { let _value = $x;
       _value * _value }

The error message would have become:

     error: in file test.c, line 25: square does not accept String argument.

which is much more readable.

But if we want the definition of square to be more generic (for instance, to allow both Int and Floats, and any type T that have a *(T,T) operation), just specify the macro like this:

     macro square(x : numeric)
     { let _value = $x;
       _value * _value }

UNIMPLEMENTED: this relies on type classes, that are not yet implemented.

Use of macros

The main use of macros is not for square-like examples : inline functions are here for this (even if inline functions are implemented as macros in L).

It is mainly useful to introduce new programming constructs. For instance, in L, while is not part of the language: it is defined as a macro. Here is its definition:

     macro while(Bool condition, body)
     {
       loop {
         if(!$condition)
           break;
         $body; }
     }

Then while can be used like this:

     let i = 25;
     while(i > 0,
         { print(i--, '\n');
         });

This isn't really like the C while. To really obtain the C definition in this case, you also have to hook the parser: this is explained in Extending the syntax.

++ and -- are also not part of the language, but defined as macros, and could be defined as:

     //pre_inc(x) is the same as ++x
     macro pre_inc(x)
     { x = x+1 }
     
     //post_inc(x) is the same as x++
     macro post_inc(x)
     { let _temp = x;
       x = x + 1;
       _temp }

Note: In fact, to make the Cocytus more readable, and the C output look more like real code, while, ++ and -- are part of the Cocytus language. But their default implementations are the above ones.

Care with macros

Use macros with care: for instance if you define

     macro square(Int x)
     {$x * $x}

Then square(i++) will be transformed into (i++) * (i++) which is certainly not what you want.

In this example, you must write:

     macro square(Int x)
     { let _x_value = $x;
       _x_value * _x_value }

That would yield the correct result. (Note: in this case, an inline function would have been better than a macro).

In general, it is dangerous to insert an argument directly more than once, and you must do it only if it is the effect that you want to achieve, as in

     macro do_twice(x)
     { $x;
        $x;
     }

The second problem you can encounter is called variable capture. For instance, if you define do_times like this: (the following example is inspired by Paul Graham's On Lisp):

     macro do_times(Unsigned_Int max, body)
     { let max_value = $max;
       let i = 0;
       loop {
       if(i++ == max_value)
          break;
        $body;
        }
     }

then your code would work in most cases:

     do_times(4) { print("hello\n"); } //print hello four times

But if you write:

     let i = 0;
     do_times(4) { i++; }
     //Here, i is still 0

(Note: if you had written do_times(5) instead of do_times(4), then the code would never have returned.)

The reason is that the local variable i used by the do_times macro and the one used in the block passed as an argument “clash” because they have the same name. Here is the full expansion of the above call:

     let i = 0;
     { let max_value = 4;
       let i = 0;
       loop {
       if(i++ == max_value)
          break;
        { i++ };
       }
     }

The naive solution to this problem is to use such ugly names that they can never clash with identifiers that a programmer would choose, like srchroukjuk239. This doesn't really make your code pretty, and worse, it does not work when macro definitions are nested:

     do_times(4) { do_times(5) { print("Hello\n");}}

The real solution to this problem is to use generated symbols, i.e. to generate a new symbol for the name of the variable for each new expansion, that are guaranteed to be uniques, and thus never clash.

L provides a simple way to do this: just prepend a '_' to your identifiers in your template code. L will recognize that and instead will generate a new symbol for each new expansion of the macro. (Note: more over, the generated symbols are generated in such a way that they are readable, and that you can find out in what macro they have been generated. This simplifies debugging of macros).

In normal code, you do not have the right to begin an identifier by an underscore (as it is forbidden in C).

Thus, the correct definition for the above macro is:

     macro do_times(Unsigned_Int max, body)
     { let _max_value = $max;
       let _i = 0;
       loop {
       if(_i++ == _max_value)
          break;
        $body;
        }
     }

And macro expansion becomes:

     let i = 0;
     { let do_times#max_value_1 = 4;
       let do_times#i_1 = 0;
       loop {
       if(do_times#i_1++ == do_times#max_value_1)
          break;
        { i++ };
       }
     }

Note: You may wonder why all let definitions are not simply transformed into generated symbols. First, sometimes you may want to generate a new symbol, even if there is no let (imagine, for instance, that you have written a my_let macro that expands into let).

Second, there are some cases where variable capture is actually wanted, and it would be silly to prevent it.

L can however generate a warning when it detects a variable capture because of a let; this warning can be desactivated like this:

     macro with_log_level(number, body) captures(log_level)
     {
       let log_level = number;
       $body;
     }

although this changes log_level only in the current lexical scope; compare it with that:

     macro with_log_level(number, body)
     {
      let _save_log_level = level;
      log_level = number;
      $body
      log_level = _save_log_level;
     }
Using multiple levels of macros

Use of macros can be nested, as in:

     let x = 0;
     while(i <= 25) { i = square(x); print(i); x++; }

In this case, inside expansions occurs before outside expansions. That is to say, the above example is first converted into:

     let x = 0;
     while(i <= 25) { i = {let square#temp = x; square#temp * square#temp }

Before being converted to:

     let x = 0;
     loop {
        if(!(i <= 25)) break;
        {
           i = {let square#temp = x; square#temp * square#temp }
        }
     }
Conclusion

Macros are a templating language for L; it allows code to be more understandable, and to let the compiler do things for you.

The combination of macros and types allows many powerful definitions, that depends on the type of the arguments. You can really define new programming language constructs, and use them as if they were part of the language.

Note: as you have to use { } in a macro, variables declared in a macro are always local to a macro, and are thus limited to computation inside the macro. This catch programmers errors that are hard to find.

If you want to implicitly declare new variables in your code, you have to use the more powerful Expanders. In general, if you want to generate code and that a templating language is not powerful for this, you have to use the more general Expanders.

Note: using macros create many redundant code that a programmer wouldn't have written. Fortunately, this redundance is often eliminated when converted to SSA Form:

     let a = { let temp i; i = 3; i };

is converted in :

     let a = 3;

in the optimisations pass of the compiler.