Variable Expression Library

Peter Simons


Table of Contents

Introduction
Purpose of this library
Downloading the latest version
Supported Expressions
Variable Expressions
Operations on Variables
Quoted Pairs
Arrays of Variables
Looping
The Complete EBNF Grammar
The varexp::expand Function
Writing Lookup Callbacks
Configuring the Parser
The varexp::unescape Function
Exceptions Thrown by libvarexp
Example Program
License

Introduction

Purpose of this library

libvarexp is a C++ library that allows its users to detach any kind of information from the representation of that information by providing a simple-to-use but powerful text-template mechanism. Similar mechanisms have been available in tools like sh(1), make(1), or perl(1) forever and have proven to be very useful. The basic idea is that the relevant information is made available in variables, which the author of the template can than use within the text itself as he or she sees fit.

Consider, for example, a tool that will calculate the monthly financial reports of a small company. Such a program should only calculate the required values, it should not worry about writing the resulting reports into an HTML file, a CSV file, or whatever format is desired. Instead, it should make the results of the calculation available in the variables $TURNOVER, $PROFIT, and $INCREASE. Then, using libvarexp, it could load an arbitrary template file and have the actual values inserted at the apropriate positions. Without changing a single line of code, one could generate the monthly report in HTML:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN">
<html>
  <head>
    <title>Financial Report</title>
  </head>
  <body>

    <h1>Financial report</h1>

    <p>This month, our glorious company reached a
total turnover of ${TOTAL} Euros, totalling ${PROFIT}
Euros before taxes. That means we have increased our
profit by ${INCREASE} percent compared to last
month.</p>

  </body>
</html>

Or you can send it out as a plain-text e-mail:

From: nobody@example.org (Monthly Financial Data)
Subject: This month's financial report

Dear Colleagues,

we have earned a total of ${PROFIT} Euros this month!
This means that we have increased profits by ${PROFIT}
percent compared to last month, totalling a turnover
of ${TURNOVER} of Euros.

Sincerely yours,
    The Statistics Program

Even better, by using such templates to generate the output you are effectively independent of the language you choose! The last report, for example, could also read:

From: nobody@example.org (Grssssötmpf!)
Subject: Ajahaha Mzoodeutschmark

Nuwarskvu,

Quhußaour Maou Ahosetuh Cravullitstziki Nakaou ${PROFIT}
Akqissäeüß Blaga: ${TURNOVER} Stauhr ${INCREASE}!!!!

This version is -- as you can probably easily recognize -- in German.

libvarexp offers application developers two functions, that will do all this for them, plus, the end-user has numerous ways not only to insert variables into his template files but to modify the variables contents on the fly, do full-blown regular expression search-and-replaces, or loop over the contents of arrays of variables.

Furthermore, the parser included in libvarexp can be re-configured to use tokens different from the ones shown in the examples; one could as well use %{NAME}, change the set of allowed characters for variable names, etc.

And last but certainly not least, these variables are not limited to environment variables at all. The programmer is free to provide a callback function to libvarexp that will be used to map a variable name to its contents. Thus, the variables your application provides, can reside internally completely. In fact, they can reside pretty much anywhere you want and they contain pretty much anything you want, as long as you write the callback.

Downloading the latest version

The latest version of libvarexp is available for download is varexp-1.2.tar.gz.

Supported Expressions

Variable Expressions

libvarexp distinguishes variables into simple and complex expressions. A simple expression has the form $NAME and will basically only replace the variable in the text buffer with its contents. Complex expressions have the form ${NAME:operation1:operation2:…} and may perform various operations on the variable's contents before inserting it into the text buffer.

Please note that due to the way simple expressions are parsed, it may not always be possible to use the simple-expression form even though you do not want to perform any operations. If your input text was This is a $FOObar, but the last bar part is meant to be a literal string, you'd have use This is a ${FOO}bar, because the parser will interpret any valid variable-name character following the dollar as part of the variable name; it will not recognize that $FOO would exist while $FOObar would not.

Also, libvarexp does not distinguish case in any way. For the library, $FoObAr and $fOoBaR are just strings -- whether they refer to the same variable or not is entirely up to the application that provides the callback used to resolve variables to their contents.

If you want to enter a text like $foo literally, you'll have to escape the $ sign by prefacing it with a backslash: \$foo. Then libvarexp won't interpret this expression as a variable.

Operations on Variables

In addition to just inserting the variable's contents into the buffer, you can use various operations to modify its contents before the expression is expanded. Such operations are used by appending a colon plus the apropriate command character to the variable name in complex expression, for example: ${FOOBAR:l}. Furthermore, you can chain any number of operations simply by appending another command to the last one: ${FOOBAR:l:u:l:u:…}.

The supported operations are:

${NAME:#}

This operation will expand the expression to the length of the contents of $NAME. If, for example, $FOO is foobar, then ${FOO:#} will result in 6.

${NAME:l}

This operation will turn the contents of $NAME to all lower-case, using the system routine tolower(3).

${NAME:u}

This operation will turn the contents of $NAME to all upper-case, using the system routine toupper(3).

${NAME:*word}

This operation will expand to word if $NAME is empty. If $NAME is not empty, it will expand to an empty string.

word can be an arbitrary text. In particular, it may contain other variables or even complex variable expressions, for example: ${FOO:*${BAR:u}}.

${NAME:-word}

This operation will expand to word if $NAME is empty. If $NAME is not empty, it will evaluate to the $NAME's contents.

word can be an arbitrary text. In particular, it may contain other variables or even complex variable expressions, for example: ${FOO:-${BAR:u}}.

${NAME:+word}

This operation will expand to word if $NAME is not empty. If $NAME is empty, it will expand to an empty string.

word can be an arbitrary text. In particular, it may contain other variables or even complex variable expressions, for example: ${FOO:+${BAR:u}}.

${NAME:ostart,end}

This operation will expand to a part of $NAME's contents, which starts at start and ends at end. Both parameters start and end are unsigned numbers.

Please note that the character at position end is included in the result; ${FOOBAR:o3,4}, for instance, will return a two-character string. Also, please note that start positions begin at zero (0)!

If the end parameter is not specified, as in ${FOOBAR:o3,}, the operation will return the string starting from position 3 to the end of the string.

${NAME:ostart-length}

This operation will expand to a part of $NAME's contents, which starts at start and ends at start+length. Both parameters start and end are unsigned numbers.

${FOOBAR:o3-4}, for example, means to return the next 4 charaters starting at position 3 in the string. Please note that start positions begin at zero (0)!

If the end parameter is left out, as in ${FOOBAR:o3-}, the operation will return the string from position 3 to the end.

${NAME:s/pattern/string/gti}

This operation will perform a search-and-replace operation on the contents of $NAME and return the result. The behavior of the search-and-replace may be modified by the following flags: If a t flag has been provided, a plain text search-and-replace is performed, otherwise, the default is to do a regular expression search-and-replace as in the system utility sed(1). If the g flag has been provided, the search-and-replace will replace all instances of pattern by replace, instead of replacing only the first instance (the default). If the i flag has been provided, the search-and-replace will take place case-insensitively, otherwise, the default is to search case-sensitively.

The parameters pattern and replace can be an arbitrary text. In particular, they may contain other variables or even complex variable expressions, for example: ${FOO:s/${BAR:u}/$FOO/ti}.

${NAME:y/ochars/nchars/}

This operation will translate all characters in the contents of $NAME that are found in the ochars class to the corresponding character in the nchars class -- just like the system utility tr(1) does. Both ochars and nchars may contain character range specifications, for example a-z0-9. A hyphon as the first or last character of the class specification is interpreted literally. Both the ochars and the nchars class must contain the same number of characters after all ranges are expanded, or an error is returned.

If, for example, $FOO contains foobar, then ${FOO:y/a-z/A-Z/} would yield FOOBAR. Another goodie is to use that operation to ROT13-encrypt or decrypt a string with the expression ${FOO:y/a-z/n-za-m/}.

The parameters ochars and nchars can be an arbitrary text. In particular, they may contain other variables or even complex variable expressions, for example: ${FOO:y/${BAR:u}/$TEST/}.

${NAME:p/width/string/align}

This operation will pad the contents of $NAME with string according to the align parameter, so that the result is at least width characters long. Valid parameters for align are l (left), r (right), or c (center). The string parameter may contain multiple characters, if you see any use for that.

If, for example, $FOO is foobar, then ${FOO:p/20/./c} would yield .......foobar.......; ${FOO:p/20/./l} would yield foobar..............; and ${FOO:p/20/./r} would yield ..............foobar;

The parameter string can be an arbitrary text. In particular, it may contain other variables or even complex variable expressions, for example: ${FOO:p/20/${BAR}/r/}.

Quoted Pairs

In addition to the variable expressions discussed in the previous sections, libvarexp can also be used to expand so called quoted pairs in the text. Quoted pairs are well-known from programming languages like C, for example. A quoted pair consists of the backslash followed by another character, for example: \n.

Any character can be quoted by a backslash; the terms \= or \@, for instance, are valid quoted pairs. But these quoted pairs don't have any special meaning to the library and will be expanded to the quoted character itself. There is a number of quoted pairs, though, that does have a special meaning and expands to some other value. The complete list is shown below. Please note that the name quoted pair is actually a bit inaccurate, because libvarexp supports some expressions that are no pairs in the sense that they consist of more than one quoted character. But the name quoted pair is very common for them anyway, so I stuck with it.

The quoted pairs supported by libvarexp are:

\t, \r, \n

These expressions are replaced by a tab, a carrige return and a newline respectively.

\abb

This expression is replaced by the value of the octal number abb. Valid digits for a are in the range from 0 to 3; either position b may be in the range from 0 to 7. Please note that an octal expression is recognized only if the backslash is followed by three valid digits! The expression \1a7, for example, is interpreted as the quoted pair \1 followed by the verbatim text a7, because a is not valid for octal numbers.

\xaa

This expression is replaced by the value of the hexadecimal number $aa. Both positions a must be in the range from 0 to 9 or from a to f. For the letters, either case is recognized, so \xBB and \xbb will yield the same result.

\x{…}

This expression denotes a set of grouped hexadecimal numbers. The part may consist of an arbitrary number of hexadecimal pairs, such as in \x{}, \x{ff}, or \x{55ffab04}. The empty expression \x{} is a no-op; it will not produce any output.

This construct may be useful to specify multi-byte characters (as in Unicode). \x{0102} is effectively equivalent to \x01\x02, but the grouping of values may be useful in other contexts, even though for libvarexp it makes no difference.

Arrays of Variables

In addition to normal variables, libvarexp also supports arrays of variables. An array may only be accessed in a complex expression -- $NAME[1] is not correct syntax. Use ${NAME[1]} instead. The reason for this limitation is that the brackets used to specify the index ([ and ]) have a different meaning in ordinary text; see the section called “Looping” for further discussion.

Which variables are arrays -- and which are not -- is entirely up to the application developer. In some applications, every variable may be accessed as both a normal variable and an array. In other applications, normal variables and arrays are different things. libvarexp does not dictate this. There exists the convention that accessing an array with a negative index, such as ${ARRAY[-1]} should return the number of elements the array contains. But again, this is not a behavior required by libvarexp; different applications may behave differently here.

When specifying the index of the array's element you wish to access, you can use complete arithmetic expressions to calculate the entry. libvarexp supports the operands + (addition), - (subtractin), * (multiplication), / (division), and + (modulo).

These operations may be used on any signed integer. A valid expression is, for example: ${ARRAY[-12/4+5]}. Please note that libvarexp follows the usual operator precedence. To group expressions explicitely, put brackets around them: ${ARRAY[-12/(2+4)]}.

In any place you can write a number in such an expression, you can also use a simple or complex variable expression. If $TWO is 2, the following expression would access the 5th entry in the $FOO array: ${FOO[10/$TWO}.

Looping

Obviously, arithmetic in array indices would be quite pointless without a looping construct. libvarexp offers such a costruct, which can model both a for and a while loop. Let's start with the second version, which is slightly simpler.

If the index delimiters [ and [ are found in the text, the start a looping construct. An example would be This is a test: [ $FOO ]. What happens now is that all text between the loop-delimiters is repeated again and again until all variables found in the body of the loop say they're undefined for the current index. The current index starts counting at zero (0) and is increased with every interation of the loop. In the index-specifier of the variable, it is available as #.

Hence, if we assume that the variable ARRAY[] had three entries: entry1, entry2, and entry3, then the loop [${ARRAY[i]}] would expand to entry1entry2entry3. Once the conter reached index 4, all arrays in the loop's body are undefined.

That raises the question what the first example we presented, This is a test: [ $FOO ], would expand to? The answer is: To the empty string! The loop would start expanding the body with index 0 and right at the very first iteration, all arrays in the body were empty -- that is, no array would have been expanded, because there weren't any arrays.

Thus, this form of looping only makes sense if you do specify arrays in the loop's body. If you do, though, you can do some weird things, like [${ARRAY[#%2]}], which expands to ${ARRAY[0]} for even numbers and to ${ARRAY[1]} for odd numbers. But the expression has another property: It will never terminate, because the array-loopup will never fail, assuming that indices 0 and 1 are defined!

That is unfortunate but can't be helped, I'm afraid. Users of libvarexp may choose to disable looping for the users of their application to prevent the end-user from shooting himself in the foot with infinite loops, though. But if you want to use loops, you must know what you're doing. There ain't no such thing as a free lunch, right?

There is another form of the looping construct available, that resembles a for loop more closely. In this form, the start value, the step value and the stop value of the loop can be specified explicitely like this: [$FOO]{start,step,stop}. This loop will start to expand the body using index start, it will increase the current index in each iteration by step, and it will terminate when the current index is greater than stop. (Please note that greater than is concept that needs much thought if you use negative values here! There may be some infinite loops coming. You have been warned.)

If any of the first two values are omitted, the following defaults will be assumed: start = 0 and step = 1. If stop is omitted, the loop will terminate if none of the arrays in the loop's body is defined for the current index. Consequently, using the loop-limits {,,} is equivalent to not specifying any limits at all.

Since most users will not need the step parameter frequently, a shorter form {start,stop} is allowed, too.

By the way: Loops may be nested. :-)

To confuse the valued reader completely, let's look at this final example. Assume that the arrays ${FOO[]} and ${BAR[]} have the following values:

    FOO[0] = "foo0"
    FOO[1] = "foo1"
    FOO[2] = "foo2"
    FOO[3] = "foo3"

and

    BAR[0] = "bar0"
    BAR[1] = "bar1"

Then the expression:

    [${BAR[#]}: [${FOO[#]}${FOO[#+1]:+, }]${BAR[#+1]:+; }]

would expand to:

    bar0: foo0, foo1, foo2, foo3; bar1: foo0, foo1, foo2, foo3

Have fun!

The Complete EBNF Grammar

input      : ( TEXT
              | variable
              | START-INDEX input END-INDEX ( loop-limits )?
              )*

loop-limits: START-DELIM
                (numexp)? ',' (numexp)? ( ',' (numexp)? )?
             END-DELIM


variable   : '$' (name|expression)

expression : START-DELIM (name|variable)+
             (START-INDEX num-exp END-INDEX)?
             (':' command)* END-DELIM

name       : (VARNAME)+

command    : '-' (EXPTEXT|variable)+
           | '+' (EXPTEXT|variable)+
           | 'o' (NUMBER ('-'|',') (NUMBER)?)
           | '#'
           | '*' (EXPTEXT|variable)+
           | 's' '/' (variable|SUBSTTEXT)+ '/' (variable|SUBSTTEXT)* '/'
             ('g'|'i'|'t')*
           | 'y' '/' (variable|SUBSTTEXT)+ '/' (variable|SUBSTTEXT)* '/'
           | 'p' '/' NUMBER '/' (variable|SUBSTTEXT)* '/' ('r'|'l'|'c')
           | 'l'
           | 'u'

num-exp    : operand
           | operand ('+'|'-'|'*'|'/'|'%') num-exp

operand    : ('+'|'-')? NUMBER
           | CURR-INDEX
           | '(' num-exp ')'
           | variable

START-DELIM: '{'

END-DELIM  : '}'

START-INDEX: '['

END-INDEX  : ']'

CURR-INDEX : '#'

VARNAME    : '[a-zA-Z0-9_]+'

NUMBER     : '[0-9]+'

SUBSTTEXT  : '[^$/]'

EXPTEXT    : '[^$}:]+'

TEXT       : '[^$[\\]]+'

The varexp::expand Function

The heart of libvarexp is the varexp::expand function, which is defined as follows:

void varexp::expand( input,  
  result,  
  lookup,  
  config);  
std::string const &input;
std::string &result;
varexp::callback_t &lookup;
varexp::config_t* config = 0;
 

The parameters are pretty intuitive: input is obviously a reference to the input buffer in which variable expressions should be expanded. result is a reference to the target buffer, where the expanded result will be stored. The contens of result will be overwritten by varexp::expand. It legal to provide the same string instance for both input and result if the original template is no longer required after the expansion.

The lookup parameter contains a reference to a user-supplied class that serves as the lookup callback for accessing variable's contents. Such a callback class must be derived from varexp::callback_t. More details on this topic can be found in the section called “Writing Lookup Callbacks” below.

The last parameter, config, can be used to change the lexical tokens of the parser's grammar. If you omit this parameter -- and thus pass 0 through the default value --, the default configuration will be used. The default configuration is what has been used in the examples throughout this manual; changing it will hardly be necessary. If you want to, though, because you want to disable looping or use variables of the form $(NAME) rather than ${NAME}, please refer to the section called “Configuring the Parser” for a detailed discussion.

In case of success, varexp::expand will return, otherwise, one of the exceptions listed in the section called “Exceptions Thrown by libvarexp is thrown.

Writing Lookup Callbacks

libvarexp's header file, varexp.hh, defines the abstract base class varexp::callback_t, which serves as an interface to user-supplied variable-lookup callbacks. The class is defined like this:

 varexp::callback_t {
virtual void operator()(const std::string & name,
std::string & data);

virtual void operator()(const std::string & name,
int idx,
std::string & data);

}

The first operator() is called by varexp::expand to resolve a normal variable such as $NAME. The parameter name will contain the name NAME in this case, and data is a reference to a string where the callback function should place the variable's contents.

The second operator() is called to resolve an array variable, such as ${NAME[i]}. The two parameters name and data have the same meaning in this case, but an additional parameter is provided, idx, which will contain the index i.

Either callback function may throw any exception it sees fit in case of an error, but there are two exceptions that have a special meaning: varexp::undefined_variable should be thrown by either function in case requested variable is not defined, and the array version of the callback should throw varexp::array_lookups_are_unsupported when it has been called but should not have been.

Throwing varexp::undefined_variable in case of an undefined variable is very important because in some cases this exception will be caught by varexp::expand -- for example during the looping construct! -- and changes the course of action in the routine. Any other exception thrown by these callbacks will leave varexp::expand and abort processing. Make sure your application catches them!

Sometimes it is useful to be able to determine the size of an array in the template. libvarexp does not provide any construct that would do that, mostly because most of the array's behavior is implementation defined anyway, but a good convention is to have the array callback return the size of the array in case a negative index is looked-up.

In order to illustrate how to write a callback of your own, here is a short example callback that will return variable from the Unix environment. The source code has been taken from the test program regression-tests/expand3.cc, so you might want to look there for further examples of more complex callbacks.

using namespace std;
using namespace varexp;

struct env_lookup : public callback_t
    {
    virtual void operator()(const string& name, string& data)
        {
        const char* p = getenv(name.c_str());
        if (p == NULL)
            throw undefined_variable();
        else
            data = p;
        }
    virtual void operator()(const string& name, int idx, string& data)
        {
        throw array_lookups_are_unsupported();
        }
    };

Configuring the Parser

One of the parameters passed to varexp::expand is a pointer to a date structure of type varexp::config_t. This structure defines the elementary tokens used by the parser to determine what is a variable expression and what is not. The structure is defined as follows:

 varexp::config_t {
char varinit ;
char startdelim ;
char enddelim ;
char startindex ;
char endindex ;
char current_index ;
char escape ;
char* namechars ;
config_t();
}

The structure has a default constructor that will initialize the members of the instance to the default values used throughout this documentation:

varexp::config_t()
    {
    varinit       = '$';
    startdelim    = '{';
    enddelim      = '}';
    startindex    = '[';
    endindex      = ']';
    current_index = '#';
    escape        = '\\;
    namechars     = "a-zA-Z0-9_";
    }

If want to use this default configuration, don't mess with a varexp::config_t structure at all; passing 0 to varexp::expand or leaving config out entirely will use exactly this configuration. If you want to parse a different syntax than the default, though, get a local instance of the varexp::config_t class, modify those values, and pass a pointer to the instance into varexp::expand.

The members of the structure have the following meaning:

varinit

This character defines the character that starts a variable in the input text.

startdelim, enddelim

These variables define the characters which must be used to delimit a complex variable expression.

startindex, endindex

These character define the characters used to delimit both an index specification to an array variable and the start and end delimiter of the looping construct. You may set these entries to 0 in order to disable array support and looping altogether.

current_index

This entry defines the character to be replaced by the current loop counter in an index specification.

escape

This entriy defines the character that will espace a varinit or startindex character in the input text so that varexp::expand interprets it literally and not as a special.

namechars

This string defines the set of characters that are legal for a variable name. The specification may contain character ranges.

Please note that it is possible to shoot yourself in the foot with an incorrect parser configuration. The namechars entry, for example, must not contain any of the specials defined above or the parser will not be able to determine the end of a variable expression anymore. There is a set of consistency checks that will be run by varexp::expand, which will throw an varexp::invalid_configuration exception in case the configuration is errorneous, but these checks will probably not catch all configurations that don't make sense. So better be careful when defininig your own configuration for the parser.

The varexp::unescape Function

The missing piece in libvarexp is the varexp::unescape function. It will expand the quoted pairs described in the section called “Quoted Pairs”. Its prototype, as defined in varexp.hh is:

void varexp::unescape( input,  
  result,  
  unescape_all);  
std::string const &input;
std::string &result;
bool unescape_all;
 

The parameters input and result are references to the input and output buffer respectively. It is legal to pass the same std::string instance as input and output if the original buffer isn't required anymore. The third parameter, unescape_all will determine whether varexp::unescape should expand only the known quoted pairs or whether it should expand all quoted pairs.

If this parameter is set to false, only the quoted pairs described in the section called “Quoted Pairs” are expanded; all other quoted pairs -- the unknown ones -- will be left untouched. If unescape_all is set to true, though, any combination of \a will be expanded to a.

You will need this parameter if you want to combine varexp::unescape with varexp::expand, because an input buffer might contain unknown quoted pairs that have a special meaning to variable constructs! One example is the quoted pair \1, which is used in regular expression search-and-replace. Another example is the string \${Not an variable}. These quoted pairs must be preserved for varexp::expand, so the usual approach for combining varexp::unescape und varexp::expand is to call the functions in the following order:

  1. Call varexp::unescape with unescape_all set to false.

  2. Call varexp::expand on the resulting buffer.

  3. Call varexp::unescape on the resulting buffer with unescape_all set to true.

This approach is illustrated in the example program shown in the section called “Example Program”.

varexp::unescape will return if no error occured. If the input buffer contained syntax errors, the apropriate exception as described in the section called “Exceptions Thrown by libvarexp will be thrown.

Exceptions Thrown by libvarexp

libvarexp throws various exceptions in case of a syntax error or when required system resources (memory) cannot be reserved. The complete list is found below. In addition to these, libvarexp may throw practically any of the exceptions thrown by the STL's containers.

All of the following exceptions are derived from the abstract base class varexp::error, so by catching this exception, you can catch all of them. The varexp::error exception provides the following interface:

 varexp::error : public std::runtime_error {
error(std::string const & what_msg);
virtual const char* what();
size_t current_position ;
}

As you can see, varexp::error is derived from std::runtime_error. This inheritance relationship also defines the what member function that will return a short, clear-text description of the error that caused the actual execption instance to be thrown. In addition to this member funcition, the member variable current_position is available, which contains the offset position in the input buffer that was parsed when the error occured.

Here is the complete list of all libvarexp-specific exceptions:

varexp::incomplete_hex

The input buffer ended before a hexadecimal \xaa quoted pair was complete.

varexp::invalid_hex

Any of the a characters in an \xaa quoted pair was not a valid hexadecimal character.

varexp::octal_too_large

The first digit of an octal \abb quoted pair was not in the range from 0 to 3.

varexp::invalid_octal

A digit of an octal \abb expression was not in the range from 0 to 7.

varexp::incomplete_octal

The input buffer ended in the before an octal \abb quoted pair was complete.

varexp::incomplete_grouped_hex

A hexadecimal \x{} expression contained an odd number of characters in the parameter.

varexp::incorrect_class_spec

In a character range specification a-b, the start of the range a was bigger (in terms of the ASCII code) than the end of the range b.

varexp::invalid_configuration

varexp::expand's configuration is inconsistent.

varexp::incomplete_variable_spec

Either, the input buffer ended right after a variable initializer token ($) was found, or a complex variable expression was not correctly terminated, meaning, that the closing } bracket was missing.

varexp::undefined_variable

This exception is supposed to be thrown by the user-provided callback when an unknown variable is requested.

varexp::input_isnt_text_nor_variable

This exception is throw in the rather unlikely case that the parser could not process the complete buffer, yet no error occured. When this should happen? Well, not at all. But since the error is theoretically possible, I defined it.

varexp::unknown_command_char

In an ${NAME:c} expression, c was none of the supported operations.

varexp::malformatted_replace

In an ${NAME:s…} expression, one of the required parameters is missing.

varexp::unknown_replace_flag

An unsupported flag was provided in an ${NAME:s…} expression.

varexp::invalid_regex_in_replace

The regular expression given as pattern in an ${NAME:s…} expression failed to compile.

varexp::missing_parameter_in_command

The required word parameter was missing in an ${NAME:-word}, ${NAME:+word}, or ${NAME:*word} expression.

varexp::empty_search_string

In an ${NAME:s…} expression, the search parameter was empty.

varexp::missing_start_offset

The start parameter was missing in an ${NAME:ostart,end} expression.

varexp::invalid_offset_delimiter

In an ${NAME:ostart,end} or ${NAME:ostart-end} expression, the delimiter between start and end was neither a , nor a -.

varexp::range_out_of_bounds

The stop parameter in an ${NAME:ostart,end} or ${NAME:ostart-end} expression exceeded the actual length of the string.

varexp::offset_out_of_bounds

The start parameter in an ${NAME:ostart,end} or ${NAME:ostart-end} expression exceeded the actual length of the string.

varexp::offset_logic

In an ${NAME:ostart,end} expression, start was larger than stop.

varexp::malformatted_transpose

In an ${NAME:y…} expression, one of the required parameters is missing.

varexp::transpose_classes_mismatch

The ochars range has not the same number of characters as the nchars range in an ${NAME:y…} expression.

varexp::empty_transpose_class

In an ${NAME:y…} expression, either the ochars or the nchars range was empty.

varexp::incorrect_transpose_class_spec

In a character range given in an ${NAME:y…} expression, the start of the range was larger (in terms of the ASCII code) than the end character.

varexp::malformatted_padding

In an ${NAME:p…} expression, one of the required parameters is missing.

varexp::missing_padding_width

The width parameter in an ${NAME:p…} expression was empty.

varexp::empty_padding_fill_string

The fill parameter in an ${NAME:p…} expression was empty.

varexp::unknown_quoted_pair_in_replace

In the replace parameter of an ${NAME:s…} expression, an invalid quoted pair was specified. Valid are only quoted pairs of the form \digit.

varexp::submatch_out_of_range

In the replace parameter an ${NAME:s…} expression, a submatch with a number greater than the number of submatches defined in the search parameter was accessed.

varexp::incomplete_quoted_pair

The input buffer ended right after a backslash character.

varexp::array_lookups_are_unsupported

This exception is supposed to be thrown by the user-supplied callback when the array lookup function is called even though arrays should not occur. If you don't intend to support arrays, though, you should disable them via the parser's configuration instead.

varexp::invalid_char_in_index_spec

The index specification of array variable contains an invalid character, a character that is not part of a num-exp that is.

varexp::incomplete_index_spec

The input buffer ended in an open variable index specification; meaning that the terminating ] delimiter was missing.

varexp::unclosed_bracket_in_index

An arithmetic group in an index specification was closed properly with a ) bracket.

varexp::division_by_zero_in_index

Division by zero error in index specification.

varexp::unterminated_loop_construct

The buffer ended in the midst of on open looping construct.

varexp::invalid_char_in_loop_limits

The looping limits specification of contained invalid characters.

Example Program

The following source code may be found in regression-test/expand5.cc. You might want to check the other test programs there for more complex examples. Especially expand6.cc, which also makes use of arrays and loops!

#include <cstdio>
#include <cstdlib>
#include <cerrno>
#include <cstring>
#include "../varexp.hh"
using namespace varexp;
using namespace std;

struct env_lookup : public callback_t
    {
    virtual void operator()(const string& name, string& data)
        {
        const char* p = getenv(name.c_str());
        if (p == NULL)
            throw undefined_variable();
        else
            data = p;
        }
    virtual void operator()(const string& name, int idx, string& data)
        {
        throw runtime_error("Not implemented.");
        }
    };

int main(int argc, char** argv)
    {
    const char* input =                            \
        "\\$HOME      = '${HOME}'\\n"              \
        "\\$OSTYPE    = '${$FOO${BAR}}'\\n"        \
        "\\$TERM      = '${TERM}'\\n";
    const char* output =                           \
        "$HOME      = '/home/regression-tests'\n"  \
        "$OSTYPE    = 'regression-os'\n"           \
        "$TERM      = 'regression-term'\n";
    string tmp;
    env_lookup lookup;

    if (setenv("HOME", "/home/regression-tests", 1) != 0 ||
        setenv("OSTYPE", "regression-os", 1) != 0 ||
        setenv("TERM", "regression-term", 1) != 0 ||
        setenv("FOO", "OS", 1) != 0 ||
        setenv("BAR", "TYPE", 1) != 0)
        {
        printf("Failed to set the environment: %s.\n",
               strerror(errno));
        return 1;
        }
    unsetenv("UNDEFINED");

    expand(input, tmp, lookup);
    unescape(tmp, tmp, true);

    if (tmp != output)
        {
        printf("The buffer returned by var_expand() " \
               "is not what we expected.\n");
        return 1;
        }

    return 0;
    }

License

Copyright (c) 2002-2010 Peter Simons <simons@cryp.to>
Copyright (c) 2001 The OSSP Project <http://www.ossp.org/>
Copyright (c) 2001 Cable & Wireless Deutschland <http://www.cw.com/de/>

Permission to use, copy, modify, and distribute this software for any purpose with or without fee is hereby granted, provided that the above copyright notice and this permission notice appear in all copies.

THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHORS AND COPYRIGHT HOLDERS AND THEIR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.