NAME

DocStrip – a documentation extractor

OVERVIEW

DocStrip extracts documentation embedded into C-like sources or scripts and outputs it in a given format.

It can handle two different sets of doc per single file – one for ordinary users and another for developers

DocStrip uses its own macro language to format docs.

DocStrip is written in Tensile and is part of its distribution (see sl(1))

SYNOPSIS

docstrip -mode=[user|dev[eloper]] -type=[man|plain|html] -options=global-opts -verbose -quiet -format=fmt-file-name -docpath=add-doc-path -docfmt=self-doc-file -define=macro-name[=value] -dir[ectory]=cur-dir -outdir=output-dir debug [infilename [outfilename]]
mode
chooses developer or user documentation (default user)
type
chooses output format (default man)
options
global options to pass to a DocStrip backend
verbose
print \trace output
quiet
do not issue error and warning messages
format
use format file fmt-file-name (default docstrip.fmt)
docpath
add add-doc-path to a path where DocStrip files are searched for Multiple options are cumulative. The default path is the current directory, then scripts/docstrip within Tensile location, then the top of Tensile location.
define
predefine a macro macro-name with value as its body
directory
changes directory to new-dir
outdir
uses output-dir as a prefix for output filenames (note that the prefix is added even if a filename has a directory part)
docfmt
specifies a file where documentation on defined macros should go
debug
turns of issuing debugging info (for developers only)
If no input filename is given, stdin is assumed; if no output filename is given, stdout is assumed

GENERAL STRUCTURE

DocStrip input is embedded into a source file inside comments. It support both C and C++ style comments as well as traditional shell hash comments. DocStrip block starts with ///, /*** or ###. The block ends at the first non-comment line or at enclosing */. Comment-introducing characters, asterisks and spaces at line beginnings are swallowed. Non-empty lines are joined together. Line continuations (aka \) are recognized.

Essentially, a DocStrip block is a sequence of text characters and macros introduced with a backslash. Macros may have arguments which are enclosed into brackets. A macro name is either a single character or a sequence of alphanumeric characters and underscores starting with a letter. If a backslashed character has no macro definition, it is treated as an ordinary After alphanumeric macro names spaces are swallowed. Whether they are kept after arguments is macro-dependant.

If a first line of a block doesn't start with a macro, it is treated as a subsection header.

Empty lines separate paragraphs.

There is a set of DocStrip primitives. Other common macros are defined in a format file. Some primitives expand their arguments using a limited subset of primitives. User-defined macros are expanded with arguments textually substituted, and the result of expansion is then re-examined.

SPECIAL CHARACTERS

Some characters and character sequences are treated specially by DocStrip backends They include:
~
non-breaking space (to get a real tilde, type it twice)
-
en dash –
--
em dash —
(c)
copyright sign ©
-
single right arrow →
--
same
<
single left arrow ←
<-
same
=
greater or equal ≥
>
same
<
less or equal ≤
==
double right arrow ⇒
<
double left arrow ⇐
<-
single bidi arrow ↔
<=
double bidi arrow ⇔
!
not equal !=
+
plus-or-minus ±
DEL
(ASCII 127) a special non-joiner character which is used to break the above ligatures

PRIMITIVES

Macro arguments are given in the line below its name. Each argument name is enclosed in brackets. Arguments which should not be empty are marked with an exclamation mark. Expandable arguments are marked with an asterisk. If a missing/empty argument is considered to have some default value, it is given after sign.

If a primitive itself is expanded in arguments, it is indicated with a plus after an argument list. If a primitive swallows spaces even after arguments, it is indicated with a @.

NOTE:DocStrip have some other primitives besides those listed below, but they are intended only for format-file developers and thence are undocumented here

href

[description*][location*]

Emits a hyper-reference whose user-visible part is description and which points to location. If a location is empty, it is guessed from the description. In particular, man page references in the form id(section) are recognized. A location may be either an URL or a anchor reference starting with #. In the former case, if a description is empty, it is considered equal to location. Otherwise, the result is backend-dependant.

anchor

[name*]

Sets an anchor to be refenced by href at the current point. If name is empty, a unique name is generated.

lastanchor

+

Expands to the most recently defined anchor name (useful with generated names)

gensym

+

Expands to a unique name suitable e. g. for anchor names

verbatim

This primitive takes the next input character as a delimiter and then treats all the following characters up to the next delimiter as ordinary. Special characters are not recognized within verbatim

begin

+

Sets a mark at the current point (not an anchor!)

define

[name!][body][argcnt][doc-string] @

Defines a macro name with body. The macro will accept at most argcnt arguments. A body may contain argument references which are given as #number or #(number). Literal hash should be doubled.

A name should not be defined.

If doc-string is not empty and a self-documentation file is defined, doc-string is written to that file prepended by a macro name as a subsection.

redefine

[name][body] @

Redefines an alredy-defined macro to be body. Argument count remains the same.

undefine

[name]

Undefines a macro name. If there is no such macro, does nothing.

let

[name][body*] @

Like redefine, but expands body

bodyof

[name] +

Expands to the body of a macro name or to an empty string if it is not defined No argument substitutions occur.

repeat

[count*][body] +

Expands to body repeat count times.

par

Starts a new paragraph

indent

[label*] @

Starts an indent block of text marked with label (if not empty). Sequential calls of indent are not cumulative

noindent

Cancels indenting caused by indent

list

Starts a list

item

[label*]

Starts a new list item (and a new list if there isn't one) possibly marking it with label. If there is no label, the marker is backend-specific, but usually a bullet will be used.

endlist

Ends a list. A list is also ended by the end of the block. Some backends may end a list accepting indent or sectionig commands.

endlists

Ends all the active lists.

options

[opts] @

Sends opts to the current backend. Options are a set of semicolon-separated pairs keyword=value

divert

[cmds!] @

Starts a diversion, so that all the following output is kept at a special location. cmds are to be executed after the diversion end. Diversions cannot nest.

onflush

[cmds]

Arranges cmds to be executed at the next flush (see flush)

onpar

[cmds]

Arranges cmds to be executed after the end of each paragraph

flush

Ends the current diversion if their is one, and executes onflushed commands

NOTE:An implicit flush occurs at the end of a block

insdiv

Issues the content of a previously defined diversion

if

[val1][val2] @+

Skips all the input until a matching else or endif if val1 != val2

ifx

[val1*][val2*] @+

Same as if, but arguments are expanded

ifnum

[op][num1][num2] @+

Performs a numerical comparison on expanded strings num1 and num2. op may be , or .

iftext

+

True if there was a text after the last begin or the start of the argument

ifdef

[name] @+

True if a macro name is defined

ifkept

[kind][item*] @+

True if there is a stored item in a table kind.

ifout

[type] @+

True if the current output type is type

ifuser

+

True if in user doc mode

ifdev

+

True if in developer doc mode

else

+

See if

endif

+

See if

include

[name] @

Includes a file name which is searched in the document path if doesn't start with `/', `./' or `../'

output

[type][name*][oprions*]

Uses a file name as a new output file of the type, sending it options. The current output file is remembered unless the first char of name is `='

If type is empty, the current type is assumed. If name is empty, the output is switched to the previous output file; if there is no such file, nothing happens. If options start with `;', options passed on command line are prepended

echo

[msg] @

Issues a msg to stderr if not in quite mode

trace

[msg*] @

Issues a msg to stderr if in verbose mode

cdecl

Extracts a C-like declaration from source following the end of the block and outputs it

cstmt

Like the previous, but operates on C statements

cdeclid

+

Expands to the identifier caught by the last cdecl

thisfile

+

Expands to the name of the current input file

source

[cnt] @

Extracts up to cnt lines of source following the end of the block and outputs them. A line containing \end terminates extraction. If cnt is empty, an infinite maximal number is assumed.

NOTE:The standard format file redefines this primitive so that it enforces developer mode

sourceindex

[kind] @

The source code that follows the block is treated as an array of string literals. Those are extracted and stored in a table kind

BUG:No escape sequences in literals are recognized

keep

[table][key*][property]

Stores a key associated with a property in a table. If a key has been already stored, the property is overwritten.

set

[table][key*][property*]

Like the previous, but expands its second argument

insertall

[table][cmds][sep-cmds][prefix*] @+

Expands to all the records from a table whose keys start with a prefix. cmds are executed for each such record, substituting a given key for %key; and its property for %prop;. Literal % should be typed twice inside cmds. sep-cmds are inserted between records (there is no substitutions here)

insert

[table][key*] +

Expands to a property associated with a key in a table; or to an empty string if no such key is stored.

zap

[table] @

Clears a table. If table is empty, clears all tables

add

[val1*][val2*] +

Expands to the sum of val1 and val2

divide

[val1*][val2*] +

Expands to the integral quotient of val1 and val2

length

[val*] +

Expands to the length of expanded val

alpha

[num*] +

Expands to numth Latin lower-case letter

Alpha

[num*] +

Expands to numth Latin upper-case letter

number

[val*] +

Expands to an opaque number representation suitable for correct sorting

unnumber

[num*] +

Converts that opaque representation back to arabic digits

date

Issues the current date

symbol

[sym] +

Issues a character with ASCII code sym

eat

a null primitive used to eat unnecessary spaces

builtin

[cmd]

Executes cmd as a primitive even if it is shadowed by a macro

protect

[arg] +

Prevents its argument from expansion

user

Marks the following doc as user's

dev

Marks the following doc as developer's

any

Marks the following doc as ambivalent

restore

Restores the user/developer mode before the last user, dev or any

fmtdoc

[self-doc] @

Writes self-doc into a self-documentation file if it is defined; otherwise does nothing

STANDARD FORMAT MACROS

There are some shorthands for primitives defined:
= \add
^
= \insert
&
= \protect
|
= \symbol[0x7F]
Internal-use macros are not documented here, as well as some DocStrip 1 compatibility macros.

comment

Skips its argument. May be written as \% too

space

Issues a space if there was a text after the last mark

incr

Increments a macro body by 1

section

[header]

Starts a new section

subsection

[header]

Starts a new subsection

paragraph

[header]

Starts a new paragraph

All the three commands puts a number before header if a corresponding level \numbering[1-3] is defined. \section and \subsection headers are always put into the table of contents, but \paragraph – only if \toc3 is defined

*

[index][entry]

Makes an index entry

see

[index][entry]

Makes a cross-reference for an index entry

plainindexof

[index]

Issues an index contents without section references

indexof

[index]

Issues an index contents with section references

em

[arg]

Emphasizes its argument

strong

[arg]

Strongly hilites its argument

sample

[arg]

Marks its argument as a sample

var

[arg]

Marks its argument as a variable name

code

[arg]

Marks its argument as a piece of code

type

[arg] Marks its argument as a type name

watermark

[arg]

Renders its argument invisible

copyright

[notice]

Defines a copyright notice

author

[name]

Defines the author's name. This command should go before any output-generating commands

bug

With no arguments, just marks the following text as a bug. If given two arguments, in addition, stores a bug record ##1 with a short description ##2

note

Marks the following text as a note

tip

Marks the following text as a tip

error

Marks its argument as an error message

todo

Stores its argument as a TODO-item. An optional second argument denotes priority (5 by default)

readme

Stores a README file chunk

returns

Mark the following text as a return value description

ref

Like href, but in addition stores its argument in the see-also list.
NOTE:It takes the only argument

undocumented

Marks the current text division as undocumented; stores a record in the TODO list

unimplemented

Like \undocumented, but marks an unimplemented idea

seealso

Issues a see-also list

toc

Issues the table of contents. An optional argument may denote a part of the table

maintoc

Issues the table of sections

group

Starts a group of related source objects

endgroup

Ends that group

C source annotations

The following macros take either a single argument or two. The first argument is the name of an object; the second, if present, specifies the number of objects. For most these macros, source code after the end of the block is examined, declarations are processed and names of objects are extracted. Explicit name arguments override those default ones. An index is built for every kind of an object with an obvious name functions, variables, fields)

function

Marks a function

functions

Marks n functions

struct

Marks a structure

union

Marks a union

field

Marks a field of a structure/union

arg

Marks a function argument

typedef

Marks a type definition

typedefs

Marks n type definitions

enum

Marks an enumeration

variable

Marks a (global) variable

variables

Marks n variables

macro

Marks a one-line macro

xmacro

Marks a multi-line macro; its end must be marked with \end

file

Describes the current source file

!

With an argument, annotates the following n lines of source; without an argument, annotates the following C statement

todolist

Outputs a to-do list (if there is one) to a specified file in a specified format

buglist

Outputs a bug list (if there is one) to a specified file in a specified format

trailer

Produces AUTHOR and COPYRIGHT sections if corresp. data were given; Generates TODO, BUGS and README files if the current backend is man. Generates TOC if the current backend is html. The argument specifies a suffix for those filenames.

BUG:Macro descriptions are too short

SEE ALSO

sl(1)

AUTHOR

Artem V. Andreev

COPYRIGHT

Copyright © 2001, 2002 Artem V. Andreev — See
sl(1) for details

Table of Contents