86. flatkeydb — Flat Text File Database.

A simple database stored as a flat text file.

(C) 2005 Benedict Verhegghe.
Distributed under the GNU GPL version 3 or later.
class flatkeydb.FlatDB(req_keys=[], comment='#', key_sep='=', beginrec='beginrec', endrec='endrec', strip_blanks=True, strip_quotes=True, check_func=None)[source]

A database stored as a dictionary of dictionaries.

Each record is a dictionary where keys and values are just strings. The field names (keys) can be different for each record, but there is at least one field that exists for all records and will be used as the primary key. This field should have unique values for all records.

The database itself is also a dictionary, with the value of the primary key as key and the full record as value.

On constructing the database a list of keys must be specified that will be required for each record. The first key in this list will be used as the primary key. Obviously, the list must at least have one required key.

The database is stored in a flat text file. Each field (key,value pair) is put on a line by itself. Records are delimited by a (beginrec, endrec) pair. The beginrec marker can be followed by a (key,value) pair on the same line. The endrec marker should be on a line by itself. If endrec is an empty string, each occurrence of beginrec will implicitly end the previous record.

Lines starting with the comment string are ignored. They can occur anywhere between or inside records. Blank lines are also ignored (except they serve as record delimiter if endrec is empty)

Thus, with the initialization:

FlatDB(req_keys=['key1'], comment = 'com', key_sep = '=',
beginrec = 'rec', endrec = '')

the following is a legal database:

com This is a comment
com
rec key1=val1
   key2=val2
rec
com Yes, this starts another record
   key1=val3
   key3=val4

The readFile function can even be instructed to ignore anything not between a (beginrec,endrec) pair. This allows for multiple databases being stored on the same file, even with records intermixed.

Keys and values can be any strings, except that a key can not begin nor end with a blank, and can not be equal to any of the comment, beginrec or endrec markers. Whitespace around the key is always stripped. By default, this is also done for the value (though this can be switched off.) If strip_quotes is True (default), a single pair of matching quotes surrounding the value will be stripped off. Whitespace is stripped before stripping the quotes, so that by including the value in quotes, you can keep leading and trailing whitespace in the value.

A record checking function can be specified. It takes a record as its argument. It is called whenever a new record is inserted in the database (or an existing one is replaced). Before calling this check_func, the system will already have checked that the record is a dictionary and that it has all the required keys.

Two error handlers may be overridden by the user:

  • record_error_handler(record) is called when the record does not pass the checks;
  • key_error_handler(key) is called when a dunplicat key is encountered.

The default for both is to raise an error. Overriding is done by changing the instance attibute.

newRecord()[source]

Returns a new (empty) record.

The new record is a temporary storage. It should be added to the database by calling append(record). This method can be overriden in subclasses.

checkKeys(record)[source]

Check that record has the required keys.

checkRecord(record)[source]

Check a record.

This function checks that the record is a dictionary type, that the record has the required keys, and that check_func(record) returns True (if a check_func was specified). If the record passes, just return True. If it does not, call the record_error_handler and (if it returns) return False. This method can safely be overriden in subclasses.

record_error_handler(record)[source]

Error handler called when a check error on record is discovered.

Default is to raise a runtime error. This method can safely be overriden in subclasses.

key_error_handler(key)[source]

Error handler called when a duplicate key is found.

Default is to raise a runtime error. This method can safely be overriden in subclasses.

insert(record)[source]

Insert a record to the database, overwriting existing records.

This is equivalent to __setitem__ but using the value stored in the the primary key field of the record as key for storing the record. This is also similar to append(), but overwriting an old record with the same primary key.

append(record)[source]

Add a record to the database.

Since the database is a dictionary, keys are unique and appending a record with an existing key is not allowed. If you want to overwrite the old record, use insert() instead.

splitKeyValue(line)[source]

Split a line in key,value pair.

The field is split on the first occurrence of the key_sep. Key and value are then stripped of leading and trailing whitespace. If there is no key_sep, the whole line becomes the key and the value is an empty string. If the key_sep is the first character, the key becomes an empty string.

parseLine(line)[source]

Parse a line of the flat database file.

A line starting with the comment string is ignored. Leading whitespace on the remaining lines is ignored. Empty (blank) lines are ignored, unless the ENDREC mark was set to an empty string, in which case they count as an end of record if a record was started. Lines starting with a ‘BEGINREC’ mark start a new record. The remainder of the line is then reparsed. Lines starting with an ‘ENDREC’ mark close and store the record. All lines between the BEGINREC and ENDREC should be field definition lines of the type ‘KEY [ = VALUE ]’. This function returns 0 if the line was parsed correctly. Else, the variable self.error_msg is set.

parse(lines, ignore=False, filename=None)[source]

Read a database from text.

lines is an iterater over text lines (e.g. a text file or a multiline string splitted on ‘n’) Lines starting with a comment string are ignored. Every record is delimited by a (beginrec,endrec) pair. If ignore is True, all lines that are not between a (beginrec,endrec) pair are simply ignored. Default is to raise a RuntimeError.

readFile(filename, ignore=False)[source]

Read a database from file.

Lines starting with a comment string are ignored. Every record is delimited by a (beginrec,endrec) pair. If ignore is True, all lines that are not between a (beginrec,endrec) pair are simply ignored. Default is to raise a RuntimeError.

writeFile(filename, mode='w', header=None)[source]

Write the database to a text file.

Default mode is ‘w’. Use ‘a’ to append to the file. The header is written at the start of the database. Make sure to start each line with a comment marker if you want to read it back!

match(key, value)[source]

Return a list of records matching key=value.

This returns a list of primary keys of the matching records.

86.1. Functions defined in module flatkeydb

flatkeydb.firstWord(s)[source]

Return the first word of a string.

Words are delimited by blanks. If the string does not contain a blank, the whole string is returned.

flatkeydb.unQuote(s)[source]

Remove one level of quotes from a string.

If the string starts with a quote character (either single or double) and ends with the SAME character, they are stripped of the string.

flatkeydb.splitKeyValue(s, key_sep)[source]

Split a string in a (key,value) on occurrence of key_sep.

The string is split on the first occurrence of the substring key_sep. Key and value are then stripped of leading and trailing whitespace. If there is no key_sep, the whole string becomes the key and the value is an empty string. If the string starts with key_sep, the key becomes an empty string.

flatkeydb.ignore_error(dummy)[source]

This function can be used to override the default error handlers.

The effect will be to ignore the error (duplicate key, invalid record) and to not add the affected data to the database.