Mediatex 0.9: Top

Abstract

Perennial archives means we provide the resources to overcome the technological obsolescence of supports and drives we use.

Objectives

The perennial archival comes with 3 main objectives: conserve the document, make it accessible and preserve it’s understanding.

Supports reliability: On numeric ages, time life of hard drive and optical supports is 5 years. The numerical archives on hard drives or tapes are sensitive to the earth’s magnetic field. Burning WORM supports and store them in good conditions provides a compromise (NF-Z-42-013 standard).

Identify: Document understanding may be linked to other documents defining a context. Consequently, it is important to provide a way to identify references from documents to others. Grouping documents into collection makes indexes easier and gives the perimeter of data to preserve.

Other datas: The same way supports need a compatible drive, application data need a software to read them. It is important to memorise which software and format versions are related to the data file, or best, to store software’s source and format specifications. In order to do so, we need meta-data.

Geographical duplication

In order to mitigate natural and technological disasters, archives must be duplicated on several sites.

Distribute meta-data as source code: Idea is servers share meta-data using a revision control system as programmers use it to share their source codes. This is not so obvious: meta-data files must be split into acceptable size so as to be merged in memory, must be human readable as automatic merge may fails and need a human arbitrage, and consume memory to be loaded. Nevertheless, revision control system does not only provides an elegant solution for geographical duplication, but also provide change’s history on meta-data. Although using a centralised deposit, this solution offer the advantage to de-synchronise updates, that is a way to anticipate crash recoveries.

Pull data into a cache: Servers by having up to date (or nevertheless converging) meta-data are able to populate their caches in order to backup unsafe files or to expose files needed from a support. They extract a file from containers (extraction meta-data) or retrieve it from an other server (connection meta-data). Indeed, if a cache is deleted, server will rebuild it from local and remote supports.

Collaborate for data access: Each server build the same static HTML index (compiling the descriptive meta-data) so as any load-balancer is sufficient to prevent form crash disaster (connection meta-data are shared too). URL on archives do not change as they point to a CGI script that will deal with all servers and return an HTML redirection to content into a server’s cache.

Reversibility

We shall consider that the ERMS becomes obsolete itself and we need to change it. A significant effort will be require to return archived content: compress, cut and send data using a media (support or network) and export the related meta-data. Archiving on supports make sense here as we just have to bring them back. Parsers designed to load the meta-data after updates offer a native software library usable to export them into any format.

Mediatex

Mediatex is an Electronic Record Management System (ERMS), focusing on the archival storage entity define by the OAIS draft, and on the NF Z 42-013 requirements. Copyright © 2014 2015 2016 2017 Nicolas Roche.

• Overview:		Description.
• How-to:		Quick overview.
• Invoking Mediatex:		General help.
• Relational schema:		Data files.
• Conceptual model:		Specifications.
• Use cases:		Examples.
• Troubleshooting:		Administration.
• Reporting bugs:		Sending bug reports.
• GNU Free Documentation License:		Copying and sharing this documentation.
• Concept index:		Index of concepts.

Mediatex 0.9

Abstract

Objectives

Geographical duplication

Reversibility

Table of Contents

Mediatex