Overview
The french Centre for Topography of the Oceans and the Hydrosphere,
CTOH, applies innovative algorithms for processing altimetric data for
oceans, continental hydrology and glaciology applications. Building
on over 20 years of experience, a new data distribution system,
PyCTOH, provides uniform DAP and web access to terabytes of
heterogeneous data.
The architecture of the PyCTOH system will be described from the
bottom up: storage of datasets, catalogue indexing and access, data
retrieval for front-end servers and end-users, and visualisation. Its
main developments guidelines are optimizing the user experience,
system scalability and sustainability.
At a glance, PyCTOH stores data by splitting them up into a number of
storageset files stored in a data-storage cluster and indexed
in a relationnal database. Retrieval is a matter of knowing which
storageset may have data according to user’s query, and of
efficiently retrieving these storagesets from data-store.
PyCTOH is very modular, so that it can be integrated with
a wide range of network architectures, depending on users’ needs and
network managers’ means.
Proposed setup targets at serving terabytes of data at gigabit/s
throughput using free software on commodity hardware, still having all
modern servers nifty features: automated deployment, load-balancing,
fail-over, etc. PyCTOH will take advantage of high-end hardware,
though.
Custom developments has been kept to the bare minimum, using
existing quality softwares wherever possible. Hardly just DAP server
and Catalog interface needed to be developed, thus alleviating
maintainance costs.
Detailed documentation is a necessary condition for general usability
and maintainability.
An user guide details how to issue DAP requests to PyCTOH and how to
use the web-based interface.
An administrator manual helps system managers in installing, tuning and
maintaining their architecture.
Finally, a developer companion is available for those wanting to
implement their own acquisition data-format.
All these documentations are available at http://pyctoh.nongnu.org.
Hardware fails. Fortunately, PyCTOH can live with that. Thanks to its
carefull design, zero-point-of-failure fail-over and load-balancing
can be achevied with a limited number of commodity-hardware servers.
Moreover, additional nodes can be integrated, as compute or storage
needs increase.
Also, the use of wide-spread and well-known softwares (Postgresql,
etc) eases delegating administrative and maintainance tasks. Your
sysadmins should appreciate.
Free software building blocks help rapid development, easy
maintainance and standard compliance.
Furthermore, PyCTOH is also free software (licenced under the GNU Affero
AGPLv3), with sources available at http://savannah.nongnu.org/p/pyctoh/.
PyCTOH is new software. A lot of developments are still beeing made,
mainly for better DAP-compliance.
It is however running at CTOH on a 3-nodes cluster, serving about 10TB
of data from various altimetric missions (Jason, ENVISAT, GFO, etc)
and models (FSLE). More and more missions data are beeing integrated,
and PyCTOH installation is expected to seamlessly scale.
Recent news are published at
http://savannah.nongnu.org/projects/pyctoh/.