In order to prevent from deny of services, we do not use a centralise database but text files spread on all servers using GIT.
Consequently parsers needs a proportional amount of CPU and memory in regards to these file’s sizes (whereas databases without indexe do not). The generated HTML catalogue also requires much more place than a dynamic web site does (moreover, limitation should comes from the number of available inodes on the partition where the HTML catalogue is stored).
All in all, the MEDIATEX system is not designed to handle collections having more than half a million archives (whereas databases easily handle millions). It should handle several such “not so big” collections, but not toot much too.
Following tests are based on the GIT upgrade plus HTML catalogue generation, which is the more consumming query (and which imply parsing most meta-data files). It gives an idea of resources (size on disk, amount of memory and CPU time) involved.