Archive Management
Digital archives present a special kind of data management challenge because they are very large and long-lived. Special attention must be given to the long-term cost of managing these systems. The degree to which archive systems manage themselves and minimize manual intervention can yield significant savings.
Self-compliant – A GridBank™ archive maintains policy compliance by encapsulating each file’s policies in an item, which also contains the file’s raw data and its metadata. Each item monitors its own compliance, periodically checking, for example, whether its content is authentic, or its retention period is still in force.
Policies can be set at several levels:
- Global Level – Globally for the entire archive, as with protection
- User level or Application level
- Item Level – For individual directories and files, as with retention
Self-configuring – Storage node servers can be added and removed on the fly without reconfiguring the cluster. The archive automatically redistributes data and metadata to ensure that the archive’s protection policy is fully enforced, and to balance load across all nodes.

Self-healing – The GridBank™ automatically detects software and hardware malfunctions in a node and alerts the administrator who can, if necessary, safely detach it from the archive. Because data and metadata are replicated across many nodes, the failure of one or more nodes is transparent to users accessing archive data. The GridBank™ Platform’s distributed metadata manager always finds an alternative source for any data that resides on a failed node. And the archive’s scavenging policy provides an extra layer of insurance against the loss or corruption of metadata.
Human intervention is rarely warranted, and usually only in response to system warnings that require outside action–for example, notification that the cluster load crossed a specified threshold, requiring the addition of new nodes.


