Boulder, CO, April 10, 2018 – – The Active Archive Alliance today issued its State of the Industry 2018 report. Organizations are quickly learning the value of analyzing vast amounts of previously untapped archival data. Industry studies suggest that only about 20% of all digital data is ever accessed or used again after it is stored, underscoring the archival challenge. The need to effectively store, search for and retrieve enormous volumes of archival content is fueling new advancements in archive solutions. This report describes the state of the archive market and the role the active archive plays.
Relentless Data Growth Fuels Digital Archive Market
Newly created worldwide digital data is expected to grow at 30% or more annually reaching 163 zettabytes (1×10 bytes) by 2025, according to IDC. Data is rapidly piling up in archives as retention requirements of up to 100 years to forever are now common. The top external factors driving archival and long-term retention requirements include government compliance regulations, growing dependency on security and surveillance systems, advanced 3D and 4D video capabilities, content producers in Media & Entertainment, the relentless growth of Big Data analytics, and the emerging IoT (Internet of Things).
A key emerging trend for new data creation indicates that even though transactional (high IOPs) data and high-performance applications are steadily growing, the amount of reference and archival data is growing faster. This is primarily due to countless regulations and the sheer amount of data “yet to be analyzed” in the future in the anticipation that some potential value might be discovered. Most data typically reach archival status in 90 days or less. IDC estimates that by the end of 2025, only 15% of the data in the global data-sphere of 163 zettabytes will be tagged and only 20% of that will be analyzed. Therefore, if 80% of the data created is analyzed, that data will likely reach archival status upon creation. Recent analyst surveys indicate that less than 40% of corporations have a dedicated archive strategy in place. This reveals a huge, unaddressed and growing archive challenge that is ready to embrace modern archiving concepts.
These enormous data volumes will require that businesses build their storage architectures by optimizing SSD, HDD and tape in tiered storage solutions. The greatest economic advantages of tiered storage occur when tape is included. Classifying your data upon its creation by its value, performance and capacity requirements will enable the right data to be in the right place at the right time. Tier 3 is typically referred to as the archive tier with an average of 60% of data classified as archival upon creation.
Backup and Archive Are Entirely Different Processes
It’s important to distinguish between backup and archive as these core IT processes are and are often misunderstood. Many businesses still use backup copies to store archival data and repeatedly back up unchanging archives wasting HDD space. Backup and archive are entirely different processes and have different objectives.
The backup process creates copies of data for recovery purposes which may be used to restore the original copy after a data loss or data corruption event. Backups are cycled and updated frequently to account for and protect the latest versions of important data assets. Archiving moves unchanging and less frequently used data to a new location(s) and refers to data specifically selected for long-term retention. Archival data is typically unchanging, and is not overwritten.
The Active Archive Re-awakens the Archives
The active archive supports file, block or object storage systems using advanced data management software to maintain end user accessibility to archival data regardless of the storage device it is residing on. Intelligent data management software provides faster online random access, search and retrieval capability for archival data in a single virtualized storage pool, and automatically migrates data between storage tiers based on user policy. The widespread usage of SSDs and high capacity HDDs coupled with tape’s highly favorable economics, security, and archive characteristics have propelled the successful emergence of active archives.
Active archiving implementations can use existing storage devices to build an integrated hardware and software solution and can incorporate enhanced file systems such as the open standard LTFS (Linear Tape File System) for Linux, Apple and Windows, or TAR (Tape Archive) for Unix systems. For those who do not want to build their own repository using existing equipment, several vendors offer preconfigured active archive appliances and file systems which work with most any tape library back-end.
Digital Archives Embrace Object Storage
Archiving is the earliest enterprise use case for object storage, having been used for over a decade providing scalable, long-term data preservation. Object storage enables IT managers to organize archival content with its associated metadata into containers to easily allow retention of massive amounts of unstructured data. In July, 2017 IBM Spectrum Archive™ Enterprise Edition V1.2.4 which uses LTFS, announced connection with OpenStack Swift to enable movement of cold (archive) data from object storage to more economical tape and cloud storage for long-term retention. LTFS now provides a back-end connector for open source SwiftHLM (Swift High Latency Media), a high-latency storage back end that makes it much easier to perform bulk operations using tape within a Swift data ring. Cloud storage is the most prominent use case for object storage.
An Active Archive Ecosystem Can Include Storage, Software and the Cloud
By providing a persistent online view of archival data by integrating one or more archive technologies (SSD, HDD, tape, and software – the active archive ecosystem) behind a file system, the active archive is a form of the widely used tiered storage concept specifically targeted for the data archive function. The tiered storage concept allows a system administrator to define policies for automatic data migration and retention to control the movement of petabytes of data from more expensive to less expensive storage systems. An active archive can be implemented on-premises, in the cloud, or in both places as offering archival data services quickly gains momentum for cloud providers. Active archiving brings the same benefits to the cloud (private or public) that it does to the data center.
Select Case Studies Highlight the Value of Active Archive Solutions
Conclusion and Outlook
Enterprises archive their data because they either to or because they to, but either way, the magnitude of this requirement can quickly become overwhelming. Future SSD, HDD, and tape technology roadmaps promise continued innovation, reliability, and capacity increases with lower costs providing archiving the necessary infrastructure. With the amount of archival data soaring and no end in sight, active archiving is poised to play an increasingly important role in tomorrow’s data centers as it reawakens the archives. Are you prepared to manage the relentless growth of archival data that lies ahead? It’s time to develop your game plan.
About the Active Archive Alliance
The Active Archive Alliance launched in 2010 as a collaborative industry association formed to educate end user organizations on the evolving new technologies that enable reliable, online and efficient access to their archived data. The goal of the Active Archive Alliance is to encourage a multi-vendor effort to align the education and technologies needed to meet the rapidly increasing requirements for archival data by addressing the following:
1. Reduce the complexity of long-term data storage
2. Provide scalable storage solutions
3. Reduce total cost of ownership for long-term data storage
4. Reduce risk of non-compliance and data loss
Note: The Active Archive Alliance includes representatives of FUJIFILM, HGST, Komprise, Quantum, Spectra Logic, StorageDNA, and StrongBox Data Solutions.