Intersystem Storage Tiering Strategy

Intersystem storage tiering historically hasn’t been widely used because of the labor-intensive, manual processes involved when moving data from one system to the other. Interest has spiked, however, with the advent of the cloud, big data and data lake analytics. Past methodologies, such as hierarchical storage management  and open source manual file copying utilities, such as rsync and Robocopy, aren’t going to cut it.

HSM is based on stubs. When data is moved from one system to another, it leaves a small stub in place of the original data. When applications or users access the data, they’re actually accessing the stub, which goes and retrieves the data, rehydrating it to the original storage.

There are, however, problems with HSM, including increased cloud storage costs. It’s relatively cheap to store data in the cloud, but there are egress fees when copying data out, as happens with HSM. This approach is also binary. If the data is moved from the secondary storage to another storage system or cloud, the HSM stub breaks because it can’t find the data.

Better intersystem tiering is available today. Some products — such as Dell EMC’s ClarityNow, Hammerspace, Komprise and StrongBox Data Solutions’ StrongLink — mount the primary storage system with admin privileges. This enables the tiering software to read all of the data. It then copies it out based on policies to one or more secondary or tertiary storage systems, including cloud and tape. Policies allow the original data to be deleted from the original storage, while a global namespace enables direct instant access to the data where it resides instead of rehydration to the original storage.

Others products, such as InfiniteIO, sit in front of the storage, looking like a switch. Data is moved from one storage system to another or to a cloud via policy. It should be noted that this type of tiering is primarily used for unstructured data, which represents more than 80% of stored data.

