Permabits and Petabytes

October 12, 2008

Dirty Little Secrets About Dirty Little Secrets

Filed under: Jered Floyd — jeredfloyd @ 5:08 pm

Over at eWeek, Chris Preimesberger has what looks like a brutal article on archive systems. On further inspection, though, it looks more like a damning (but nameless) indictment of EMC’s Centera. In that context, he’s spot on, but I wish he didn’t try to drag down all of archiving with one rotten apple!

Let’s look through the problems Chris identifies.

1: Scalability. CAS (content-addressable storage) archives have a hard limit on the number of objects that can be stored.

Let’s clear one thing up right here, Archives are not CAS. CAS — content-addressed storage — is a technology that some archive systems use for deduplication and data authenticity, but this has nothing to do with the system being used as an archive. CAS is not a market, is not a product, and is not an interface — it’s just a class of technologies. Technologies that, in general, make no difference to you from a user perspective. It’s time we moved beyond talking about “CAS”, because it’s just a red herring.

All storage systems have to balance resources, and some choose poorly. Have you ever run out of inodes in your file system because you had too many small files? That’s the same problem that he talks about here. You won’t run into that problem with a Permabit Enterprise Archive… we’re optimized to deliver top performance with an average file size as small as 8 kilobytes. Even with potentially small files like emails you’ll still be able to use all of your archive storage.

2: Performance degradation. As objects pile up in an archive, the speed at which the archive runs slows down tremendously.

Again, all storage systems have the potential to slow down as they fill, not just archive systems. But the (Nexsan-specific) answer seems particularly odd… it talks entirely about databases, specifically single vs. dual databases.

At some level I suppose you can think of any storage system as a very simple database. Block storage maps a LUN and block offset to a 512 byte block. File storage maps a share and path name to a file. Object storage maps an object name to a data object. How that’s implemented internal to the storage system, however, varies greatly.

I’ll agree with Bob Woolery that having a single, monolithic database is an absolutely wrong approach to any data storage. Scalability and performance will both suffer as that database becomes a bottleneck to all I/O with the system. But just mirroring your database, as Assureon apparently does, isn’t much better! You’ve simply doubled (or so) the capacity you can scale to before you start running into the same problems.

The only way to cleanly handle scaling is through a fully distributed architecture. There cannot be a single point, or small set of points, through which all data must flow or that will always become the system bottleneck. For this reason, Permabit Enterprise Archive has a fully distributed architecture. While every write operation involves multiple nodes in an Enterprise Archive system for redundancy and reliability, none of these nodes is required for all writes. Consecutive operations involve difference subsets of nodes in the system. As more nodes are added to the system to increase capacity, the same number of operations uses the same nodes less frequently. Conversely, the larger number of nodes are able to handle a greater number of operations in aggregate, increasing overall system performance as the system scales.

This sort of scalability of performance can only be achieved with a fully distributed system. Any architecture with single (or dual) databases will always slow down as capacity increases.

3: Data protection. The existence of the commodity hardware “back door.”

This seems to say that you can’t be assured of your data safety in a system that does not have integral storage, so don’t trust an archive product that acts as a gateway to a SAN. Again, I’m not sure what’s archive specific here. Even with integrated storage a malicious character can corrupt data, unless the drive bays on the storage appliance are wired to 50,000 volts.

There are a great many storage administrators using SAN gateway products for NAS and archive storage, and I’d bet they’d likely dispute this “secret”, but let me provide a far more compelling argument of why you should choose an appliance over a gateway — it’s far more cost effective. Why buy a pricey gateway to use on $30/GB SAN storage, when you can get an integrated Permabit appliance for $5/GB or less?

We have another good reason why Permabit Enterprise Archive is only available as an appliance, not a gateway, and it does have to do with data protection but not any multi-path “back door” concern. Permabit has developed our patent pending RAIN-EC data protection technology which is capable of providing data protection up to 250 times more reliable than RAID 6 on equivalent disks. RAIN-EC has advanced coding algorithms for distributing data across multiple disks in multiple nodes, so you’re protected not just against the loss of a drive but also the failure of any component (or multiple components) anywhere in the storage system. This requires precise data distribution in the system, the sort of which is not possible in a gateway architecture.

There are plenty of reasons to choose an integrated appliance for archive, but worrying about back-end access to the storage seems like the least of them. Most SAN administrators seem quite content with their SAN security.

4: Data migration: When an archive is moved, the files can become orphaned and the entire process could become exceedingly slow.

Here the complaints become almost non-sensical, focusing on the problems of proprietary APIs rather than anything related to archive at all. It’s a fair complaint that if you’re using a storage system with a proprietary API you have to bring the data back out through the original application to move it — if the original application allows for this at all. (Centera apps are notoriously tricky at this; once you have data in it may as well be in a roach motel.)

Even if you have a standard API or other interface, however, migration is still an issue. Consider purchasing a NAS-connected archive storage system and loading your petabytes of archive data into it. In three to five years, the vendor is going to come knocking on your door again, offering to sell you the latest and greatest. And also offering to sell you migration services. Even though that device has standard interfaces, migrations are still time consuming, costly, and risk-prone endeavors. I’ve seen many a migration project cost more than the new storage system put in place!

Permabit Enterprise Archive is designed to avoid these sorts of migration headaches. Because of Enterprise Archive’s grid-based architecture, there’s no single critical point in the system. Access nodes and storage nodes are all connected together via standard Ethernet, and individual nodes can be removed and replaced without any system downtime.

New nodes can be of different generations, different capacities, or even different storage technologies. In this way, the system can be organically, piecewise upgraded over time without ever having to go offline. Over 20 years, 50 years or longer you can continually refresh every component, taking advantage of industry improvements in storage density, power efficiency, and performance. Through this whole process you never once have to pay for migration professional services.

5: Energy efficiency. Not the best in most archive systems.

This goes back again to the “single database” confusion early on in the strange secret number 2, so I’m not quite sure what to say here other than to repeat that Enterprise Archive has no bottleneck central database.

Deduplication is an inherently green technology. If you can get even 2x deduplication out of your archive storage system, that’s half the number of drives that have to be manufactured, purchased and spinning — massive energy savings. Depending on data set we’ve seen savings everywhere from 20% to 300x, all of these reflecting energy efficiency over conventional primary storage.

Real “dirty secrets” you should be concerned about

Most of Chris’ storage concerns are valid… but only for a single vendor not actually named in the article. There definitely are dirty secrets that you should be asking your archive storage vendor about, though; consider the following:

Availability. Archive data may be infrequently accessed, but when it is needed it’s needed immediately. Does your archive product include full high-availability (HA) features? Is there any single point of failure that can take the system offline?

Reliability. Archive data may be the last and final copy of critical business information. How reliable is your archive storage system over the long term? Are you using older RAID technologies that might not hold up with modern high capacity drives?

Longevity. Archive data may need to be preserved for 50 years, 100 years, or even indefinitely. How does your storage system provide for long-term archival storage? How do I integrate new technologies? Will I have to migrate to new storage systems every few years to ensure data availability?

Scalability. Data sets for archive storage may be many petabytes in size, and deduplication rates will vary widely. How much disk can your archive storage system address? Don’t sell me a 30 TB box and tell me it stores a petabyte!

Cost. Long-term archive data needs to be stored inexpensively, both today and over the long term. How much is the real cost of storage in your archive storage system, before assuming any deduplication? How much will it cost to maintain and operate? How much will it cost to handle media migrations as components regularly reach the end of their usable lives?

Finally, a challenge. Move beyond “CAS” — I challenge storage industry writers to abolish the term by the end of the year. In the past six months I have not once heard a customer ask for “CAS”. CAS is a technology, not an interface, and not a user feature…. only the trade press seem to be keeping this term alive.

Archive storage is a tier of storage and a storage market. XAM is a storage interface for object storage. EMC Centera is a product with a proprietary API. There is no CAS as a market, as a tier, or as an interface. It’s time to kill CAS.


  1. […] Permabit has been shipping their cluster-based Enterprise Archive for two quarters. Permabit’s CTO, Jered Floyd, has a blog with a great post on why – among other things – it is time to stop talking about content-addressable storage (CAS). […]

    Pingback by StorageMojo » Cool kit at SNW — October 26, 2008 @ 11:16 pm

  2. Hi Jered,

    One addition to your last point: EMC Centera is a product with a proprietary API “AND an industry standard one: XAM”. Centera now supports both.

    All the best,

    Comment by Steve Todd — October 27, 2008 @ 9:37 am

  3. Jered,

    Given the purpose of an archive, my greatest concern is related to point #4, and something you and I discussed in the past.

    Assuming the integrity of the bytes on the storage medium are not compromised, and assuming the archive can reconstitute the data (itself an enormous challenge as you pointed out), there’s no guarantee that the data will be in a usable form.

    That is to say, the applications or complex systems that originally wrote the data may no longer exist in usable form. I like to use the example of a modern website consisting of HTML, style sheets, various client and server side code, one or more databases, and perhaps data generated by services external to one’s own organization.

    One could archive a PDF of some single state (or several states) of the website, or save the collection of files and data to an archive. But one must ask if either of those methods are sufficient to meet the expectations of the business (or regulatory/legal requirements such as the FRCP)? Can the archived data be reassembled in a meaningful way?

    To reconstitute a website, beyond a simple reconstitution of its files and data, one would need compatible, operational versions of every application used by the site at the time it was archived, and a means to restore the files and data into those environments. Programming language interpreters/compilers, operating systems, databases, application servers, content management systems, office applications, browsers, etc.

    This is equally true for most complex business systems including CMSs, CRMs, ERPs, SCMs, etc. And with the growth in popularity of systems developed using loosely coupled services, I believe the problem will only worsen over time.

    The roach motel metaphor falls short. It’s analagous to wearing mittens to build a 50,000 piece puzzle from a box with no picture on the cover to help guide its reconstruction.

    Until we find a suitable range of solutions for this challenge, today’s long-term digital archives are (for the most part) bit buckets from which we hope we can retrieve meaningful information in the future.

    Comment by josephmartins — October 27, 2008 @ 4:05 pm

  4. Steve,

    Fair enough — Centera now does support XAM. As a co-chair of the FCAS TWG, I certainly appreciate the support!

    Why doesn’t Atmos use XAM?


    Comment by jeredfloyd — November 20, 2008 @ 12:54 pm

  5. Joe,

    That’s a great point, and one that’s not given enough attention by archive storage vendors — after all, it’s an application problem. đŸ˜‰

    When I was spending more time on SNIA LTACSI, the Long-Term Archive and Compliance Storage Initiative (quite a mouthful), I tried to draw a distinction between what I call “physical readability” and “logical readability”.

    Physical readability means that I can get back the bits that I wrote 100 years ago, in the exact order I wrote them, with perfect fidelity. Logical readability means that I can extract the semantic meaning from those bits, the same meaning that they had 100 years ago. The first problem is purely a technical one, and the second one is sadly not.

    Physical readability is something that storage vendors can address without making the customer change the way they do business. Permabit Enterprise Archive, for example, allows seamless refresh to new media or even to new kinds of technology simply by adding new nodes to an existing system. Since there’s no core component in the grid architecture to go out of date, the entire system can be piecewise refreshed over time, multiple times.

    Logical readability, on the other hand, requires that the customer change the way they handle data. It requires that they adopt widely standardized and widely documented, or self-documenting, data formats. It requires that they consider archiving virtual machine images with critical applications that are no longer supported, and identify how much of the environment must be virtualized to maintain long-term access. And most painfully, it requires that they immediately migrate data from formats that are already unsupported, lest it become more expensive to do so in the future.

    Archive is a class of storage, storage that can solve the physical readability issues, but it’s also a set of processes. You can’t implement the storage without a plan to use it as part of a complete archive environment.


    Comment by jeredfloyd — November 20, 2008 @ 1:07 pm

  6. […] Archive Challenges Filed under: Jered Floyd — jeredfloyd @ 5:40 pm After my post about dirty little secrets a few weeks ago, Joe Martins from Data Mobility Group wrote to point out the real “dirty […]

    Pingback by No Silver Bullet: Archive Challenges « Permabits and Petabytes — December 3, 2008 @ 5:46 pm

  7. […] storage) is “breakthrough technology” for archiving. Which is odd. In that industry insiders appear to have a somewhat different opinion on the value of CAS for archiving. Moreover, others point out that CAS is, itself, somewhat of a […]

    Pingback by » Blog Archive » “Top HPC trends” … or are they? — February 23, 2009 @ 8:23 am

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Create a free website or blog at

%d bloggers like this: