Permabits and Petabytes

August 22, 2008

Off the Grid, Out of the Cloud

Filed under: Jered Floyd — jeredfloyd @ 6:00 pm

No, not grid storage or cloud computing. I’m off to the wilds of Black Rock City, NV, and this year’s Burning Man festival, so you won’t be hearing from me for another two weeks after the recent flurry of activity. If I see anything particularly storage related, however, I will try to make a post!

In the meantime, be sure to check out the webinar Mike is doing next week; it promises to be interesting. Also, in addition to The Trouble With RAID, I’ve got a few more videos that will be going up over the coming weeks, explaining in more detail Permabit technologies such as RAIN-EC and SDR.

See you all again soon!

Deduplication is Not a Crime

Filed under: Jered Floyd — jeredfloyd @ 10:00 am

We’re starting to get deep into the election season, so the negative ads are coming fast and furious. Shadowy pictures and a scary voice saying things like “John Smith says that he supports healthy meals for school children, but could it really be because he’s fattening them up to be sold as meat to foreign terrorists? A child-eating terrorist supporter? Is that really the sort of person you want as your state representative?” The sort of manipulative FUD that scares people on an issue without actually presenting any evidence.

We get the same sort of thing in storage. It’s not seasonal, though.

For example, there’s been a good amount of FUD about deduplication. (more…)

August 21, 2008

Reducing Primary Storage Costs

Filed under: Jered Floyd — jeredfloyd @ 2:15 pm

I’ve written a lot here about how cost reduction is a primary driver for implementing an enterprise archive system, but I haven’t yet explained exactly how implementing a product like Permabit Enterprise Archive will directly (and immediately) save you money.

I’m going to be traveling for the next week and a half, but conveniently Mike Ivanov, our VP of Marketing, will be giving a webinar on this very topic while I’m away. Mike will be presenting Enterprise Archiving: Five Steps to Reduce Primary Storage Costs on Tuesday, August 26 at 1 PM EDT. If you can’t make that, he’ll also present again on Tuesday, September 9, also at 1 PM.

It’s a free webinar, and we’ll even send you a 1 GB USB flash drive to add to your collection. (I’m considering making some sort of tribal data storage necklace with all of mine.) The free 1 GB is not, of course, one of the five ways in which we save you money on storage, so maybe that makes it six if you attend?

Over the past few months we’ve done an extensive survey of storage and IT departments at enterprise customers. One of the big things we learned is that 72 percent of them are not seeing any growth in their IT budget, yet they have to keep up with storage growth of 50% or more. Everybody likes to save money, but reducing primary storage cost has now become critical.

I don’t want to steal Mike’s thunder by going into the details here on how enterprise archive cuts storage costs. I’ll write about that once I’m back in town, but for now, go tune in to his presentation! You can register here.

August 20, 2008

Are Fibre Channel and SCSI Drives More Reliable?

Filed under: Jered Floyd — jeredfloyd @ 9:04 pm

One of the adages of the storage industry has been “Fibre Channel and SCSI drives are more reliable than SATA and PATA drives”. This has always confused me. The technology in the spindles just doesn’t change that much, and in the past the difference between the SCSI and ATA models of a drive may have been as little as different drive electronics on the same spindle.

How could SCSI drives have been more reliable? Could it have something to do with them costing three times as much for the same amount of storage? Hmmm…

It used to be easy to find comparable drives in SATA and SCSI flavors, but that’s become increasingly difficult with the advent of 10K and 15K RPM drives. The drive manufacturers have created a false segmentation in the market, where 10K and 15K RPM drives are only available in SCSI, FC and SAS flavors, and almost never in SATA. Western Digital was the lone company that broke the rules of this cabal, but they seem to have been shamed back to offering only a single model, the VelociRaptor. Let’s hope the FTC decides to start looking into this. (more…)

August 19, 2008

Deduplication is Not a Feature

Filed under: Jered Floyd — jeredfloyd @ 8:23 pm

I’ve been writing an awful lot about deduplication lately, how it works, how it doesn’t, and how Permabit does it. I’ve been drumming it up a lot, so now I’m going to turn the tables and say something different: Deduplication doesn’t matter.

No, I’m not contradicting myself.

When you set out to buy an archive storage product, there are things that are features and things that are product characteristics. Examples of features are NFS protocol interface, unlimited volume size, low cost, and comes in blue, red or black. Examples of characteristics are Intel processor, SAS drives, number of gigabytes of RAM and, yes, deduplication.

These look like similar lists; what’s the difference? (more…)

August 18, 2008

Jet Engine on a Duck: You Can’t Retrofit Dedupe

Filed under: Jered Floyd — jeredfloyd @ 9:14 pm

As I wrote about last month, hash collisions are not something to be concerned about in a properly designed deduplicating storage system, despite what some FUD vendors would like you to think. You don’t have to take just my word for it; Curtis Preston wrote about this last year too.

In fact, hash-based systems are the most likely to be capable of handling enormous amounts of data for deduplication; the challenges in building an efficient system for matching hashes (or fingerprints) are entirely different from the sorts of problems storage system builders have had to solve in the past. Permabit set out from day one to build a system capable of efficiently deduplicating petabytes of storage, and this technology is realized in Enterprise Archive. We’ve developed our own file systems and distributed transaction managers to solve just these problems, so for any chunk of data written to our system we can determine in mere milliseconds if we’ve seen that information before.

Because we have this in-line processing of incoming data, there’s never a need for spare storage space to cache data for later deduplication, and there’s never a deduplication window where the system has to pause to “catch up” with data that’s been written so far. These are some of the key benefits to in-line deduplication. The story isn’t so good for non-purpose-built systems, unfortunately. (more…)

August 16, 2008

Two Types of Archives

Filed under: Jered Floyd — jeredfloyd @ 4:15 pm

Blocks and Files has rapidly become one of my favo(u)rite sources of daily storage news, partly due to content and party due to the understated, cynical British humo(u)r that pervades, continuing the tradition of more general tech news sites like The Register and The Inquirer.

Most recently they have published two related articles on the maturation of archive technology in the enterprise, and I think both are pretty much spot on. The first, “Archive Layer Cake” by editor Chris Mellor, primarily highlights the vertical integration of the technologies that bring data from primary source to archival residence and the second, “The Evolution of the Archive” by Plasmon marketing directory Steve Tongish, elaborates on the horizontal archive consolidation Chris touches on in the first. Both are trends we are also seeing here at Permabit. We agree with these views of the future of archiving, as evidenced by our recent partnership with archive software vendor Atempo. (more…)

August 15, 2008

Multiple Drive Failures: RAID 6 vs. RAIN-EC

Filed under: Jered Floyd — Tags: — jeredfloyd @ 1:03 am

In the middle of an article earlier this week on problems with MozyPro restore performance there was buried an interesting nugget:

A wildfire in July caused Santa Barbara to be hit with several power outages, which led to the failure last week of one of three drives in a Teddy Bear server’s RAID group. Before a replacement drive could be installed, another drive in the group failed, and the foundation’s data was lost.

This is far more interesting story than a backup to the cloud service failing to perform, as if anyone should be surprised there given the recent string of cloud-storage growing pains, because it’s a publicly documented double drive failure. Why do I find that interesting? (more…)

Blog at