My data storage and backup system

Inspired by Steven Frank’s description of his own system, I thought I’d take a moment to document how I store and backup our important data.

Backup Objectives.

  1. I want to get back up and running quickly if a computer’s startup drive dies.
  2. I want to be able to easily restore historical versions of important lost or modified files.
  3. I want to have important files stored ‘offsite,’ in case the house burned down.
  4. I want my backup system to be automated. If any part of it depends on me remembering to do something, then it will fail. (I’ve confirmed this, through personal experience, as a universal natural law.)
  5. I’d like some redundancy.

Software/Services.

My backup system relies on the following software and services:

  • SuperDuper will mirror one bootable drive to another. It can be scheduled to run automatically.
  • Time Machine. Time Machine is Apple’s archiving backup utility, which runs automatically.
  • ChronoSync is a general purpose backup tool for OS X, that can be scheduled to run automatically. Critical to my system, is its ability to create an archive of changed or deleted files.
  • Backblaze is an online service that, for $50 per year, will backup an entire computer, along with any disks that are attached to it. The service provides unlimited storage. You can restore files online, or have Backblaze FedEx you DVDs or USB drives.
  • Dropbox is an online service that creates a local folder on your Mac, the contents of which are sync’d to Dropbox’s servers (Amazon S3). When installed using the same username/password on two or more machines, Dropbox can be used to keep a shared folder in sync.

Context.

My home computer is a Mac Pro, which serves the following functions:

  • iTunes server (to other computers on our home network, and an Apple TV)
  • Master repository for our important files
  • My wife’s work computer

It has four internal drives installed:

  • Everest. The startup drive. Apart from applications and preferences, the only user data it contains is my wife’s account and work files.
  • EverestMirror. A bootable mirror of Everest.
  • Pumori. This is a 2TB Hitachi drive that contains the master repository of all our important files — iTunes music and videos, home photos and videos, non-current archives of personal and business documents, purchased software installers, etc.
  • Time Machine. This is a 1TB drive, used as a Time Machine target for Everest.

In addition, the Mac Pro has a Drobo attached, whose file system looks like this:

  • /Archives
  • /Backups
    • /Pumori
    • /Dropbox

The Setup.

  • To achieve quick recovery in case of startup drive failure, I have SuperDuper automatically mirror Everest to EverestMirror each night. A Growl notification lets me know this went OK each morning. (Objectives 1 & 4)
  • All of our personal and working current files are kept in Dropbox. This keeps them synchronized between my wife’s area of the Mac Pro, and my personal MacBook, and provides an online backup of them. (Objectives 2, 3, 4 & 5)
  • As a first line of recovering a lost or modified file from my wife’s working area, I have Time Machine archiving Everest to the “Time Machine” drive. (Objectives 2 & 4)
  • As a second line of recovering a lost or modified file from a portion of my wife’s working area (/Users/Wife/Dropbox), or from Pumori, I have Chronosync perform an archiving backup to the Drobo (in /Backups/Pumori and /Backups/Dropbox) each night. To avoid filling up the Drobo, I have Chronosync keep a maximum of five archived copies of any given file. I have Chronosync also configured to cleanup the archives by deleting files over 180 days old, but to always preserve at least one archived version of every file, regardless how old it is. Finally, I have Chronosync configured to email me if anything goes wrong during its nightly backup.(Objectives 2, 4 & 5)
  • For offsite backup, I have Backblaze backup Everest and Pumori. Given the large amount of data on Pumori, I decided to exclude iTunes “Movies” and “TV Shows”. They’re already backed up on the Drobo, and I figure in the very worst case, that the house burned down, I could just repurchase those as I wanted to view them again. Music, on the other hand, is something I wouldn’t want to have to repurchase, and so Backblaze does keep that backed up (approximately 45 GB.) (Objectives 3 & 4)

Final Notes.

  • I also use the Drobo to archive files that are large, uncritical and not backed up. For example, original DVD rips of movies (the VIDEO_TS folders).
  • For probably a decade, I’ve used File Buddy to search for files (in my archives), when needed. It’s very configurable and fast.
  • Backblaze is great. Since first purchasing it, I’ve bought licenses for my own MacBook, our office server, and my mother’s computer. (More precisely, I’ve added those computers to my single Backblaze account, so that I can manage and restore from them all in a single web interface.)
  • Dropbox is also great. We’ve been using it extensively for a couple years now, and it has never failed us — which is quite a feat, considering the large number of files it manages, and the crazy folder reorganizations we’ve performed!
  • Both Backblaze and Dropbox also keep archived versions of changed or deleted files, but I prefer the convenience and configurability of my own solution. It’s nice to know those other versions are there, though.
  • As a curiosity, I suspect both Dropbox and Backblaze make use of “Content Addressable Storage,” which is a technology to allow unique representation of binary data. For example, I’ve dropped a huge Adobe Installer DMG file into Dropbox, and watched it almost instantly appear sync’d with Dropbox’s servers. This means that some other Dropbox customer already uploaded that particular file, and my own Dropbox recognized the “signature” of that particular file, and realized that it didn’t need to be uploaded again to their servers. Very clever. Along with multiplying the profits of Dropbox (since they pay a one-time storage fee to their provider for each N payments from their customers, for a given file), this also benefits users, as folder reorganization doesn’t require lengthy re-uploads of your files.

Update 2016-06-02

Since writing this article, I’ve made some changes:

  1. I now have my entire backup strategy based on CrashPlan, as documented in this article.
  2. For file synchronization and sharing, we now use BitTorrent Sync, a peer-to-peer system which is free and doesn’t involve any third-party servers, as documented in this article.

5 thoughts on “My data storage and backup system”

  1. I have an almost identical setup with my computers, and with my backup. It’s uncanny how similar, in fact, with the exception of ChronoSync. I’ll have to take a look at it. Sounds very intriguing.

    I was curious how you set up SuperDuper to fire off a Growl alert. I assume you changed the settings for SuperDuper in the Growl Prefs Pane? Or did you use a shell script from within SuperDuper?

    I’ve recently started to move my offline backup from BackBlaze to CrashPlan. It offers a cloud backup like BackBlaze, but provides two additional benefits.

    First, you can backup between any two computers, not just to their service, and it costs nothing. I have both my parents’ computer and my wife’s iMac backing up to backup folders on my Drobo. Then, I back up my system to an external 2Tb drive I sent to my parents (what’s cool is I did a backup to the drive myself first, then FedExed it to them, and CrashPlan picked up where it left off…slick). I’m also backing up to CrashPlan’s cloud-based servers, but I’m not sure I need the redundancy. For $5/mo, though, I might just keep it.

    Second, CrashPlan is much better behaved in terms of CPU load. You can throttle the % of the CPU used when at and away from your computer, and I find CrashPlan to obey these settings faithfully.

    Only downside is that the client is a Java app with a very un-Mac-like UI. It’s decent enough (though not as simple as BackBlaze) and given that it’s used only rarely, I can live with that.

    Great post. Thanks for sharing.

  2. Sir,

    I found this article very interesting following your previous backup strategy article nearly five years ago. That article made me seriously think about backup and realize that my strategy then was lacking. I changed and to this day, I still engage the strategy used by you at that time (alas, no Drobo for me), but use Dropbox and Backblaze as well.

    I would be most interested to know your current view on Synchronize! Pro from Qdea, having regard to today’s backup solutions. This is relevant to me as that software was central to your backup strategy nearly five years ago (and is still used by me), but did not get a mention in your recent article.

    Thanks.

  3. Hello, We currently run Tresorit in all relevant personal and business data and use TimeMachine (backing up to a Pro w/ OSX Server running ) has an extra layer of redundancy. We had a dedicated 1TB hard drive that stopped working a couple of weeks ago so I was diving a little into the affordable backup theme to decide if we should buy new hardware or select a cloud solution.

    Then I ran into your 6 year old blog and found it interesting. You have a very robust backup system.

    If I may, does Backblaze use a zero-knowledge technology or do they administrative and support staff have access to your data? I have been looking at there site but wasn’t convinced of the first.

    I really don’t feel comfortable to relinquish my financial information, project planing and photos to a cloud service that doesn’t use zero-knowledge. That’s the main reason we left Dropbox for Tresorit a couple of years ago.

    Thank you for your post and, in advance, for some extra guidance.

    Kind Regards, Rui

    1. Hi @disqus_LPvSeL9Sfg:disqus — Since writing this article, I’ve changed my system to this.

      http://go.dafacto.com/dmbw3

      1. For synchronization and sharing of files at work and privately, we use BitTorrent Sync, which is a peer to peer system that doesn’t involve any third-parties at all. And, it’s free for up to 10 sync’d folders!

      2. For both local and cloud backup, we use CrashPlan, which allows you to specify an passphrase that’s used to encrypt the data that’s stored on their servers.

      Finally, for more information about my BitTorrent Sync setup, there’s this:

      http://go.dafacto.com/o0p2c

      Hope this helps!

      1. Hi Matt,

        Thank you for your reply, it was very helpful.

        Since then I have been taking a closer look into CrashPlan and started testing it yesterday. I expect to have it implemented sometime during today and tomorrow I will simulate a recovery situation to have a better in-deep view of how it would work (if need be).

        Kind regards, Rui

Agree? Disagree? What do you think?

This site uses Akismet to reduce spam. Learn how your comment data is processed.