Saturday, April 25, 2009

Ma.gnolia Data Loss and Future

I've been watching the data loss at with some interest. Ma.gnolia is (was) a social bookmarking web site, with a small, but engaged community of users. Amongst the social bookmarking websites, we felt it was one of the stronger competitors with (

But, on Jan 31, 2009, they had a catastrophic data loss and the site went down. At first, it looked like they would recover in a few hours. Hours dragged on into days, and days into weeks. Even now, some 3 months later, they have not returned. They did provide some data recovery tools to their users to try to reclaim their bookmarks from publicly available copies on the web.

I just finished watching this video interview of Larry Halff, the creator behind ma.golia. At first, I thought they were being a bit flippant about losing the all their users's data, but Larry proved to be very forthcoming and contrite about the episode.

The interesting facts and insights from this video are:

  1. They lost 1/2 Terabyte of data.
  2. They did not test their backup system.
  3. They had only one backup (not a rolling backup).
  4. They ignored operational issues/performance as site grew.
  5. Ma.gnolia was running on 4 mac mini's running as web servers with 1 main database server and 1 backup server
  6. Despite using software raid storage, file system corruption caused catastrophic data loss and prevented their backup from working.
  7. Mag.nolia "never made money" (completely self funded), but did build a loyal community of users.
  8. Larry feels that "destination social bookmarking" does not have a good prognosis - more people are using the large social sites (Facebook, Twitter) for sharing links.
  9. Ma.gnolia will return hosted on AWS (cloud infrastructure) with better backup systems in place. It will relaunch as a private beta and be invitation only (he doesn't say why).
  10. Larry suggests that web 2.0 sites disclose more about their infrastructure and backup procedures, so users can be more comfortable knowing how their data is being managed.

With this in mind, I think it appropriate to disclose some of the behind-the-scenes facts about

  1. Faves was built by a professional development team (7 developers) beginning in 2005.
  2. After having raised multiple rounds of funding, we realized that we were not earning enough money via advertising to sustain that team. In the Fall of 2008, had to lay off all full time employees.
  3. is still be operated by myself, and 1 part-time operations person.
  4. While somewhat downsized, we retain a fairly large scale data center with 4 front-end web servers, and 6 backend database servers.
  5. We perform nightly incremental backups of all of our data, as well as complete datbase snapshots taken on a weekly basis.
  6. We have NOT done a recent full-scale data recovery test, though we used to restore a snapshot of our site to our test server we used for internal develpment.
  7. We recently raised a small round of financing which will enable to continue to operate for the foreseeable future (2+ years, even without increasing our site revenue).
  8. We also have concerns about our site performance, and have prioritized migrating to a cloud-based backend.

As we evolve the service, we are looking to be more relevant to users who are increasingly adopting Facebook or Twitter as the primary means of communicating with their friends, family, and co-workers. We are also finding that a large proportion of users of our site are not authentic "bookmarkers", but rather are creating links to other web sites for marketing, or spam.

Our challenges, and priorities, going forward are:

  1. Improve site performance and reliability.
  2. Make more relevant and useful to users of Facebook and Twitter.
  3. Put better systems in place for dealing with spam users and adult content
  4. Reduce reliance on advertising as a revenue model (ask Faves users to directly support the site, via donations or premium features).

If you have feedback for me, you can send me mail at mike at