Skedsheet Blog

Where we talk about the product, calendars, organization, and business

Shifting bottlenecks

with one comment

server rack - from http://www.flickr.com/photos/jamisonjudd/2433102356/Today is a great day! I finally finished moving all our data to the new servers. I love finishing a project — especially one that involves working until midnight every night until it’s done.

As part of this migration, I outsourced our email hosting to Rackspace, and I couldn’t be happier. They have much better spam filters than we had on our server, and I don’t have to think about it anymore. It’s not that hosting our own email was difficult, but it was a pain to migrate to a new server because you have to time everything just right so you don’t lose mail, or have it directed to the wrong location. Rackspace does it better, and it’s one less thing for me to think about. Well worth the money.

For the new server setup, we’ve got a couple dedicated file servers, and we use Windows distributed file system replication (DFS) to continuously replicate files between them. It’s wonderful! I’ve been trying to get to this point for many years, but I was using the wrong tools. Previously I used rsync or SyncBack, but the problem was always the time it takes to scan all the files for changes. (We currently have 700,000 files, so this is a big deal.) DFS instead uses the NTFS change journal to track all the changed files, and doesn’t have to scan the file system to be sure things are synced up. If there’s another 3rd party tool that does this, I’d love to know about it, because the only problem I have with DFS is it requires all the servers to be on the same domain (or in the same forest on a WAN.) The other tools I’ve seen that use the NTFS change journal still say you have to scan all the files periodically to make sure you haven’t lost any changes somewhere along the way.

When we started out many years ago, we had only one server. To backup the data, we first created a database backup, which involves reading in the entire database and writing out a backup file. Then we copied all the files (or just changes) to a backup, so at a minimum it’s scanning all files for changes, then reading all the changed files to send to the backup. That’s a very disk-intensive process, so while the backup was running, it hurt performance of the application. While it would be nice to have frequent backups, the performance penalty meant we could only take backups during off-peak hours.

For our second generation of servers, we bought the fastest drives we could, which doubled the cost of the servers, but meant the backups didn’t hurt application performance as much. That worked OK, but we still had other problems around disaster recovery. If a server failed and took all its data with it, we’d have to recover from backup, which involved moving hundreds of GBs of data, which is slow no matter how fast the drives are.

Now we’re on the third generation of servers, and we’ve taken a different approach. Now the application servers are only responsible for creating database backups to a file server periodically. That’s much less disk-intensive, and has minimal impact on application performance. We’ve shifted all the backup processing to the file servers, and even that is spread out over a couple servers to make sure it doesn’t impact application performance. So this not only improves performance, but it gives us more redundancy, and allows us to do more frequent backups during the workday.

So does this fix all our problems? No way. But it changes things so much, I don’t know what the next problem is going to be. This week I’ll be using the excellent PAL tool from Clint Huffman to analyze the server performance, and start designing the next iteration of the server architecture. Some days it’s hard to tell I’m in the software business.

Advertisements

Written by Ted Pitts

July 18, 2009 at 10:59 am

Posted in Uncategorized

One Response

Subscribe to comments with RSS.

  1. […] that, there are the ongoing costs of running skedsheet.  We’ll need servers, which will probably start around $500/month, and grow as we’re successful.  I’m sure we […]


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: