Posted by David Felstead
Sat, 17 May 2008 16:31:00 GMT
So that’s probably not a title you see floating around the blogowebs too often… switching AWAY from OS X? HERESY!!!
Now don’t get me wrong – I still loves me some OS X, in fact I’m writing this post on my MacBook Pro right now. But I have to be practical, and developing code at Microsoft, the most practical course is to run Windows. That’s not to say that I’m disallowed from using my OS X machine at work, far from it – walk around campus and you’ll see plenty of folks working from MacBooks or MacBook Pros, but working on dev in search it’s really just not practical; there are just too many hoops to jump through and the cons end up outweighing the pros.
So as one of the douchebags who used OS X, Rails, TextMate and unix shell terminals to do nearly all of my coding, moving back to windows has been interesting, as I’ve had to “reinvent” my dev environment somewhat. To clarify, my desktop machine runs Vista Enterprise x64, whereas my dev machine runs Windows Server 2003×64.
So without further ado, here are the apps I’ve chosen that make up my new Windows based dev environment:
- UltraEdit – it’s an awesome text editor. It’s not free, but it’s cheap and it has nearly all the features of TextMate and a few extra to boot. it’s stable, works great with projects and has snippet support through injectable templates. I also love its column mode, that’s really useful when you’re writing repetitive stuff. Plus its file search is ridiculously fast.
- Console2 – this neat little program allows you to wrap a nice, consistent interface over the Windows command shell and any other console apps that you like. As anyone who’s ever used command line in windows knows, it blows hard, but this little app gives you proper line based copy/pasting, a tabbed interface for running multiple shells, and some nice eye candy with transparency. I like this app a lot, and I have shortbuts in it to start up new dev environments for my various code enlistments, and a shortcut key to start IRB whenever I need to do some quick script hackery.
- Visual Studio – I don’t use it as an IDE, but it is hands down the best debugger on the planet. It’s saved me so much time, it’s ridiculous.
- UnxUtils – I do loves me some sed, awk, grep, find, etc… Yes, there’s Cygwin, but to me that’s way too heavyweight and I can’t stand the weird hybrid windows/bash environment. I’m pretty comfortable with the windows command shell anyways, and these utils just give me the little extras I need to be that little bit more productive.
- Pidgin – A multi-protocol IM client that’s almost as nice as Adium. I use it for my external IM (Live Messenger, AIM, Jabber, etc) and use Office Communicator 2007 for my internal IM.
- ...and I know I’m probably going to catch some flak for this, but Outlook 2007 is by far the best integrated mail/calendaring system I’ve used. It and exchange sync with my Blackjack, and keeps me well and truly organised. I’m surprised to say that I like it a lot more than Mail and iCal in OS X.
- Robocopy – the new xcopy on steroids, it’s got a ton of options and a great status console, plus it’s NTFS permissions aware, which I don’t use much, but when I have to, it’s a life saver.
- Firefox with Firebug – This barely needs a mention, every web developer needs these. Honourable mention goes to the new developer tools built into the new IE8 beta, but they’re still nowhere near Firebug yet.
There are also a whole bunch of internal MS tools like Odd and tsSwitch that I use and that are great, but unfortunately they’re only available to MS employees – it’s amazing what kinds of great internal only software appears on the intranet, it’s seriously a gold mine. Hopefully some of it will someday see the light of day in one form or another, perhaps bundled with various resource kits and whatnot.
I might post a few more later, but I’d definitely be interested in other peoples’ experiences with coming back to Windows after an extended stay on OS X or Linux. It’s been a learning experience for sure, but I have to say I’ve been pleasantly surprised!
Posted in Miscellaneous, Development Methodologies | no comments
Posted by David Felstead
Sun, 11 May 2008 16:32:00 GMT
Wow, this blog sure has accumulated some dust over the past couple of years… A huge amount of stuff has changed in that time, including changes in career and even country!
Around July last year, I was offered a position at Microsoft in Redmond (Washington, USA for those not acquainted with the geography of the area), specifically working on Live’s Multimedia Search; that is, the Image Search and the Video Search.
It’s been a pretty huge transition—I was very sorry to leave Site5, it’s one hell of a company filled with really smart folks, but luckily we remain on good terms. The move from Australia to the greater Seattle area was a bit of a culture shock, but luckily myself, my kids and my wife have all settled in nicely.
The work at Microsoft is pretty amazing stuff… no doubt it’s pretty obvious that Live Search is gunning pretty hard against the Big G and the advances that I’ve seen on the search platform just in the nine odd months that I’ve been there have been nothing short of phenomenal. There’s still a ways to go, but things are going to get very interesting in the near future, I can promise you that.
From a technical standpoint, moving from being a Ruby/Rails guy to working primarily on a massive, distributed, hugely performant C++ system has been a pretty huge step, but it’s coming along really well, and is really interesting stuff. I used to think that working out ways to manage the couple of hundred servers at Site5 was a complex problem… until I came up against tens of thousands of servers :)
Anyways, I figured I might as well post an update, hopefully shake the dust off the site and maybe keep it going… we’ll see how it goes!
BTW, if you’re an awesome programmer, preferably with some C/C++ and web experience and the willingness to work on interesting, complex problems, we’re always on the lookout for new engineers – drop me a line at davfeld at microsoft dot com!
no comments | no trackbacks
Posted by David Felstead
Thu, 22 Jun 2006 01:24:00 GMT
Anyone who has been around Ruby and Rails (or any new software technology) for a while has seen or heard the question asked countless times: Does it scale?
Inevitably the one who is asking the question is referring to the scalability of the software framework, asking whether it can be easily expanded to accommodate copious amounts of requests and traffic. The question itself is one of those ones that can’t be answered simply, and no doubt the Rails afficionados will be rolling their eyes at the same question being raised yet again, as the vast majority of applications written with Rails will never grow to the size that will require them to scale. The Site5 Engineering team’s recent work on our new server monitoring and task management system Squire has made us have to look more closely into the scaling issue, and not just on a technological basis. With the massive growth that Site5 has been experiencing in recent months, it seems that some of the ways we used to take care of things with regards to server management just weren’t going to carry us through into the future.
The crux of my argument is this: when there is enough growth to warrant a re-evaluation of the scalability of an application, chances are that you’re going to have to re-evaluate your business processes in a similar way. Luckily, the Site5 Management team recognizes this, and have planned accordingly – in fact, most of the engineering team’s effort is being poured into future-proofing our fleet.
Site5 now has hundreds of new customers joining us every month and new servers being added all the time, so tasks that were once very simple for our support staff and system-admin gurus start to become more difficult. The off-the-shelf systems we use to monitor our server fleet start to become inadequate – they don’t provide enough detail on the sources of any issues occurring, and often require manual intervention from support staff to resolve. Though most issues take only a few minutes to resolve, a few minutes multiplied by a few hundred servers starts to become a big drain on resources. It might have worked before, it might still work today, but it won’t work in the future.
So enters our new server monitoring system Squire. This neat little piece of software (written completely from scratch by we of the Site5 Engineering Team) is a purpose built web-hosting monitoring system. It’s closely integrated with our hand-built Synco CRM system and even with Site5 Backstage to proactively monitor and gather detailed statistics on each machine in our fleet. It automatically detects and resolves common issues on our machines and instantly notifies the support team when a problem that can’t be resolved automatically is encountered. In addition, it provides the customer service staff and support staff with quick links to customer information should customers need to be contacted in the case of problems. In fact, we have a few little Backstage features in store to keep our customers that much more informed… stay tuned!
Our system administration and customer service teams work extremely hard to keep our server fleet healthy and our customers happy. With Squire we hope to not only make their lives easier, but also to keep our customers up to date, informed, and, most importantly: happy.
Posted in Rails, Miscellaneous, Site5 | 1 comment | no trackbacks
Posted by David Felstead
Sat, 18 Mar 2006 01:54:00 GMT
Two days ago I was interviewed by Charles Wright of The Bleeding Edge, a blog that is a spin-off of his column of the same name in The Age and Sydney Morning-Herald newspapers. Charles has been working on a 2000-word magazine feature on Web 2.0 services and technologies and has been interviewing people from around Melbourne, Australia who are involved in this new web shift.
We had a nice chat on the phone about my own views on the whole “Web 2.0” hype, Ruby on Rails, why Site5 is the best web host out there and about some of our projects, especially Flashback. A brief summary of the results of our chat can be seen here, and I’ll be sure to post as soon as his magazine feature is released!
Posted in Rails, Ruby, Miscellaneous, Site5 | no comments | no trackbacks
Posted by David Felstead
Sun, 12 Mar 2006 03:10:00 GMT
Automated unit testing is now a mainstream concept in software development. The basic idea, for those who haven’t experienced it is to write a battery of methods to poke and probe components of your application to make sure it’s doing what it should be, and report any failures – some sort of script or program is then run over the test battery to pick out any problems. Although it involves a larger up-front time investment, as your code evolves and expands it’s a massive time saver as it takes care of a lot of your regression testing automatically. Couple that with a continuous integration tool (we Site5 engineers use CIA with our Ruby on Rails projects) and you can (with a little effort) end up with a very tight, well tested development cycle.
One of the more useful concepts in automated unit testing and test driven development is the idea of mock objects – basically these are clones or extensions of your application’s objects modified slightly to allow them to operate in your test environment. Typically you will mock objects because:
- You cannot use the real object (perhaps it interfaces to an external component like a credit card processing gateway) or;
- The object you are testing against isn’t finalized or completed or;
- You want the object to behave (or fail) in a certain, predetermined way.
However, like all good programmers, I’m kind of lazy, and unfortunately, writing mock objects can be a very testing (heh) and arduous process, especially writing a million different mocks for a million different scenarios. The dynamic nature of ruby makes extending on the fly much easier, and I’m going to outline a little hack I thought up to help myself with point 3 above.
Now overriding the functionality in all the objects of a particular class is spectacularly easy in ruby – you can simply do something like this:
Before:
puts 1 => 1
# Here's our weird contrived mock object
class Fixnum
def to_s
"I don't like numbers"
end
end
After:
puts 1 => I don't like numbers
What happens though, if you want to override a class in an instance of an object, and not all of its kind? Typically you would define a mock object, and create an instance of it. But, in Ruby there is an easier and faster way that doesn’t involve writing a different mock class for each different scenario – and it is made possible by the singleton class. This clever bit of ruby hackery lets you override the behaviour of a single instance of a class, creating what I’ve decided to call a partial mock object. To demonstrate, I’ve written a small method called override_method which will override the behaviour of the specified method in the passed object, like so:
# Overrides the method +method_name+ in +obj+ with the passed block
def override_method(obj, method_name, &block)
# Get the singleton class/eigenclass for 'obj'
klass = class <<obj; self; end
# Undefine the old method (using 'send' since 'undef_method' is protected)
klass.send(:undef_method, method_name)
# Create the new method
klass.send(:define_method, method_name, block)
end
# Just an example class
class Foo
def do_stuff
"I'm okay!"
end
end
# Test code
list = []
5.times { list.push(Foo.new) }
# We override the method here!
override_method(list.first, :do_stuff) { "I'm NOT okay!" }
list.each_with_index { |f, i| puts "(#{i}) #{f.do_stuff}" }
Outputs:
(0) I'm NOT okay!
(1) I'm okay!
(2) I'm okay!
(3) I'm okay!
(4) I'm okay!
As you can see, only the first object in the array’s behaviour has been changed – the rest have remained untouched. Because of this, you can embed these partial dynamic mock objects deeply into your code without the need to specially instantiate a mock object deep in your code, or writing a ‘clever mock’ to only trigger the determined behaviour in certain objects.
Where this code comes in really handy is when you need an object to raise a difficult to simulate exception (like a disk full error) on a certain method to test your error handling – simply call override_method and pass in a call to raise and voila! Dynamic partial mock objects on the fly!
2 comments | no trackbacks
Posted by David Felstead
Fri, 03 Mar 2006 05:21:00 GMT
One of the major hurdles in developing FlashbackPRIME was handling repository locking between processes. Since we basically started from the ground up, unfortunately we lost some of the benefits of having established libraries do this work for us, and as it happened, had to re-invent the wheel in some areas.
What we wanted was a very simple, barebones method of locking an arbitrary resource (at the application level) so that all requests to it could be serialized. The more difficult part was making sure that the locks were available between processes, since FlashbackPRIME consists of several components – the web application and the back end daemons handling sweeping and restoring. So the inter-process requirement pretty much ruled out ruby’s Mutex and its associates, and I didn’t want to have to rely on a daemon or service running, so that eliminates DRb and Rinda. Going back to basics, it seemed that simple filesystem based file locks were a good match. The filesystem allows exclusive locking of files, and it is more or less a portable solution – preferable, since I develop on Mac OS X whereas Site5’s production servers are mostly CentOS Linux.
The final requirement is that the locking mechanism needs to be resilient – if a lock collision is detected, the application should continue to attempt to attain the lock several times before raising an exception. Since there is no single central arbiter to handle distributing locks, I decided that a random exponential backoff retry strategy (ala Ethernet) would be sufficient – access to the resource should be rare and more or less randomly distributed, so whilst this isn’t a foolproof method, it has tested very well.
# Try to lock the resource and execute passed block within context of lock
def try_lock(options={})
# Set default options
lockfile_path = options[:lock_file] || 'lockfile.lock'
retries = options[:retries] || 10
retry_period = options[:retry_period] || 0.5
# Shared or exclusive lock?
locking_method = options[:readonly_lock] ? File::LOCK_SH : File::LOCK_EX
retries.times do |attempt|
lockfile = File.open(lockfile_path, "a")
locked = lockfile.flock(locking_method | File::LOCK_NB)
if locked then
begin
lockfile.truncate(0)
lockfile.puts(Process.pid)
lockfile.flush
retval = yield
lockfile.close
return retval
rescue Exception => ex
lockfile.close
raise ex
end
else
lockfile.close rescue nil
# Calculate exponential random backoff ala ethernet
backoff_time = rand * retry_period * (2 ** attempt)
STDERR.puts("Lock on '#{lock_type}' failed (pid:#{Process.pid}) - " +
"#{attempt+1}/#{retries} (backing off " +
"#{sprintf("%.2f", backoff_time)} seconds)")
sleep(backoff_time)
end
end
# If we get here, we're out of retries
raise "Locking Error"
end
It’s not the shortest piece of code, but it’s proven to be very reliable thus far. It’s used in the following way (all parameters are optional by the way, the defaults are in the code above):
try_lock(:lockfile_path => '/var/run/lockfile.pid',
:readonly_lock => false,
:retries => 5,
:retry_period => 0.5 ) {
...code accessing shared resource goes in here...
}
This is a nice little way of synchronizing access to shared resources via application level code – it’s relatively portable (Linux, Mac
OS X and Windows so far) and most importantly of all, it’s fairly resilient and self-repairing. Even deadlocks aren’t a major issue, as they will time out eventually – not ideal, but better than the alternative. One final caveat though:
these lockfiles will not work across NFS mounted drives! That can be done, but I suspect you’ll need to look at doing some cleverer
POSIX style locking using
fcntl and its bretheren.
1 comment | no trackbacks
Posted by David Felstead
Thu, 23 Feb 2006 04:01:00 GMT
It’s one of the cornerstone concepts of programming these days – Don’t Re-Invent the Wheel. These days there are so many third party libraries, utilities and frameworks available that more often than not you would be crazy to write the difficult stuff yourself. Occasionally, however, you find yourself outside the “more often”, and run into one of those “not” situations, one where just throwing more hardware at the problem won’t make it go away. Just recently, the Site5 Engineering Team (of which I am a member) ran into one of those “not” situations. The product? Flashback.
The problem
Flashback is a really nice piece of software. It’s a file explorer for your webspace with a difference – it not only allows you to see your website now, but also as it was a day ago. Or a week ago. Or a month. You get the idea. Any changes you make in your webspace are picked up and versioned by the Flashback engine and are recorded for posterity. You want to revert back to your old layout? No problem. Want to retrieve those images you accidentally deleted? They’re there.
The core of the original Flashback used to be the source control management software Subversion (or SVN), which is a great tool to add to any developer’s repertoire – and joy of joys, it even comes with and external API and, more importantly to us: bindings for Ruby. Now at first glance, one would assume that SVN would be pretty fast and performant – after all, it’s written in C and has a thriving open-source community contributing to its development. Unfortunately, that assumption (the word should have raised alarm bells) came back to bite us. Whilst being a great source control system, it turns out that when it comes to performance and efficiency, SVN is a real dog. And you know what? That’s fine. It is the “more often” than the “not” that you don’t care about performance in managing your source code, and for what it’s designed for, SVN ain’t so bad. Anywhere outside its comfort zone though… BZZZZZT! – no good.
YOU (yes you) can always do it better
Off on a tangent for a second – back when I was at university, probably in my second year of a computer science degree, we were assigned the typical task of implementing a Quicksort algorithm in C, and benchmarking it against various other sorting algorithms. Of course, the cynics and the realists in the group wondered what the point of this was? Any programmer worth their salt knows that the C standard library’s qsort function implements the Quicksort – Why Re-invent the Wheel?. The surprise came when the class implemented the algorithm themselves and benchmarked it against the original qsort function. The result? Around 80% of the class had implemented a faster version of the algorithm, and these were second year uni students! A similar revelation came when a friend of mine, studying for his PhD re-implemented some of the functions in string.h (rather than relying on the standard library) in a very CPU intensive experimental search engine. The result? It ran about 40% faster.
The moral? When it comes to performance, you can always do it better. Why? Because you know the problem you’re trying to solve.
FlashbackPRIME – a faster, more efficient wheel
So it turns out that SVN wasn’t up to scratch, and not viable for long term deployment – it’s just too slow and too much of a resource hog. So what to do? The first step was taking a few benchmarks. As a test, I implemented a few algorithms (change detection and repository updating) in pure Ruby and measured them against the same functions in SVN. The performance results were amazing – the pure Ruby solution outperformed the C based SVN (with Ruby bindings) by several orders of magnitude – it was literally hundreds (sometimes thousands) of times faster. With this data in hand, the Site5 Management Team gave me the go-ahead to re-implement the guts of Flashback, and with our lovely modular design of the first system, slotting it in was a breeze.
The final feature set of FlashbackPRIME is comparable to
SVN’s:
- Both systems use a filesystem based repository
- They both have atomic, transactional commits with rollback capabilities
- Both have storage engines based on delta compression
- Both can store arbitrary metadata on items
There are a lot of things that SVN does that FlashbackPRIME does not, but the guts of the functionality is the same… and the results? Incredible. Here are some rough timings:
| Task |
Flashback/SVN |
FlashbackPRIME |
| Populating large repository (several gigabytes, thousands of files) |
about 2 hours |
about 126 seconds |
| Sweeping same repository for changes |
about 38 minutes |
about 81 seconds |
| Sweeping smaller repositories |
about 15 seconds |
less than 1 second! |
Very unscientific figures, but you get the jist.
Sometimes it comes to a point where the wheel just won’t cut it any more, and these are the times that YOU as a developer need to take control and say “You know what? I can do better than that.”
Posted in Ruby, Site5, Development Methodologies | 4 comments | no trackbacks
Posted by David Felstead
Tue, 29 Nov 2005 05:47:00 GMT
ActiveRecord has a little known column modifier known as serialize that allows you to store arbitrary data in it. Many people use this data to store arbitrary or variable properties in hashes, something like this:
class MyData < ActiveRecord::Base
serialize :properties
end
and then using it like this:
a = MyData.create(:properties => {'name' => 'fred', 'food' => 'fish'})
b = MyData.create(:properties => {'food' => 'chicken', 'colour' => 'purple'})
For a quick and dirty way of storing arbitrary data, it's great! All it does is turn the object you're passing into the 'properties' attribute into a YAML string and stores it in the database in text form.
However, the problem with this is you can't query the data easily - you can't find, for example, all the items with property 'food' set to 'fish' without pulling all the data out of the DB and iterating through it to find your items. This is obviously a dumb thing to do from an efficiency perspective.
However, by generating and SQL condition to search the YAML strings, you can query the data quite easily, using the code below:
class MyData < ActiveRecord::Base
serialize :properties
# Assumes the properties column data is a YAML-ised hash
def self.hashfind_by_properties(key, value)
condition = YAML::dump({key, value})[4..-1] + "\n" # Strip the YAML header
# Should be immune to SQL injection, since AR sanitizes SQL
self.find(:all, :conditions => "properties LIKE '%\\n#{condition}\\n%'")
end
end
NB: Code has been modified since original post to simplify query
A bit nasty, but works well. It could be easily extended to chain conditions together though - if someone wants to do that, have fun! Similarly, it might go well in a Rails plugin - feel free to do that too.
Posted in Rails, Ruby | 1 comment | no trackbacks
Posted by David Felstead
Tue, 25 Oct 2005 00:39:00 GMT
Well, Site5 Flashback is basically complete and now in internal testing… finally! The development process, contributed to by the entire engineering team, as well as our illustrious leaders Matt and Rod, has been challenging, but a great experience for all of us. It’s been very much a case of: ”Once you finish the first 90% of a project, you have the other 90% to do.”
However, we have learned the following things:
- Trying to customise unsupported Linux kernel patches and apply them to non-standard kernels is about as much fun as it sounds.
- The ‘d’ in dnotify does in fact stand for suck.
- There is an implementation of unionfs for Linux, which is cool1.
- Unix filesystem hardlinks are both very neat and very scary.
- Scott Deming has the amazing ability to make Linux do things that mere mortals would only dream of with no more than a smattering of C code and a few environment variables.
- That M# should actually be pronounced ‘Moctothorpe’
1 What is less cool is that I think this is cool.
Anyways, enough of the tech-geekery and onto some eye-candy… The first release screenshots of Flashback !!! Woohoo!


These are just a few samples of some of the grooviness in Site5’s Flashback – and if you guys can’t make a good guess at what it is with these screenshots, well… :P
Anyways, it will be launching unbelievably soon, and it will without-a-doubt change the way you think about web hosting altogether…
Posted in Rails, Site5, Development Methodologies | 7 comments
Posted by David Felstead
Fri, 23 Sep 2005 10:43:00 GMT
Ruby on Rails is a great framework, but to get any kinds of usable performance out of it, it needs to be run as a persistent application. Out of the box (or gem), Rails apps on Apache run as CGI applications – this means that for every request sent to your Rails app, Apache will go through the process of loading the entire framework, executing your request, and shutting it back down again. This, as you can imagine, is horrendously inefficient.
Up until recently, the only way to get Rails hosted with any decent amount of speed was to use FastCGI, or FCGI. FastCGI differs from regular CGI in that a number of persistent processes run on the webserver, and requests are farmed off to them as they are received. What this gets us is a big performance boost – the framework only needs to load once (for each worker process), and can serve requests with very low overhead. There are some very high traffic sites running with Rails on FastCGI, so it is a viable solution. But is it a good solution for Rails? Well… no.
The theory behind
FCGI is fine – no worries there – it’s the implementation that’s a problem.
FCGI with Ruby on Rails:
- Is a royal pain in the proverbial to set up;
- Is very difficult to monitor;
- Requires scheduled tasks/cron jobs to clean up after itself – ph33r the reaper (not to mention the spawner and the spinner)
- ...and is not currently under active development (the base library, anyways).
So… what to do? Well, enter Zed Shaw, SCGI, and his SCGI Rails Runner (or SRR for short) script. SCGI is FCGI done nicely – it’s the same concept, but with a significantly less complex implementation – that’s gotta be a big plus for anyone. The SRR, along with mod_scgi in Apache and lighttpd delivers the same performance as FastCGI, with a pure Ruby implementation – this is pretty impressive and a credit to Zed as a coder – FCGI uses compiled native code, yet SRR, a mere interpreted piece of Ruby script can keep pace with it. Impressive.
My experiences installing SRR with Apache2, once I had some instructions were very good – now bear in mind I’ve dicked around with FastCGI for hours, and still had little success, but from installation of mod_scgi and copying of the scgi_rails script to being up and running was less than fifteen minutes. Nice. Very nice.
There are aspects of configuring Apache that still need looking at in my configuration; the most notable of them is the fact that all web requests, bar those with a ’.’ in the filename are routed through SCGI - I’d like to make sure that only Rails stuff is routed through there, ala the dispatch.fcgi method in FastCGI. No doubt this can be achieved with some mod_rewrite funkiness, but my knowledge of it’s inner workings is somewhat limited. However, this is a minor thing and probably makes little difference to the overall performance – I’m just pig-headed and like things working my way. :)
SRR is still very young, but the fact that it is already at a usable level is unbelievably impressive, and to me highlights the promise in the project. I have no doubt that SCGI and Zed’s SRR will dethrone FastCGI on Rails very shortly…
Given its obvious advantages, we’re strongly considering offering it alongside RoR with FastCGI here at Site5 for all your Ruby on Rails web hosting needs – from there, the choice is yours. Let us know if you’re interested!
Posted in Rails, Ruby, Site5 | 1 comment | 2 trackbacks