Search

Map Reduce Jobs in Ruby with Wukong

Patrick Van Stee

2 min read

Oct 24, 2011

Map Reduce Jobs in Ruby with Wukong

Hadoop + Ruby

I recently got to work on some really interesting, big data problems at Highgroove. One of our clients needed to record every api call and analyze specific time periods for averages and usage metrics. Hadoop fits this use case pretty well with a distributed file system and map reduce framework built in. But, I’ll be honest, I wasn’t too happy about having to write map reduce jobs in Java. While looking for alternatives I found Wukong; a gem that adds a Ruby wrapper around the Hadoop Streaming utility. Here’s an example of how easy it makes writing map reduce jobs.

Now there are always trade offs. Streaming queries, especially to a Ruby script, are definitely slower than their pure Java counterparts. But, between the base startup time of any map reduce job (about 30 seconds) and the speed up from running all jobs in parallel across a cluster of workers, it’s really not that bad and totally worth the improvements in readability and maintenance. Also, because this script runs over the entire cluster, you have to make sure that each machine has the right ruby version and and all necessary gems installed, but we easily solved this with a few lines in a Chef recipe.

We’ve really only touched the surface of what is included with Wukong. There are plenty of different mapper and reducer classes depending on how you have your data structured and how you want to query it. With the script class provided by Wukong, you can run scripts across an entire Hadoop cluster with a single command. It even includes support for writing pig latin scripts in Ruby. So if you know Ruby and want to get involved with some big data projects, Wukong + Hadoop is a great way to get started.

Zack Simon

Reviewer Big Nerd Ranch

Zack is an Experience Director on the Big Nerd Ranch design team and has worked on products for companies ranging from startups to Fortune 100s. Zack is passionate about customer experience strategy, helping designers grow in their career, and sharpening consulting and delivery practices.

Speak with a Nerd

Schedule a call today! Our team of Nerds are ready to help

Let's Talk

Related Posts

We are ready to discuss your needs.

Not applicable? Click here to schedule a call.

Stay in Touch WITH Big Nerd Ranch News