Hadoop is a great solution to the big data problem and with the instant access to servers and storage in the cloud, it’s easier than ever to spin up and manage your own cluster. If you haven’t heard too much about it yet, hadoop provides access to a distributed file system along with a framework for running map reduce jobs over the data. It takes care of replicating chunks of data to each node and running jobs in parallel for you. However, when you want to expand your hadoop cluster across availability zones you can run into some unexpected problems. So lets dig into the ideas we tried and the final solution that worked the best for our configuration.
At Highgroove, we like forward momentum. This means that we know that every delivery cannot be perfect, so instead of worrying about perfection, we worry about progress.
As a developer, I always try to follow the “Boy Scout Principle” when it comes to the code I’m working with. Simply put:
In Ruby, blocks are kind of a big deal. We use them for everything from basic iteration to executing callbacks. They are also really handy for writing Domain Specific Languages, or DSLs for short. For example, checkout how blather uses blocks to respond to an XMPP message.
When I went to Europe for the first time, first to England and then to France, I was deeply obsessed with track cycling. Since I was (kind of) writing my Masters Thesis while my wife was (actually) doing serious research for hers, I had plenty of spare time to hunt down every concrete or wood-banked oval in all of France. My process started out by simply looking for the famous ones like Roubaix and Vélodrome d’Hiver. I quickly realized that every town in France has a great website and they nearly always listed the address of their velodrome if they had one! With that in hand, I would then search for it in Google Earth and put a push-pin there. This was a strange and silly obsession but it taught me a few things. It taught me to read a bit of French, to provide correct accents to Google France’s search site, and to do my best to search like I was actually a French speaker. I’m not sure if I succeeded, but this is the exact opposite of what we want our users to have to do with any of the sites we create. Luckily Ruby on Rails provides extensive support for Internationalization (i18n for short) via mechanisms such as Translation and Localization. The prior simply creates mappings between variables (and parameters) and some copy in the language to be displayed. The latter does things like format numbers, times and dates, and currencies correctly according to the language to which we are localizing. This alone should make us ecstatic but we can leverage the tools provided in even more exciting ways.
ROWE means getting results done and not worrying about when it happens or how long it takes. This is a great way to get things done, but the amount of time spent on tasks can be an extremely useful metric.
In case you missed it, the awesome Globay Day of Coderetreat occurred on December 3rd. The amount of fun I experienced was unexpected and impressive! I learned some things too. Read on to find out what.