A quick look at the Redis source code

Having been writing predominantly Java and Scala for the last 7 years, my C skills are pretty rusty.  In fact they’re practically non existent.  Apart from the occasional hack I’ve not had the occasion to write much C since University.  There is a widely held opinion that reading other people’s code is an excellent way to learn, particularly if those people are experts or the codebase is held in high regard in terms of its quality.  I’ve decided to take a look at one such codebase.

redis-logo

Redis is an open source data structure server written in ANSI C.  ”Data structure server” is another way of saying really, really neat key-value store. Not only can you store simple values like strings against keys but also hashes (or maps, or dicts even), lists, sets and sorted sets.  We use Redis a lot at Top10, mostly for indexing hotels in (near) real time depending on their availability and price for the dates the user is searching on.  I’ve also discovered that it has a pretty easily understandable code base, even for a C newbie like myself.  The code is cleanly written, relatively small (around 45k lines of code), mostly single threaded, and with few dependencies.  The dependencies are all included with the source making building it as simple as cloning the repo and typing make.

I decided to dive straight in to the code by adding a new command to Redis.  Something simple that will give me an idea of how Redis handles a command and dispatches a response.  A command rand that accepts a single integer argument max and returns a random integer between 0 and max (exclusively).  Not an ideal use of a key value store but implementing it should be instructive.  I certainly won’t be submitting a pull request!

Disclaimer – as mentioned before I’m by no means an expert in C so take all the code and interpretation of code here on those terms.  Also I’m linking to the unstable branch of Redis on github so the links may be just that – unstable.  You’ll probably get more out of this post if you clone the Redis source yourself and follow along in your favourite editor, particularly if you compile and run the code changes found here.

The command table is found near the very top of src/redis.c.  It is an array of instances of the redisCommand struct.  redisCommand is defined in src/redis.h but there’s a very handy block comment explaining each of its fields above the declaration of redisCommandTable.  Here is the definition of the get command:

The first field "get" is the name of the command.  The second is a pointer to the function that implements the command (you can see the implementation in t_string.c).

The third field is the command’s arity (number of arguments it accepts).  Specifying this means the command lookup and execution code can prevalidate a request before passing control by calling the function pointer.  This reduces the error handling code necessary in each of the command functions.  The argument count appears to include the name of the command itself so the get command accepts two arguments; its name and the name of the key whose value should be fetched.

The fourth field, set to "r", is specifying that the command is read only and doesn’t modify any keys’ value or state.  There are a whole bunch of one letter flags that you can specify in this string that are explained in detail in the nearby block comment.  The field following this string  should always be set to zero, and will be computed later.  It’s simply a bitmask representation of the information implied by the string.

The sixth field is NULL because it is only necessary when you need complex logic to tell Redis which arguments of the command are actually keys.  A key implies a reference to a value stored in Redis as opposed to simple value parameters such as our max argument.  This allows Redis to extract the values of the keys (and check that they exist) before calling the command implementation.  If this field were used it would be a pointer to a function that would return an integer array of argument indexes (zunionInterGetKeys in db.c is an example of this).  In the case of the get command though (and most other command) this information can be conveyed with the next 3 integer fields.  There is only one argument to getCommand and it is a key.  Therefore the first argument that is a key is at index 1, the last argument that is a key is at index 1, and the step increment to find all the keys is; 1,1,1.

The last two fields of a redisCommand represent metrics about the command, are set by Redis and should always be initialised to 0.

Let’s add our rand command to the bottom of the table:

The command is called "rand", randCommand is the pointer to the implementation (not implemented yet) and it takes 2 arguments (the name and max).  As for the flags – it’s read only (r), returns random, non deterministic output (R) and can be called while Redis is still loading the database (l).  There are no key arguments.

The next step is to add the randCommand function prototype to src/redis.h.  A redis command implementation takes one argument, a redisClient struct that represents the command arguments but can also be used to send the response to the actual client:

This prototype ought to be placed in src/redis.h near the all the other command prototypes.  Grepping for this line:

will help you find where.

Let’s add an empty implementation to src/redis.c:

I added mine near to the infoCommand definition.  Now let’s run make

and run the server we’ve built (hint if you usually have an instance of redis running locally now would be a good time to stop it):

And let’s run a Redis client in another terminal and try out our command:

First let’s try out the error handling:

Good to see the arity checking working.  Let’s specify an argument this time:

… and Redis hangs.  I suppose that should have been expected given that we’re not responding with anything from the randCommand function.  Let’s ctrl-c the server and get back to the source.

We want to respond with a number so I dug around looking for an example of how to do that and found the zcardCommand in src/t_zset.c.  This command uses addReplyLongLong to reply to the client with a response that is a 64-bit integer (a long long).  Let’s try that:

Now when we make again and run the command:

OK, so it’s not very random but it’s a start.  Let’s parse our max argument from the command now and return a random value limited by max:

Whilst Redis uses primitive types and C strings throughout the codebase it also has its own internal object system for representing strings, longs and more complex types in a more generic fashion.  An example of Redis’s use of these objects is the representation of each command’s arguments.  Each command argument is stored as a Redis object in the argv array on the redisClient instance c.   To get a Redis object as a long I found an example in the getrangeCommand function in src/t_string.c that uses the getLongFromObjectOrReply function from src/object.c.

getLongFromObjectOrReply takes a redisClient instance, checks that the object in its second parameter is an object that represents a long, places the value of that long at the pointer specified by its third parameter and returns REDIS_OK.  If the argument is not a long (or overflows) it will return REDIS_ERR.  The beauty of this method is that if we receive REDIS_ERR we can just return from our randCommand function as any necessary error response will have already been sent to the client.  Let’s try out our command again:

Looks pretty good!  rand is an entirely pointless command but I learned quite a bit about Redis from implementing it and I hope you did too by following along.  Please use the comments to let me know if I’ve made any glaring errors in this post.  It would also be good to know if you’ve found this post useful or enjoyable.  I’ll certainly consider writing more posts like this about Redis or other open source codebases.

 Discuss on Hacker News

Posted in C, Redis | Tagged , , , , | 9 Comments

Scala with Mockito gotcha

We do a lot of Scala over at Top10 nowadays.  We also do a lot of unit testing and that means a lot of mocking.  We’ve been using Mockito due to our familiarity with it as Java developers and because Scala’s excellent interoperability with Java means that features like the @Mock annotation still work on Scala objects.

One thing to bear in mind when using Mockito matchers with Scala is that Mockito, being a Java library knows nothing of a Scala method’s default arguments.  So if you have a method like:

and you try to mock it thusly:

you’ll probably be met with the following complaint from Mockito:

You can’t mix regular values with Mockito matchers when mocking a method’s behavior and the default values will be supplied as plain old values to the mockito object.  The only option is to call all the default parameters with matchers:

or if you want to be more specific (and generally you should):

(This highlights another gotcha.  eq is a method on AnyRef so it hides any static import of Matchers.eq).

Posted in Development, Java, Scala | Tagged , , , , | 1 Comment

Leaving the BBC. Joining Top10

After ~20 months it’s time for me to leave the Big British Castle. It’s been an honour to work for such a fantastic organisation which is very close to my heart. I’ve got to work on some great projects and with an awesome bunch of very smart people.

I won’t go into my reasons for leaving other than to say that I’m super excited to be starting work at Top10 tomorrow! The name of the site speaks for itself as does the subtitle on the front page “The Top 10 of Everything, Created by Everyone”.

I love the concept of the site and where the guys are taking it. Top10 will be the first startup I’ve joined and I can’t wait to start working on the site with some new technologies and another great team.

Here are my Top 10s.

Posted in BBC, top10 | Tagged , , | Leave a comment

Most tweeted Columbo episodes

Sadly, Peter Falk died this week.  This means a lot of people are tweeting about Columbo.  Instead of doing all the things I ought to be doing today like exercising, working on an upcoming project and feeding myself I wrote a Python script to find the most tweeted about Columbo episodes.  Here are the results:

The script searches for the episode title in quotes plus the word Columbo (as episodes like “Undercover” and “Dead Weight” were returning a lot of false positives).  I’ll stick the script on github later, first I have to get to the supermarket before it closes.

Update: Script is on github.  It’s dead simple, probably could have just been a gist.  I got the list of episodes from here.  There are some sample results in the git project.

Posted in Development, Fun, Python, Twitter | Tagged , , , , , | Leave a comment

Twitter rate limiting and Google App Engine

Twitter blocked icon@markovator has been silent for some time. He’d been replying to people but he’d not been able to come up with his own original tweets. The reason for this is the that he was being rate limited by Twitter. Initially this seemed strange to me as markovator tweets quite infrequently (at the very most once every minute if people are constantly pestering him). However the markovator code was inconsistent in how it authenticated with the twitter api. When it didn’t seem necessary to authenticate I didn’t (for example when requesting an unprotected user’s tweets). Twitter applies the rate limit to the IP of the machine making the request and when you’re running on Google App Engine you never know anything about the node you’re making http requests from or who else is using it. Whilst App Engines IPs seem to have been whitelisted (they can make 20000 requests per hour) all it would take (and did take, it seems) to cause markovator’s requests to be refused is a few more apps with very heavy unauthenticated twitter usage.

So you’ve been warned. If you’re building an app the uses twitter on app engine always OAuth authenticate, even when this seems unnecessary or a pain otherwise you’ll always be at the whim of twitter’s IP limiting. Obviously markovator always authenticates as @markovator now. If you usually authenticate on behalf of your users then you should use the twitter account you used to register your twitter app just for making those public, read only requests you might have thought could go unauthenticated.  You can avoid the OAuth flow by getting a single access token for that account from your twitter app page on dev.twitter.com

On a side note the account/rate_limit_status endpoint seems a little capricious. I have an admin endpoint that reports the status of the markovator app, including it’s authenticated rate limit status. When run locally it returns 350 per hour whilst when run on app engine it returns 20000 per hour.  Perhaps the whitelisted status of App Engine overrides the fact that the status request is authenticated.  Despite this the authenticated rate limiting does seem to apply when making authenticated requests from App Engine to Twitter.

Links:

Posted in Development, markovator, Python | Tagged , , , , | Leave a comment

Minecraft Tumblr blog

Postcards from Minecraft

I created a Tumblr blog for recording our minecraft adventures rather than filling this site up.

I’ve cross posted the original minecraft post from here and added a couple more.

More to come soon. Enjoy :)

Posted in Fun, Gaming, Minecraft | Tagged , , | 1 Comment

Minecraft

Caragh looking out at the monsters from the castle wallsI’ve been playing Minecraft single player for a while now and it’s great.  Everything you’ve heard about the game is true.  The whole mechanic hinges around the simple rules that govern the physics of the world and the unique experiences that emerge from them (apparently this is called emergent gameplay).  The adventures you can have battling monsters, exploring caves and going off on hikes and sea voyages can be exhilarating, scary and challenging.  The simple graphics are charming and sometimes quite awe inspiring once you reach a mountain peak and get a good view of the impressive terrains that get generated.  There’s nothing like building your own world to really get you invested in what goes on in it and that’s exactly what you do as you stack tons of cobblestones into towering battlements around your castle keep, dig endlessly downwards into your labyrinth mines and caves and carefully landscape grassy forests and deserts into the perfect back garden.

Singleplayer was great but what has been even more fun is setting up a server and convincing Caragh to join me in a multiplayer world.  Here are some screenshots of the action we’ve encountered so far.

Minecraft beach/desert screenshot

This is the spawn location in our world, every time you die you’re yanked back here and all your stuff stays where you were so you have to run back and collect it before it disappears.

Screenshot Caragh of cutting down trees

Caragh’s cutting down logs while I head off to find coal.  It’s important to gather coal and wood on your first day so you can craft torches to get through that long first night.

Sun's going down, here come the monsters

The sun’s going down which means the monsters will be coming out.  Time to start to build a shelter so we can hide away.

Time to dig a shelter into the hillside

Got our chest, crafting bench and furnace set up

We’ve got a little hole to hide away in now, with a door, a furnace, crafting bench, a chest and most importantly, torch light.

Started digging a mine

At first the only thing to do at night is to start digging a mineshaft so you can mine out some raw materials like coal, iron and maybe even diamond.

A scary cave to explore at the bottom of our mineWhilst mining you’ll often breakthrough into huge natural caverns or tight twisting passages with zombies and skeletons hiding around every corner.  It’s up to you whether to hastily brick them up or to grab a torch and a sword and venture in.

I think Caragh got Creepered!

Caragh standing in a creeper crater.  Creepers will sneak up on you soundlessly before making a hissing noise and exploding.  Creeper death was our number two cause of death starting out, after falling down our mineshaft.

Finally we have a small farm and some safe walls

Eventually we get a nice set of castle walls up outside our shelter so we can come out at night in relative safety.  The plants down there are the start of a wheat farm and a sugar cane crop.I built a porch out the front

We built a porch out the front because we kept opening our front door to find a creeper or a spider waiting for us.  The elevated porch is harder for the creepers to get as near to and the glass allows us to look out and see where they are.

Looking nicer inside, we have cake!

We expanded the inside around our mine a little bit and made cake.  The cake replenishes health and is made from the crops we farmed outside (and some eggs and milk).  We found the pumpkin outside and put a torch inside to make a Jack-O-Lantern.

I climbed a big tree on the mountainside to get a better view of our castle

I climbed a tall tree on the hillside to get a better view of our castle/shelter.  You can see the skylight that lights our mineshaft from here.

Venturing further from home

Venturing even further from home.

Begging to be explored!

In the other direction is a huge island.  I can’t wait to take a boat over there and explore.

Posted in Gaming, Minecraft | Tagged , , | 7 Comments

A History of the World in 100 Seconds

I attended the History Hackday at the Guardian and helped Gareth implement his cunning plan to scrape Wikipedia for all historical events with associated geo coordinates and present them chronologically.  The video above is the result.  Unfortunately I couldn’t be there on Sunday but whilst Gareth set about doing the hard bit, using Python and Mongo to extract the events from his dump of Wikipedia, I managed to code up a framework in Java for Gareth to produce the visualisation video from.  The dataset is pretty fuzzy, we went through each Wikipedia year page and kept all pages that were linked to for that year that also had coords.  I’m pretty pleased with the results, we won a prize for “best visualisation”, although I’d like to spend a little more time tweaking the graphics if I find the time.  The code is available on github.

The hackday itself was pretty cool, well stocked with beer, pizza and cool people. Props for a well organised event go to @fidothe, I’m sorry I couldn’t have been there for the whole thing.

Posted in Development, Java, Python, Wikipedia | Tagged , , , , | 2 Comments

Neon Trails Live Wallpaper for Android

Since version 2.1, Android has had a live wallpaper feature which allows dynamic wallpapers to be built for the Android home screen that can animate and respond to touch and other inputs.  They’re remarkably easy to develop so I spent some of my time off this holiday season building one I call Neon Trails.  The wallpaper draws little trails that wind themselves across the screen and around each other.Neon Live Wallpaper

The algorithm for the trails is relatively simple, each moves forward one unit and then chooses a random direction to travel in before moving forward another unit and picking another direction.  Each trail won’t pick a direction that will cause it to overlap with itself or another and will stop moving as soon as is it gets trapped.

The rendering was done using the 2D API that Android provides via the Canvas class.  The little lights at the end of active trails were achieved by faking a bloom effect by repeatedly rendering a circle with a number of BlurMaskFilters, each with a smaller radius than the last.

Neon Trails Live Wallpaper for Android with a green color scheme

Most of the work was spent optimising various things.  A more optimised wallpaper will scroll better as the user drags between virtual home screens and will use less battery.  Memory allocation and garbage collection are generally quite expensive on the Dalvik VM so avoiding using the new keyword where ever possible, except on start up, is generally a good idea when optimising for Android (there is still some work to do here as both the Trail and Grid class frequently allocate new Point instances).  The DDMS eclipse tool that you get with Android is excellent for tracking these things.

The technique that seemed to give the greatest benefit in terms of performance was generating an offscreen buffer (an instance of Bitmap) to render the “terminated” trails to (active trails are moving too much so would invalidate the buffer on every frame).  Referencing the buffer with a SoftReference meant that the system could always garbage collect the image if it couldn’t spare the memory required.  This is generally good practice for all non-essential offscreen buffers when working on optimised UIs in Java where memory might be tight.

Neon Trails Live Wallpaper for Android with yellow color scheme

The wallpaper is fairly customisable in terms of appearance, you can alter the color scheme and appearance of the trails and can even set a lower frame rate if you’re worried about the wallpapers effect on battery life.  The wallpaper is available for free on the Android Market and you can find the code on Github.

QR Code for Neon Trails Live Wallpaper for Android

QR Code for Neon Trails Live Wallpaper for Android

Posted in Android, Development | Tagged , , , | Leave a comment

Markovator – Python + Google App Engine + Markov chains + Twitter

Can good TV shows have a dog eating cat poop.
@markovator_dev
Markovator Dev

As a child a chat bot program called NIALL made quite an impression on me.  It came on one of those demo floppy disks from some PC magazine I forget the name of now.  NIALL stands for Non Intelligent Acquired Language Learner and what it (he?) would do is remember each word you “said” to it and the frequency that any following word appeared after it.  So for example when given the sentence “the cat sat on the mat” NIALL would remember that “cat” always precedes “sat” but that “the” only precedes “cat” half the time, the other half of the time it precedes “mat”.  Whilst slowly building up this dataset of word frequencies with each message you would type in it would use them to generate its own nonsensical, sometimes amusing replies.  You can find a better description of the process on the how it works page of this GNOME implementation.

Continue reading

Posted in Development, Python | Tagged , , , , | 1 Comment