Redis, Ruby, and Some Surprising Uses

Any developer worth their salt knows that Redis is great for caching. As an in-memory cache, it gets the job done. You certainly don’t have to take my word for it; the major sponsors of Redis (redislabs) wrote a white paper to explain it. What isn’t quite as widely known is that Redis has some other uses worth considering. I’ll list the ones I’m aware of (and have used) which are all available with open-source Redis.

As a fan of Ruby, naturally my few examples (and links to examples) will be written in Ruby, but there should be equivalents for your favorite language. Most of the Redis in Action by redislabs actually provides examples in Python if that’s your cup of tea.

Redis as Key/Value Store (a NoSQL DB)

This one is probably the least surprising; take an in-memory key/value store and add some persistence and now you’ve got a lightning-quick DB that’ll persist beyond a restart. My favorite way to interact with Redis is Moneta, which abstracts connections for lots of different kinds of NoSQL DBs/generic storage options. I won’t attempt to replace its README here, but I’ll provide a couple examples to show just how easy it is to get started using Redis as a DB.

Moneta needs to be told what DB adapter class (in this case, :Redis) to use with the URL we provide (the 'redis://...' part). You’ll probably need to change redis.host.name to whatever is hosting your Redis instance. Also notice the /0 in the URL; Redis supports multiple databases on a single host and there are some advantages to separating things out for different use-cases (such as faster key scans, less risk from a FLUSHDB, etc.). These databases aren’t named, they’re indexed, and Redis by default supports up to 16 of them (indexes 0-15).

In the above example, I created a Hash and stored it under the key profile:some.user@example.com). Moneta handles marshaling/unmarshaling the Hash for us since Redis itself and the redis gem only support storing String objects (at least in this use-case).

So what’s the difference between this and just normal caching? Not much, really. Most of the work is either that we’re not setting expirations on keys or done on the server side. Redis supports a couple different modes for persisting data that are worth checking out (or you can use them both).

There are a few differences with using Redis compared to a “normal” DB if you’re used to something like an RDBMS. Like other popular NoSQL DBs, Redis is (mostly) single-threaded. Read the relevant Redis FAQs to understand why this is probably not a big deal. Another major difference is the increased need for sharding for large datasets. This is because Redis is an in-memory database that happens to be capable of persisting to disk; this is to say that the upper bound on the amount of data for an instance of Redis is limited to the amount of memory available to that process. Finally, Redis doesn’t have a concept like foreign keys or relationships.

That said, Redis can get a bad rap as a database because of its more typical usage a a cache and its emphasis on speed. Redis has continued to improve and offer more complex features and capabilities. It is much more than a key/value store now; Redis supports Hashes and Sets including a whole slew of associated commands for working with these. While I probably still wouldn’t use it for things like an authoritative store for bank transactions, Redis shouldn’t be completely discounted as a viable option for many DB workloads.

Redis as a FIFO Queue

In much the same way you might use RabbitMQ/ActiveMQ, you can use Redis for queuing via the pub/sub or pull capabilities it offers clients.

For this, I use resque, written way back in 2009 by GitHub and still (as far as I know) used to handle their truly massive amount of background/asynchronous tasks. While I could throw a ton of code as examples at you, I’ll invite you to review the overview provided by the resque project and provide a link to a repository of poorly-documented but mostly-working code to setup a git-based job runner I started working (but never really finished) a while back.

Resque makes full use of Redis’ features for background tasks and spins up workers that listen to specific queues to do their work. This is a fantastic way to accomplish a looser coupling between application components, offer better scalability for select components of a system, deal with highly variable workloads, or just take care of slow/expensive processes outside of your main program. Jobs can be treated as a transaction and will be properly marked as failed if they fail. Resque also provides a pretty slick (and optional) web UI for monitoring your jobs and even rerunning them if required.

While Redis and resque continue to improve, they do lack some MQ capabilities you might want in extremely critical applications (at least out of the box), like automatic retries, “fan out” or distributed jobs, and more, though many of these features are available via community plugins like resque-retry, resque-fanout, and others available on the plugins page. Before writing off Redis as a queue, consider that it runs many mission-critical workloads for GitHub.

Redis for Scheduled Tasks

Building on the FIFO queuing example, Redis supports delayed tasks, allowing jobs to be placed on a queue for actual processing after a certain time. Resque has a plugin to work with this Redis capability called resque-scheduler which offers both these delayed tasks (meaning run after some point in time as soon as a worker is available to do so) and scheduled tasks (which either run at a specific time if a worker is available or not at all). Scheduled tasks can either be one-off or recurring.

These scheduled/delayed tasks require workers available to run them and these workers are extra processes, though they can be extremely lightweight and are very easy to dockerize (I do it at work all the time).

Others

These are capabilities I’m aware of and looked into but haven’t yet used:

  • Distributing Files – Redis can be used for distributing files. These can be read, either in their entirety, line-by-line, or block-by-block. Combined with sharding and replication, this could be a very interesting approach to a distributed filesystem or just for shipping around files.
  • Search – While I’m not sure I’d use it to replace Solr or Elasticsearch just yet, Redis does provide some capabilities for doing word-based full-text search like you’d get from Lucene. It would be interesting to see how performance and capabilities differ for applications where I’m already using Redis and don’t want to include Solr just to add search.

Hopefully this demonstrates that Redis is capable of a whole lot more than just caching.

Leave a Reply