Redisql: the lightning fast data polyglot


For about a year, I have been using the NOSQL datastore redis, in various web-serving environments, as a very fast backend to store and retrieve key-value data and data that best fits in lists, sets, and hash-tables. In addition to redis, my backend also employed mysql, because some data fits much better in a relational table. Getting certain types of data to fit into redis data objects would have added to the complexity of the system and in some cases: it’s simply not doable. BUT, I hated having 2 data-stores, especially when one (mysql) is fundamentally slower, this created a misbalance in how my code was architected. The Mysql calls can take orders of magnitude longer to execute, which is exacerbated when traffic surges. So I wrote Redisql which is an extension of redis that also supports a large subset of SQL. The Idea was to have a single roof to house both relational data and redis data and both types of data would exhibit similar lookup/insert latencies under similar concurrency levels, i.e. a balanced backend.

Redisql supports all redis data types and functionality (as it’s an extension of redis) and it also supports SQL SELECT/INSERT/UPDATE/DELETE (including joins, range-queries, multiple indices, etc…) -> lots of SQL, short of stuff like nested joins and Datawarehousing functionality (e.g. FOREIGN KEY CONSTRAINTS). So using a Redisql library (in your environment’s native language), you can either call redis operations on redis data objects or SQL operations on relational tables, its all in one server accessed from one library. Redisql morph commands convert relational tables (including range query and join results) into sets of redis data objects. They can also convert the results of redis commands on redis data objects into relational tables. Denormalization from relation tables to sets of redis hash-tables is possible, as is normalization from sets of redis hash-tables (or sets of redis keys) into relational tables. Data can be reordered and shuffled into the data structure (relational table, list, set, hash-table, OR ordered-set) that best fits your use cases, and the archiving of redis data objects into relational tables is made possible.

Not only is all the data under a single data roof in Redisql, but the lookup/insert speeds are uniform, you can predict the speed of a SET, an INSERT, an LPOP, a SELECT range query … so application code runs w/o kinks (no unexpected bizarro waits due to mysql table locks -> that lock up an apache thread -> that decrease the performance of a single machine -> which creates an imbalance in the cluster).

Uniform data access patterns between front-end and back-end can fundamentally change how application code behaves. On a 3.0Ghz CPU core, Redis SET/GET run at 110K/s and Redisql INSERT/SELECT run at 95K/s, both w/ sub millisecond mean-latencies, so all of a sudden the application server can fetch data from the datastore w/ truly minimal delay. The oh-so-common bottleneck: “I/O between app-server and datastore” is cut to a bare minimum, which can even push the bottleneck back into the app-servers, and that’s great news as app-servers are dead simple (e.g. add server) to scale horizontally. Redisql is an event-driven non-blocking asynchronous-I/O in-memory database, which i have dubbed an Evented Relational Database, for brevity’s sake.

During the development of Redisql, it became evident that optimizing the number of bytes a row occupied was an incredibly important metric, as Redisql is an In-Memory database (w/ disk persistence snapshotting). Unlike redis, Redisql can function if you go into swap space, but this should be done w/ extreme care. Redisql has lots of memory optimisations, it has been written from the ground up to allow you to put as much data as is possible into your machine’s RAM. Relational table per-row overhead is minimal and TEXT columns are stored in compressed form, when possible (using algorithms w/ negligible performance hits). Analogous to providing predictable request latencies at high concurrency levels, Redisql gives predictable memory usage overhead for data storage and provides detailed per-table, per-index memory usage via the SQL DESC command, as well as per row memory usage via the “INSERT … RETURN SIZE” command. The predictability of Redisql, which translates into tweakability for the seasoned programmer, changes the traditional programming landscape where the datastore is slower than the app-server.

Redisql is architected to handle the c10K problem, so it is world class in terms of networking speed AND all of Redisql’s data is in RAM, so there are no hard disk seeks to engineer around, you get all your data in a predictably FAST manner AND you can pack a lot of data into RAM as Redisql aggressively minimizes memory usage AND Redisql combines SQL and NOSQL under one roof, unifying them w/ commands to morph data betwixt them …. the sum of these parts, when integrated correctly w/ a fast app-server architecture is unbeatable as a dynamic web page serving platform with low latency at high concurrency.

The goal of Redisql is to be the complete datastore solution for applications that require the fastest data lookups/inserts possible. Pairing Redisql w/ an event driven language like Node.js, Ruby Eventmachine, or Twisted Python, should yield a dynamic web page serving platform capable of unheard of low latency at high concurrency, which when paired w/ intelligent client side programming, could process user events in the browser quickly enough to finally realize the browser as an applications platform.

Redisql: the polyglot that speaks SQL and redis, was written to be the Evented Relational Database, the missing piece in the 100% event driven architecture spanning from browser to app-server to database-server and back.

About these ads
This entry was posted in concurrency, node.js, redis, Redisql. Bookmark the permalink.

7 Responses to Redisql: the lightning fast data polyglot

  1. Tim says:

    impressive work, jak! sounds almost too good to be true ;)

    • jaksprats says:

      Thanks Tim, a bug compliment from a veteran redis user (one who has made some great presentations about redis at european conferences).
      I will use 2 examples to show it is not “too good to be true”
      1.) Voltdb (http://voltdb.com/) is also an evented relational database, and it is horizontally scalable, the only annoying thing is you have to rewrite all of your SQL into stored procedures and then compile them :)
      2.) Handlersocket (http://bit.ly/a9B7Gh) an event driven daemon that talks directly to Innodb. Very cool, drawbacks are the only functionality is GET/SET and the SET speed is no quicker than normal InnoDB.
      Both of these evented relational databases are amazingly fast and they are fast because they are single threaded and run directly out of the event loop (something that Salvatore[aka antirez] the author of redis did PERFECTLY)

  2. Pixy Misa says:

    Looks very interesting. Now we just need Redis to support graph structures and full text indexing, and we can take over the world!

    • jaksprats says:

      Pixy, Lets not forget DocumentStore structures too (like Couchdb and Mongodb). DocumentStores can be really cool and similar to how I made two-way functions to morph data back and forth between ALL of redis’ data structures and relational-tables, it is also possible to do this w/ DocumentStores and RDBMSes (and redis -to an extent), but it would be difficult.

      Having made the “morph” features for Redisql, I can say, it took much longer than I thought, and I changed them about 20 times until they got a natural feel to them, hopefully people will be able to pick them up quickly and do amazing things w/ them. If someone extends this “morph” idea to other types of databases (graph, full text search engine, document stores) it would be amazingly helpful for people who need to change/extend what they are currently doing w/ their data and it would help fight the fire named NOQSL-data-lock-in that is growing as quickly as NOSQL is. NOSQL really has to avoid data lock in, its a long term gotcha that will scare away potential adapters.

      But if I had to pick a next thing to integrate into redis it would be a graph database … i think redis’ architecture would provide a solid (and minimalistic) foundation if someone wrote the logic of a graph database and then integrated it into redis (it worked for a RDBMS :)

  3. Pingback: Die wunderbare Welt von Isotopp

  4. Pingback: The case for Datastore-Side-Scripting « Jaksprats's Blog

  5. jaksprats says:

    I have renamed this project to AlchemyDatabase.
    The project home page is here: http://code.google.com/p/alchemydatabase/
    The github repo is here: https://github.com/JakSprats/Alchemy-Database
    The project has matured quite a bit since this blog post :)

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s