You are viewing kostja_osipov

Fish Magic - New algorithm for taking snapshot in Tarantool

> Recent Entries
> Archive
> Friends
> Profile
> My photos at flickr

September 28th, 2013


Previous Entry Share Next Entry
01:55 am - New algorithm for taking snapshot in Tarantool
Just merged in a patch which I think gives Tarantool one more small but important edge over any other free in-memory database on the market.
The patch changes the algorithm of snapshotting (consistent online backup in Tarantool) from fork() + copy-on-write to use of delayed garbage collection. The overhead per tuple (Tarantool name for a record) is only 4 bytes, to store the added MVCC version. And, since delayed garbage collection is way more fine-grained compared to page-splits after a fork(), as it works on record level, not on page level, the extra memory "headroom", required for a snapshot, is now within 10% of all memory dedicated to an instance.

This feature goes into 1.5, which is, technically speaking, frozen :), but the patch has quite good locality and has been tested in production for a few months already, so I couldn't stand the temptation of making it available ASAP.

Speaking of our master, 1.6, it has already got online add/drop space/index, space and index names, and now is getting ready to switch to msgpack as the primary data format. But since we withstood from making incompatible changes for almost 3 years, there is still a lot of wants and wishes for 1.6. So the current best bet is to get 1.6 out of alpha by the end of the year.

(2 comments | Leave a comment)

Comments:


[User Picture]
From:mkevac
Date:September 28th, 2013 12:29 pm (UTC)
(Link)
What is delayed garbage collection?
[User Picture]
From:kostja_osipov
Date:September 28th, 2013 01:52 pm (UTC)
(Link)
Mm... perhaps not a very nice term.
Tuples are reference counted in Tarantool, which lets you use the same piece of data in Lua (which is itself a garbage collected language) and as part of a space without copying. A tuple is "garbage collected" when reference count drops to 0.
So the basic idea of delayed free is that when it's time to get rid of a dead tuple you also take into account that it may be used used by the snapshot process, even though this process doesn't maintain a direct reference on it.

> Go to Top
LiveJournal.com