Concurrent programming is difficult

Pursuing your vision is difficult. Especially, if this vision is a multi-threaded in-memory database.

A couple of days ago I've got a working implementation of 3-thread layout for Tarantool core.

Mm... what's that supposed to mean if Tarantool is single-threaded? First, it's not. The binary log
is written in a separate thread. Checkpointing (snapshotting) is done in a separate thread. Replication
relays run in their own threads. But the transaction processor still runs in 1 thread only. You're
supposed to shard anyway, so why not begin with shardingon a single multi-core server, that's the idea.

One thing, however, which was part of transaction processor thread but didn't belong to it was the
client/server protocol, handling the socket I/O.

So, now we have 3 major threads: the network thread, the transaction processor thread, and the
binlog thread. Once in a while there is a checkpoint thread that gets involved.

The benchmark numbers were supposed to go up. But not only that, they were supposed to go up
while still maintaining at least comparable CPU utilization. I wasn't interested in 20% performance
boost for 100%  incraese in CPU usage, even though that's exactly what I got at first.

So, finally, after a day of tuning and patching, and cooling off the hot mutexes I got my numbers.
+25% performance increase for +30% higher CPU utilization. It's +25% only because in this specific
benchmark network I/O is only 25% of the original performance profile, the rest is B-trees operations
and transaction management, stuff that stayed in the transaction thread.

Happily, I pushed the patch to  the next-release tree. And yesterday we ran first YCSB benchmarks with it.

Now, YCSB is a stupid idea for a benchmark. For example, YCSB RO benchmark is N clients (for example, N is 16)
issuing tiny requests in synchronous fashion. Essentially, with few clients it means the client and the
server operate in lock-step fashion - the cliens  fire up a bunch of requests over network, and wait for esults.
The server kicks in, handles the input, and responds. 30% user time, 70% sys time during the benchmark.

Now, in this particular test Tarantool results got worse, while CPU usage went significantly up. On top of a lot of kernelspace and userspace switching, the patch added a bunch of network/transaction thread switching. The same lock step, one step further.

And now I have to think what to do with it. We can't look silly even on a silly benchmark.

Multi-threading, evidently, needs more work.

Один из участников нашего проекта написал письмо про мудаков. Поможет или нет не знаю, но repost

Originally posted by unera at Открытое письмо президенту РФ об информационной политике
В современном мире так сложилось, что институт частной собственности временами выступает мотором для развития общества, а временами - тормозом.

Известные мировые мыслители, такие, как Ричард Столлман, еще несколько десятков лет назад сумели оценить ту угрозу, которую несет миру распространение законов частной собственности на продукты интеллектуальной деятельности.

Collapse )

Performance of stdarg.h

Most discussions I was able to find online about functions with variable number of arguments in C and C++ focus on syntax and type safety. Perhaps it has to do with C++11 support of such functions. But how much are they actually slower?

I wrote a small test to find out:

kostja@olah ~/snippets % gcc -std=c99 -O3 stdarg.c; time ./a.out
./a.out 0.18s user 0.00s system 99% cpu 0.181 total
kostja@olah ~/snippets % vim stdarg.c
kostja@olah ~/snippets % gcc -std=c99 -O3 stdarg.c; time ./a.out
./a.out 0.31s user 0.00s system 98% cpu 0.320 total

64-bit ABI allows passing some function arguments in C via registers. Apparently this is not the case for functions with variable number of arguments. I don't know for sure how many registers can be used, but the speed difference between standard and variadic function call increases when increasing the number of arguments.

Launchpad bug tracker

The issue tracker on our source code host, GitHub, has matured enough for the team to make a decision to move.
It's probably not the best idea to criticize a free home for an open source project, after all, Launchpad wasn't making any money from hosting us, but, truth be said, and perhaps lack of business model is the reason, it has fallen behind in features and usability.

Just for the record, the most important problems with bugs at Launchpad for us were:
- 7-digit bug ids. Tarantool is a small project and we perhaps will never go out of 4 digits, and you often need to have a quick and easy "handle" for a bug during conversation or in a email
- too many attributes of a bug. The milestone and series system was again designed for a large project, and only complicated matters for us
- bug states were quite nice, but then again we only used a few of them. At the same time there was no "legal" way to mark a bug as a duplicate - perhaps something related to the internal policies at Canonical.
- no way to cross-link a bug and a commit, unless (I guess) you're using Bazaar
- no bulk operations on bugs.

GitHub issues solve a lot of the above, plus, and this is actually the main reason, the issue tracker and the code both benefit from being close to each other.

New algorithm for taking snapshot in Tarantool

Just merged in a patch which I think gives Tarantool one more small but important edge over any other free in-memory database on the market.
The patch changes the algorithm of snapshotting (consistent online backup in Tarantool) from fork() + copy-on-write to use of delayed garbage collection. The overhead per tuple (Tarantool name for a record) is only 4 bytes, to store the added MVCC version. And, since delayed garbage collection is way more fine-grained compared to page-splits after a fork(), as it works on record level, not on page level, the extra memory "headroom", required for a snapshot, is now within 10% of all memory dedicated to an instance.

This feature goes into 1.5, which is, technically speaking, frozen :), but the patch has quite good locality and has been tested in production for a few months already, so I couldn't stand the temptation of making it available ASAP.

Speaking of our master, 1.6, it has already got online add/drop space/index, space and index names, and now is getting ready to switch to msgpack as the primary data format. But since we withstood from making incompatible changes for almost 3 years, there is still a lot of wants and wishes for 1.6. So the current best bet is to get 1.6 out of alpha by the end of the year.