Our background is in low-latency trading. We are obsessed with performance and have always wanted to build the fastest tech out there.
But let’s keep the suspense for a minute and introduce ourselves first. QuestDB is an open-source SQL time-series database. InfluxDB is the current market leader in time-series, and we thought it would only be fair if we had a stab at their ingestion format called Influx line protocol (“ILP”) to compare data ingestion performance between QuestDB and InfluxDB. It would not be an overstatement to say that InfluxDB uses a lot of CPU. We set ourselves to build a receiver for ILP, which stores data faster than InfluxDB while being hardware efficient.
We built QuestDB in zero-GC Java. Hopefully, the Java community will be proud!
Starting with QuestDB 4.0.4, users can ingest data through ILP to leverage SQL to query Influx data alongside other tables in a relational database while keeping the flexibility of ILP.
Store (fast) as many — query fast) as one.
Data loss over UDP
We have conducted our testing over UDP, thus expecting some level of data loss. However, we did not anticipate that InfluxDB would lose so much.
We have built a sender, which caches outgoing messages in a small buffer before sending them to a UDP socket. It sends data as fast as possible to eventually overpower the consumers and introduce packet loss. To test for different use cases, we have throttled the sender by varying the size of its buffer. A smaller buffer results in more frequent network calls and results in lower sending rates.
The benchmark publishes 50 million messages at various speeds. We then measure the number of entries in each DB after the fact to calculate the implied capture rate.
We use the Dell XPS 15 7590, 64Gb RAM, 6-core i9 CPU, 1TB SSD drive. In this experiment, both the sender and QuestDB/InfluxDB instance run on the same machine. UDP publishing is over loopback. OS is Fedora 31, OS UDP buffer size (net.core.rmem_max) is 104_857_600.
It comes down to ingestion speed
Database performance is the bottleneck that results in packet loss. Messages are denied entry, and the loss rate is a direct function of the underlying database speed.
By sending 50m messages at different speeds, we get the following outcome.
Capture rate as a function of sender speed
InfluxDB’s capture rate rapidly drops below 50%, eventually converging toward single-digit rates.
Capture rate as a function of sending speed.
Implied ingestion speed in function of Sender speed
QuestDB’s ingestion speed results are obtained through ILP. Our ingestion speed is considerably higher while using our native input formats instead.
Why is the sender’s rate slower for InfluxDB compared to QuestDB?
In this test, we run the sender and the DB on the same machine, and it turns out that InfluxDB slows down our UDP sender by cannibalizing the CPU. Here is what happens to your CPUs while using InfluxDB:
InfluxDB’s CPU usage when serving requests
When in use, InfluxDB saturates all of the CPU. As a consequence, it slows down any other program running on the same machine.
QuestDB’s secret sauce
We maximise the utilization of each CPU, from which we extract as much performance as possible. For the example below, we compared InfluxDB’s ingestion speed using 12 cores to QuestDB using one CPU core only. Despite utilizing one core instead of 12, QuestDB still outperforms InfluxDB significantly.
If spare CPU capacity arises, QuestDB will execute multiple data ingestions in parallel, leveraging multiple CPUs at the same time, but with one key difference; QuestDB uses work-stealing algorithms to ensure every last bit of CPU capacity is used while never being idle. Let us illustrate why this is the case.
Modern network cards have much superior throughput than the single receiver. Being limited to one receiver by design, InfluxDB considerably under-utilizes the network card, which is the limiting factor in the pipeline.
All CPU cores open one single receiver that under-utilizes the network card
Conversely, QuestDB can open parallel receivers (requiring one core each), fully utilizing the network card capabilities. The following illustration assumes that there would be spare CPU capacity in other cores to be filled. In such a scenario we would get QuestDB utilizing 12 cores, with each one of those being considerably faster than InfluxDB’s combined 12 cores!
Each CPU core opens an independent receiver working in parallel that fully leverages the network card
Besides ingestion, InfluxDB also saturates the CPU on queries. The current user cannibalizes the whole CPU, while other users have to wait for their turn.
Users monopolize all CPU cores one after the other
By contrast, QuestDB uses each core separately, allowing multiple users to query or write concurrently without delay. The performance gap between QuestDB and InfluxDB grows significantly as the number of simultaneous users increases.
Users share CPU cores and are served concurrently, fast. They also use cores to the maximum.
QuestDB supports ILP over UDP multicast and unicast sockets. TCP support will follow shortly. You don’t need to change anything in your application. For Telegraf, you can configure the UDP sender for QuestDB’s address and port.
You can use JOINs while modifying your data structure on the fly and querying it all in SQL.