

This is due to the fact that as the hit ratio decreased 10 times, but the write EPS only increased 4 times, which means the search has a lower load which made some significant impact to spunk search latency. It is interesting that Splunk latency is lower compared to small size configuration.

Timeplus latency increased to 20ms which is still way faster than Splunk 1.4s (69x) and Materialize 1.5s (76x). One thing to mention here is that the max throughput observed from materialize shows a number bigger than 30K, as the data generation is 30k eps, so the query processing might be not accurate or the processing speed is not stable.īoth Splunk and Timeplus can reach 110K+ eps throughout, while Materialize can only process 35K (29% of Write EPS). Timeplus can achieve less than 10ms latency which is 100x faster than Materialize and 1000x faster than Splunk. In the test, the latency is defined as T5-T0, which is an end-to-end latency that our test tool can easily observe.Īll the stacks can reach 30K eps, but the latency data shows big difference To better understand how fast a real-time analytic system can be, I made some performance tests to evaluate the latency and throughput of different real-time analytic systems with different workloads.
TIMEPLUS SYSTEMS FUZION HOW TO
How to optimize query processing is handled by every database system, making the query processing fast will contribute to reduced latency. The compromise is loss of persistence for event storage. Other systems like Flink or Materialize, which do not index data, raw events are directly stored in memory and are consumed by query processing. The compromise is to increase the indexing time (T2-T1). Some data analytic systems will build different indexes to help load data quickly by skipping unnecessary data reading. So a real-time system must be able to ingest data with super fast speed. When there is a long data ingestion time, even if raw data is ready, the data cannot be involved in any query. Preprocessing data is sometimes time consuming. For a specific query, this is the end of the whole life cycle. This time is called emit time.įinally the end user or downstream system received the query result.


The query processing is completed and the query result is ready and starts sending the result to the end user or downstream systems. One specific raw event might be involved in different queries, so the processing time is an attribute of a query. The query processing loads indexed events and starts executing the query computing. Raw events get transformed into whatever the system believes is the best format to be used later. This process is usually called the data ingestion process. Raw events are processed for internal storage, usually an indexer, file, database or memory. We can call it enqueue time (considering data ingestion using a queue). Raw events are collected and sent to the data analytics system or pulled by the data analytics system, the event has not changed. This time is what we call event time and this is when the event starts its life. Raw events are generated/created/emerged at a data source, we label this time as T0.
