top of page
Writer's pictureSarwar Bhuiyan

Alerting and Throttling with Timeplus


A common approach is to take the results of a stream processing pipeline and generate alerts from them. However, for alerting, it may not be ideal to send an alert (email, Slack message, PagerDuty alert) for every single event in the result stream, as this could overwhelm the recipients with redundant messages. Instead, there might be a need to group events into an incident, which should only be reported once and not again until it’s acknowledged. Alternatively, we might want to throttle the alerts to limit their frequency over a certain time period.


I thought I'd have a go at building this sort of alert throttling with a very simple example and work out my thoughts and requirements. It turns out, it's not just a matter of producing some rules as a derived stream and sending them to Kafka. But implementing throttling and clearing alerts isn’t much more complex either.



Take a simple example with some devices producing temperature readings. I want to be alerted when the temperature exceeds 50C but not give me multiple alerts from every single temperature event. In other words, I want to receive a warning and then, for as long as the temperature continues to come above 50C, don't emit anything. If the temperature drops, then "clear" the alert and then if it meets the rule again, send another alert. I might want to send an alert if I hadn't sent an alert in the last 5 minutes too even if it wasn't cleared. 



How I did this with Timeplus


Assuming an incoming stream called device_events with schema (device, number, time), I create two additional streams. 


CREATE MUTABLE STREAM alert_acks (
   device string, 
   state string, 
   created_at datetime )
PRIMARY KEY device

This can keep track of alerts created per device and allow me to update the state to acked so that the alert is "cleared". 


CREATE MATERIALIZED VIEW device_ack_gen INTO alert_acks AS WITH new_alert AS
(
	SELECT de.device AS device, 'unacked' AS st, now() AS ca
	FROM device_events AS de 
	LEFT JOIN alert_acks AS aa ON de.device = aa.device
	WHERE de.number > 50 AND (aa.device = '' or (now() - 5m > 			aa.created_at) OR aa.state = 'acked')
)
SELECT device, st AS state, ca AS created_at FROM new_alert

This is a slightly long materialized stream (materialized views are derived streams which write state) which writes an "unacked" state when the rule is met but no existing alert was found or if the previous alert was more than 5 minutes old. Since alert_acks is a Mutable Stream (See our latest blog on Mutable Streams)


Now, I can hook up this alert_acks stream to push out the emitted alerts to Kafka with the help of an External Stream as follows:


CREATE EXTERNAL STREAM kafka_alerts_target (
 device string,
 state string,
 created_at datetime)
SETTINGS type='kafka',
		 brokers='kafka:9092',
		 topic='alerts_topic',
		 data_format='JSONEachRow',
		 one_message_per_row=true;

CREATE MATERIALIZED VIEW kafka_alerts_mv INTO kafka_alerts_target AS (
 SELECT device, state, created_at
 FROM alert_acks
)

The above is something hacked together manually, but one can easily implement an alerts management system (perhaps using our Native JDBC Driver). You can make use of templates and CTEs (Common Table Expressions) to make it such that any stream can be combined with a generic alert_acks table shown above to store suppress alerts by id or timeout.


If the type of queries you alert on are stateless, these can be very cheap and one can run thousands of these on Timeplus. However, if they are stateful and using things like windows or grouping with high cardinality, that would need some benchmarking to ensure the right sizing is done.


Happy hacking and try Timeplus Enterprise  or join our community slack for more discussions.


bottom of page