As the Head of Product at Timeplus, I am quite obsessed with streaming data. I am always thinking of how we can create a product that makes building powerful real-time analytics super easy for our friends in the developer community. In today’s post, I’m excited to share some new examples of how Timeplus can help you develop a more efficient real-time analytics stack. I will walk you through the process of building a real-time security app, with just a few lines of code. With Timeplus, it’s super easy to define security rules and prevent data breaches with sub-second latency.
But first, some context. In my last blog, I introduced our top 10 patterns for streaming SQL:
Streaming Tail & Filter
Tumble Window
Hopping Window
Session Window
Late Event Detection
Stateful Processing
Time Travel
User Defined Function
Streaming Data Transformation
Streaming Join
The prior post gives context to some of the powerful patterns for streaming SQL, and is a good primer for this discussion. For this article, I’ll focus on a concrete example of how using streaming SQL can simplify your security use cases without compromising on functionality.
Use Cases
Since many organizations rely on Slack to connect people, tools, customers and partners in a digital HQ, I thought a tutorial on Timeplus security integrations with Slack would be a great way to showcase how to process real-time data and trigger real-time actions via the Timeplus platform. There are several scenarios where data security practices are mission critical:
Customers’ social security numbers, credit card numbers, addresses should not be discussed in Slack channels;
Sensitive IT or engineering information such as login passwords or API tokens should not be shared in slack directly;
While collaborating with partners, vendors or customers via Slack Connect, documents with confidential information should be prohibited;
There could be other suspicious behaviors by analyzing the activities across multiple slack channels or direct messages, such as once a document is shared in a team channel, one user will always share it to other individuals;
For customers with Slack Enterprise Grid, all messages can be encrypted with Customer Managed Keys (CMK) on AWS KMS (Key Management Service). In the incident of a data breach, the organization can invalidate the CMK for certain channels, certain time periods, or certain files.
There are some existing solutions that analyze Slack audit logs, or batch process the various API from Slack and set red-flags for suspicious behavior. However, by using Timeplus, developers can trigger actions in the same second when the message is sent. Sensitive information is removed immediately. Inappropriate behavior is warned automatically. Every second, millions of activities can be analyzed with Timeplus, without any human interactions. This greatly enhances data privacy and security.
Easy as 1, 2, 3…
Building a real-time security app (as a chat bot) with Timeplus requires just three steps:
Load. The real-time data need to be sent to Timeplus continuously with lowest latency possible;
Query. Business logic is implemented as streaming SQL. Those SQL will be long-running in the background to scan all incoming data and emit new results without re-scanning past data;
Action. For each new result from the streaming SQL, actions will be triggered with sub-second latency. Sample actions are triggering webhook, adding messages to Kafka/Redpanda topics, or sending emails or slack notifications.
In this sample real-time security app, the system overview is:
The key building blocks for the sample app are:
A Python script is created with a few lines of code with the Timeplus SDK to continuously get chat activities via the Slack Real-Time Message (RTM) API and send them to data streams in Timeplus;
A few streaming SQLs are defined as the pattern detection rules;
Those streaming SQLs trigger AWS Lambda (via Timeplus built-in WebHook sink) to delete/update/post messages real-time, or even revoke the AWS KMS keys to prevent sensitive messages or files being viewed by others.
This serverless application can process millions of events per second, and trigger actions per sub-second. The only code you need to develop is a simple Python script to sync Slack messages to Timeplus data streams, and the other Python script — such as AWS Lambda — to turn SQL results to Slack API calls.
Here is a short demonstration of the real-time application:
Step 1: Load Slack activities to Timeplus in real-time
Kudos to our friends at Slack. The Real-Time Message(https://api.slack.com/rtm) API is awesome and very intuitive to use. (We hope you feel the same way about Timeplus!)
In the Python script, I chose to skip the Slack SDK and directly called the REST API. All you need is a Slack bot token:
Timeplus has published the Python SDK (https://pypi.org/project/timeplus/) to make it super easy to authenticate with Timeplus Cloud, then manage streams and ingest data. You can also check How to Build a Github App in 2 Minutes to learn more details.
The last line of the above code snippet is the key. We will get the JSON document for each Slack activity from the Slack API. Depending on the different message types, we can send them to different data streams in Timeplus, without any transformation. Just a raw JSON document as a single field in the event. Timeplus can easily process JSON documents in the SQL.
Step 2: Streaming query
Once you add the Slack bot in a test Slack workspace and channel and start the Python script, messages will flow into the related data stream inTimeplus. You can easily explore the real-time data via clicking on the data stream name:
You can try different rules in this SQL workbench. With the SQL auto-complete, write a SQL, run it, check the result, then stop it and refine the SQL. Unlike other streaming processing tools, you can run ad-hoc streaming queries in Timeplus without defining data pipelines. You don’t need to wait for half a minute to get the results after clicking the Preview button.
There are a few examples how to identify potential security issues in the real-time chats:
SELECT .. WHERE text=’password’
SELECT .. WHERE multi_search_any(text,[‘password’,’token’,’secret’])
SELECT .. WHERE match(text,’arn:aws:kms:us-east-1:\d{12}:key/.{36}’)
You can also check our blog post on The Top 10 Streaming SQL Patterns for more examples.
We can choose to develop a user-defined-function (UDF) and call in right in the SQL to remove messages or revoke AWS KMS key, such as:
SELECT slack_del_msg(channel,ts) FROM SLACK WHERE text LIKE ’%password%’
In this sample app, I decided to use a Lambda function as a webhook sink. This is to ensure that all security rules can use the same Lambda function to perform different actions. The only requirement for each streaming SQL is to add 2 columns to guide the Lambda to take corresponding actions:
`tp_action` column to guide the Lambda function to either remove the message, revoke the AWS KMS key, or do nothing;
`tp_message` column to send a bot message to the channel to remind the user to avoid a data breach.
For example, the SQL to remove any chat message with ‘confidential’ as the key word and post a reminder message to mention the user is:
SELECT*, 'delete'AS tp_action, concat('<@',user_id,'>, do not send messages with confidential information') AS tp_message FROM slack_events WHEREtextLIKE ’%confidential%’
Step 3. Take action in real-time via AWS Lambda
Lambda is a great serverless tool with ultra-low latency, high performance, as well as low cost. It costs less than 1 USD to process 1 million events (assuming each invocation takes 100 milliseconds).
One month ago, an exciting feature was announced to enable function URLs for AWS Lambda. You can follow the wizard on an AWS Console to create a Python-based Lambda function.
You can set the Slack token in the Lambda environment variables. In order to post a message, you also need a user token:
Similar to the data ingestion script, we don’t need to import the Slack SDK. Just call the REST API with the proper bot or user tokens:
If you choose to revoke the KMS key, you can attach an IAM role to the Lambda function, so that you don’t need to embed any static AWS access key or secret key.
The final step is to generate a HTTPS URL for the Lambda then set it as a webhook URL in the Timeplus UI.
The streaming result will be sent to the webhook as a JSON payload in the HTTP request body.
The end-to-end latency is usually less than 1 second, from when you send out the message in the Slack app to the message being removed automatically and receiving a reminder. What happens under the cover:
The Slack message is loaded via Slack RTM API and synced to Timeplus;
Even before the message is put in the disk, Timeplus’s purpose-built streaming engine will evaluate this message in all related streaming queries. For any matching queries, a new event is created with the extra `tp_action` and `tp_message` columns;
The streaming results are sent to the webhook as JSON;
The AWS Lambda receives such requests, marshalls the SQL results as JSON objects, checks the action types, then triggers the Slack API to remove/send messages.
Conclusion
In this blog, I walked you through three steps to build a real-time security app with minimal custom code.Without Timeplus, such low-latency real-time solutions usually require a complex data infrastructure with Apache Kafka, Apache Flink or Apache Spark, with lots of servers and custom micro-services. With Timeplus, data can be directly pushed to Timeplus without extra message buses. Within a few milliseconds, they are filtered and aggregated, and then trigger webhooks in real time. Compared to custom code with Spark/Flink, we calculated the development effort using Timeplus saves you 10x your developer time/cost. More importantly, this unlocks potential for all your data analysts. Anyone with basic SQL skills can now build real-time applications without the help of engineers. For developers, they can focus on their mission-critical business applications without starting from scratch.
We believe similar development workflows can be applied to build more real-time applications, such as real-time marketing and real-time customer profiling. We welcome you to join our beta program to experience Timeplus yourself, or ask any questions in the Timeplus Community Slack. As always, we welcome your feedback, and look forward to hearing from you soon.