How to leverage append-only streams and key-value streams to monitor securities portfolios in real time.
Speed matters in capital markets. Just ask the shareholders of SVB.
One key component in capital markets that supports traders is post-trade analysis, the analyzing of trading activities and market data after trades have been executed. When a trade is executed, for example, all of the information attached to it is deployed to update accounts and ledgers, recalculate risk and kick off settlement processes.(1) Developing architectures that monitor firm-wide risk positions and regulatory requirements in real time helps financial service firms gain insight on opportunities (and risk) as soon as trades happen. Good post-trade analytics can dramatically improve risk-adjusted performance. But too frequently, we see star traders leveraging antiquated technologies to make decisions, which means they are making risk decisions with data that is sometimes already one day old. As we have seen with the fall of SVB, 24 hours in an eternity in capital markets time. However, with streaming data, transactions are completed seamlessly and investment strategies can be continually reviewed.
In today’s blog, I will explore the real-time data analytics concept of two different type of streams in Timeplus and show examples of how to use these tools to turn real-time data into real-time insight in a post-trade capital markets scenario:
Append-only stream which is composed by a sequence of events
Key-value streams which represent the current state of the target object
To visualize these use cases, try our live demo for securities trading scenarios at demo.timeplus.cloud.
Append-only stream
An append-only stream is a sequence of events in which:
The stream represents what happened, so the event is read-only and cannot be updated
The event is usually ordered by time, while it is possible to have late event due to network transmission delay or time skew
The stream is usually unbounded
Key-value stream
A key-value stream is also a stream composed by a sequence of events, the difference here is that it has a key. In the above diagram, the color indicates the key of the event, so this stream has four different keys: red, blue, orange, and green. In most of the analysis scenarios, it is important to get the latest value of the keys, for example, the latest price of a security or balance of a bank account. So with the key-value stream, there is a compaction process running to keep the latest value by key. With this compacted data, the stream analysis can get the latest value from a stream per specified key and once a new value is updated, the stream query can use that changed data to run incremental query processing. This is how change data capture (“CDC”) is supported by Timeplus.
Let’s take a look at the following example of how to use Timeplus’s streaming SQL with append-only stream and key-value stream to analyze and monitor a stock portfolio.
Case Studies: Security Analysis and Portfolio Monitoring
Our financial service clients use Timeplus to solve mission-critical activities that demand low latency and high system reliability. One core use case is to monitor portfolio/position performance in real time for improved risk controls and regulatory reporting. In this blog, we created a simplified case to simulate an investment and trading scenario in which an investor is trading securities and wants to analyze and monitor her portfolio in real-time. Let's take a look at how Timeplus can help to achieve this goal.
First, let's take a look at the key related data:
market_price and transaction are two append-only streams, both of which are event flow of the facts. They cannot be changed and are ordered by time.
account and position are two key-value streams, which can be used to record how much funding or securities the account has at the current time.
A trade simulator(2) is running in the background, simulating a random trading execution every 5 seconds. Note this is a highly simplified security trading simulation where we assume:
There is one trading account which has the initial funding of $1,000,000.
The portfolio contains four securities: Microsoft , Amazon, Coca cola and Walmart, with the codes: 105.MSFT, 105.AMZN, 106.KO , 106.WMT
Every 5 seconds, a random transaction is executed based on the current funding and portfolio positions. A buy or sell of one random securities among the portfolio with random quantities will be executed.
All executions will always succeed based on the latest market price.
There is no commission and other fee generated by the transaction.
Upon each execution, related data will be pushed to Timeplus for analysis.
OK, let’s start with monitoring those append only streams.
Monitor Append-only Stream: Market Price and Transactions
What is the latest price trend for each security that I have?
If you want to monitor the real-time price trend, just run a simple stream SQL from the price stream:
SQL 1
SELECT price, code FROM market_price WHERE code = '105.MSFT'
The above query will return the latest price of Microsoft in real-time. Every time there is a new price updated from the data source, the result will be pushed to the user through the SQL stream.
Also, you can monitor the price trend of multiple securities in your portfolio by running the following SQL:
SQL 2
SELECT code, price, _tp_time as time FROM market_price
WHERE code in ('105.MSFT','105.AMZN', '106.KO', '106.WMT')
transaction is also an append-only stream which records all the executions. The user can run stream SQL on this stream to monitor what’s happening to all transactions. For example, a user can monitor which security was sold the most during a 5 minutes window. The streaming SQL will be like this:
SQL 3
SELECT window_start, max(quantity * price) as sell_max, code
FROM tumble(transaction, 5m)
GROUP BY window_start, code
We can run more complex queries on an append-only stream. For example, if the user wants to know how much money did I spend on each security? This result can be achieved by aggregating all the historical transactions on each security.
SQL 4
SELECT
sum((direction * quantity) * price) AS total_cost, code, accountId
FROM transaction WHERE _tp_time>='1970-1-1'
GROUP BY code, accountId
Note, in this query, WHERE _tp_time>='1970-1-1' is applied to the query, which tells the query to aggregate all data on the transaction stream from the very earliest event. Because in order to get the current cost of specific securities, the query needs all transaction events.
Monitoring an append-only stream using stream SQL is pretty simple and straightforward, while things got a little bit complex coming to the key-value stream. Let’s take a look at what happens here.
Monitoring Key-value Stream, Portfolio Positions and Funding
Portfolio positions, like pieces in a game of chess, are not an append-only stream. It is the current state that records the quantity of each security the account is holding; it is more like a traditional OLTP (Online Transactional Processing) database table or a hash-table, where the user can update the state by running update (key, values).
If the user wants to ask the question of how many stocks do I have for each security? The following query will give the answer:
SQL 5
SELECT code, quantity FROM position
The Timeplus streaming query is different from a traditional OLTP query, as it will not only return the current state. When the stream is updated with new key-values, the stream query will also return the updated result. So that is what streaming SQL does, it will get all your latest results in real-time and will continuously update it.
Similar to monitoring account funding, if the user wants to know how much money she has in her account, she can run the following SQL:
SQL 6
SELECT funding FROM account
Append-Only Stream Join a Key-value Stream
Let's go a little further on more complex cases for streaming data analysis. What will happen if you want to join two different streams for some advanced analytic functions?
All the above queries were single stream/table queries where only one stream/table is involved. In a real analytics case, it is often required to join different streams/tables.
In the streaming world, there are different types of join, these different types of join could be used in different cases.
Append-only string join historical table
The first case is that the user want to enrich a stream by providing some lookup, for example an there is an append only stream of user’s transaction and there is another historical table recorded the each user’s address, when I want have a transaction report containing each user's address information so I can do the shipping, I can join the transaction stream with historical address table.
Append-only stream join Append-only stream
Another case in the stream analytic cases is there is a user transaction stream and a payment stream, the user wants to have a report that contains the payment status of each transaction. Then a two append-only streams join will be used. Since stream is unbounded, usually an interval join or asof join is used to specify the scope of the join.
Note, in the above diagram, assuming using interval query where the time range is within 3 time window
Append-only stream join Key-value stream
When joining the append-only stream with a historical data table, there is a limitation as what if the table changed with new value? For example, the customer moved to a new address, in this case, we need the table to be updated when calculating the joining result. This is where the key-value stream can help, when joining an append-only stream with key-value stream, the joining will always be executed on the latest value from the key-value stream, so if the customer has a new address, that address will be used for joining.
Key-value streaming join Key-value stream
It is also possible to join two key-value streams, which works similar as if you join two tables in a traditional database, but when any of the values is changed in one of the streams, the new joining result will be calculated and pushed to the user.
Note, the append-only stream join key-value stream and key-value stream join key-value stream will give you the same result, as we have mentioned, the key-value stream is just another format of the same stream which represents the latest status.
So let’s take a look at how to use join to get some more advanced streaming analysis.
Now the user came with a question: I already know the cost on each securityxz, what is my cost basis for each security? The cost basis can be calculated as the total money spent on one security / security quantity. We have a query that will return the cost (total money spent on a security) already. The security quantity is recorded in the position stream. So what the user needs to do is to join these two:
SQL 7
WITH cost AS(
SELECT
sum((direction * quantity) * price) AS total_cost, code, accountId, emit_version() AS ver
FROM transaction WHERE _tp_time>='1970-1-1'
GROUP BY code, accountId
)
SELECT total_cost, code, accountId, ver, p.quantity AS quantity,
total_cost / quantity AS cost_price
FROM cost AS c INNER JOIN position AS p ON c.code = p.code
By joining the append-only stream with stateful key-values stream in the above query, when either the new transaction has been triggered or the position data has been changed, the join will apply the latest data to calculate the join.
In the next query, the user wants to know what their portfolio is worth now:
SQL 8
SELECT
sum(total_value_per_code) AS total_value
FROM
(
SELECT
s.code AS code, latest(s.price * p.quantity) AS total_value_per_code
FROM
market_price AS s
INNER JOIN position AS p ON s.code = p.code
GROUP BY
s.code
)
To calculate the total value of all securities in the portfolio, we need to join the price stream to get the latest price of each securities to the position stream which returns the current quantity of those securities. The subquery in the above SQL will do it which returns the current value of each securities, and the query uses a sum to aggregate the result into the total value of all securities in the portfolio.
Monitoring Real-Time P&L
The last query I am going to show will join three streams. The question to answer is what is the profit and loss for each securities.
To calculate the P&L, as we have already calculated the cost price for each security, we just need to compare the latest price of the security and calculate the difference.
P&L = ( latest_price - cost_price ) * quantity
This will join the transaction stream with position stream and price stream.
Here the join requires a key-value stream instead of an append-only stream to match the latest price of specific security using security code. So we can create a key-value stream price by running a materialized view that targets the key-value stream.
As we have mentioned, it is possible to join two append-only streams using interval join or asof join, while using key-value stream will provide the best performance as the join only needs to find the latest value for each security.
SQL 9
WITH cost AS
(
SELECT
sum((direction * quantity) * price) AS total_cost, code, accountId, emit_version() AS ver
FROM transaction WHERE _tp_time>='1970-1-1'
GROUP BY code, accountId
)
SELECT
c.total_cost, c.code, c.accountId, c.ver,
po.quantity AS quantity, total_cost / quantity AS cost_price,
pr.price AS latest_price,
(latest_price - cost_price) AS profit_per_share, profit_per_share * quantity as profit
FROM cost AS c
INNER JOIN position AS po ON c.code = po.code
INNER JOIN price AS pr ON c.code = pr.code
The query will return the real-time profit and loss for each securities in the account, no matter if a new transaction is executed which caused the position change, or the security price has been changed, the updated P & L will be reported to the user in real-time.
Summary
In today’s blog, we explored the idea of two core concepts in the streaming analytic world: append-only stream and stateful key-value tables. Using an example of stock portfolio monitoring, we have shown how these two concepts can be used to support complex real-time streaming analytics. Streaming analytics are not easy to grasp, but when the right tools are applied, it will save you tons of time. We’ve designed Timeplus to make your stream analytic life easier.
The live demo of this showcase is running on demo.timeplus.cloud. Try it and tell me your comments and feedback.
Reference
Dow Jones Blog streaming-data-for-real-time-market-decisions
The transaction simulation code can be found here