-
Notifications
You must be signed in to change notification settings - Fork 4.1k
sql: mixed timestamp queries #39275
Description
Problem
The feature I'm proposing seems to have two use cases in which it would be beneficial.
- Combining geo-partitioning with follower reads
Imagine an application which wants to serve low-latency reads for all operations. This application would also like to perform low-latency writes for user oriented data. The application has two primary sources of information which it draws on to complete user queries; 1) a user profile which is geo-partitioned and 2) a global index of data which updated asynchronously.
The data required to render a view joins data from the user profile and the global index. In this case it is acceptable for the information from the global index to be stale. If a query could specify that the joined table should be read historically then a follower read could be used.
Imagine for example we have a users table:
CREATE TABLE users (
region STRING, id UUID, topics STRING[],
-- ...
PRIMARY KEY (region, id)
); -- PARTITION BY region ...
CREATE TABLE user_history (
region STRING, user_id UUID, article_id UUID, seen DATE,
PRIMARY KEY (region, user_id, article_id, seen)
); -- PARTITION BY region ...
CREATE TABLE articles (
id UUID, title STRING, content STRING, topic STRING, score FLOAT8,
INDEX (topic, score),
PRIMARY KEY (id)
);
For some dummy data we can imagine:
INSERT
INTO
users (region, id, topics)
VALUES
('us-east', '82e7b741-af16-4031-9101-20a99ae6bda6', ARRAY['memes', 'jokes']);
INSERT
INTO
articles (id, title, topic, score)
VALUES
(gen_random_uuid(), e'A meme I haven\'t seen', 'memes', 42),
(gen_random_uuid(), 'A meme I saw', 'memes', 7.4),
(gen_random_uuid(), 'A joke', 'jokes', 1.1);
INSERT
INTO
user_history (region, user_id, article_id, seen)
SELECT
u.region, u.id, a.id, now()
FROM
users AS u, articles AS a
WHERE
a.title = 'A meme I saw';
Finding the articles for the user might look something like:
SELECT
*
FROM
articles AS a, users AS u
WHERE
a.topic = ANY u.topics
AND u.region = 'us-east'
AND u.id = '82e7b741-af16-4031-9101-20a99ae6bda6'
AND a.id
NOT IN (
SELECT uh.article_id FROM user_history AS uh WHERE uh.user_id = u.id
)
ORDER BY
a.score DESC
LIMIT
10;
The problem with this query is that it requires going to the lease holder for the articles index. It would be possible to exploit the duplicate indexes pattern to build geo-replicated indexes but that approach will have tremendous extra cost. For each region we'll need to store additional replicated copies of the index. One way to mitigate this cost would be to only have a single replica of each copy of the index but such a configuration would not be fault tolerant. A more ideal solution would be to use a stale copy of the index and to exploit follower reads. It is totally the case that we could use follower reads for this entire join but it is less than optimal as content recently added to the user_history table will not be reflected. Even worse is the fact that the topics might be stale so if a user updated their topics they would need to wait until the follower read delay has passed. Maybe that'd be okay if the follower read delay were shorter (#37083).
- Using historical data with
CREATE TABLE AS SELECT/INSERT INTO/UPDATE
See #39123
This technique could be very useful to materialize the results of an expensive query without sending all of the data to the client and without contending with foreground traffic.
Describe the solution you'd like
The proposal is primarily about enabling queries which read from tables at mixed timestamps. I believe this will also imply the need for a mechanism to perform historical reads inside of a transaction. In the above query the solution would look like the following:
SELECT
*
FROM
(articles AS OF SYSTEM TIME experimental_follower_read_timestamp()) AS a,
users AS u
WHERE
a.topic = ANY u.topics
AND u.region = 'us-east'
AND u.id = '82e7b741-af16-4031-9101-20a99ae6bda6'
AND a.id
NOT IN (
SELECT uh.article_id FROM user_history AS uh WHERE uh.user_id = u.id
)
ORDER BY
a.score DESC
LIMIT
10;
The syntax of the proposed solution is to augment the SQL grammer to extend relation_expr with ( table_name as_of_clause) as an additional form. The more challenging portion of the solution will be extending the semantics to allow table readers to read at a timestamp other than the transaction timestamp. Each table_ref will need to carry a timestamp. Constraints can only be used by the optimizer when the timestamps are all the same for the indexes and tables which share a constraint (and that constraint must exist at that timestamp). This proposal is certainly not simple in its implementation but could be quite useful.
Describe alternatives you've considered
The main alternative is to run to perform all logic the integrates data from different times on the client. The downside of performing this work on the client is it is an added burden for the client and may be dramatically less efficient as data will need to be transmitted to the client for processing. Furthermore it eliminates some opportunities for query optimization.
An alternative to achieve low latency in the first case might be to keep extra copies of the global index (discussed above). Another mitigation for the first case might be to reduce the follower read delay but that remains less than optimal.