Skip to content

Conversation

@jeffreylovitz
Copy link
Contributor

@jeffreylovitz jeffreylovitz commented Mar 3, 2021

Resolves #1589

Copy link
Contributor

@swilly22 swilly22 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's redesign the timeout/cron interface

src/timeout.h Outdated
Comment on lines 11 to 15
typedef struct {
ExecutionPlan *plan; // the query's ExecutionPlan
bool query_completed; // whether the query has finished successfully
bool changes_committed; // whether the graph has been locked for commits
} TimeoutCtx;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of a timeout source file, I think the API of this header should be:

  1. Timeout_SetTimeOut
  2. Timeout_ClearTimeOut

The Timeout_ClearTimeOut might accept as an input a CronTaskID generated by Cron_AddTask
this ID can be saved at the QueryCtx level.

Timeout_QueryCompleted and Timeout_ChangesCommitted revel too much internal information, please remove them.

Also let's try to find a more suitable location for this file

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we can remove both query_completed and changes_committed
Once a write query tries to acquire the write-lock for the first time, it should (before acquiring the lock) call ClearTimeout if it succeed in clearing the timeout (given that a timeout is present) the query can proceed, otherwise the timeout been triggered, it shouldn't acquire the lock and return exception.

src/util/cron.c Outdated

static void CRON_FreeTask(CRON_TASK *t) {
ASSERT(t);
rm_free(t->pdata);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cron_AddTask should state that it takes ownership over pdata

Copy link
Contributor

@swilly22 swilly22 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

feel free to call if you have any questions,
if we feel like performing a timeout on WRITE queries we can scrap that and apply it only to READ queries

// execute due tasks
CRON_TASK *task = NULL;
while((task = CRON_Peek()) && CRON_TaskDue(task)) {
task = CRON_RemoveTask();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

between while((task = CRON_Peek()) && CRON_TaskDue(task)) { and task = CRON_RemoveTask();
the task might be removed by a worker thread, moreover between CRON_Peek() and CRON_RemoveTask() a new task might be inserted, this entire process (checking if there's a due task and removing it from the heap needs to be atomic)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to atomic. I additionally changed the while to an if so that we don't block entering queries while waiting for a timeout to expire.

src/util/heap.c Outdated

/* ensure heap property */
__pushup(h, idx);
if (idx < h->count)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check can be performed a bit earlier, there's no need to perform:

h->array[idx] = h->array[h->count - 1]
h->array[h->count - 1] = NULL;

if this is the last item in the heap

self.env.assertContains("Query timed out", str(error))

def test04_timeout_during_commit_stage(self):
query = "CREATE (a:M) WITH a UNWIND range(1,10000) AS ctr SET a.v = ctr"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice

src/timeout.h Outdated
Comment on lines 11 to 15
typedef struct {
ExecutionPlan *plan; // the query's ExecutionPlan
bool query_completed; // whether the query has finished successfully
bool changes_committed; // whether the graph has been locked for commits
} TimeoutCtx;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we can remove both query_completed and changes_committed
Once a write query tries to acquire the write-lock for the first time, it should (before acquiring the lock) call ClearTimeout if it succeed in clearing the timeout (given that a timeout is present) the query can proceed, otherwise the timeout been triggered, it shouldn't acquire the lock and return exception.

src/timeout.c Outdated
void Timeout_ClearTimeout() {
CronTask task = QueryCtx_GetTimeoutJob();
if(task == NULL) return;
Cron_RemoveTask(task);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cron_RemoveTask should indicate if it managed to remove the Task. this indication should be returned by Timeout_ClearTimeout

src/query_ctx.h Outdated
/* Set the last writer which needs to commit */
void QueryCtx_SetLastWriter(OpBase *op);
/* Set the query's associated timeout job. */
void QueryCtx_SetTimeoutJob(CronTask task);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See if it make sense to change this to: QueryCtx_SetTimeout(ms,... )

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that would work easily - the QueryCtx tracks whether the query has an associated timeout, it does not instantiate it.

src/query_ctx.c Outdated
Comment on lines 175 to 177
// Changes are being committed, clear the timeout job.
Timeout_ClearTimeout();

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be the first thing we try to do right before acquiring the GIL line 149,
if we fail clearing the timeout (timeout been triggered) we should emit an exception

@swilly22 swilly22 merged commit 964b268 into master Mar 9, 2021
@swilly22 swilly22 deleted the runtime-timeouts branch March 9, 2021 12:00
jeffreylovitz added a commit that referenced this pull request Mar 10, 2021
* Add run-time configuration for default query timeouts

* Timeout for write queries that haven't committed changes

* define TIMEOUT_NO_TIMEOUT

* Refactor timeout logic

* Address PR comments

* Do not use timeouts for write queries

Co-authored-by: swilly22 <roi@redislabs.com>
Co-authored-by: Roi Lipman <swilly22@users.noreply.github.com>
(cherry picked from commit 964b268)
swilly22 added a commit that referenced this pull request Mar 15, 2021
* RedisGraph benchmark automation  (#1557)

* Added local and remote benchmark definition and automation

* [fix] Fixes per PR review. Added option to specify benchmark via BENCHMARH=<benchmark name>. Updated benchmark template

Co-authored-by: filipecosta90 <filipecosta.90@gmail.com>
(cherry picked from commit f6f1ab2)

* Updated benchmark UPDATE-BASELINE to be less restrictive in the latency KPI (#1577)

Given we're still experimenting with the benchmarks CI KPI validation, this PR increases the `OverallClientLatencies.Total.q50` to be lower than 2.0 ( before was 1.5 ) so that we can collect further data and adjust afterwards...

(cherry picked from commit 611a0f0)

* * log redisgraph version (#1567)

When pulling container image tagged as `latest` or `edge` I sometimes
don't know which version I'm running, and it would be much faster to
find out if the information was displayed at startup. This patch logs
this information.

Co-authored-by: Roi Lipman <swilly22@users.noreply.github.com>
(cherry picked from commit fe2e7ce)

* [add] Triggering nightly CI benchmarks; Early return on CI benchmarks for forked PRs (#1579)

(cherry picked from commit a529c1e)

* use PRIu64 to format uint64_t (#1581)

(cherry picked from commit c0e00d5)

* [fix] Fixed missing github_actor on ci nightly benchmark automation (#1583)

(cherry picked from commit 8abad84)

* Fix idx assertion (#1580)

* Fix flawed assertion in index deletion logic

* Reduce KPI for updates_baseline benchmark

* Address PR comments

* Address PR comments

Co-authored-by: Roi Lipman <swilly22@users.noreply.github.com>
(cherry picked from commit 6bad20a)

* Always report run-time errors as the sole reply (#1590)

* Always report run-time errors as the sole reply

* Update test_timeout.py

Co-authored-by: Roi Lipman <swilly22@users.noreply.github.com>
(cherry picked from commit c9ba776)

* remove wrong assertion (#1591)

(cherry picked from commit 12ef8ac)

* Report 0 indices created on duplicate index creation (#1592)

(cherry picked from commit e00f2c8)

* Multi-platform build (#1587)

Multi-platform build

(cherry picked from commit 26ace7a)

* Multi-platform build, take 2 (#1598)

(cherry picked from commit acde693)

* Moved common benchmark automation code to redisbench-admin package. Improved benchmark specification file (#1597)

(cherry picked from commit ebea927)

* Added readies submodule (#1600)

* Added readies submodule

* fixes 1

(cherry picked from commit efbfeaf)

* Dockerfle: fixed artifacts copy (#1601)

(cherry picked from commit f722f2d)

* CircleCI: fixed version release (#1602)

(cherry picked from commit 9f218d6)

* CircleCI: release-related fix (#1604)

(cherry picked from commit 15cf291)

* remove redundent include (#1606)

(cherry picked from commit 7ea1c43)

* Threaded bulk insert (#1596)

* Update the bulk updater to execute on a thread

* Bulk loader endpoint locks for minimal time

* TODOs

* Use a separate thread pool for bulk operations

* Update test_thread_pools.cpp

* refactor bulk-insert

* Fix PR problems

* count number of pings during bulk-insert, only create graph context on BEGIN token

Co-authored-by: swilly22 <roi@redislabs.com>
Co-authored-by: Roi Lipman <swilly22@users.noreply.github.com>
(cherry picked from commit 2d43f9d)

* Use system gcc in Ubuntu 16 (#1615)

(cherry picked from commit 0c10130)

* wrongly assumed add op had only 2 operands (#1618)

(cherry picked from commit 6b06095)

* Updated benchmark requirements version (#1616)

* Updated benchmark requirements version

* Update requirements.txt

(cherry picked from commit db080d4)

* Runtime timeouts (#1610)

* Add run-time configuration for default query timeouts

* Timeout for write queries that haven't committed changes

* define TIMEOUT_NO_TIMEOUT

* Refactor timeout logic

* Address PR comments

* Do not use timeouts for write queries

Co-authored-by: swilly22 <roi@redislabs.com>
Co-authored-by: Roi Lipman <swilly22@users.noreply.github.com>
(cherry picked from commit 964b268)

* Fix typo in assertion

* bump version to 2.2.16

Co-authored-by: Roi Lipman <swilly22@users.noreply.github.com>
Co-authored-by: filipe oliveira <filipecosta.90@gmail.com>
Co-authored-by: bc² <odanoburu@users.noreply.github.com>
Co-authored-by: Rafi Einstein <raffapen@outlook.com>
jeffreylovitz added a commit that referenced this pull request Mar 15, 2021
* Add run-time configuration for default query timeouts

* Timeout for write queries that haven't committed changes

* define TIMEOUT_NO_TIMEOUT

* Refactor timeout logic

* Address PR comments

* Do not use timeouts for write queries

Co-authored-by: swilly22 <roi@redislabs.com>
Co-authored-by: Roi Lipman <swilly22@users.noreply.github.com>
(cherry picked from commit 964b268)
swilly22 added a commit that referenced this pull request Mar 17, 2021
* Threaded bulk insert (#1596)

* Update the bulk updater to execute on a thread

* Bulk loader endpoint locks for minimal time

* TODOs

* Use a separate thread pool for bulk operations

* Update test_thread_pools.cpp

* refactor bulk-insert

* Fix PR problems

* count number of pings during bulk-insert, only create graph context on BEGIN token

Co-authored-by: swilly22 <roi@redislabs.com>
Co-authored-by: Roi Lipman <swilly22@users.noreply.github.com>
(cherry picked from commit 2d43f9d)

* set score to 1 for each document (#1607)

* set score to 1 for each document

* test fulltext search scoring

* Update proc_fulltext_query.c

* Add documentation

Co-authored-by: Jeffrey Lovitz <jeffrey.lovitz@gmail.com>
(cherry picked from commit bd1fdca)

* wrongly assumed add op had only 2 operands (#1618)

(cherry picked from commit 6b06095)

* Updated benchmark requirements version (#1616)

* Updated benchmark requirements version

* Update requirements.txt

(cherry picked from commit db080d4)

* Runtime timeouts (#1610)

* Add run-time configuration for default query timeouts

* Timeout for write queries that haven't committed changes

* define TIMEOUT_NO_TIMEOUT

* Refactor timeout logic

* Address PR comments

* Do not use timeouts for write queries

Co-authored-by: swilly22 <roi@redislabs.com>
Co-authored-by: Roi Lipman <swilly22@users.noreply.github.com>
(cherry picked from commit 964b268)

* Use master version of CircleCI config

* bump version to 2.4.2

Co-authored-by: Roi Lipman <swilly22@users.noreply.github.com>
Co-authored-by: filipe oliveira <filipecosta.90@gmail.com>
pnxguide pushed a commit to CMU-SPEED/RedisGraph that referenced this pull request Mar 22, 2023
* Add run-time configuration for default query timeouts

* Timeout for write queries that haven't committed changes

* define TIMEOUT_NO_TIMEOUT

* Refactor timeout logic

* Address PR comments

* Do not use timeouts for write queries

Co-authored-by: swilly22 <roi@redislabs.com>
Co-authored-by: Roi Lipman <swilly22@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

default timeout

3 participants