Skip to content

Query cancellation with CTRL-C for Python client#3749

Merged
Mytherin merged 2 commits intoduckdb:masterfrom
hannes:pypending
Jun 8, 2022
Merged

Query cancellation with CTRL-C for Python client#3749
Mytherin merged 2 commits intoduckdb:masterfrom
hannes:pypending

Conversation

@hannes
Copy link
Member

@hannes hannes commented Jun 1, 2022

Follow up to #3747, adding query cancellation to Python client. Should fix #3742. One caveat that we maybe have to check out still is the need for the GIL when checking for interrupts. Hopefully that does not lead to a performance degradation.

@pdet
Copy link
Collaborator

pdet commented Jun 1, 2022

Ups, I had worked on the same thing yday, but you beat me to the PR :).

The GIL thing is indeed a pity, maybe we should run some numbers?

@hannes
Copy link
Member Author

hannes commented Jun 1, 2022

Great idea, please do that :P

@pdet pdet self-requested a review June 8, 2022 12:10
Copy link
Collaborator

@pdet pdet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've did try out different data sizes and number of threads, and it seems there is no considerable impact.

import duckdb
from datetime import datetime
import threading
import queue as Queue

class DuckDBThreaded:
    def __init__(self,duckdb_insert_thread_count,thread_function):
        self.duckdb_insert_thread_count = duckdb_insert_thread_count
        self.threads = []
        self.thread_function = thread_function
        
    def multithread_test(self,size,if_all_true=True):
        duckdb_conn = duckdb.connect(check_same_thread=False)
        queue = Queue.Queue()
        return_value = False

        for i in range(0,self.duckdb_insert_thread_count):
            self.threads.append(threading.Thread(target=self.thread_function, args=(duckdb_conn,queue,size),name='duckdb_thread_'+str(i)))

        for i in range(0,len(self.threads)):
            self.threads[i].start()
            if not if_all_true:
                if queue.get():
                    return_value = True
            else:
                if i == 0 and queue.get():
                    return_value = True
                elif queue.get() and return_value:
                    return_value = True
            
        for i in range(0,len(self.threads)):
            self.threads[i].join()

        assert (return_value)


def fetchone_query(duckdb_conn, queue, size):
    try:
        a = duckdb_conn.execute("select sum(range) from  range ("+str(size)+")").fetchone()
        queue.put(True)
    except:
        queue.put(False)  

def run_test(num_threads, size):
    cur = datetime.now()
    duck_threads = DuckDBThreaded(num_threads,fetchone_query)
    duck_threads.multithread_test(size)
    print (str(num_threads) + "T " + str(size) + " elements")
    print (datetime.now() - cur)

run_test (10, 100000000)
run_test (20, 100000000)
run_test (100, 100000)

@Mytherin Mytherin merged commit 05499e7 into duckdb:master Jun 8, 2022
@Mytherin
Copy link
Collaborator

Mytherin commented Jun 8, 2022

Great! Thanks :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature request: respond to "interrupt kernel" button in Jupyter

4 participants