Skip to content

Conversation

@smatvienko-tb
Copy link
Contributor

Hi guys!
Please, consider printing COUNT results even if any failures.
It makes no sense to query billions of rows for a few hours and show nothing if some failure happens.
In general, 2-5 failures are fine for a few billion rows. And retry strategy may fix those failures.
The results are always approximate because the cluster keeps writing new data during dsbulk run
In my example, I counted 6.5 Billion rows with 2 failures, spent 2 hours, and got no results at all.
Here is an example of what I got after a few tries:
image

…o query billions rows for a few hours and show nothing if some failure happens
@adutra
Copy link
Contributor

adutra commented Jan 18, 2022

Also @smatvienko-tb I would need you to sign our CLA if you haven't done it before:

https://cla.datastax.com/

Thanks!

@smatvienko-tb
Copy link
Contributor Author

smatvienko-tb commented Jan 18, 2022 via email

@adutra adutra merged commit 950c11a into datastax:1.x Jan 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants