Describe the problem
While a big data import migrating a huge mongo database into cockroachdb, suddenly the processed stopped importing. The reason: duplicate key value. But I do have ON CONFLICT DO NOTHING active. I got a doubt and was just able to reproduce it. The reason for this is:
We do bulk inserts to go through the import faster. But it turns out in the same batch the same key pair (a distinct key made from 3 fields) appeared at least twice. Cockroachdb does not handle this well yet. This is clearly a bug and only happens in this scenario, if the database holds a value for the duplication already, the conflict handling will work even though there are multiple entries for this key in the statement.
Expectation
The expectation I have from cockroachdb is: If in the same bulk insert statement a key conflict exists and there is an ON CONFLICT DO NOTHING, write the first value and ignore further ones. This is how they would be treated should a value exist in the database already anyways though.
To Reproduce
Create a table with some unique value (we had a primaryKey spanned over 3 columns) and execute a bulk statement. The table should not have any record yet with the same key.
INSERT INTO "table" ("key1", "key2", "key3", "randomfield") VALUES (12,12,13,"1"), (12,12,13,"2") ON CONFLICT DO NOTHING
Describe the problem
While a big data import migrating a huge mongo database into cockroachdb, suddenly the processed stopped importing. The reason: duplicate key value. But I do have
ON CONFLICT DO NOTHINGactive. I got a doubt and was just able to reproduce it. The reason for this is:We do bulk inserts to go through the import faster. But it turns out in the same batch the same key pair (a distinct key made from 3 fields) appeared at least twice. Cockroachdb does not handle this well yet. This is clearly a bug and only happens in this scenario, if the database holds a value for the duplication already, the conflict handling will work even though there are multiple entries for this key in the statement.
Expectation
The expectation I have from cockroachdb is: If in the same bulk insert statement a key conflict exists and there is an
ON CONFLICT DO NOTHING, write the first value and ignore further ones. This is how they would be treated should a value exist in the database already anyways though.To Reproduce
Create a table with some unique value (we had a primaryKey spanned over 3 columns) and execute a bulk statement. The table should not have any record yet with the same key.
INSERT INTO "table" ("key1", "key2", "key3", "randomfield") VALUES (12,12,13,"1"), (12,12,13,"2") ON CONFLICT DO NOTHING