Skip to content

FangCundi/DepState

Repository files navigation

DepState: Detecting Synchronization Failure Bugs in Distributed Database Management Systems

DepState is a tool for detecting synchronization failure bugs in distributed database management systems.

The artifact consists of the following contents:

One folder containing the tool binaries, binary file after code instrumentation (NDB), and the system configuration file in the case of an NDB cluster (DepState). Please note that I compressed the mysqld file after staking due to size limitations, so please unzip it before using it. Three folders containing the tool's coverage metadata for the three databases under test (ndb_log, mariadb_log, innodb_log). The readme.md describes how to use the tool to test the databases, including the corresponding configuration. The errors detected by DepState are described in bug, It contains the log file for the bug. Since we are testing in a distributed scenario, we also learned a file that helps you build a mysql NDB cluster as a test environment via docker's container (NDB cluster-container.md).

Synchronization Failure Bugs detected by DepState

# DDBMSs Bug Type The Root Cause Analysis Bug Status
1 MySQL NDB Cluster Crash Engine mishandles metadata synchronization and locking. Confirmed
2 MySQL NDB Cluster Crash Concurrent cluster restart during node reboot causes state inconsistency. Confirmed
3 MySQL NDB Cluster Crash Improper lock handling during node removal in synchronization. Confirmed
4 MySQL NDB Cluster Crash Internal error occurs during signal transmission in nodes. Confirmed
5 MySQL NDB Cluster Crash Improper metadata lock handling during synchronization with complex dependencies. Confirmed
6 MySQL NDB Cluster Crash FindTable function failure during BACKUP. Investigating
7 MySQL NDB Cluster Crash SimulatedBlock component's signal processing fails. Confirmed
8 MySQL NDB Cluster Crash Data Check fails, the specified table or table pointer could not be found. Investigating
9 MySQL NDB Cluster Crash SUMA bucket switch failure during asynchronous event processing. Confirmed
10 MySQL NDB Cluster Crash Forced shutdown-induced signal processing error caused cascade node restarts. Investigating
11 MySQL NDB Cluster Crash Some operations are not supported when synchronizing complex SQL queries. Investigating
12 MySQL NDB Cluster Hang Timeout mechanism failure in NDB Cluster during complex query execution. Confirmed
13 MySQL NDB Cluster Hang Failure in query plan generation and optimization when handling complex nested queries. Confirmed
14 MySQL NDB Cluster Hang Transaction optimization enters an infinite loop. Confirmed
15 MySQL NDB Cluster Hang ID allocation failure disrupts synchronization during node rejoin. Confirmed
16 MySQL NDB Cluster Hang Synchronization fails during complex query processing. Confirmed
17 MySQL NDB Cluster Inconsistency Failure to send synchronization signal in function. Investigating
18 MySQL NDB Cluster Inconsistency Error occurred updating automatic index statistics. Investigating
19 MySQL InnoDB Cluster Crash Incompatible data types cause synchronization to fail. Investigating
20 MySQL InnoDB Cluster Crash A data type conversion error after a network connection failure causes the node to exit. Investigating
21 MariaDB Galera Cluster Crash Missing records and duplicate key conflicts in delete and update operations. Investigating
22 MariaDB Galera Cluster Inconsistency Data type mismatches due to invalid default values are ignored by WSREP. Investigating

Environment for DepState and Target DDBMS

We conducted all experiments on a 64-bit Ubuntu 22.04 machine equipped with an AMD EPYC 7742 processor (128 cores @ 2.25 GHz) and 488 GiB of main memory.

A distributed cluster consisting of one node as manager, four nodes as data nodes, and four nodes as sql clients. Analogous to a mysql ndb cluster, there is one ndb_mgm, four ndbd, and four mysqld, where ndbd and mysqld must be the same node setup, as configured in (NDB cluster-container.md).

Example for DepState to test DDBMS (set NDB as an example)

This step builds on the fact that you have built a cluster that meets the test conditions, using NDB as an example:

First, on all nodes, replace mysql, mysqld, mysqlad, mysqladmin, ndbd, ndb_mgm, ndb_mgmd in the existing environment with the ones in the DepState folder, and make sure that these files have executable permissions.The DepState folder contains the files after we staked them.

Then, on all nodes, copy log_file.txt from the DepState folder to the /home/mine-code/ directory. If you don't have this path, please create one

docker cp DepState/log_file.txt <containerID>:/home/mine-code/

Subsequently, copy the control-main file from the DepState folder to the /home directory of the manager node and the node-main file to the /home directory of the data node.

docker cp DepState/control-main <containerID>:/home/mine-code/
docker cp DepState/node-main <containerID>:/home/mine-code/
...

Then, please replace the corresponding staked binaries into the corresponding directory of the node. manager needs to replace ndb_mgm and ndb_mgmd, and data node needs to replace ndbd, mysql, mysqladmin and mysqld.

docker cp DepState/ndb_mgm <containerID>:/bin/ndb_mgm
docker cp DepState/ndb_mgmd <containerID>:/bin/ndb_mgmd
docker cp DepState/ndbd <containerID>:/bin/ndbd
docker cp DepState/mysql <containerID>:/bin/mysql
docker cp DepState/mysqladmin <containerID>:/bin/mysqladmin
docker cp DepState/mysqld <containerID>:/bin/mysqld
...

Finally, please create a mytest database in the cluster, an empty one is fine.

create database mytest;

Then you can run DepState. Run node-main on the four data nodes, then control-main on the manager node, and you're ready to start testing.

On data node:

./node-main <outfile> <dryrunflag> <timeout> <sql_max> <dbname> <SERVER_PORT>

such as:
./node-main /home/mine-code/output_v32-11 1 90 8 mytest103 101
./node-main /home/mine-code/output_v32-11 0 90 8 mytest103 101
./node-main /home/mine-code/output_v32-11 0 90 8 mytest103 101
./node-main /home/mine-code/output_v32-11 0 90 8 mytest103 101

-outfile:
    Path to the result.
-dryrunflag:
    Flag if the current node needs to be responsible for database initialization.
-timeout:
    Timeout setting for one sql statement.
-sql_max:
    How many sql sequences are included in a sql scenario.
-dbname:
    The name of the database to be tested.
-SERVER_PORT:
    sql service port number.

Only the first node, which needs to be responsible for table creation, so dryrunflag is 1 and the rest is 0

On manager node:

./control-main <max_test> <outfile> <max_timeout> <state> <row_or_table_len> <ip1> <ip2> <ip3> <ip4> <SERVER_PORT> <dbname> <time_num>

such as:
./control-main 10 /home/mine-code/output_v32-11 5 10 10 192.172.10.9 192.172.10.10 192.172.10.11 192.172.10.12 101 mytest103 10

-max_test:
    How many tests to run in one run.
-outfile:
    Path to the results.
-max_timeout:
    The number of execution failures that can be tolerated, 10 tests have to restart the mysql client if there are PROBE times.
-state:
    number of tests in each round of testing, how many state sequence mutations to start for the next sql scenario after the state sequence mutation.
-row_or_table_len:
    For each sql statement, how probable is the table level and how much is the column level.
-SERVER_IP_1:
    IP address of the 1st data node.
-SERVER_IP_2:
    IP address of the 2nd data node.
-SERVER_IP_3:
    IP address of the 3rd data node.
-SERVER_IP_4:
    IP address of the 4th data node.
-port:
    SQL service port number.
-dbname:
    Database name to be tested.
-time_num:
    Maximum loop variable.

The test results are stored in the output file.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published