When populating Cloudant databases, often the source of the data is initially some JSON documents in a file.
cloudantimport is designed to assist with importing such data into Cloudant efficiently. Simply pipe a file full of JSON documents into cloudantimport, telling it the database to send the data to and it will group the documents into batches and employ Cloudant's bulk import API.
You will need to download and install the Go compiler. Clone this repo then:
go build ./cmd/cloudantimportThe copy the resultant binary cloudantimport (or cloudantimport.exe in Windows systems) into your path.
cloudantimport authenticates with your chosen Cloudant service using environment variables as documented here e.g.
CLOUDANT_URL=https://xxxyyy.cloudantnosqldb.appdomain.cloud
CLOUDANT_APIKEY="my_api_key"Pipe a JSON file (one document per line) into cloudantimport and supply the database you want to write to using the --dbname/--db parameter:
cat myfile.json | cloudantimport --db mydbBy default, only one bulk write API call is in flight at any one time. This can be increased with the --concurrency/--c option
# import data with a maximum of 5 bulk write API calls in flight at once
cat myfile.json | cloudantimport --db mydb --concurrency 5cloudantimport can be paired with datamaker to generate any amount of sample data:
# template ---> datamaker ---> 100 JSON docs ---> cloudantimport ---> Cloudant
echo '{"_id":"{{uuid}}","name":"{{name}}","email":"{{email true}}","dob":"{{date 1950-01-01}}"}' | datamaker -f json -i 100 | cloudantimport --db people
written {"docCount":100,"successCount":1,"failCount":0,"statusCodes":{"201":1}}
written {"batch":1,"batchSize":100,"docSuccessCount":100,"docFailCount":0,"statusCodes":{"201":1},"errors":{}}
Import completeor with the template as a file:
cat template.json | datamaker -f json -i 10000 | cloudantimport --db peopleThe output comes in two parts. Firstly, one line per bulk write request made to stderr:
2025/11/20 09:51:49 201 176 500 0
2025/11/20 09:51:49 201 165 500 0
2025/11/20 09:51:50 201 165 500 0
This shows the date/time, HTTP status code, latency (ms), number of documents successfully written and the number that failed.
Then at the end comes a summary to stdout:
{"statusCodes":{"201":20},"errors":{"conflict":10},"docs":9990,"batches":20}
which lists the counts of each HTTP status code, counts of document write errors, total docs written and total number of bulk write API calls.
To remind myself of what's going on, this diagram helps:
