cloudantimport

Introduction

When populating Cloudant databases, often the source of the data is initially some JSON documents in a file.

cloudantimport is designed to assist with importing such data into Cloudant efficiently. Simply pipe a file full of JSON documents into cloudantimport, telling it the database to send the data to and it will group the documents into batches and employ Cloudant's bulk import API.

Installation

You will need to download and install the Go compiler. Clone this repo then:

go build ./cmd/cloudantimport

The copy the resultant binary cloudantimport (or cloudantimport.exe in Windows systems) into your path.

Configuration

cloudantimport authenticates with your chosen Cloudant service using environment variables as documented here e.g.

CLOUDANT_URL=https://xxxyyy.cloudantnosqldb.appdomain.cloud
CLOUDANT_APIKEY="my_api_key"

Usage

Pipe a JSON file (one document per line) into cloudantimport and supply the database you want to write to using the --dbname/--db parameter:

cat myfile.json | cloudantimport --db mydb

By default, only one bulk write API call is in flight at any one time. This can be increased with the --concurrency/--c option

# import data with a maximum of 5 bulk write API calls in flight at once
cat myfile.json | cloudantimport --db mydb --concurrency 5

Generating random data

cloudantimport can be paired with datamaker to generate any amount of sample data:

# template ---> datamaker ---> 100 JSON docs ---> cloudantimport ---> Cloudant
echo '{"_id":"{{uuid}}","name":"{{name}}","email":"{{email true}}","dob":"{{date 1950-01-01}}"}' | datamaker -f json -i 100 | cloudantimport --db people
written {"docCount":100,"successCount":1,"failCount":0,"statusCodes":{"201":1}}
written {"batch":1,"batchSize":100,"docSuccessCount":100,"docFailCount":0,"statusCodes":{"201":1},"errors":{}}
Import complete

or with the template as a file:

cat template.json | datamaker -f json -i 10000 | cloudantimport --db people

Understanding the output

The output comes in two parts. Firstly, one line per bulk write request made to stderr:

2025/11/20 09:51:49 201 176 500 0
2025/11/20 09:51:49 201 165 500 0
2025/11/20 09:51:50 201 165 500 0

This shows the date/time, HTTP status code, latency (ms), number of documents successfully written and the number that failed.

Then at the end comes a summary to stdout:

{"statusCodes":{"201":20},"errors":{"conflict":10},"docs":9990,"batches":20}

which lists the counts of each HTTP status code, counts of document write errors, total docs written and total number of bulk write API calls.

How does it work?

To remind myself of what's going on, this diagram helps:

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
cmd/cloudantimport		cmd/cloudantimport
internal/app		internal/app
.gitignore		.gitignore
LICENSE.md		LICENSE.md
README.md		README.md
couchimport.png		couchimport.png
go.mod		go.mod
go.sum		go.sum

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

cloudantimport

Introduction

Installation

Configuration

Usage

Generating random data

Understanding the output

How does it work?

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

cloudantimport

Introduction

Installation

Configuration

Usage

Generating random data

Understanding the output

How does it work?

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages