A micro multi-website analytics database service designed to be fast and robust, built with Go and SQLite.
Analytics databases tend to grow fast and exponentially. Requesting data for one specific website from a single database thus become very slow over time. But analytics data are highly decoupled between two websites.
The idea behind µAnalytics is to shard your analytics data on a key, which is usually a website name. Each shard thus only contains a specific website data, allowing faster response times and easy horizontal scaling.
To handle requests even faster, µAnalytics automatically manages a pool of connections to multiple shards at a time.
By default, the service keeps 10 connections alive.
But you can easily increase/decrease the max number of alive shards with the --connections flag when launching the app.
$ go install github.com/GitbookIO/micro-analyticsTo launch the application, simply run:
$ ./micro-analytics
The command takes the following optional parameters:
| Parameter | Environment Variable | Usage | Type | Default Value |
|---|---|---|---|---|
--user, -u |
MA_USER |
Username for basic auth | String | "" |
--password, -w |
MA_PASSWORD |
Password for basic auth | String | "" |
--port, -p |
MA_PORT |
Port to listen on | String | "7070" |
--root, -r |
MA_ROOT |
Database directory | String | "./dbs" |
--connections, -c |
MA_POOL_SIZE |
Max number of alive shards connections | Number | 1000 |
--idle-timeout, -i |
MA_POOL_TIMEOUT |
Idle timeout for DB connections in seconds | Number | 60 |
--cache-directory, -d |
MA_CACHE_DIR |
Cache directory | String | ".diskache" |
If --user is provided, the service will automatically use basic access authentication on all requests.
The actual cache directory will be a subdirectory named after the app major version. The default will then be ./.diskache/0.
All shards of the µAnalytics database share the same TABLE schema:
CREATE TABLE visits (
time INTEGER,
event TEXT,
path TEXT,
ip TEXT,
platform TEXT,
refererDomain TEXT,
countryCode TEXT
)Every query for a specific website can be executed using a time range. Every following GET request thus takes the two following optional query string parameters:
| Name | Type | Description | Default | Example |
|---|---|---|---|---|
start |
Date | Start date to query a range | none | "2015-11-20T12:00:00.000Z" |
end |
Date | End date to query a range | none | "2015-11-21T12:00:00.000Z" |
The dates can be passed either as:
- ISO (RFC3339)
"2015-11-20T12:00:00.000Z" - UTC (RFC1123)
"Fri, 20 Nov 2015 12:00:00 GMT" - A Unix timestamp as a String
"1448020800"
| Name | Type | Description | Default | Example |
|---|---|---|---|---|
unique |
Boolean | Include the total number of unique visitors in response | none | true |
Except for GET /:website, every response to a GET request will contain the two following values:
| Name | Type | Description |
|---|---|---|
total |
Integer | Total number of visits |
unique |
Integer | Total number of unique visitors based on ip, set to 0 unless unique=true is passed as a query string parameter |
Returns the full analytics for a website.
{
"list": [
{
"time": "2015-11-25T16:00:00+01:00",
"event": "download",
"path": "/somewhere",
"ip": "127.0.0.1",
"platform": "Windows",
"refererDomain": "gitbook.com",
"countryCode": "fr"
},
...
]
}Returns the count of analytics for a website. The unique query string parameter is not necessary for this request.
{
"total": 1000,
"unique": 900
}Returns the number of visits per countryCode.
label contains the country full name.
{
"list": [
{
"id": "fr",
"label": "France",
"total": 1000,
"unique": 900
},
...
]
}Returns the number of visits per platform.
{
"list": [
{
"id": "Linux",
"label": "Linux",
"total": 1000,
"unique": 900
},
...
]
}Returns the number of visits per refererDomain.
{
"list": [
{
"id": "gitbook.com",
"label": "gitbook.com",
"total": 1000,
"unique": 900
},
...
]
}Returns the number of visits per event.
{
"list": [
{
"id": "download",
"label": "download",
"total": 1000,
"unique": 900
},
...
]
}Returns the number of visits as a time serie. The interval in seconds can be specified as an optional query string parameter. Its default value is 86400, equivalent to one day.
| Name | Type | Description | Default | Example |
|---|---|---|---|---|
interval |
Integer | Interval of the time serie | 86400 (1 day) |
3600 |
Example with interval set to 3600:
{
"list": [
{
"start": "2015-11-24T12:00:00.000Z",
"end": "2015-11-24T13:00:00.000Z",
"total": 450,
"unique": 390
},
{
"start": "2015-11-24T13:00:00.000Z",
"end": "2015-11-24T14:00:00.000Z",
"total": 550,
"unique": 510
},
...
]
}Insert new data for the specified website.
{
"time": "2015-11-24T13:00:00.000Z", // optional
"event": "download",
"ip": "127.0.0.1",
"path": "/README.md",
"headers": {
// ...
// HTTP headers received from your visitor
}
}The time parameter is optional and is set to the date of your POST request by default.
Passing the HTTP headers in the POST body allows the service to extract the refererDomain and platform values.
The countryCode will be deduced from the passed ip parameter using Maxmind's GeoLite2 database.
Insert a list of analytics for a specific website. The analytics can be sent directly in DB format, with time being a String value.
time can be passed as either:
- ISO (RFC3339)
"2015-11-20T12:00:00.000Z" - UTC (RFC1123)
"Fri, 20 Nov 2015 12:00:00 GMT" - A Unix timestamp as a String
"1448020800"
If the time parameter is not provided, it will be defaulted to the exact time of the server processing the POST request.
As for the POST /:website method, the analytics can also have an optional headers parameter.
If the refererDomain and/or platform values are not passed in the JSON body, the headers parameter will be used to set these values automatically.
{
"list": [
{
"time": "1450098642",
"ip": "127.0.0.1",
"event": "download",
"path": "/somewhere",
"platform": "Apple Mac",
"refererDomain": "www.gitbook.com",
"countryCode": "fr"
},
{
"time": "2015-11-20T12:00:00.000Z",
"ip": "127.0.0.1",
"event": "login",
"path": "/someplace",
"headers": {
// ...
// HTTP headers received from your visitor
}
}
]
}The countryCode will be reprocessed by the service using GeoLite2 based on the ip.
Insert a list of analytics for different websites. The analytics have the same format as POST /:website/bulk, with a mandatory website parameter.
{
"list": [
{
"website": "website-1",
"time": "1450098642",
"ip": "127.0.0.1",
"event": "download",
"path": "/somewhere",
"platform": "Apple Mac",
"refererDomain": "www.gitbook.com",
"countryCode": "fr"
},
{
"website": "website-2",
"time": "2015-11-20T12:00:00.000Z",
"ip": "127.0.0.1",
"event": "login",
"path": "/someplace",
"headers": {
// ...
}
}
]
}Fully delete a shard from the file system.
