Motivation
We need anonymous statistics about netdata. Such statistics will be used for:
-
Quality assurance, to help us understand if netdata behaves as expected and help us identify repeating issues for certain distributions or environment.
-
Usage statistics, to help us focus on the parts of netdata that are used the most, or help us identify the extend our development decisions influence the community (i.e. can we drop a certain collector? which is the most used collector to optimize? can we drop node.d.plugin? is anyone using netdata statsd? which backends are used by how many? etc).
Collection points
These are the points we can collect these statistics:
- The netdata installer, updater, uninstaller. These reports will not exist when we will distribute binary packages.
- In the unlikely event netdata crashes.
- When netdata starts (1 minute after it starts, to report also plugins and modules collecting data).
- When a user accesses a netdata dashboard.
Opt out
Users should be able to opt-out from this data collection.
To make it easier for users to opt-out globally, they will need to create the file /etc/netdata/.opt-out-from-anonymous-statistics, with one of following values:
all, to opt-out from all kinds of statistics, including crashes
usage, to opt-out from usage statistics only (points 3 and 4 above)
user, to opt-out from user usage statistics only (point 4 above)
This file will be generated by the custom installer scripts, with the command line option --out-out-from-anonymous-statistics VALUE, so that a user that is not willing to share anything, will give this option to the installer and no data will be shared at all.
Public reports
We will publish reports with these data. All data will be public, for everyone to see and examine, with the exception of 2 values:
- The
IPs that sent the usage statistics
- The unique netdata machine IDs that sent the usage statistics
The above values will be used in calculating statistics, but will be deleted from the database as soon as the calculations of the statistics allow.
Information to be collected
We should maintain a public wiki, explaining the usage statistics we maintain, the purpose we collect it for and the actual reports we generated using these statistics.
All this information will be public.
netdata info
The following information about netdata itself:
-
version as given by netdata -V
-
unique netdata machine ID (random, generated when netdata starts) as stored in /var/lib/netdata/registry/netdata.public.unique.id
operating system info
-
The following information from /etc/os-release:
NAME, like Debian GNU/Linux, Gentoo, Manjaro Linux
VERSION_ID, like 9
ID, like debian, gentoo, manjaro
ID_LIKE, like arch
-
The following information from uname
- Kernel name, as given by
uname -s, example Linux
- Kernel version, as given by
uname -r, example 4.19.2-1-MANJARO
- Architecture, as given by
uname -m, example x86_64
-
Information about the virtualization technology
- The output of the command
systemd-detect-virt, example none, or kvm
Installation reports
Installation reports (install, update, uninstall) should also report the final status of the operation (OK, FAILED, CANCELLED).
Crash reports
Netdata daemon crash reports should provide a stack trace when possible.
Implementation
We could provide a script that will be installed as /usr/libexec/netdata/plugins.d/send-anonymous-usage-info.sh to take care of all reports, except the user one (point 4).
This script will collect and send all reports, respecting the contents of /etc/netdata/.opt-out-from-anonymous-statistics.
This script will be called by all installation scripts and netdata itself to send the information required by points 1, 2 and 3.
Netdata itself should read the contents of /etc/netdata/.opt-out-from-anonymous-statistics and expose this information to the dashboard, so that the dashboard will not send user statistics (point 4).
Motivation
We need anonymous statistics about netdata. Such statistics will be used for:
Quality assurance, to help us understand if netdata behaves as expected and help us identify repeating issues for certain distributions or environment.
Usage statistics, to help us focus on the parts of netdata that are used the most, or help us identify the extend our development decisions influence the community (i.e. can we drop a certain collector? which is the most used collector to optimize? can we drop node.d.plugin? is anyone using netdata statsd? which backends are used by how many? etc).
Collection points
These are the points we can collect these statistics:
Opt out
Users should be able to opt-out from this data collection.
To make it easier for users to opt-out globally, they will need to create the file
/etc/netdata/.opt-out-from-anonymous-statistics, with one of following values:all, to opt-out from all kinds of statistics, including crashesusage, to opt-out from usage statistics only (points 3 and 4 above)user, to opt-out from user usage statistics only (point 4 above)This file will be generated by the custom installer scripts, with the command line option
--out-out-from-anonymous-statistics VALUE, so that a user that is not willing to share anything, will give this option to the installer and no data will be shared at all.Public reports
We will publish reports with these data. All data will be public, for everyone to see and examine, with the exception of 2 values:
IPsthat sent the usage statisticsThe above values will be used in calculating statistics, but will be deleted from the database as soon as the calculations of the statistics allow.
Information to be collected
We should maintain a public wiki, explaining the usage statistics we maintain, the purpose we collect it for and the actual reports we generated using these statistics.
All this information will be public.
netdata info
The following information about netdata itself:
version as given by
netdata -Vunique netdata machine ID (random, generated when netdata starts) as stored in
/var/lib/netdata/registry/netdata.public.unique.idoperating system info
The following information from
/etc/os-release:NAME, likeDebian GNU/Linux,Gentoo,Manjaro LinuxVERSION_ID, like9ID, likedebian,gentoo,manjaroID_LIKE, likearchThe following information from
unameuname -s, exampleLinuxuname -r, example4.19.2-1-MANJAROuname -m, examplex86_64Information about the virtualization technology
systemd-detect-virt, examplenone, orkvmInstallation reports
Installation reports (install, update, uninstall) should also report the final status of the operation (
OK,FAILED,CANCELLED).Crash reports
Netdata daemon crash reports should provide a stack trace when possible.
Implementation
We could provide a script that will be installed as
/usr/libexec/netdata/plugins.d/send-anonymous-usage-info.shto take care of all reports, except the user one (point 4).This script will collect and send all reports, respecting the contents of
/etc/netdata/.opt-out-from-anonymous-statistics.This script will be called by all installation scripts and netdata itself to send the information required by points 1, 2 and 3.
Netdata itself should read the contents of
/etc/netdata/.opt-out-from-anonymous-statisticsand expose this information to the dashboard, so that the dashboard will not send user statistics (point 4).