Collect metrics from Fedora CoreOS machines

The idea of a Linux distribution collecting metrics about its installed base has long been controversial.  Many Linux users do not want their operating system to phone home about the state of their system.  At the same time, it is difficult to effectively allocate development resources for an operating system whose installed base is not well understood.  Decisions often need to be made about what platforms or container runtimes to prioritize; what hardware, system services, or system or network configurations are commonly used; which corner cases need to be especially well tested during upgrades; and which third-party services provide important compatibility constraints.  Metrics can inform these decisions, which benefits the operating system and the userbase as a whole.  External mechanisms for measuring the installed base, such as analysis of logs from download mirrors, are inaccurate at best; much better data can be collected from installed machines directly.

We would like to collect metrics from Fedora CoreOS systems by default.  We want to be clear about exactly what will be collected, how it will be used, and how to disable that collection.  We may also allow opting into collection of additional metrics beyond the default set.

## Background: Container Linux

Container Linux metrics are collected via the update system.  CoreUpdate collects metrics about each machine that checks in: its update channel, its state in the client state machine, what OS version is running, what version was originally installed, the OEM ID (platform) of the machine, and its checkin history.  This works okay but gives us an incomplete picture of the installed base: we do not receive any information about machines behind private CoreUpdate servers, behind a third-party update server such as [CoreRoller](https://github.com/coreroller/coreroller), or which have updates disabled.

## Fedora CoreOS

Fedora CoreOS will not couple metrics to its update system.  This not only allows greater freedom in the design of both, but allows privacy-conscious users to disable metrics while continuing to receive automatic updates.  (In any event, the Cincinnati update protocol (#83) is not designed to collect client metrics.)  We will need a separate metrics-collection system, including:
- A client daemon or timer unit to perform the collection
- A cloud service to collect and aggregate metrics data

Initial metrics might include:
- Random unique identifier for deduplicating reports
- Platform (cloud environment or hypervisor)
- On bare-metal systems, a summary of hardware
- On cloud systems, the instance type
- Original OS version
- Current OS version
- Container runtimes in use
- Summary of network configuration

Metrics might be grouped into multiple levels, such as `minimal` and `full`, which collect different amounts of information.  The default could be `minimal`, with the option to switch to `full` or `off` by writing a config file via Ignition.  Corresponding documentation should explain what metrics are collected, how the metrics are used, and the consequences of disabling them.

Due to time constraints, functioning metrics collection may not be included in the first release of Fedora CoreOS.  However, the first release should include at least the following:
- A configuration file
- A service that parses the config file and rejects invalid configs
- Documentation for configuring metrics collection

In other words, metrics collection should be configurable and documented from day 1.  This should reduce unpleasant surprises for the user community when the metrics infrastructure is deployed and enabled.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Collect metrics from Fedora CoreOS machines #86

Background: Container Linux

Fedora CoreOS

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Collect metrics from Fedora CoreOS machines #86

Description

Background: Container Linux

Fedora CoreOS

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions