Right now, we estimate installed Fedora systems by counting unique IP addresses which show up in our updates mirror statistics. We need better data than that. There are some proposals for more complicated systems, but a quick thing we can do is implement a per-system UUID (unique identifier) and count that instead of IP addresses.
This is what openSUSE does — see https://metrics.opensuse.org/ for live stats. See also this previous Fedora Council discussion for background.
- Name: Matthew Miller
- Email: mattdm
- Release notes owner:
- Targeted release: Fedora 30
- Last updated: 2019-01-07
- Tracker bug: <will be assigned by the Wrangler>
- A. Currently, we can only count Fedora OS use by observing IP addresses. This is subject to undercounting due to NAT — and to overcounting due to short DHCP leases and laptops moving between work or school and home or coffee shop.
- B. We can count what releases are observed, but we can’t distinguish variants.
- C. We can’t count quickly because various logs are copied back to a central server and data is not consistent for several days.
- The Fedora community cares about privacy and is adverse to tracking measures. We don't want to track; just count.
- For this reason, we don’t want to use any identifier like /etc/machine-id which may be used for other purposes.
- And, also for that reason, there needs to be a relatively easy way to opt out.
- This needs to work with Yum/DNF, MicroDNF, PackageKit, Cockpit, rpm-ostree, GNOME Software, Muon, Apper, and software update mechanisms used in other spins.
- We need to be able to distinguish between short-lived instances (like temporary containers or test machines) and actual installations.
- We don’t want to track users, just count systems.
- Except for distinguishing temporary installations from “real” use, we don’t need to track systems over time. We just want a daily or weekly moment-in-time count.
- Being able to see how systems are upgraded over time might be interesting but isn’t as important as privacy concerns.
- VARIANT_ID will be set in /etc/os-release. See Changes/Label Our Variants We want that, plus VERSION_ID and machine architecture.
- We may also want each report to contain a boolean flag showing whether the system has been in use for at least 24 hours to help separately categorize test and other throw-away instances.
- openSUSE already uses a UUID in zypper; this is ground already traveled
- Yum and DNF have built in support for fileset variables which can be ‘removed’ to deal with privacy issues.
Benefit to Fedora
- Better metrics overall
- Public stats page updated automatically
- Better knowledge of relative use of different variants
- Insight into Fedora's use in short-lived test systems and temporary containers vs. longer-term installations
- Proposal owners: work with DNF team and infrastructure to implement the UUID feature and corresponding backend data collection
- DNF team: feature work
- Maintainers of other package management tools: make sure feature works in these cases as well
- Other developers: Spin maintainers should make sure that VARIANT_ID is being set in /etc/os-release
- Release engineering: #Releng issue number (a check of an impact with Release Engineering is needed) -- should have no impact
- List of deliverables: affects all deliverables
- Policies and guidelines: none
- Trademark approval: none
Older versions will not have the UUID counting enabled; we will keep collecting stats in the traditional way for those systems.
How To Test
Once the system is in place, we will see data collected.
User experience will not change. Users who wish to opt out of counting will have an easy way to do so.
- Contingency mechanism: continue counting the old way
- Contingency deadline: does not block release; we can ship with the feature incomplete, although it would certainly be most useful to have it available at GA
- Blocks release? No
- Blocks product? No
Release notes need to be written, and documentation describing how to opt out.
This needs to be written but depends on exact implementation.