Skip to content

Add backend functionality necessary for graphing DHCP stats#3766

Merged
lunkwill42 merged 12 commits intomasterfrom
dhcpstats-backend
Feb 12, 2026
Merged

Add backend functionality necessary for graphing DHCP stats#3766
lunkwill42 merged 12 commits intomasterfrom
dhcpstats-backend

Conversation

@jorund1
Copy link
Copy Markdown
Collaborator

@jorund1 jorund1 commented Feb 6, 2026

Scope and purpose

First part of fix for #2373 (partly; see #3633 as well).

This pull request

Refactors #3397 by:

  • Moving generic dhcpstats functionality away from kea_dhcp.py and into common.py
  • Consolidating much of this generic dhcpstats functionality into a DhcpPath class that contains most data about a dhcpstats metric. Data contained in this class is e.g. <server-name>, <first-ip>, <last-ip>, <allocation-type>, <group-name> (but, perhaps surprisingly, not metric names such as assigned, total, etc.). The DhcpPath class makes it easy to convert to/from dhcpstats Graphite metric paths, which will be used in up-and-coming frontend functionality.
  • Making the kea_dhcp.py client also start collecting stats about unassigned addresses, so that the frontend can show this without needing to do arithmetic on total and assigned addresses, which would be inflexible (for consumers of Graphite stats) and also either complicated (when done right) or brittle (when done in a simple way).
  • Changing the format of dhcpstats Graphite metric paths from
    nav.dhcp.<ip-version>.pools.<server-name>.<pool-name>.<first-ip>.<last-ip>.{assigned,total,declined}
    
    to the more explicit format of
    nav.dhcp.servers.<server-name>.{range,pool,subnet}.{custom_groups,special_groups}.<group-name>.<ip-version>.<first-ip>.<last-ip>.{assigned,total,declined,unassigned}
    
    which first and foremost allows for
    • using the third component in the path (which here is servers) as a namespace,
    • differentiating between a range of addresses (which cannot contain gaps), a pool of addresses (which can contain gaps), and a subnet; three different kinds of fidelity a DHCP server may serve its statistics in.
    • namespacing group-names (previously pool-names) into either custom-groups for group-names sourced externally or special-groups for group-names generated by NAV e.g. when a group-name is missing.
  • Expecting [server_<server-name>] sections in dhcpstats.conf instead of [endpoint_<server-name>] for sake of consistency (the new code has backwards-compability with the old naming-scheme)
  • Renaming the user_context_poolname_key option in dhcpstats.conf (used by kea_dhcp.py) to user_context_groupname_key and also changing its default value from "name" to "group" (the new code does not have backwards-compability with the old naming-scheme; NAV works fine when this option is absent, but old configurations with relied on the default value of "name" must now explicitly set it since the default value is now "group")

Contributor Checklist

  • Added a changelog fragment for towncrier
  • Added/amended tests for new/changed code
  • Added/changed documentation see Add documentation for DHCP stats #3785
  • Linted/formatted the code with ruff
  • Wrote the commit message so that the first line continues the sentence "If applied, this commit will ...", starts with a capital letter, does not end with punctuation
  • Based this pull request on the correct upstream branch: For a patch/bugfix affecting the latest stable version, it should be based on that version's branch (<major>.<minor>.x). For a new feature or other additions, it should be based on master.
  • If applicable: Created new issues if this PR does not fix the issue completely/there is further work to be done
  • If it's not obvious from a linked issue, described how to interact with NAV in order for a reviewer to observe the effects of this change first-hand (commands, URLs, UI interactions) see Add functionality to graph DHCP stats on each VLAN page and Prefix page in NAV  #3633
  • If this results in changes in the UI: Added screenshots of the before and after
  • If this adds a new Python source code file: Added the boilerplate header to that file

This way, we can pass wildcard strings such as "*" and "{foo,bar}" to graphite
metric template functions and be assured that they won't be escaped to "_" and
"_foo_bar_".
@github-actions
Copy link
Copy Markdown

github-actions bot commented Feb 6, 2026

Test results

    21 files      21 suites   24m 57s ⏱️
 2 904 tests  2 904 ✅ 0 💤 0 ❌
16 272 runs  16 272 ✅ 0 💤 0 ❌

Results for commit bb55cb4.

♻️ This comment has been updated with latest results.

@codecov
Copy link
Copy Markdown

codecov bot commented Feb 6, 2026

Codecov Report

❌ Patch coverage is 87.58170% with 19 lines in your changes missing coverage. Please review.
✅ Project coverage is 63.26%. Comparing base (40e206f) to head (7413313).
⚠️ Report is 96 commits behind head on master.

Files with missing lines Patch % Lines
python/nav/bin/dhcpstats.py 41.66% 7 Missing ⚠️
python/nav/dhcpstats/common.py 91.02% 7 Missing ⚠️
python/nav/dhcpstats/kea_dhcp.py 91.22% 5 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #3766      +/-   ##
==========================================
+ Coverage   63.14%   63.26%   +0.11%     
==========================================
  Files         614      615       +1     
  Lines       45502    45631     +129     
  Branches       43       43              
==========================================
+ Hits        28734    28870     +136     
+ Misses      16758    16751       -7     
  Partials       10       10              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@jorund1
Copy link
Copy Markdown
Collaborator Author

jorund1 commented Feb 6, 2026

Seems like there's still some lint in the commits that I didn't catch; hopefully it's fine for me to force-push a rebase that fixes this..

Copy link
Copy Markdown
Member

@lunkwill42 lunkwill42 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good refactor.

I'm a bit concerned that the metric paths have changed again, and what effect this will have on someone who's already been using this. That being said, since there has been no GUI so far, it is unlikely anyone has made use of it. How much work would it be to migrate existing data if there were any? I.e. would a guide to migration take a lot to write?

That being said, I approve of merging this as-is. I do want, however, that you make follow-up issues for the inline things I've noted 😄

Comment on lines +167 to +170
group_name_source = "special_groups"
group_name = "standalone"
else:
group_name_source = "custom_groups"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

non-blocking: These "magic" strings appear all over the code and should probably be defined as constants instead - if they ever need to change, you only need to do it in one place.

They are also not the only "magic" strings that appear in the code...

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One way to do this without worsening legibility too much would be to have two enums,

class StandaloneGroup(Enum):
    NAME = "standalone"
    NAME_SOURCE = "special_groups"

class CustomGroup(Enum):
    NAME_SOUCE = "custom_groups"

# ...

@dataclass(frozen=True, order=True)
class DhcpPath:
    # ...
    group_name_source: Literal[StandaloneGroup.NAME_SOURCE, CustomGroup.NAME_SOURCE]
    group_name: str
    # ...     

    def is_standalone(self):
        return (
            self.group_name_source == StandaloneGroup.NAME_SOURCE
            and self.group_name == StandaloneGroup.NAME
        )

But then I think it's best to wait until Python 3.11 so that we can subclass StrEnum instead such that DhcpPath.group_name can continue to be typed as a string. In the meantime, I don't think the original problem is too dire since DhcpPath.group_name_source is type checked to only accept the two "magic" strings. Also, DchpPath.group_name_source is not expected to be used outside of the class itself (e.g. neither classmethod has it as a parameter); the only reason it's not named as a private attribute is for symmetry with the graphite paths to be had.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jorund1
Copy link
Copy Markdown
Collaborator Author

jorund1 commented Feb 12, 2026

I'm a bit concerned that the metric paths have changed again, and what effect this will have on someone who's already been using this. That being said, since there has been no GUI so far, it is unlikely anyone has made use of it. How much work would it be to migrate existing data if there were any? I.e. would a guide to migration take a lot to write?

There's a migration guide in #3785 (https://github.com/jorund1/nav/blob/dhcpstats-documentation/doc/howto/migrate-dhcpstats.rst)

@lunkwill42
Copy link
Copy Markdown
Member

Great - but the new fixup commit prevents us from merging. Will you rebase, @jorund1 ?

Before, we only needed to translate from runtime data to Graphite paths, so we
only needed the metric_path_for_dhcp() function. But now we'll need to start
translating from Graphite paths back to runtime data because the Graphite paths
contains the <first_ip> and <last_ip> segments used to decide which
VLANs/Prefixes a stat belongs to. Further, the Graphite paths also contains the
<server_name>, <allocation_type>, <ip_version>, <group_name>, and
<group_name_source> segments used to decide which graph on that specific
VLAN/Prefix page a stat should be displayed in.

Thus this commit adds a new class, DhcpPath, which basically has the purpose of
containing runtime path data and translating between path data sourced from
external sources (DHCP server APIs), runtime data, and Graphite paths, like so:

  external path data ---> runtime path data <---> Graphite path
     (DHCP server)           (DhcpPath)            (Graphite)
More specifically, use the class to do the following translations:

  external path data ---> runtime path data ---> Graphite path
     (Kea API)               (DhcpPath)           (Graphite)

Thus delegating validity checks and transformations.

Also, this commit renames some variables and comments to stay consistent
with the variables and comments used in DhcpPath.
Because we didn't previously test for the case when a Kea DHCP configuration
file *does* contains a pool with a user-context object but that *does not*
contain the "group" key wanted by the kea_dhcp.py client.
This is part of simplifying the terminology used in the up-and-coming
documentation and configuration files for dhcpstats and its frontend;
terminology in the implementation and terminology used in the (user) interface
termonology should diverge as little as possible.

The documentation has more or less used the terms 'endpoint' and '(DHCP) server'
interchangeably. We fix this so that only the term '(DHCP) server' is used.
Where the user-facing interfaces are affected, backwards compability is added to
avoid the most dramatical errors.
Originally, we stored only these three stats:

- 'total' addresses,
- 'assigned' addresses, and
- 'declined' addresses.

When creating graphs, we want to show 'total', 'assigned', and also 'unassigned'
addresses (perhaps also 'declined' in the future). The thought was originally to
use Graphite functions to subtract 'assigned' from 'total' to get 'unassigned'
addresses when creating graphs. However, Graphite does not handle very well the
case where 'total' is missing some data but 'assigned' is not, or vice versa:

- Subtracting a number N from 'null' or vice versa in Graphite yields N, not
  'null'.
- If there are no data for 'total' at all (which could be the case for some
  DHCPv6 APIs), we obviously shouldn't then try to create 'unassigned' out of
  'total' and 'assigned'. Thus the need to check for the existence of data for
  'total'. This requires an extra round-trip to Graphite if the check is done in
  Python code, and is unnecessarily complicated to check if the check is done
  using Graphite-functions (the latter method is required when monitoring
  Graphite metrics based purely off of a Graphite SeriesList expression).

This change will increase the amount of storage needed for DHCP data in Graphite
by 33%, or about 37KiB [^1] per tracked range/pool/subnet.

[^1]: Assuming each timeseries value occupy 10 bytes, the calculation is
      ((((60*24*7) / 5) + ((60*24*12) / 30) + ((60*24*50) / 120) + 600) * 10)
We really never need get-config-hash to work; it's just used to spare one
round-trip. Catching dhcpstats.errors.CommunicationError means we now should be
gracefully handling any API errors and any HTTP errors; a superset of what we
did earlier.
This type represents a graphite metric; a type which is useful
in other modules than just dhcpstats/kea_dhcp.py.
This fix allows HTTPS schemes with one or more uppercased letters to also be
detected.
This is mostly so that we're able to support HTTP Basic Authentication in the
kea_dhcp.py client where the password is the empty string.
@sonarqubecloud
Copy link
Copy Markdown

@lunkwill42 lunkwill42 merged commit 9b67897 into master Feb 12, 2026
16 checks passed
@lunkwill42 lunkwill42 deleted the dhcpstats-backend branch February 17, 2026 09:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants