Skip to content

Improve readability of lockfiles#58

Merged
victor-linroth-sensmetry merged 19 commits intomainfrom
vl/readable-lockfiles
Nov 7, 2025
Merged

Improve readability of lockfiles#58
victor-linroth-sensmetry merged 19 commits intomainfrom
vl/readable-lockfiles

Conversation

@victor-linroth-sensmetry
Copy link
Copy Markdown
Collaborator

@victor-linroth-sensmetry victor-linroth-sensmetry commented Oct 14, 2025

Summary

The current lockfile structure is difficult to inspect manually and can potentially
produce noisy diffs in version control. These updates aim to make the lockfile easier
to understand and audit.

  • Using toml_edit for manual serializing, allowing inline tables and bracketed
    (multiline) lists.
  • Decreased nesting. The general info and meta entries are too deeply nested
    to be displayed well inline so they were replaced by specific entries for
    name, version, exports* and usages.
  • Adding a comment at the top of the lockfile explaining that it is
    automatically generated and not intended to be edited manually. (To provide
    extra clarity for the less technical users.)
  • Allow canonicalization, which includes sorting the various list (for more
    readable diffs in version control).
  • Reordering the entries roughly based on what is interesting to a user, e.g
    checksum goes last since you're unlikely to want to visually inspect that.
  • Changing the name of iris to identifiers.

*exports are the top level symbols exported by the project and are taken from the
index in .meta.json. It was included because in a way these are the true "names"
of a project as seen from KerML/SysML, and can be part of lockfile validation (and
maybe also dependency solving at some point).

Example

A project now fits under a single entry like e.g.

[[project]]
name = "PLEML"
version = "0.5.0"
exports = [
    "PLEML",
    "Product Line Engineering Modeling Language",
]
identifiers = [
    "urn:kpar:pleml",
]
usages = [
    { resource = "urn:kpar:analysis-library", version_constraint = "^2.0.0" },
    { resource = "urn:kpar:cause-and-effect-library", version_constraint = "^2.0.0" },
    { resource = "urn:kpar:data-type-library", version_constraint = "^1.0.0" },
    { resource = "urn:kpar:function-library", version_constraint = "^1.0.0" },
    { resource = "urn:kpar:geometry-library", version_constraint = "^2.0.0" },
    { resource = "urn:kpar:metadata-library", version_constraint = "^2.0.0" },
    { resource = "urn:kpar:quantities-and-units-library", version_constraint = "^2.0.0" },
    { resource = "urn:kpar:requirement-derivation-library", version_constraint = "^2.0.0" },
    { resource = "urn:kpar:semantic-library", version_constraint = "^1.0.0" },
    { resource = "urn:kpar:systems-library", version_constraint = "^2.0.0" },
]
sources = [
    { remote_kpar = "https://beta.sysand.org/fae1dcf670974c6a51808908c0595550407f0152244f53a7d51538765f1b01de/0.5.0.kpar", remote_kpar_size = 3311 },
]
checksum = "fd1743a85695eece48f2184b5a16749a38d1b5b82e61531c9349c32d08ac5e5c"

Motivation

Lockfile project entries have been evaluated for inclusion and ordering based on
the following points:

Readability Relates to the users experience manually inspecting the lockfile
directly, or looking at diffs from version control.
Validation There are two stages when validation can considered:
after lockfile generation and when loading a lockfile for syncing. Currently the
lockfile generation process is not that complex, but as it gets more complex it
can be good to have a validation step at the end to catch bugs early. An externally
loaded lockfile has no guarantees of validity and should be checked in order to
"fail fast" and hopefully produce better error messages. When loading a lockfile
it is not always the case that all relevant project files will be immediately
available so it can be advantageous if (some) validation can be performed only with
the contents of the lockfile. Validation can also be used as part of testing.
Reproducibility The ability to ensure that the same model is loaded for each
user. Note that this may not always be desired, like when developing multiple
projects in conjunction. For this reason checksum should probably be made optional,
but that change can be added at a later time.

  • name
    • Readability Only really serves the role of a header for the project table
      and as a potential reference in error messages.
  • version
    • Readability Shows version changes in diffs for lockfile updates.
    • Validation Together with identifiers will be used to check that a usage is
      satisfied.
  • exports
    • Readability Typically these will be the packages exported by the project,
      as and as such are the real "names" from a KerML/SysML perspective.
    • Validation As there is no way for tools to sensibly handle name collisions
      in a standard compliant way, we should make sure there is no overlap between
      projects.
  • identifiers
    • Readability These are the "names" used when adding dependencies.
    • Validation Together with version will be used to check that a usage is satisfied.
    • Reproducibility Technically required for Sysand to install a dependency
      (so not required for current project).
  • usages
    • Readability Gives the reader an idea of idea of which projects are bringing
      in dependencies. In particular shows if a version update changes dependencies.
    • Validation Each usage will have to be satisfied by a project in the lockfile.
  • sources
    • Readability For auditing the source of the project.
    • Reproducibility Needed to finding dependencies in a reproducible fashion.
  • checksum
    • Reproducibility For checking that you have the same project as used when
      lockfile was generated.

Additional changes

Basic validation functionality has been added to better illustrate it's relevance
in relation to project entries.

A preliminary check of the lockfile has been introduced that only depends on it being valid TOML. Here the lockfile version is checked and there are warnings issued if unknown fields are encountered in the lockfile. This is an effort to make potential errors easier to debug.

To signify the changes the lockfile name has also been changed to sysand-lock.toml and lock_version has been bumped to "0.2".

Copy link
Copy Markdown
Collaborator

@andrius-puksta-sensmetry andrius-puksta-sensmetry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably should also have a test for serialization round-trip.

@victor-linroth-sensmetry
Copy link
Copy Markdown
Collaborator Author

Probably should also have a test for serialization round-trip.

I was thinking about that but wasn't sure how to best implement it. It doesn't seem to be a good fit for a unit test, since a round-trip isn't a unit of work. Not sure it's a good integration test either since the deserialization isn't really part of our library interface.

I've been thinking about setting up some Property based testing for Sysand, and I think this would be the best fit for round-trip testing. Would have to be its own Issue though.

But maybe I'm just overthinking things.

@victor-linroth-sensmetry
Copy link
Copy Markdown
Collaborator Author

There's something wonky with 'Python / Linux' jobs. First one stalled waiting for some Header, but seem to work when manually restarted it. I'll try the same with the others.

@andrius-puksta-sensmetry
Copy link
Copy Markdown
Collaborator

I've been thinking about setting up some Property based testing for Sysand, and I think this would be the best fit for round-trip testing.

Yes, it seems to be a good fit. Until that is done, we should have a few simple tests for this. Unit tests are best for it, because even though round-trip tests too much functionality at once for a unit test, at least the code being tested is completely self-contained and thus easy to write unit tests for.

@andrius-puksta-sensmetry
Copy link
Copy Markdown
Collaborator

First one stalled waiting for some Header

It's a network timeout. Happens sometimes to all jobs.

@tilowiklundSensmetry
Copy link
Copy Markdown
Member

Could we use a name for exports that more closely mirrors what's in .meta.json? I realise index is confusing, but maybe index_symbols?

@tilowiklundSensmetry
Copy link
Copy Markdown
Member

exports are the top level symbols exported by the project and are taken from the index in .meta.json. It was included because in a way these are the true "names" of a project as seen from KerML/SysML, and should be part of lockfile validation (and maybe also dependency solving at some point).

I'm confused by how this would be used during validation and resolution. Under what circumstances (where you would not also have access to the full .project.json and .meta.json contents) would it be used?

@victor-linroth-sensmetry
Copy link
Copy Markdown
Collaborator Author

Could we use a name for exports that more closely mirrors what's in .meta.json? I realise index is confusing, but maybe index_symbols?

I considered this at first but decided against it. Mostly because I not sure the average user will care about the content of .meta.json. Maybe they'll eventually get to know .project.json but .meta.json seems like something most wont really bother with (and ideally it would be automatically handled by tools I guess). I think names like index_symbols risks just being even more confusing to your typical engineer.

@victor-linroth-sensmetry
Copy link
Copy Markdown
Collaborator Author

I'm confused by how this would be used during validation and resolution. Under what circumstances (where you would not also have access to the full .project.json and .meta.json contents) would it be used?

Well if you pull a git repo with with a lockfile and do sync your not going to have the manifests for all the dependencies immediately at hand. I think it seems like good practice to do a quick sanity check on the lockfile before you start downloading stuff.

As far as I know the dependency solver doesn't take name collisions in the exports into account, but I feel like that would be a reasonable thing to do since, from what I can tell, KerML/SysML has no way of handling that at all. One of the imports will just get overridden by the other.

Sure we don't absolutely need to include exports, but it seemed fitting since for the most part these are the actual package names and it's the names the user will interact with inside of KerML/SysML.

@tilowiklundSensmetry
Copy link
Copy Markdown
Member

tilowiklundSensmetry commented Oct 17, 2025

I'm confused by how this would be used during validation and resolution. Under what circumstances (where you would not also have access to the full .project.json and .meta.json contents) would it be used?

Well if you pull a git repo with with a lockfile and do sync your not going to have the manifests for all the dependencies immediately at hand. I think it seems like good practice to do a quick sanity check on the lockfile before you start downloading stuff.

As far as I know the dependency solver doesn't take name collisions in the exports into account, but I feel like that would be a reasonable thing to do since, from what I can tell, KerML/SysML has no way of handling that at all. One of the imports will just get overridden by the other.

Sure we don't absolutely need to include exports, but it seemed fitting since for the most part these are the actual package names and it's the names the user will interact with inside of KerML/SysML.

So the general point would be that you propose to be able to validate the lock file early (in certain circumstances). Early in the sense that you do not have all the mentioned projects ready in local environments yet. It's not clear to me under what circumstances this would be useful.

That being said, from a human readability standpoint I think you make a good point that to many actual users, top level names exported by the project may actually be more recognisable than the project name 👍

@tilowiklundSensmetry
Copy link
Copy Markdown
Member

Could we use a name for exports that more closely mirrors what's in .meta.json? I realise index is confusing, but maybe index_symbols?

I considered this at first but decided against it. Mostly because I not sure the average user will care about the content of .meta.json. Maybe they'll eventually get to know .project.json but .meta.json seems like something most wont really bother with (and ideally it would be automatically handled by tools I guess). I think names like index_symbols risks just being even more confusing to your typical engineer.

Fair enough.

@tilowiklundSensmetry
Copy link
Copy Markdown
Member

tilowiklundSensmetry commented Oct 17, 2025

I'm really happy with the above as the default generation, but I'd like to have

Strictly mandatory:

  • iris
  • checksum
  • sources

Recommended (meaning sysand always produces them, but accepts reading files without them):

  • name
  • version
  • export
  • usages

Optional (meaning sysand never produces them, but accepts files containing them):

  • maintainers
  • licence
  • tags
  • website
  • metamodel
  • includes_derived
  • includes_implied
  • created

I don't expect to have round-tripping work for the optional ones, so it can simply be implemented as not giving an "unexpected field" error in case they're present. This can even be done by simply allowing arbitrary additional fields, potentially with a warning message like "ignoring field X". In particular, there is no need to implement any options to include this optional information during lock file generation.

This makes the format more backward compatible in case we want to introduce changes later (even once we're out of 0.x.y).

The main reason for making recommended fields optional is to force us to not accidentally depend on them being present when they're purely informative.

@victor-linroth-sensmetry
Copy link
Copy Markdown
Collaborator Author

This can even be done by simply allowing arbitrary additional fields

This is the default behaviour of serde so there's no reason for us to add optional fields until we are implementing them.

@tilowiklundSensmetry
Copy link
Copy Markdown
Member

This can even be done by simply allowing arbitrary additional fields

This is the default behaviour of serde so there's no reason for us to add optional fields until we are implementing them.

Considering the amount of other ad-hoc validation we're doing, wouldn't it make sense to also check that the fields are correct?

@victor-linroth-sensmetry
Copy link
Copy Markdown
Collaborator Author

Considering the amount of other ad-hoc validation we're doing, wouldn't it make sense to also check that the fields are correct?

Maybe? We could potentially emit warnings for additional fields.

@tilowiklundSensmetry
Copy link
Copy Markdown
Member

Considering the amount of other ad-hoc validation we're doing, wouldn't it make sense to also check that the fields are correct?

Maybe? We could potentially emit warnings for additional fields.

I don't feel strongly either way, but might save us some headache in case people mess with their lockfiles.

Copy link
Copy Markdown
Collaborator

@andrius-puksta-sensmetry andrius-puksta-sensmetry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LTGM, aside from a few nits above.

Copy link
Copy Markdown
Member

@tilowiklundSensmetry tilowiklundSensmetry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we've converged as much as we will in the short term. Look good, let's merge this.

Signed-off-by: victor.linroth.sensmetry <victor.linroth@sensmetry.com>
Signed-off-by: victor.linroth.sensmetry <victor.linroth@sensmetry.com>
Signed-off-by: victor.linroth.sensmetry <victor.linroth@sensmetry.com>
Signed-off-by: victor.linroth.sensmetry <victor.linroth@sensmetry.com>
Signed-off-by: victor.linroth.sensmetry <victor.linroth@sensmetry.com>
Signed-off-by: victor.linroth.sensmetry <victor.linroth@sensmetry.com>
Signed-off-by: victor.linroth.sensmetry <victor.linroth@sensmetry.com>
Signed-off-by: victor.linroth.sensmetry <victor.linroth@sensmetry.com>
Signed-off-by: victor.linroth.sensmetry <victor.linroth@sensmetry.com>
Signed-off-by: victor.linroth.sensmetry <victor.linroth@sensmetry.com>
Signed-off-by: victor.linroth.sensmetry <victor.linroth@sensmetry.com>
Signed-off-by: victor.linroth.sensmetry <victor.linroth@sensmetry.com>
Signed-off-by: victor.linroth.sensmetry <victor.linroth@sensmetry.com>
Signed-off-by: victor.linroth.sensmetry <victor.linroth@sensmetry.com>
Signed-off-by: victor.linroth.sensmetry <victor.linroth@sensmetry.com>
Signed-off-by: victor.linroth.sensmetry <victor.linroth@sensmetry.com>
Signed-off-by: victor.linroth.sensmetry <victor.linroth@sensmetry.com>
Signed-off-by: victor.linroth.sensmetry <victor.linroth@sensmetry.com>
Signed-off-by: victor.linroth.sensmetry <victor.linroth@sensmetry.com>
@victor-linroth-sensmetry victor-linroth-sensmetry merged commit 080c9e5 into main Nov 7, 2025
39 checks passed
@victor-linroth-sensmetry victor-linroth-sensmetry deleted the vl/readable-lockfiles branch November 7, 2025 11:07
simonas-drauksas-sensmetry added a commit that referenced this pull request Nov 21, 2025
The lockfile was renamed in #58

Signed-off-by: Simonas Draukšas <simonas.drauksas@sensmetry.com>
simonas-drauksas-sensmetry added a commit that referenced this pull request Nov 21, 2025
* chore(docs): Fix case of 'SysandLock.toml' to 'sysand-lock.toml'

The lockfile was renamed in #58

* chore(docs): Change file paths to generic project structure

Updated file paths in the tutorial to use a generic project path.

---------

Signed-off-by: Simonas Draukšas <simonas.drauksas@sensmetry.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants