Java: Regenerate framework models automatically#7767
Conversation
| - name: Build database | ||
| env: | ||
| SLUG: ${{ matrix.slug }} | ||
| REF: ${{ matrix.ref }} | ||
| run: | | ||
| mkdir dbs | ||
| cd repos/${REF} | ||
| SHORTNAME=${SLUG//[^a-zA-Z0-9_]/} | ||
| codeql database create --language=java ../../dbs/${SHORTNAME} |
There was a problem hiding this comment.
We could consider using actions/cache for this
|
@tamasvajk In case you have some time, I'd appreciate if someone could look at this action. Happy to pair if anything is unclear |
michaelnebel
left a comment
There was a problem hiding this comment.
This is really good work!
I have added a couple of suggestions, but these could also just be follow ups - so not anything blocking.
| import sys | ||
|
|
||
|
|
||
| lgtmSlugToModelFile = { |
There was a problem hiding this comment.
Should we consider introducing an invariant instead of an explicit mapping?
The root path could be java/ql/lib/semmle/code/java/frameworks/ and the slug can then be used to decide subfolder an file name.
eg. slug = "apache/commons-beanutils" -> apache/CommonsBeanutilsGenerated.qll.
This would solve the minor code cohesion issue that we otherwise need to remember to update both the workflow (slug, ref) pair and add an entry into lgtmSlugToModelFile.
There was a problem hiding this comment.
I used to do that but quickly realized that we have different locations for our models. And yes, we could try and move/rename them all into the same spot but hold back on that. I'm happy either way.
There was a problem hiding this comment.
Are you thinking about models that has been auto generated and then don't follow this pattern on location or hand written models?
There was a problem hiding this comment.
Yes, mostly because of not having a good naming pattern (apache/commons-io -> apache/IO.qll vs apachecommonsio.qll vs apache/commonsio.qll). E.g. when we have multiple models from an organization, we keep them in folders whereas others, we split one framework into multiple parts (still to be discussed when it comes to the generator). Generally, I agree we can extract the root path though
There was a problem hiding this comment.
Alright. Thank you for the context. If this is undecided, then just leave it as it is :-)
| print("============================================================") | ||
| print("Generating models for " + lgtmSlug) | ||
| print("============================================================") | ||
| modelFile = lgtmSlugToModelFile[lgtmSlug] |
There was a problem hiding this comment.
Maybe add some exception handling as we might have forgotten to update the lgtmSlugToModelFile, when adding entries to the strategy matrix in the workflow (if not removing the lookup)?
|
@michaelnebel Good points overall on the |
Without any experience, it is hard for me to say, but my immediate response would be to keep it as a python script, just to ease potential debugging/extensions/rewriting of the script. |
tamasvajk
left a comment
There was a problem hiding this comment.
Looks plausible to me.
BTW, where did you test this workflow? I used to push my workflow changes first to dsp-testing to see them in action (pun intended).
| slug: ["placeholder"] | ||
| ref: ["placeholder"] | ||
| include: | ||
| - slug: "apache/commons-io" | ||
| ref: "8985de8fe74f6622a419b37a6eed0dbc484dc128" | ||
| exclude: | ||
| - slug: "placeholder" | ||
| ref: "placeholder" |
There was a problem hiding this comment.
Why do we need the placeholders here (, which are excluded)?
There was a problem hiding this comment.
In order to create a paramterized job without the combinatoric nature of a matrix build (and all axis are required to have at least one value)
Fair enough, keeping it for now
|
Takes the model generator and any framework model we have on main and regenerates the model using the latest and greatest model generator. This enables us to update and evolve the models without manual intervention.