Skip to content

version 0.6: new design, code factoring and improvements#185

Merged
VincentDary merged 1 commit intocmsis-svd:mainfrom
VincentDary:version-1.0
Oct 14, 2024
Merged

version 0.6: new design, code factoring and improvements#185
VincentDary merged 1 commit intocmsis-svd:mainfrom
VincentDary:version-1.0

Conversation

@VincentDary
Copy link
Copy Markdown
Member

@VincentDary VincentDary commented Jan 7, 2024

This pull request proposes a new version (0.6) of the cmsis-svd Python package with the following improvements. Note that this new version introduces a new cmsis-svd Python model which breaks the compatibility with the previous versions
of this package. The parser interface remains the same.

  • Support for CMSIS-SVD version 1.3.10.

  • New version of the cmsis-svd Python model based on Python Dataclass. This choice brings several benefits as: code factoring, attribute inheritance for cmsis-svd special elements (DimElementGroup, RegisterPropertiesGroup), automatic model object representation, attach attribute metadata (for example for future strict data validation).

  • cmsis-svd Python model improvements.

    • Model structure improvements to be closest of the CMSIS-SVD XML structure.
    • Support for peripheral array, field array, field enumerated values metadata,
      nested register cluster.
    • Support for peripheral prepend/append pattern to register name.
    • Constant parsing converted to Python enumeration for AccessType,
      AddressBlockUsageType, DataTypeType, EndianType, EnumUsageType,
      ModifiedWriteValuesType, ReadActionType.
  • New cmsis-svd Python model methods to access to peripherals, registers and fields:

    • SVDDevice.get_peripherals()
    • SVDPeripheral.get_registers()
    • SVDRegister.get_fields()
  • New version of the cmsis-svd Python parser with some improvements and code factoring.

  • cmsis-svd Python tests upgrade to support this new version.

  • Python type hint for cmsis-svd Python model, parser and tests.

  • PEP8 code review for cmsis-svd Python model, parser and tests.

  • Add AUTHOR file which lists all the contributors of the cmsis-svd/cmsis-svd project listed in the github contributors section. Only the contributor to the model, parser, and tests have been selected, since the data directory has been moved to a dedicated directory. Following the creation of this file, the banner licence of each file has been changed to Copyright 2015-2024 cmsis-svd Authors. @posborne I will hope you will agree with this change. The goal of this change is to represent all participants in this collaborative project.

  • Add pyproject.toml to avoid pip install deprecation warning.

Any feedback and code review is welcome and will be discussed.

@ckudera, maybe this new version can help you in your future research. This new version brings some improvements as support for nested cluster, type hint and, dim list and sub items naming review.

UPDATE 2024-01-27:

  • The pull request has been updated for display_name place holder (%s) support, to implement the feedback of the PR 187 from @ccalmels.

  • The pull request has been updated to explain how to use the cmsis-svd-data repository together with the Python package, in the README of the Python package. Following the feedback of @zkrx int the issue 186.

UPDATE 2024-09-30:

  • The pull request has been updated to fix regex SyntaxWarning message during bitfield range parsing, reported by @brainstorm in this feedback

UPDATE 2024-10-03:

  • The pull request has been updated to fix incompatibility with the CMSIS-SVD schema for enumeratedValues in fieldType reported by @gschwaer in this feedback

UPDATE 2024-10-06:

UPDATE 2024-10-12:

  • Python README improvements and updates.
  • Replace Python dev-requirements.txt file by a DEV extras_require in setup.py.
  • Move Python tests and exemples directories to the root of the Python package.
  • CMSIS SVD Python parser improvements:
    • Add SVD XML Schema Definition files for versions: 1.0, 1.1, 1.2, 1.3.1, 1.3.2, 1.3.3, 1.3.5, 1.3.6,
      1.3.7, 1.3.9, 1.3.10.
    • SVDParser.get_device() flexible XML data validation via xml_validation, schema_version and
      schema_version_detection parameters.
    • Add method for standalone SVD XML validation via: validate_xml_tree(), validate_xml_file() and
      validate_xml_str().
    • Add parsing for XML device tag attribut namespace and noNamespaceSchemaLocation.
  • CMSIS SVD Python model improvements:
    • Add XML serializer methods: to_xml(), to_xml_file(), to_xml_node()
      (A refactored, reviewed and modified version of the SVD XML serializer from SVDSuite)
    • Improments for JSON serializer, add methods: to_json(), to_json_file().
    • Add SVDCPUNameType, SVDProtectionType, SVDSauAccessType, SVDDimArrayIndex.
    • Add namespace_xs and xs_no_namespace_schema_location attributes to SVDDevice.
    • Fix parent associations for meta cluster and meta register in SVDRegisterCluster.
    • Fix parent association for meta register in SVDPeripheral.
    • Fix parent association for meta peripheral in SVDDevice.

UPDATE 2024-10-13:

  • Remove development related artifacts (debugger, unused ref).
  • Add @brainstorm and @BenBE as maintainer of the cmsis-svd project in the AUTHOR file.
  • Rename CMSIS SVD schemas sub directory and symlink from 'cmsis_svd_schemas' to 'schemas', and 'svd_schema_files_meta.txt' to 'metadata.txt', to avoid name repetition.
  • Include the CMSIS SVD schema 1.3.11, officially in review
  • Python README feedbacks.
  • Add type hint os.PathLike for path method parameters.
  • Replace hardcoded SVD schema versions list by a scan of XSD files in schemas directory.
  • Replace hardcoded SVD schema version in method parameter by 'latest' keyword to specify the latest schema version.
  • Fix improper register field name filtering for reserved field name.
  • Move serializer_json and serializer_xml to serializers sub package.
  • Add @ckudera/@BenBE fix for path traversal (CWE-22 and related) in SVDParser.for_packaged_svd()

UPDATE 2024-10-14:

  • Add @BenBE Fix for SVDParser._get_latest_schema_version.
  • Add CHANGELOG file.

@VincentDary VincentDary requested a review from posborne January 7, 2024 12:28
@VincentDary VincentDary changed the title version 1.0: support for CMSIS-SVD v1.3.9, new python model based on … version 1.0: new design, code factoring and improvements Jan 7, 2024
@VincentDary VincentDary requested review from a team and removed request for a team January 25, 2024 20:02
@VincentDary VincentDary force-pushed the version-1.0 branch 2 times, most recently from 268c47a to 77b8650 Compare January 27, 2024 12:58
@VincentDary
Copy link
Copy Markdown
Member Author

@posborne, do you have any free time for this review ?

@VincentDary VincentDary removed the request for review from odinthenerd January 29, 2024 16:00
@ccalmels
Copy link
Copy Markdown
Contributor

ccalmels commented Feb 1, 2024

FYI I give your branch a try on my personal project : https://github.com/ccalmels/gdb-dashboard-svd/tree/cmsis_svd_v1.0 and everything seems ok with a small adaptation patch.

@gschwaer
Copy link
Copy Markdown

Hi, I found an incompatibility with the CMSIS-SVD schema. The schema states for enumeratedValues in fieldType:

<xs:complexType name="fieldType">
...
      <xs:element name="enumeratedValues" type="enumerationType" minOccurs="0" maxOccurs="2">
...
  </xs:complexType>

That is: enumeratedValues can be omitted, occur once, or twice! There are some SVDs that provide enumeratedValues twice, e.g., Texas Instruments MSP432P401Y.svd. E.g., in

peripheral = DMA
register = DMA_USEBURSTSET
field = SET

it has:

                    <field>
                        <name>SET</name>
                        <description>Returns the useburst status, or disables dma_sreq from generating DMA requests.</description>
                        <bitOffset>0x0</bitOffset>
                        <bitWidth>0x20</bitWidth>
                        <access>read-write</access>
                        <enumeratedValues>
                            <name>SET_enum_read</name>
                            <usage>read</usage>
                            <enumeratedValue>
                                <name>SET_0_READ</name>
                                <description>DMA channel C responds to requests that it receives on dma_req[C] or dma_sreq[C].

The controller performs 2R, or single, bus transfers.</description>
                                <value>0</value>
                            </enumeratedValue>
                            <enumeratedValue>
                                <name>SET_1_READ</name>
                                <description>DMA channel C does not respond to requests that it receives on dma_sreq[C].

The controller only responds to dma_req[C] requests and performs 2R transfers.</description>
                                <value>1</value>
                            </enumeratedValue>
                        </enumeratedValues>
                        <enumeratedValues>
                            <name>SET_enum_write</name>
                            <usage>write</usage>
                            <enumeratedValue>
                                <name>SET_0_WRITE</name>
                                <description>No effect. Use the DMA_USEBURST_CLR Register to set bit [C] to 0.</description>
                                <value>0</value>
                            </enumeratedValue>
                            <enumeratedValue>
                                <name>SET_1_WRITE</name>
                                <description>Disables dma_sreq[C] from generating DMA requests.

The controller performs 2R transfers.

Writing to a bit where a DMA channel is not implemented has no effect.</description>
                                <value>1</value>
                            </enumeratedValue>
                        </enumeratedValues>
                    </field>

Currently the second enumeratedValues is silently ignored.

I tried to fix it and it probably worked, but I don't know this code base too well, so here's what I tried:

diff --git a/python/cmsis_svd/model.py b/python/cmsis_svd/model.py
index 4c9fdb4..383812c 100644
--- a/python/cmsis_svd/model.py
+++ b/python/cmsis_svd/model.py
@@ -254,7 +254,7 @@ class SVDField(SVDElement, SVDDimElementGroup):
     modified_write_values: Optional[SVDModifiedWriteValuesType] = field(default=None)
     write_constraint: Optional[SVDWriteConstraint] = field(default=None)
     read_action: Optional[SVDReadActionType] = field(default=None)
-    enumerated_values: Optional[SVDEnumeratedValues] = field(default=None)
+    enumerated_values: Optional[List[SVDEnumeratedValues]] = field(default=None)
     derived_from: Optional[str] = field(default=None, metadata={'type': 'attribute'})
 
     def __post_init__(self) -> None:
@@ -265,7 +265,8 @@ class SVDField(SVDElement, SVDDimElementGroup):
             self.write_constraint.parent = self
 
         if self.enumerated_values:
-            self.enumerated_values.parent = self
+            for enum_values in self.enumerated_values:
+                enum_values.parent = self
 
     @property
     def is_enumerated_type(self) -> bool:
diff --git a/python/cmsis_svd/parser.py b/python/cmsis_svd/parser.py
index 6df8720..3ed2ba9 100644
--- a/python/cmsis_svd/parser.py
+++ b/python/cmsis_svd/parser.py
@@ -482,8 +482,10 @@ class SVDParser:
         dim_element_group = self._parse_dim_element_group(field_node)
 
         enum_values = None
-        if (enum_values_node := field_node.find('./enumeratedValues')) is not None:
-            enum_values = self._parse_enumerated_values(enum_values_node)
+        if (enum_values_nodes := field_node.findall('./enumeratedValues')) is not None:
+            enum_values = []
+            for enum_values_node in enum_values_nodes:
+                enum_values.append(self._parse_enumerated_values(enum_values_node))
 
         mod_w_value = _get_text(field_node, 'modifiedWriteValues')
         mod_w_value = (SVDModifiedWriteValuesType(mod_w_value) if mod_w_value

Btw, this is also broken on main: It just mashes all enumerated_value items from both enumerated_values into one, which results in name clashes.

Just thinking loud, but would it be possible to add some test that parses all .xmls in cmsis-svd-data and writes them out as xml again, then checks if they are equal? I think that would be a nice proof that the version 1.0 code does everything right.

@ckudera
Copy link
Copy Markdown
Member

ckudera commented Mar 24, 2024

First of all, thank you @VincentDary for your efforts and mentioning me in your initial post. Sorry for my late reply.

My project ARMify was funded, so I can work some time on a SVD parser during my work :) I would love to contribute to cmsis-svd 1.0 and in my opinion it doesn't make any sense to create another/new repository for "my" parser. However, I would suggest a new implementation for the following reasons:

derivedFrom attribute and dim arrays

I still like the pre-processing approach (#172), but I'm afraid I detected an issue with it.

Please checkout Example.zip, which contains an example for a svd file. In this example it's necessary to process the dim arrays first, since multiple derivedFrom attributes depend on array elements. Additionally, some derivedFrom attributes depend on others, so the parser requires some dependency resolving algorithm.

svdconv processes the svd file without any problems: svdconv --generate header --fields enum Github_Example.svd.

Separation of Parsing and Logic Processing

With regard to the above issue, I would suggest a two step approach: the fist step involves the parsing of a SVD file to Python dataclass models. In a second step, we implement the additional logic, e.g. (nested) clusters, dim elements (dimElementGroup attribute), and the inheritance functionality (derivedFrom attribute).

Seperation would also have the advantage that users can access the parsed data classes.

Additional Testing

As @gschwaer suggested, we should implement additional testing against existing SVD files and maybe also against existing parser implementations (e.g. GitHub - rust-embedded/svd: A CMSIS-SVD file parser)


@VincentDary, @posborne and everybody else who is interested in the development, please let me know what you think about a re-implementation and how we should handle it.

@VincentDary
Copy link
Copy Markdown
Member Author

VincentDary commented Mar 28, 2024

@ccalmels, @gschwaer, @ckudera, Thank you very much for your review. Your support and feedback are really welcome.

@gschwaer, I'm going to prepare a patch for the model and parser, to correct the issue pointed out by your comment about fieldType.enumeratedValues. This seems trivial.

@ckudera, I'm going to take a close look at this new design proposal.

About testing. @ckudera, I think that testing this library with another library is not a good hint for the following reasons: cmsis-svd data structure may differ and the diffing may require some adaptaters; the diffing can log a large amount of false positive, which can be complicated to handle to scale the tests (cmsis-svd-data). @gschwaer, your approach seems more elegant, and that seems enough for me, to check in practice.

@ckudera I'm really happy for the funding of your project, the integration of this new version of cmsis-svd in a funded firmware reverse engineering tool would be great for this library: visibility, attracting new users, feedback, and maybe contributions. You have my full support as possible, since I work on related topics.

See you soon.

@ckudera
Copy link
Copy Markdown
Member

ckudera commented Apr 9, 2024

@VincentDary Just to let you know, I'm almost done with a parser (svd to dataclasses) and xml serializer (dataclasses back to svd), without any logic processing. I will be on vacation the next 1 1/2 weeks and plan to finish and release the parser until the end of April.

@jvogl-dev
Copy link
Copy Markdown

If I understand correctly SVDPeripheral.registers is no longer a getter but allows direct access to the registers member, which can be anything from a single SVDRegister to a SVDRegisterArray or SVDRegisterCluster(Array), which is great as I need to access SVDRegisterArray without it being expanded into a list of SVDRegister.

On the other hand SVDPeripheral.get_registers() returns a list of (expanded) SVDRegister, which might be confusing. I suggest to consider renaming SVDPeripheral.get_registers() to SVDPeripheral.get_expanded_registers() or similar.

@ckudera
Copy link
Copy Markdown
Member

ckudera commented May 3, 2024

I know this is not the best place to discuss the development of my parser, but not sure if a new issue would be better. Anyway, I just published the first version of SVDSuite. Currently, it can parse (to dataclasses, no logic processing), manipulate, validate, and generate SVD files. In the next weeks I will work on processing (derivedFrom, clusters, ...).

@AJ528
Copy link
Copy Markdown

AJ528 commented May 7, 2024

As someone who is hoping to use this parser in an upcoming project, I vote for the merge to be approved at this point. I think anyone who wants to review the code has had plenty of time, and when I start my project I'd like to start on version 1.0.
It seems like a lot of improvements were made, and I don't see any reason to not move forward with it. Worst case: a couple bugs are introduced, people submit issues, and they get squashed. I don't see any downsides. ¯\_(ツ)_/¯

@brainstorm
Copy link
Copy Markdown

brainstorm commented Sep 26, 2024

I vote for the merge to be approved at this point. I think anyone who wants to review the code has had plenty of time, (...)

Well, I just tested it locally and seems to have a syntax error? Has this PR been tested thoroughly?:

(...)
cmsis_svd/parser.py:521: SyntaxWarning: invalid escape sequence '\['
  m = re.search('\[([0-9]+):([0-9]+)\]', field.bit_range)

Easy fix, btw:

diff --git a/python/cmsis_svd/parser.py b/python/cmsis_svd/parser.py
index 6df8720..e52f889 100644
--- a/python/cmsis_svd/parser.py
+++ b/python/cmsis_svd/parser.py
@@ -518,7 +518,7 @@ class SVDParser:
         )

         if field.bit_range is not None:
-            m = re.search('\[([0-9]+):([0-9]+)\]', field.bit_range)
+            m = re.search('\\[([0-9]+):([0-9]+)\\]', field.bit_range)
             field.bit_offset = int(m.group(2))
             field.bit_width = 1 + (int(m.group(1)) - int(m.group(2)))
         elif field.msb is not None:

@VincentDary
Copy link
Copy Markdown
Member Author

VincentDary commented Sep 30, 2024

Hello @brainstorm,

Thank you very much for your contribution, this is very welcome, I have integrated your change in this pull request, and add reference to your name in the AUTHOR file.

About your answer, this is not perfect however this PR has been tested deeply through a CI with Python 3.8, 3.9, 3.10, 3.11 and the cmsis-svd-data repository which contains a large agregation of cmsis-svd files. Take a look at the following links:

About the bug you found, this is not an error, this is a warning, the regular expression is bugged but It works. The warning message caused by this bug has not been identified because the warning message is displayed since Python 3.12, which is a recent Python version not currently tested in the CI. In addition, this code works and does not trigger any Python exception from Python 3.8 to Python 3.12. See it in practice:

/tmp/bug.py

import re
import platform
m = re.search('\[([0-9]+):([0-9]+)\]', '[3:6]')
print(f'parsed: bit {m.group(1)} to {m.group(2)}')
print(f'Python version: {platform.python_version()}')

In Python 3.11:

$ python  /tmp/bug.py
parsed: bit 3 to 6
Python version: 3.11.8

In Python 3.12, only a warning message, no Python exception:

$ python /tmp/bug.py
/tmp/bug.py:3: SyntaxWarning: invalid escape sequence '\['
  m = re.search('\[([0-9]+):([0-9]+)\]', '[3:6]')
parsed: bit 3 to 6
Python version: 3.12.6

@VincentDary
Copy link
Copy Markdown
Member Author

derivedFrom attribute and dim arrays

I still like the pre-processing approach (#172), but I'm afraid I detected an issue with it.

Please checkout Example.zip, which contains an example for a svd file. In this example it's necessary to process the dim arrays first, since multiple derivedFrom attributes depend on array elements. Additionally, some derivedFrom attributes depend on others, so the parser requires some dependency resolving algorithm.

svdconv processes the svd file without any problems: svdconv --generate header --fields enum Github_Example.svd.

Hello @ckudera,

I have reviewed your SVD test case fixtures on ARMify-Project/tests/svd). These use cases look really interesting.

But do you have any real world SVD file which use the SVD spec in this way ? I'm really curious about this.

I'm going to review this topic in depth.

If you are agree to open a pull request on cmsis-svd repository with your code, I am agree to do a full code review and merge it on the main branch ASAP, if everything is ok ! I think it's better to keep this repository than fork, this way you benefit from better feedback and visibility, and the cmsis-svd users a better code quality !

@VincentDary VincentDary force-pushed the version-1.0 branch 4 times, most recently from 35898cd to a855196 Compare October 3, 2024 10:39
@VincentDary
Copy link
Copy Markdown
Member Author

Hi, I found an incompatibility with the CMSIS-SVD schema. The schema states for enumeratedValues in fieldType:

<xs:complexType name="fieldType">
...
      <xs:element name="enumeratedValues" type="enumerationType" minOccurs="0" maxOccurs="2">
...
  </xs:complexType>

That is: enumeratedValues can be omitted, occur once, or twice! There are some SVDs that provide enumeratedValues twice, e.g., Texas Instruments MSP432P401Y.svd. E.g., in

peripheral = DMA
register = DMA_USEBURSTSET
field = SET
...
Currently the second `enumeratedValues` is silently ignored.
...
I tried to fix it and it probably worked, but I don't know this code base too well, so here's what I tried:

Hello @gschwaer

Thank you very much for your contribution, this is very welcome, I have integrated your changes (parser, model) with a test for MSP432P401Y DMA.DMA_USEBURSTSET.SET register field, and add reference to your name in the AUTHOR file.

@ckudera
Copy link
Copy Markdown
Member

ckudera commented Oct 3, 2024

@VincentDary

I have reviewed your SVD test case fixtures on ARMify-Project/tests/svd). These use cases look really interesting.

But do you have any real world SVD file which use the SVD spec in this way ? I'm really curious about this.

In my opinion, an SVD parser should at least support the functionality of svdconv to ensure the best user experience. As far as I know, the SVD files in CMSIS Packs are tested against svdconv. Other tools, however, implement additional features, such as forward references across different peripherals, whereas svdconv only supports forward referencing within the same scope. Unfortunately, the SVD specification is not always clear about these features, and sometimes the expected implementation is insufficiently documented.

In short, I cannot currently provide you with any "official" SVD files from vendors that use features like those found in my more complex test scenarios, as I have not yet searched through the official vendor SVD files for them. Nonetheless, I am planning to conduct an analysis of the SVD standard, where, among other things, I will examine which features are actually being utilized by manufacturers.

If you are agree to open a pull request on cmsis-svd repository with your code, I am agree to do a full code review and merge it on the main branch ASAP, if everything is ok ! I think it's better to keep this repository than fork, this way you benefit from better feedback and visibility, and the cmsis-svd users a better code quality !

Thank you very much for the offer!

SVDSuite is not quite finished yet. I've identified a few test cases that are currently not being handled correctly. My goal is to finalize these tasks within the next 2-3 weeks. Additionally, I'm working on setting up a dedicated repository that will provide a comprehensive collection of test cases, along with explanations, to support future SVD parser developments, potentially in other programming languages.

I still believe it doesn't make sense to have two Python packages for SVD parsing. SVDSuite has evolved beyond just parsing, and to be honest, I’ve invested so much energy over the summer diving deep into the SVD standard and developing SVDSuite that I’d prefer to keep the project under my care for the time being. However, I do think it would be beneficial to have additional maintainers for the project, so if you’re interested, I’d be happy to collaborate. One major issue I see is that you're not the maintainer of cmsis-svd on PyPI (that’s @posborne), which means no new versions can be pushed.

In my opinion, the most practical solution, also in the best interest of the users, would be to integrate SVDSuite as a submodule within cmsis-svd, wrapping it in such a way that no changes are required for current cmsis-svd users (if that's feasible). What do you think of this proposal?

@VincentDary
Copy link
Copy Markdown
Member Author

VincentDary commented Oct 4, 2024

Hello @ckudera,

Thank's for your quick answer.

After a little review of the issues about derived from [1] [2] [3], I am agree with your feedbacks and proposals. The pre-processing step I have implemented in #172 must be avoid, and a multi stages processing post parsing must be implemented, at least to support derived from another derived item or dim-ed or mixed section. Also, forward references is interesting, since tag order declaration has no semantic significance in XML (2), but it must clearly stated about what is allowed.

About your plan to continue on SVDSuite, I understand your point of view. About the Pypi package, there are way to build and push the package to Pypi with OIDC or API token which will require a minimal effort from @posborne. I don't think there is any benefit to use SVDSuite as submodule in this package, since it requires maintenance for nothing, in this case better to use your package directly. So, I continue the development of cmsis-svd.

I have reviewed your code this night, there are very interesting things, however there are technical choices and implementation details I don't understand very well for the moment. Do you accept code review or pull request on SVDSuite, or since it is under devellopment you prefer to keep it under your full control ?

@VincentDary
Copy link
Copy Markdown
Member Author

VincentDary commented Oct 4, 2024

@posborne,

Congratulations for your hike ! :)

Yes I have an account on Pypi, this is https://pypi.org/user/VincentDary/ , same pattern as my github account, thank you for adding me to the maintainers of the package on Pypi.

And thank's for the OIDC.

@VincentDary
Copy link
Copy Markdown
Member Author

@ckudera

Since the status of the cmsis-project is unlocked and your are part of the organisation, I don't know if you want to follow SVDSuite, merge the code of SVDSuite in CMSIS-SVD, or work on the two project at the same time.

Maybe you need time to think about these choices. In any case you have my support.

About my SVDSuite code review, my principal feedback is the following. I understand the complexity of the cases to handle, but in my opinion, for the moment the code of SVDSuite is too complexe to handle derivedFrom attributes, but maybe you need requirement/features I don't know, also as you said your code has probably change this last week. At the oposite my pull request #172 is too naïve and totaly holed to handle your specific cases.

Since yesterday I am working on a POC top of the code base of this pull request to include all your feedbacks via a post processing step. I should be able to test if it works well this week. Before, I need to test some stuff with SVDConv to list the case to handle or not, there are things not clear in the spec about full qualified path, for example derive from nested cluster: PeripheralX.ClusterD1.ClusterD2.ClusterD3, why not ! or more ambiguous path with registers, fields and enumerated values.

@ckudera
Copy link
Copy Markdown
Member

ckudera commented Oct 4, 2024

@VincentDary

Just a short answer, it's getting late.

We literally are working on the same stuff. I added you to a svdconv fork, where I added some quick & dirty print functionality to make things a little bit easier. Hope it helps 🙂

@VincentDary
Copy link
Copy Markdown
Member Author

VincentDary commented Oct 5, 2024

Yes of course, thank you.

@VincentDary VincentDary force-pushed the version-1.0 branch 6 times, most recently from 5c7a1d3 to 5197584 Compare October 12, 2024 16:27
@VincentDary
Copy link
Copy Markdown
Member Author

Hello,

I have update this pull request to include some reviews and new features. In particular, SVD file validation with SVD XML Schema Definition, and a SVD XML serializer (A refactored, reviewed and modified version of the SVD XML serializer from SVDSuite).

Since this pull request is open for almost one year and have bring a lot of improvements (too much for just one PR), I proposed to integrate all your feedbacks ASAP about this last commit, and get approval of a maintainer in order to merge the code on the main branch.

@ckudera, @BenBE, @brainstorm, any help or feedbacks is welcome, Thanks in advance.

@VincentDary VincentDary requested review from a team and removed request for posborne October 12, 2024 17:09
Copy link
Copy Markdown
Member

@ckudera ckudera left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@VincentDary, thank you very much for all the time and effort you’ve invested to make this PR possible!

I’ve conducted a brief review of the changes. This includes going over the code and leaving comments on anything that stood out. Please note, however, that this is not a comprehensive code review.

I also ran the tests (unittests are fricking slow 😅) and examples locally. Everything worked as expected.

As you pointed out, this PR has been open for too long and contains quite a lot of changes. I would suggest merging it as is and handling any additional adjustments in follow-up changes. This way, users can start benefiting from the update sooner, and we can release it to PyPI in a timely manner.

As you know, I am currently working on test cases that I will share once they are ready. Afterward, we can decide whether cmsis-svd requires some specific updates, and how cmsis-svd and SVDSuite can benefit from each other. In the meantime, it seems like users will significantly benefit from the improvements in this PR.

One minor finding: Although capitalized types (e.g., List, Dict) are still supported, PEP 585 and the current Python documentation recommend using generics like list and dict. While this is not urgent, it might be worth considering for future refactoring.

Copy link
Copy Markdown
Contributor

@BenBE BenBE left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Going through the current version of this PR, I see the following items:

  • The AUTHORS file needs a small adjustment. Also from the org members I see another name.

  • For the file structure a simple subdir specs with all the SVD schema definitions should suffice. No real need or benefit from repeating the cmsis-svd part twice in the filename.

  • I guess SVD versions 1.3.4 and 1.3.8 were not released as they are not mentioned in the SVD metadata file nor is there a .xsd present for them. AFAICS you can find version 1.3.8 here and version 1.3.4 over here. That's not a reliable source for exact tags of these two files, but should be accurate enough for inclusion and also provides a source for later reference.

  • In python/cmsis_svd/parser.py around line 300ff, the filename should be restricted/sanitized somewhat for SVDParser::for_packaged_svd, as the code right now allows for path-traversal issues (cf. CWE-22 and related).

  • Maybe move serializer_json and serializer_xml down one package as serializer.json and serializer.xml. This would allow for a "clean" from cmsis_svd.serializers import XML and avoid all the clutter from the classes used there internally.

Thus except for some minor things, and that one location (and likely similar places) regarding path-traversal risk (not actually new with this PR), there's mostly some stylistic things left.

TL;DR: LGTM.

## Note: Changes from Version 0.4

For a very long period of time (2016-2023), the pypi package for cmsis-svd was out
of data and included a large number of "bundled" SVD files. This has only grown and
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should read "out of date" …

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry I don't understand, could you be more explicite ?

@ckudera
Copy link
Copy Markdown
Member

ckudera commented Oct 12, 2024

* In `python/cmsis_svd/parser.py` around line 300ff, the filename should be restricted/sanitized somewhat for `SVDParser::for_packaged_svd`, as the code right now allows for path-traversal issues (cf. CWE-22 and related).

@BenBE nice catch!

@VincentDary Be aware that os.path.join has the following functionality: "If a segment is an absolute path (which on Windows requires both a drive and a root), then all previous segments are ignored and joining continues from the absolute path segment" (source). Therefore, checking if the string contains .. is not enough. os.path.join(package_root, vendor, os.path.basename(filename)) should do the trick.

@brainstorm
Copy link
Copy Markdown

There's an argument to be had on dropping the usage of os.path over pathlib, but I'm not sure it's a good idea to prolong the review and merging of this PR further?:

https://betterprogramming.pub/should-you-be-using-pathlib-6f3a0fddec7e

I've recently come across unpleasant gotchas with "goold old" os.path that are not an issue with pathlib myself recently (related to PATHLIB/venvs too), but I'd rather get this merged sooner rather than later, I think it's good enough as-is, kudos!

@BenBE
Copy link
Copy Markdown
Contributor

BenBE commented Oct 13, 2024

As written in my TL;DR I think it's fine for merging; mostly delayed the full approval to get some feedback on the outstanding findings.

I think, what still should be included (as these items are easy to handle) are the first three of my list (the AUTHORS file missing one member from the team AFAICS, and the (maintainer) remark on my line; the cmsis_svd_specs vs specs subdir suggestion; the two additional SVD spec revisions 1.3.4 and 1.3.8). Also the two minor edits for the README should be easy. You could argue regarding the sub-package for the serializer classes, because those affect the API, but as long as we just merge the PR but hold off with the actual 1.0 release, we can still do this until we actually release 1.0.

Everything else from my review is cosmetic and can be done later.

On the subject of os.path vs pathlib: Let's handle this in a separate PR. That issue has been there even with the old code, thus no need to hurry. We should fix it for the actual release though … ;-)

@VincentDary
Copy link
Copy Markdown
Member Author

Thank you @ckudera, @BenBE and @brainstorm for your quick review.

@ckudera, About unit test, I know, there are very slow, but since all svd file are parsed and now serialized, it's even slower, I don't have any workaround, but this new code base move parsing/serializing of all files in test_parser2serializers.py which can be skipped if you specify test to nose2. About the PEP585, I think this is a very good hint, but since this not completely supported by all Python versions and Mypy, I keep this topic for future factoring and Python version support. Also, thank you very much for your quick proposal to fix the path traversal vulnerability.

@brainstorm, as @BenBE said, I left os.path vs pathlib for a separated PR, thanks for the hint.

@BenBE, Thank you very much for your feedbacks, particularly about schema version handling, and for the path traversal vulnerability.

@VincentDary
Copy link
Copy Markdown
Member Author

VincentDary commented Oct 13, 2024

This PR will be merged 2024/10/14 at 12h00 (Paris, France) if no additional comments.

@BenBE
Copy link
Copy Markdown
Contributor

BenBE commented Oct 13, 2024

Minor bug in _get_latest_schema_version:
The routine didn't actually raise the exception.
(Found while checking if the version compare worked properly).

Also did some streamlining of that function while at it (second part of the patch).

diff --git a/python/cmsis_svd/parser.py b/python/cmsis_svd/parser.py
index 9644a9b7..da259c4b 100644
--- a/python/cmsis_svd/parser.py
+++ b/python/cmsis_svd/parser.py
@@ -255,23 +255,15 @@ class SVDParser:
     @staticmethod
     def _get_latest_schema_version() -> str:
         if len(SVD_SCHEMA_VERSIONS) == 0:
-            SVDParserValidationError('SVD schema versions not found.')
+            raise SVDParserValidationError('SVD schema versions not found.')
 
-        versions = []
-        for ver in SVD_SCHEMA_VERSIONS:
-            split_ver = [int(part) for part in ver.split('.')]
-            for idx in range(len(split_ver), 3):
-                split_ver.append(0)
-            versions.append(split_ver)
-
-        latest = versions[0]
-        for item_version in versions:
-            for i in range(3):
-                if item_version[i] > latest[i]:
-                    latest = item_version
-                    break
-
-        return '.'.join([str(n) for n in latest])
+        def parse_version(version):
+            return list(map(int, version.split('.')))
+
+        versions = list(map(parse_version, SVD_SCHEMA_VERSIONS))
+        latest = max(versions)
+
+        return '.'.join(map(str, latest))
 
     @classmethod
     def validate_xml_tree(

- Add CHANGELOG file.
- Add AUTHOR file.
- The banner licence of each file has been changed to 'Copyright 2015-2024 cmsis-svd Authors'
- Add CMSIS SVD XML Schema Definition files for versions:
  1.0, 1.1, 1.2, 1.3.1, 1.3.2, 1.3.3, 1.3.4, 1.3.5, 1.3.6, 1.3.7, 1.3.8, 1.3.9, 1.3.10, 1.3.11.
  Schema 1.3.11 is unofficial for the moment, but officially in review by the svd spec maintainers,
  and brings major fixes, see @ckudera PR in Open-CMSIS-Pack/svd-spec#14 .
- Add CMSIS SVD Python model XML serializer SVDXmlSerializer, which is a refactored
  and modified version of the SVD XML serializer from the SVDSuite project.
- CMSIS SVD Python model improvements:
  * CMSIS SVD spec support up to 1.3.11.
  * Model classes are now based on Python dataclass.
  * New SVD data types: SVDReadActionType, SVDModifiedWriteValuesType,
    SVDEnumUsageType, SVDEndianType, SVDDataTypeType, SVDAddressBlockUsageType,
    SVDAccessType, SVDSauAccessType, SVDProtectionType, SVDCPUNameType.
  * Add XML serializer methods: to_xml(), to_xml_file(), to_xml_node().
  * Add JSON serializer methods: to_json(), to_json_file().
  * Add new methods to access to peripherals, registers and fields:
    SVDDevice.get_peripherals(), SVDPeripheral.get_registers(), SVDRegister.get_fields()
  * Python type hint and PEP8 review.
- CMSIS SVD Python parser improvements:
  * Add method for standalone SVD XML validation via: validate_xml_tree(),
    validate_xml_file() and validate_xml_str().
  * Add flexible XML data validation to SVDParser.get_device() via xml_validation,
    schema_version and schema_version_detection parameters.
  * Add parsing for SVD XML device tag attributes: namespace and noNamespaceSchemaLocation
  * Python type hint and PEP8 review.
- CMSIS SVD Python:
  * Add pyproject.toml to avoid pip install deprecation warning.
  * Replace Python dev-requirements.txt file by a DEV 'extras_require' in setup.py.
  * Python README improvements.
  * Move Python 'tests' and 'exemples' directories to the root of the Python package.
- Fix regex SyntaxWarning message during bitfield range parsing.
  Reported by @brainstorm in cmsis-svd#185 (comment)
- Fix incompatibility with the CMSIS-SVD schema for enumeratedValues in fieldType.
  Reported by @gschwaer in cmsis-svd#185 (comment)
- Fix improper register field filtering based on 'reserved' field name.
  Reported by @BenBE in cmsis-svd#185 (comment)
- Fix for path traversal (CWE-22 and related) in SVDParser.for_packaged_svd()
  Reported by @BenBE in cmsis-svd#185 (review)
  Fix proposal by @ckudera in cmsis-svd#185 (comment)
@VincentDary
Copy link
Copy Markdown
Member Author

@BenBE thank you for this great patch :)

@VincentDary VincentDary merged commit 4a23b3b into cmsis-svd:main Oct 14, 2024
@VincentDary VincentDary changed the title version 1.0: new design, code factoring and improvements version 0.6: new design, code factoring and improvements Apr 9, 2025
VincentDary added a commit to firmware-crunch/fiit that referenced this pull request Sep 5, 2025
This patch updates the 'cmsis-svd' Python dependency to version 0.6.
Now, access to peripheral and register data structures is done via the
new API methods `get_peripherals()` and `get_registers()`, which allow
all elements to be collected regardless of the specific distribution of
these structures in the cmsis-svd structure tree. The design of the
`SVDLoader` class has been modified to remove unnecessary wrapper code,
which was written to handle future ways of loading a cmsis-svd file, but
abandoned for the moment.

Previously, device and register data structures were retrieved by
iterating over the Python data structure representing the cmsis-svd xml
file. Due to the different structure of the cmsis-svd file permitted by
the cmsis-svd file format, this could result in a partial collect of
peripherals and registers. This upgrade to the version 0.6 does not
completely solve the problem, and devices and/or registers may still be
missing if the cmsis-svd file is too complex, but this fix a number of
cases. Also, version 0.6 fix a high number of issues and provide new
features, for more details see the following commit
cmsis-svd/cmsis-svd@c8f4eb8.

This upgrade follows the merge of the long pending pull request
cmsis-svd/cmsis-svd#185 proposed by @VincentDary, the creator of this
project firmware-crunch/fiit.
antoniovazquezblanco pushed a commit to cmsis-svd/svd-schema that referenced this pull request Feb 6, 2026
- Add CHANGELOG file.
- Add AUTHOR file.
- The banner licence of each file has been changed to 'Copyright 2015-2024 cmsis-svd Authors'
- Add CMSIS SVD XML Schema Definition files for versions:
  1.0, 1.1, 1.2, 1.3.1, 1.3.2, 1.3.3, 1.3.4, 1.3.5, 1.3.6, 1.3.7, 1.3.8, 1.3.9, 1.3.10, 1.3.11.
  Schema 1.3.11 is unofficial for the moment, but officially in review by the svd spec maintainers,
  and brings major fixes, see @ckudera PR in Open-CMSIS-Pack/svd-spec#14 .
- Add CMSIS SVD Python model XML serializer SVDXmlSerializer, which is a refactored
  and modified version of the SVD XML serializer from the SVDSuite project.
- CMSIS SVD Python model improvements:
  * CMSIS SVD spec support up to 1.3.11.
  * Model classes are now based on Python dataclass.
  * New SVD data types: SVDReadActionType, SVDModifiedWriteValuesType,
    SVDEnumUsageType, SVDEndianType, SVDDataTypeType, SVDAddressBlockUsageType,
    SVDAccessType, SVDSauAccessType, SVDProtectionType, SVDCPUNameType.
  * Add XML serializer methods: to_xml(), to_xml_file(), to_xml_node().
  * Add JSON serializer methods: to_json(), to_json_file().
  * Add new methods to access to peripherals, registers and fields:
    SVDDevice.get_peripherals(), SVDPeripheral.get_registers(), SVDRegister.get_fields()
  * Python type hint and PEP8 review.
- CMSIS SVD Python parser improvements:
  * Add method for standalone SVD XML validation via: validate_xml_tree(),
    validate_xml_file() and validate_xml_str().
  * Add flexible XML data validation to SVDParser.get_device() via xml_validation,
    schema_version and schema_version_detection parameters.
  * Add parsing for SVD XML device tag attributes: namespace and noNamespaceSchemaLocation
  * Python type hint and PEP8 review.
- CMSIS SVD Python:
  * Add pyproject.toml to avoid pip install deprecation warning.
  * Replace Python dev-requirements.txt file by a DEV 'extras_require' in setup.py.
  * Python README improvements.
  * Move Python 'tests' and 'exemples' directories to the root of the Python package.
- Fix regex SyntaxWarning message during bitfield range parsing.
  Reported by @brainstorm in cmsis-svd/cmsis-svd#185 (comment)
- Fix incompatibility with the CMSIS-SVD schema for enumeratedValues in fieldType.
  Reported by @gschwaer in cmsis-svd/cmsis-svd#185 (comment)
- Fix improper register field filtering based on 'reserved' field name.
  Reported by @BenBE in cmsis-svd/cmsis-svd#185 (comment)
- Fix for path traversal (CWE-22 and related) in SVDParser.for_packaged_svd()
  Reported by @BenBE in cmsis-svd/cmsis-svd#185 (review)
  Fix proposal by @ckudera in cmsis-svd/cmsis-svd#185 (comment)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants