2

In PowerShell, child nodes and attributes of XML elements are accessible as properties:

([xml]"<book genre='novel'/>").Book.genre
novel

It is a Property, not a NoteProperty:

([xml]"<book genre='novel'/>").Book | Get-Member genre
   TypeName: System.Xml.XmlElement

Name  MemberType Definition
----  ---------- ----------
genre Property   string genre {get;set;}

The documentation says:

Attributes are properties of the element, ...

Although, in C#, there are no such properties:

XmlDocument doc = new XmlDocument();
doc.LoadXml("<book genre='novel'/>");
// Console.WriteLine(doc.Book);  // Does not work
// Console.WriteLine(doc.DocumentElement.genre);  // Does not work
'XmlDocument' does not contain a definition for ...

Is it a PowerShell feature? Or is it a .NET feature I am using incorrectly?

What is the mechanism used to implement these properties?

0

1 Answer 1

5

tl;dr

  • You're seeing PowerShell's adaptation of the XML DOM (.NET type System.Xml.XmlDocument, available via type accelerator [xml] in PowerShell), which presents XML child elements and attributes as properties, allowing for convenient access to them via dot notation.

  • To see only the .NET type-native members of a given type's instance, pass -View Base to Get-Member; e.g.:

    # -View Base excludes the *adapted* and possibly other ETS properties.
    [xml] '<foo/>' | Get-Member -View Base
    

Background information:

PowerShell decorates the object hierarchy contained in System.Xml.XmlDocument instances (created with cast [xml], for instance):

  • with properties named for the input document's specific elements and attributes[1] at every level; e.g.:

     ([xml] '<foo><bar>baz</bar></foo>').foo.bar # -> 'baz'
     ([xml] '<foo><bar id="1" /></foo>').foo.bar.id # -> '1'
    
  • turning multiple elements of the same name at a given hierarchy level implicitly into arrays (specifically, of type [object[]]); e.g.:

     ([xml] '<foo><C>one</C><C>two</C></foo>').foo.C[1] # -> 'two'
    

As the examples (and your own code in the question) show, this allows for access via convenient dot notation.

Note:

  • If you use dot notation to target an element that has at least one attribute and/or child elements, the element itself is returned (an XmlElement instance); otherwise, it is the element's text content; e.g.:

    # The <bar> element's *text content* is returned, as a [string] ('baz'),
    # because it has only a text child node and no attributes
    ([xml] '<foo><bar>baz</bar></foo>').foo.bar
    
    # The <bar> element is returned as an XmlElement instance,
    # because it has an *attribute*.
    ([xml] '<foo><bar id="1">baz</bar></foo>').foo.bar
    
    # The <bar> element is returned as an XmlElement instance,
    # because it has *child elements*.
    ([xml] '<foo><bar><baz>quux</baz></bar></foo>').foo.bar
    
  • Updating XML documents via dot notation is limited to simple, non-structural changes; the difference above comes into play:

    # OK - direct updating of the text content of a simple
    # element (no child nodes, no attributes).
    # $xml.foo.bar then yields 'new'
    ($xml = [xml] '<foo><bar>baz</bar></foo>').foo.bar = 'new'
    
    # OK - direct updating of the attribute of an
    # element.
    #  $xml.foo.bar.id then yields '2'
    ($xml = [xml] '<foo><bar id="1">baz</bar></foo>').foo.bar.id = 2
    
    # !! FAILS - because <bar> isn't a simple element in this case,
    # !! due to the presence of an *attribute*, you cannot directly assign new text content.
    # !! -> Error "Cannot set "bar" because only strings can be used as values to set XmlNode properties."
    ($xml = [xml] '<foo><bar id="1">baz</bar></foo>').foo.bar = 'new'
    
    # OK - assign to the type-native .InnerText property
    #  $xml.foo.bar.InnerText then yields 'new'
    ($xml = [xml] '<foo><bar id="1">baz</bar></foo>').foo.bar.InnerText = 'new'
    

The downside of dot notation is that there can be name collisions, if an incidental input-XML element name happens to be the same as either an intrinsic [System.Xml.XmlElement] property name (for single-element properties), or an intrinsic [Array] property name (for array-valued properties; [System.Object[]] derives from [Array]).

In the event of a name collision: If the property being accessed contains:

  • a single child element ([System.Xml.XmlElement]), the incidental properties win; e.g.:

    # -> 'foo': i.e .the <foo> element's own name, using 
    # XmlElement's type-native .Name property.
    ([xml] '<foo><Bar>bar</Bar></foo>').foo.Name
    
    # -> !! 'bar': That is, the *adapted* .Name property - i.e. the 
    #    !! child element whose name happened to be "Name" takes precedences.
    ([xml] '<foo><Name>bar</Name></foo>').foo.Name
    
    • The workaround to get predictable access to the type-native properties is to call the underlying property accessor method, .get_<propertyName>(), directly:

      # -> 'foo', thanks to .get_Name() workaround
      ([xml] '<foo><Name>bar</Name></foo>').foo.get_Name()
      
      • An alternative is to use the intrinsic psbase property:

        # -> 'foo', thanks to .psbase workaround
        ([xml] '<foo><Name>bar</Name></foo>').foo.psbase.Name
        
  • an array of child elements, the [Array] type's properties win.

    • Therefore, the following element names break dot notation with array-valued properties (obtained with reflection command
      Get-Member -InputObject 1, 2 -Type Properties, ParameterizedProperty):

      Item Count IsFixedSize IsReadOnly IsSynchronized Length LongLenth Rank SyncRoot
      
      • For example, trying to use member-access enumeration to get all item attribute values across all <bar> elements:

        # !! Outputs the definition of the parameterized .Item property
        # !! of type [Array], 
        # !! *not* the values of the "Item" attributes of the <bar> child elements.
        ([xml] '<foo><bar item="one" /><bar item="two" /></foo>').foo.bar.item
        
      • The workaround is to use explicit enumeration of array-valued properties, e.g. via the intrinsic .ForEach() method:

        # -> 'one', 'two'
        ([xml] '<foo><bar item="one" /><bar item="two" /></foo>').foo.bar.ForEach('item')
        

Dot nation is invariably case-insensitive - unlike XML itself, which is an inevitable consequence of representing elements and attribute as properties, given that property access in PowerShell is generally case-insensitive:

 # Dot notation: case-INSENSITIVE
 # -> 'bar', despite the case mismatch
 ([xml] '<FOO>bar</FOO>').foo
 # -> 'BAR', 'bar', i.e. *both* elements that match case-insensitively
 ([xml] '<root><FOO>BAR</FOO><foo>bar</foo></root>').root.foo

 # Type-native XML method: case-SENSITIVE
 # -> NO output, due to the case mismatch
 ([xml] '<FOO>bar</FOO>').SelectSingleNode('foo')

Dot notation ignores XML namespaces - unlike XML-native functionality:

 # Dot notation: Namespaces are ignored.
 # -> the <foo> element (as a whole, because it has attributes)
 #    despite not specifying the namespace.
 ([xml] '<ns1:foo xmlns:ns1="https://example.org">bar</ns1:foo>').foo

 # Type-native XML methods: Explicit namespace handling required:
 # -> No output
 ([xml] '<ns1:foo xmlns:ns1="https://example.org">bar</ns1:foo>')['foo']

 # -> OK - explicit use of the namespace prefix;
 ([xml] '<ns1:foo xmlns:ns1="https://example.org">bar</ns1:foo>')['ns1:foo']

 # -> No output; see below.
 ([xml] '<ns1:foo xmlns:ns1="https://example.org">bar</ns1:foo>').SelectSingleNode('foo')
  • For XPath queries with the type-native .SelectSingleNode() / .SelectNodes() methods, you need not only the use of namespace prefixes, but you first need to create a namespace manager that maps the prefixes used in queries to their namespace URIs - see this answer for an example of this technique.
    The same applies analogously to the Select-Xml cmdlet, which uses XPath queries too - see this answer for an example.

[1] If a given element has both an attribute and and element by the same name, PowerShell reports both, as the elements of an array [object[]].

Sign up to request clarification or add additional context in comments.

2 Comments

Superb and excellent comprehensive technical detail!
ETS. This is what it's called :) It definitely lacks an about_ topic. Thank you for this and the linked answers. Great examples of XML manipulation in PowerShell.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.