Skip to content

Term names affect cycle hash when terms are structurally equivalent, leading to equivalent cycles with different hashes, and possibly different cycles with the same hash #2787

@ChrisPenner

Description

@ChrisPenner

When reviewing updates to the hashing code I realized that the canonical ordering of component terms is based on the structure of those terms, but importantly, the canonical ordering ignores self-referential variables when it does this.

This works great in most cases, but I wondered, what if two terms were structurally equivalent and differed only in that they had variables referring to different terms within the same cycle?

Here I present 3 cycles, which should all receive the same component hash because they're all equivalent aside from term names;

However, cycle 2 is given a different hash.

As far as I can tell, the reason for this is that inner1 and inner2 are structurally equivalent terms, they differ only in that they refer to different terms within their own cycle.

This means that they'll both actually receive the SAME hash when attempting to establish a canonical term ordering, despite being different terms.

As a result, whether inner1 or inner2 comes first in the canonical ordering is dependent on which order they're passed to the hashComponent function, which as far as I can tell, corresponds to the alphabetic ordering of the term names.

So, when I add an a in front of ainnerX2, it moves it before innerX1 in the ordering, and thus comes first in the canonical ordering, which results in a different cycle hash for cycle 2.

-- Cycle 1
outer1 = 1 + !inner1
outer2 = 2 + !inner2

inner1 _ = 1 + !inner2 + outer1
inner2 _ = 1 + !inner1 + outer2


-- Cycle 2 (These are equivalent to Cycle 1, but get a different component hash!)
outerX1 = 1 + !innerX1
outerX2 = 2 + !ainnerX2

innerX1 _ = 1 + !ainnerX2 + outerX1
ainnerX2 _ = 1 + !innerX1 + outerX2

-- Cycle 3 (These are equivalent to Cycle 1 and are given equal hashes as expected)
outerY1 = 1 + !innerY1
outerY2 = 2 + !innerY2

innerY1 _ = 1 + !innerY2 + outerY1
innerY2 _ = 1 + !innerY1 + outerY2
.example1> names outer1

  Term
  Hash:   #vilkvev8kh.1c4
  Names:  outer1 outerY1

.example1> names outerX1

  Term
  Hash:   #c5t6op881o.3c4
  Names:  outerX1

This is a contrived case of course, but it shows that cycles which are the same can have different hashes, and I suspect (by manipulating this property in the reverse direction) that different cycles can have the same hash (I'll try to prove this with a transcript).

So we may need to come up with a different method of resolving order conflicts like this.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions