Skip to content

shed_diff smart diff not hiding dependency changes (changeset_revision etc) #205

@peterjc

Description

@peterjc

The help suggests smart/magic diff is the default,

$ planemo shed_diff --help
...
  --raw                      Do not attempt smart diff of XML to filter out
                             attributes populated by the Tool Shed.
...

However, this isn't being smart enough:

$ planemo shed_diff --shed_target testtoolshed ~/repositories/pico_galaxy/tools/sample_seqs/
wget -q --recursive -O - 'https://testtoolshed.g2.bx.psu.edu/repository/download?repository_id=de64f62614881770&changeset_revision=default&file_type=gz' | tar -xzf - -C /tmp/tool_shed_diff_i5xHt_/_testtoolshed_ --strip-components 1
mkdir "/tmp/tool_shed_diff_i5xHt_/_local_"; tar -xzf "/tmp/tmpS0q0pC" -C "/tmp/tool_shed_diff_i5xHt_/_local_"; rm -rf /tmp/tmpS0q0pC
cd "/tmp/tool_shed_diff_i5xHt_"; diff -r _local_ _testtoolshed_
diff -r _local_/tools/sample_seqs/tool_dependencies.xml _testtoolshed_/tools/sample_seqs/tool_dependencies.xml
4c4
<         <repository name="package_biopython_1_65" owner="biopython" />

---
>         <repository changeset_revision="f8d72690eeae" name="package_biopython_1_65" owner="biopython" toolshed="https://testtoolshed.g2.bx.psu.edu" />

This looks like the raw diff output:

$ planemo shed_diff --shed_target testtoolshed ~/repositories/pico_galaxy/tools/sample_seqs/ --raw
wget -q --recursive -O - 'https://testtoolshed.g2.bx.psu.edu/repository/download?repository_id=de64f62614881770&changeset_revision=default&file_type=gz' | tar -xzf - -C /tmp/tool_shed_diff_VGHJk0/_testtoolshed_ --strip-components 1
mkdir "/tmp/tool_shed_diff_VGHJk0/_local_"; tar -xzf "/tmp/tmpBmc0Ck" -C "/tmp/tool_shed_diff_VGHJk0/_local_"; rm -rf /tmp/tmpBmc0Ck
cd "/tmp/tool_shed_diff_VGHJk0"; diff -r _local_ _testtoolshed_
diff -r _local_/tools/sample_seqs/tool_dependencies.xml _testtoolshed_/tools/sample_seqs/tool_dependencies.xml
4c4
<         <repository name="package_biopython_1_65" owner="biopython" />

---
>         <repository changeset_revision="f8d72690eeae" name="package_biopython_1_65" owner="biopython" toolshed="https://testtoolshed.g2.bx.psu.edu" />

I think the problem is the assumption in planemo/shed/diff.py that the special files tool_dependencies.xml and repository_dependencies.xml will only be at the tar-ball root:

def diff_and_remove(working, label_a, label_b, f):
    a_deps = os.path.join(working, label_a, "tool_dependencies.xml")
    b_deps = os.path.join(working, label_b, "tool_dependencies.xml")
    a_repos = os.path.join(working, label_a, "repository_dependencies.xml")
    b_repos = os.path.join(working, label_b, "repository_dependencies.xml")

Also, as a possible enhancement, should we consider the TS doing attribute reordering, which prompted me to make this change (testing with an older version of planemo)?: peterjc/pico_galaxy@698859a

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions