Skip to content

Fix Short DOI error#7191

Merged
calixtus merged 9 commits into
JabRef:masterfrom
PremKolar:shortDOIexceptionsFix_issue7127
Dec 14, 2020
Merged

Fix Short DOI error#7191
calixtus merged 9 commits into
JabRef:masterfrom
PremKolar:shortDOIexceptionsFix_issue7127

Conversation

@PremKolar

Copy link
Copy Markdown
Contributor

Fixes #7127

In issue #6920 the detection of exact short dois was improved. This was relevant where short dois were misplaced under the wrong field of an entry. This did not affect the method public static Optional<DOI> findInText(String text), which looks for dois within text (not exact matches).
With this commit, I have carried the improvements over to the within-text detection. I had to make the detection more stringent, to avoid mis-detections for cases like bla bla 10:30 bla bla or bli blubb 10/12/2020 blibb blabb etc. Short dois within text will now only be detected if there is any doi or urn within the substring. E.g. any of the following will be detected:

  • doi:10/12ab
  • /urn:doi:10/12ab
  • doi.org/10/1234
  • doi.org/ab123 (shortcut doi)
  • etc..

I also introduced the concept of shortcut dois to the in-text detection. (e.g. doi.org/xyz123)

  • Change in CHANGELOG.md described (if applicable)
  • Tests created for changes (if applicable)
  • Manually tested changed features in running JabRef (always required)
  • Screenshots added in PR description (for UI changes)
  • Checked documentation: Is the information available and up to date? If not created an issue at https://github.com/JabRef/user-documentation/issues or, even better, submitted a pull request to the documentation repository.

@Siedlerchr Siedlerchr changed the title #7127 Fix Short DOI error Dec 14, 2020
dependabot Bot and others added 4 commits December 14, 2020 11:07
Bumps [unirest-java](https://github.com/Kong/unirest-java) from 3.11.05 to 3.11.06.
- [Release notes](https://github.com/Kong/unirest-java/releases)
- [Changelog](https://github.com/Kong/unirest-java/blob/main/CHANGELOG.md)
- [Commits](Kong/unirest-java@v3.11.05...v3.11.06)

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [classgraph](https://github.com/classgraph/classgraph) from 4.8.93 to 4.8.94.
- [Release notes](https://github.com/classgraph/classgraph/releases)
- [Commits](classgraph/classgraph@classgraph-4.8.93...classgraph-4.8.94)

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…5-r (JabRef#7187)

Bumps org.eclipse.jgit from 5.9.0.202009080501-r to 5.10.0.202012080955-r.

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* Fix newly added entry not synced to db


Newly added entries have empty fields; don't update the field table to prevent SQL Exception
Fix shared entry not found by id
use left outer join for this

* fix checkstyle

* fix wording

* add tests for fix

* adjust test

@Siedlerchr Siedlerchr left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, please fix the checkstyle error
Error: eckstyle] [ERROR] /home/runner/work/jabref/jabref/src/test/java/org/jabref/model/entry/identifier/DOITest.java:260:5: 'METHOD_DEF' has more than 1 empty lines before. [EmptyLineSeparator]

Co-authored-by: Siedlerchr <siedlerkiller@gmail.com>
Co-authored-by: Tobias Diez <tobiasdiez@gmx.de>
@koppor

koppor commented Dec 14, 2020

Copy link
Copy Markdown
Member

We're just having a (light) mob programming session and will take over. 😅

@koppor

koppor commented Dec 14, 2020

Copy link
Copy Markdown
Member

GitHub shows too much changed files. Maybe due to database errors at GitHub - see master...PremKolar:shortDOIexceptionsFix_issue7127

@koppor

koppor commented Dec 14, 2020

Copy link
Copy Markdown
Member

When opening the diff:

grafik

Wrong GitHub output:

grafik

@calixtus calixtus merged commit 3f4470f into JabRef:master Dec 14, 2020
@PremKolar

Copy link
Copy Markdown
Contributor Author

Oh wow :) Thank you for doing all the hard work!! :)

@Siedlerchr

Copy link
Copy Markdown
Member

@PremKolar We have to thank you for impoving and fixing the regex We just moved the tests to paramterized ones

@koobs

koobs commented Dec 18, 2020

Copy link
Copy Markdown

Thank you all 👍

@Siedlerchr

Copy link
Copy Markdown
Member

@PremKolar As you recently adapted the DOI find in text. I just stumbled upon an issue, which results in an error when the method finds a DOI with a dot at the end:
This results in a not found error from doi.org

doi:10.1007/s10549-018-4743-9.
It would be cool if you could add a replacement at the end

The following text is coming from the PDF content importer. It extracts the text content of the first page and then calls
DOI.findInText(firstPageContents);

HHS Public Access
Author manuscript
Breast Cancer Res Treat. Author manuscript; available in PMC 2019 July 01.

Published in final edited form as:
Breast Cancer Res Treat. 2018 July ; 170(1): 77–87. doi:10.1007/s10549-018-4743-9.

Acupuncture for breast cancer-related lymphedema: a 
randomized controlled trial

@PremKolar

Copy link
Copy Markdown
Contributor Author

Hi, yes i would love to look into it. I'm not sure it's the ., but we will see..
Maybe the 25th I'll find the time 🙂

@Siedlerchr

Copy link
Copy Markdown
Member

Thanks in advance! No hurry. Happy holidays!

@koobs

koobs commented Dec 24, 2020

Copy link
Copy Markdown

@PremKolar As you recently adapted the DOI find in text. I just stumbled upon an issue, which results in an error when the method finds a DOI with a dot at the end:
This results in a not found error from doi.org

doi:10.1007/s10549-018-4743-9.
It would be cool if you could add a replacement at the end

The following text is coming from the PDF content importer. It extracts the text content of the first page and then calls
DOI.findInText(firstPageContents);

HHS Public Access
Author manuscript
Breast Cancer Res Treat. Author manuscript; available in PMC 2019 July 01.

Published in final edited form as:
Breast Cancer Res Treat. 2018 July ; 170(1): 77–87. doi:10.1007/s10549-018-4743-9.

Acupuncture for breast cancer-related lymphedema: a 
randomized controlled trial

I was going to report the same error, but I wanted to check whether DOI's could syntactically (validly) end in periods before I did so.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Additional Short DOI fatal exception cases: java.lang.IllegalArgumentException: <string> is not a valid DOI/Short

5 participants