Skip to content

Make vocabularies multilingual#600

Merged
osma merged 12 commits intomasterfrom
issue559-multilingual-vocabularies
Aug 8, 2022
Merged

Make vocabularies multilingual#600
osma merged 12 commits intomasterfrom
issue559-multilingual-vocabularies

Conversation

@osma
Copy link
Copy Markdown
Member

@osma osma commented Aug 4, 2022

This PR implements #559 - making vocabularies multilingual, so that there is no need to use separate language-specific vocabulary id's such as yso-fi, yso-sv and yso-en. Instead the vocabulary id yso can be used for all projects and the loadvoc command needs to be executed just once as it will detect which languages are available in the vocabulary and load the labels in all available languages.

It's also possible to define/override the language of labels, for example to use vocab=lcsh(en) in a Finnish language project.

The changes need to be carefully tested as they are quite disruptive. Documentation (including the Annif tutorial) should be updated, mainly by stripping language suffixes from vocabulary id's in examples. However, old examples (vocabulary id's with a language suffix) should still keep working.

@osma osma added this to the 0.59 milestone Aug 4, 2022
@osma osma self-assigned this Aug 4, 2022
@codecov
Copy link
Copy Markdown

codecov bot commented Aug 4, 2022

Codecov Report

Merging #600 (b258873) into master (b6a1363) will decrease coverage by 0.01%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master     #600      +/-   ##
==========================================
- Coverage   99.54%   99.52%   -0.02%     
==========================================
  Files          86       86              
  Lines        5653     5695      +42     
==========================================
+ Hits         5627     5668      +41     
- Misses         26       27       +1     
Impacted Files Coverage Δ
annif/cli.py 99.63% <100.00%> (ø)
annif/corpus/skos.py 100.00% <100.00%> (ø)
annif/corpus/subject.py 100.00% <100.00%> (ø)
annif/project.py 99.39% <100.00%> (ø)
annif/vocab.py 95.50% <100.00%> (-0.15%) ⬇️
tests/conftest.py 100.00% <100.00%> (ø)
tests/test_cli.py 100.00% <100.00%> (ø)
tests/test_project.py 100.00% <100.00%> (ø)
tests/test_vocab.py 100.00% <100.00%> (ø)
tests/test_vocab_skos.py 100.00% <100.00%> (ø)
... and 1 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@osma osma force-pushed the issue559-multilingual-vocabularies branch from b1aa810 to 4d78bb5 Compare August 5, 2022 13:12
@osma
Copy link
Copy Markdown
Member Author

osma commented Aug 5, 2022

Rebased on current master (which now contains PR #597 that was the starting point of this branch) and force-pushed.

@sonarqubecloud
Copy link
Copy Markdown

sonarqubecloud bot commented Aug 5, 2022

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 6 Code Smells

No Coverage information No Coverage information
0.0% 0.0% Duplication

@osma osma marked this pull request as ready for review August 5, 2022 15:23
@osma
Copy link
Copy Markdown
Member Author

osma commented Aug 5, 2022

This is more or less done now, pending review.

I tried to fix the issues reported by QA tools. Code Climate still complains about load_vocabulary, but I can't figure out how to make it better. Codecov says there's one more missed line than before in annif.vocab, but I can't find it in the detailed report.

@osma osma requested a review from juhoinkinen August 5, 2022 15:25
@lgtm-com
Copy link
Copy Markdown

lgtm-com bot commented Aug 5, 2022

This pull request introduces 1 alert when merging b258873 into b6a1363 - view on LGTM.com

new alerts:

  • 1 for Module is imported with 'import' and 'import from'

Copy link
Copy Markdown
Member

@juhoinkinen juhoinkinen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@osma
Copy link
Copy Markdown
Member Author

osma commented Aug 8, 2022

Thanks for reviewing @juhoinkinen ! I will merge this now, although we still need to do more testing before the next release, however there are other related changes to the vocabulary functionality coming up (possibly e.g. #602) and it makes sense to test them all in one go.

@osma osma merged commit 954a8d9 into master Aug 8, 2022
@osma osma deleted the issue559-multilingual-vocabularies branch August 8, 2022 07:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants