Towards hierarchical keywords#1950
Conversation
lenhard
left a comment
There was a problem hiding this comment.
A few things need to be addressed before merging (especially StringUtil and the inheritance of KeywordList).
| } | ||
|
|
||
| public BibDatabaseContext(Defaults defaults, DatabaseLocation location, String keywordSeparator) { | ||
| public BibDatabaseContext(Defaults defaults, DatabaseLocation location, Character keywordSeparator) { |
There was a problem hiding this comment.
Not that this is an error or anything the like. But there is hardly ever an advantage of using Character over String (except for illusive memory benefits). So what is the reason for this change and all the related changes in this PR?
There was a problem hiding this comment.
There were some issues related to strings as keyword separators. For example, the default is , (with space) but some places tried to split strings only at the comma (i.e. recognize apple,orange). The space was only used to reformat the list as apple, orange. So I decided to force that only one character can be used as a separator and the final space is always adapted upon writing.
There was a problem hiding this comment.
Ok, sounds reasonable. Let's hope that no one ever wants to use some weird multi-keyword separator.
| import java.util.Collection; | ||
|
|
||
| import net.sf.jabref.model.entry.EntryUtil; | ||
| import net.sf.jabref.model.strings.StringUtil; |
There was a problem hiding this comment.
So if I get this correctly, you merged StringUtil into EntryUtil? We have had discussions about this before and I am not entirely happy with it. StringUtil has a high potential of becoming a god class that can do everything and is used everywhere. Please let's be careful with this.
In the end, however, I also see no point in arguing endlessly over it, so it is ok for me for now. However, please not that on current master there is the class model.util.ModelStringUtil. There is absolutely no change that I would tolerate two string util classes in model, so you'll have to merge that one as well
There was a problem hiding this comment.
Yes we now have a ModelStringUtil class, which already moved some string-related methods to model. I would propose to merge everything together: logic.StringUtil, model.ModelStringUtil and model.EntryUtil, they all deal with the same thing. I will create a separate PR with this change (because it also makes merging easier).
There was a problem hiding this comment.
Ok, I understand. Do this in a separate PR, but please do not forget it :)
| } | ||
| } | ||
| for (String s : sortedKeywordsOfAllEntriesBeforeUpdateByUser) { | ||
| for (Keyword s : sortedKeywordsOfAllEntriesBeforeUpdateByUser) { |
There was a problem hiding this comment.
Nitpick: Rename s to keyword :)
| new ExplicitGroup(Localization.lang("Automatically created groups"), | ||
| GroupHierarchyType.INCLUDING, | ||
| Globals.prefs.getKeywordDelimiter())); | ||
| Set<String> hs; |
There was a problem hiding this comment.
Nitpick: Improve naming in the new code here.
| * Represents a list of keyword chains. | ||
| * For example, "Type > A, Type > B, Something else". | ||
| */ | ||
| public class KeywordList extends ArrayList<Keyword> { |
There was a problem hiding this comment.
I do not like that you extend the API class here. That way, KeywordList inherits all kinds of methods that it probably does not need and is prone to weird polymorphic misuses. Instead, KeyWordList should have an internal attribute of type ArrayList<Keyword> and just provide a limited subset of methods that are really needed to modify this attribute.
Also, the current implementation supports duplicates in the list. Is this reasonable for keywords? Wouldn't duplicate elimination make sense here?
There was a problem hiding this comment.
Composition over inheritance, noted!
I wasn't sure how to handle duplicates. There was some code which tried to replace a keyword exactly at the same position: a, b, c -> a, replace, c. This was not easily possible with Set, but just one line with List. Not sure that this was the right decision through. What is your opinion?
There was a problem hiding this comment.
Tough question, really. Personally, I cannot come up with a reason to have duplicate keywords, but users are weird, as we all know :) It is probably best if we just keep things as they were for the moment. Since the keywords were prior stored as a String, it was certainly possible to add duplicates. So let us keep this for now. My fear is that it may result in random bugs, where processing stops, because a keyword is found and another occurrence of the keyword is still in the list.
As a side note: You do not have to use a Set to avoid duplicates. You can also use a List internally and purge duplicates at the level of KeywordList.
There was a problem hiding this comment.
I now changed it so that no duplicates are left upon parsing a string to a KeywordList. I feel we get more bugs if the list may actually contain duplicates.
| import org.junit.Test; | ||
| import org.xml.sax.SAXException; | ||
|
|
||
| public class GvkParserTest { |
There was a problem hiding this comment.
I don't quite get why this class is introduced in this PR.
|
|
||
| import static org.junit.Assert.assertEquals; | ||
|
|
||
| public class KeywordListTest { |
There was a problem hiding this comment.
Please add tests that document the behavior in case of duplicate keywords.
629d6ad to
2665d83
Compare
* upstream/master: Implemented Integrity NoBibtexFieldChecker (#2059) Implemented title and camel key modifiers (#1894) Fix localization task hints (#2031) Result of syncLang.py update Result of syncLang.py update (with manual correction of captial_letters, and The_BibTeX_entry...) Fix "large capitals" to "capital letters" Updated Menu_tr.properties (#2057) Updated jabref_tr.properties (#2056) fix ignore version (#2055) Towards hierarchical keywords (#1950) Fix failing test, replace \uxxx encoded chars with proper UTF8 chars. Import Italian patch
* Small code cleanup in SpecialFieldsUtils * Refactor code related to keywords * Move StringUtil to model and remove EntryUtil * Add a few more tests * Change keyword delemiter in groups to Character * Optimize imports * Removed unused keyword separator in shareddb ui manager * Fix build errors * Fix failing architecture tests * Reformat imports * Small renamings * Move from Inheritance to composition in KeywordList * Fix tests * ArXiv accepts import format preferences instead of keyword delimiter * Fix binding * Fix arXiv tests
* Small code cleanup in SpecialFieldsUtils * Refactor code related to keywords * Move StringUtil to model and remove EntryUtil * Add a few more tests * Change keyword delemiter in groups to Character * Optimize imports * Removed unused keyword separator in shareddb ui manager * Fix build errors * Fix failing architecture tests * Reformat imports * Small renamings * Move from Inheritance to composition in KeywordList * Fix tests * ArXiv accepts import format preferences instead of keyword delimiter * Fix binding * Fix arXiv tests # Conflicts: # src/main/java/net/sf/jabref/gui/importer/fetcher/EntryFetchers.java
This PR is the first step towards supporting hierarchical keywords #628.
KeywordListandKeywordKeywordListaccordingly -> new PRNote: I also moved the StringUtil class to model (in some sense it is JabRef's own String class) and removed EntryUtil (only had methods related to strings (-> StringUtil) or keywords (-> KeywordList) ).