Skip to content

Tagging capability#1914

Merged
anton-abushkevich merged 23 commits intomasterfrom
asset_tags_pr
Aug 13, 2021
Merged

Tagging capability#1914
anton-abushkevich merged 23 commits intomasterfrom
asset_tags_pr

Conversation

@anton-abushkevich
Copy link
Contributor

@anton-abushkevich anton-abushkevich commented Aug 6, 2021

Description: #1917

Frontend PR: OHDSI/Atlas#2589

ssuvorov-fls and others added 20 commits April 26, 2021 19:07
tag entity to dto conversion was added
tag entity was added to assets
some refactorings were made
some refactorings were made
endpoints for assigning and unassigning of tags for cohorts was added
tag groups were added to result list
concept set checking was added
additional fields were added to entity
endpoints for proteced tafs were added
additional fields were added to tag entity
checking of adding tag to group was added
checking of assigning protected tags was added
persmissions were added
list of mandatory tag groups was added to checking
# Conflicts:
#	src/main/java/org/ohdsi/webapi/util/GenericExceptionMapper.java
@chrisknoll
Copy link
Collaborator

chrisknoll commented Aug 10, 2021

This PR does need a description to at least the Issue that describes the functionality, but just looking through sourcecode, there are a few questions:

Q1: what does refresh stat period do? If there are statistics related to tags, what are they?

Q2: What is the purpose of coupling 'checkers' with 'tags'? I'm referring to IRChecker, CohortChecker, etc. It feels strange to see seemignly independent functions (tagging and validation) being coupled together, but I could be misunderstanding the purpose of tags (my understanding being: tags were things you attach to entities, and I don't see a warning message an 'entity').

Q3: The presence of hashing will not impact the hashing/design hash generation of an entity, right? It would be odd to have a different cohort definition hash if tags were added/removed.

@ssuvorov-fls
Copy link
Contributor

Q1: It counts the number of assigning of each tag to the entity to allow user to understand how many times each tag was used
Q2: This check allows you to find groups in which at least one tag must be assigned
Q3: Cohort definition hash does not depend on the tags, and design hashing will work as expected

@alex-odysseus
Copy link
Contributor

@chrisknoll the description is here: #1917

@anton-abushkevich anton-abushkevich changed the title Tags Tagging capability Aug 11, 2021
@chrisknoll
Copy link
Collaborator

chrisknoll commented Aug 11, 2021

Should we be concerned that the tables don't follow naming conventions:

Correct (singluar form):

  • sec_role
  • sec_user
  • batch_job_execution
  • batch_job_exuection_context
  • cc_cohort
  • cc_param
  • cohort_definition

Possible issue (note plural form):

  • cohort_tags
  • tags
  • tag_groups
  • pathway_tags

There are minor exceptions to this:

  • batch_job_execution_params
  • cohort_definition_details

But the exceptions are far fewer than the norm.

@chrisknoll
Copy link
Collaborator

I've attempted to create a nested structure of tags but I don't see them on the UI, here's what I inserted:

INSERT INTO webapi.tags (name, show_group, icon, color, multi_selection, mandatory, allow_custom, description)
VALUES ('TA', true, 'fa fa-bullseye' , '#cbffcf', true, false, false, 'Therapeutic Area');

INSERT INTO webapi.tags (name, show_group, icon, color, multi_selection, mandatory, allow_custom, description)
VALUES ('Neruological', true, 'fa fa-bullseye' , '#cbffcf', true, false, false, 'Neruological Area');
INSERT INTO webapi.tags (name, show_group, icon, color, multi_selection, mandatory, allow_custom, description)
VALUES ('Immuno', true, 'fa fa-bullseye' , '#cbffcf', true, false, false, 'Immunology Area');

INSERT INTO webapi.tag_groups (group_id, tag_id)
SELECT tag_group.id, tag.id
FROM webapi.tags tag_group, webapi.tags tag
WHERE tag_group.name = 'TA' and tag.name in ('Neruological','Immuno');

INSERT INTO webapi.tags (name, description)
VALUES ('Depression', 'Depression');

INSERT INTO webapi.tag_groups (group_id, tag_id)
SELECT tag_group.id, tag.id
FROM webapi.tags tag_group, webapi.tags tag
WHERE tag_group.name = 'Neruological' and tag.name in ('Depression');


INSERT INTO webapi.tags (name, description)
VALUES ('Psoriaisis', 'Psoriaisis');

INSERT INTO webapi.tag_groups (group_id, tag_id)
SELECT tag_group.id, tag.id
FROM webapi.tags tag_group, webapi.tags tag
WHERE tag_group.name = 'Immuno' and tag.name in ('Psoriaisis');

The structure is:

TA
  - Neurological
      - Depression
  - Immuno
      - Psoriasis

On the front end, we see:

image

I'm not actually sure how you're supposed to navigate through a hierarchy of tags.

@alex-odysseus
Copy link
Contributor

Chris, from the description "parent group ID - in case of necessity Tag Groups can be nested; there are no limitations how deep are the nested levels (for the initial implementation the nested level is 1); if the value is omitted (NULL or a specific constant to be defined) it is a top-level Tag Group"

This is crucial for the current implementation - "for the initial implementation the nested level is 1" @chrisknoll

@chrisknoll
Copy link
Collaborator

OIk, Fair enough, I didn't realize what that was referring to when I read it.

Personally, I think the idea behind nested groups should be discarded: by my thinking, the hierarchy is going to be difficult to manage (considering there's no restriction on 1 group having multiple parents), harder to understand (same reason), showing them in the UI is going to be difficult, etc.

What I love about what is implemented so far is the notion of 'tag groups' where a group becomes a column of the entity talb,e and facet filters are provided for each group. I think that's very clever, and very useful. I haven't checked, but I haven't tried to create a tag that doesn't belong to any group at all (I'm assuming that is supported? since tags and groups are defined in the same table). If we ditch the notion of hierarchies, then tags are just things that can be grouped together, making a UI to manage them very straight forward (just a form with name-value pairs and the collection of groups that the tag is assigned to).

If ditching hierarchies can be the direction, then maybe we can think of 'tag groups' separately from 'tags' and create the model from that. as it stands now, everything is a tag, which can have a parent-child relationship in tag_groups.

I don't mean to suggest any major changes to the architecture (although, I know that's how it might sound) but I have some worry about functionality that may never be realized or become overcomplicated to manage.

I guess I need to hear from those folks who had their hopes set on tag hierarcies and how they would be used in practice. I attempted to show a use case in my own SQL examples above, but anything more complicated than what I tried to do would really become difficult to organize in a larger scale implementation.

@chrisknoll
Copy link
Collaborator

chrisknoll commented Aug 12, 2021

@alondhe do you have any comments on the function of hierarchical tags? I think this subject came up in our meeting, but I thought we opted for a simple approach. Did you have a vision as to how managing a hierarchy of tags might be managed, from a UI perspective? or how you would show the hierarchy of a tag once it is assigned to an asset?

In contrast to that, the way 'members of a group' is shown as a column in the entity list (like cohort definition list). That's cool and nice. And you can also say that only 1 member of a tag group can be assigned at a time (which is cool for a type of tag that is for 'status' where you can only either be 'draft' or 'finalized'..but not both). So, I like those but I can't quite picture how nested tags would behave, and wanted to get your input on that.

Copy link
Collaborator

@chrisknoll chrisknoll left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eerything seems to be working (tho I did not look too closely at the logic behind the tag statistics refresh). Only thing I would request changes to is the naming conventions around the table names: tag_groups -> tag_group, tags -> tag, tags_seq -> tag_seq, etc.

Also, I noticed that certain tests were updated, but I don't think I noticed any tag-specific tests being implemented for tags, such as confirming that the individual tag color is overridden if it is provided for the specific tag vs. group tag (for example).

Will there be test cases?

@ssuvorov-fls
Copy link
Contributor

@chrisknoll
Tables were renamed in additional migration
Tests were added

@alondhe
Copy link
Contributor

alondhe commented Aug 12, 2021

Hi @chrisknoll - I agree, we didn't aim to make a multi-level (nested) hierarchy with this. It's tag group -> tag value, and that's all. We considered the idea of allowing something deeper, like TA -> TA Value -> TA sub-value, but felt it to be too complicated for the UX.

Copy link
Collaborator

@chrisknoll chrisknoll left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you confirm that the sequence was renamed too?

@chrisknoll
Copy link
Collaborator

Thanks @alondhe for the info. So, I don't want to throw a wrench in anything since this current impl works provided you do not try to nest groups (you can insert the data, but it won't appear in the UI). Maybe we can think about a refactoring in a later relase where we have defiend Tag and TagGroup elements that have associations with each other, and that will (I think) make the object model more clear about what's involved in tags. What has been implemented is a great acocmplishment, and I don't want to lose sight on that, so I do want to express gratitude for the efforts: Thanks to all who did pull this together, it will be very helpful and useful to use.

Once these last minor details about naming is resolved we can merge this in.

@alondhe
Copy link
Contributor

alondhe commented Aug 12, 2021

Awesome, thanks @chrisknoll!

@ssuvorov-fls
Copy link
Contributor

Can you confirm that the sequence was renamed too?

Yes

@anton-abushkevich anton-abushkevich merged commit a4737b7 into master Aug 13, 2021
@delete-merged-branch delete-merged-branch bot deleted the asset_tags_pr branch August 13, 2021 09:26
m0nhawk pushed a commit to uc-cdis/WebAPI that referenced this pull request Nov 1, 2021
* initial script for postgresql was added
tag entity to dto conversion was added
tag entity was added to assets

* index were added to script
some refactorings were made

* index were added to script
some refactorings were made

* count of assigned tags was added
endpoints for assigning and unassigning of tags for cohorts was added

* Assets Tags implementation

* added search endpoint
tag groups were added to result list

* endpoints for assigning and unassigning of tags for assets were added

* mandatory tags checking was added
concept set checking was added
additional fields were added to entity

* tests were fixed
endpoints for proteced tafs were added
additional fields were added to tag entity
checking of adding tag to group was added
checking of assigning protected tags was added
persmissions were added

* Assets Tags implementation

* description was added to tag
list of mandatory tag groups was added to checking

* Design was changed to "Tags"

* Design was changed to "Tags"

* removed unused function from AbstractDaoService

* removed unused sql statements
some refactorings

* Assets Tags i18n

* Assets Tags implementation - tag reassign confirmation added

* Master merge - compilation fix

* Fix - enable tests

* tables with tags were renamed to singular form

* tests were added

* sequence for tags was renamed to singular form

Co-authored-by: Sergey Suvorov <sergey.suvorov@odysseusinc.com>
@chrisknoll
Copy link
Collaborator

chrisknoll commented Dec 25, 2021

There was an oversight when I approved this change, resulting in a backwards-compatability error:

This change changes the signature of the object from CohortExpression to a CohortDTO. These objects are not JSON compatable (one is a raw circe expresison, the other acts like a 'wrapper' where you have the expression and expressionType).

This type of backwards compatibility change should not have been let in. We should try to understand how to revert this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants