Skip to content

Conversation

@abhiklodh
Copy link
Contributor

@abhiklodh abhiklodh commented Oct 21, 2022

📚 Context

Emojis show up as invalid if they are prefixed or postfixed with a variant selector character. This is a hidden character that is used as variant selector for emojis (more information here).

  • What kind of change does this PR introduce?

    • Bugfix
    • Feature
    • Refactoring
    • Other, please describe:

🧠 Description of Changes

  • Wrote a function clean_emoji to remove hidden unicode and return a sanitised emoji if possible.

  • Added logic to validate_emoji to handle emojis that have hidden unicode with the help of clean_emoji.

    • This is a breaking API change
    • This is a visible (user-facing) change

Revised:

st.success("This is a success message", icon="🚨") # Works fine
st.success("This is a success message", icon="️🚨") # Works fine
st.success("This is a success message", icon="\U0000FE0F\U0001F44D\U0001F3FD\U0000FE0F") # Works fine
st.success("This is a success message", icon="\U0000FE0F") # Throws an error
st.success("This is a success message", icon="\U0000FE0F\U0001F6A8\U0000FE0F") # Works fine
st.warning("This is a warning", icon="🚨") # Works fine
st.warning("This is a warning", icon="️🚨") # Works fine
st.error("This is an error", icon="🚨") # Works fine
st.error("This is an error", icon="️🚨") # Works fine
st.info("This is a purely informational message", icon="🚨") # Works fine
st.info("This is a purely informational message", icon="️🚨") # Works fine

Current:

st.success("This is a success message", icon="🚨") # Works fine
st.success("This is a success message", icon="️🚨") # Throws an error
st.success("This is a success message", icon="\U0000FE0F\U0001F44D\U0001F3FD\U0000FE0F") # Throws an error
st.success("This is a success message", icon="\U0000FE0F") # Throws an error
st.success("This is a success message", icon="\U0000FE0F\U0001F6A8\U0000FE0F") # Throws an error
st.warning("This is a warning", icon="🚨") # Works fine
st.warning("This is a warning", icon="️🚨") # Throws an error
st.error("This is an error", icon="🚨") # Works fine
st.error("This is an error", icon="️🚨") # Throws an error
st.info("This is a purely informational message", icon="🚨") # Works fine
st.info("This is a purely informational message", icon="️🚨") # Throws an error

🧪 Testing Done

  • Screenshots included
  • Added/Updated unit tests
  • Added/Updated e2e tests

🌐 References

Does this depend on other work, documents, or tickets?


Contribution License Agreement

By submitting this pull request you agree that all contributions to this project are made under the Apache 2.0 license.

@kajarenc
Copy link
Collaborator

Hello @abhiklodh, thanks for your contribution

I think we should not modify the emoji, and only "normalize" it for is_emoji check to keep the selector.

This comment could be helpful for you in case you want to work on this #5582 (comment)

@abhiklodh
Copy link
Contributor Author

Hello @abhiklodh, thanks for your contribution

I think we should not modify the emoji, and only "normalize" it for is_emoji check to keep the selector.

This comment could be helpful for you in case you want to work on this #5582 (comment)

Hey @kajarenc!
Thanks for the cleaner suggestion. I pushed my commit implementing the same and it works as intended.

@kajarenc
Copy link
Collaborator

kajarenc commented Nov 8, 2022

Hello @abhiklodh, thanks for your contribution
I think we should not modify the emoji, and only "normalize" it for is_emoji check to keep the selector.
This comment could be helpful for you in case you want to work on this #5582 (comment)

Hey @kajarenc! Thanks for the cleaner suggestion. I pushed my commit implementing the same and it works as intended.

@abhiklodh could you please fix formatter error, and commit again to rerun CI:

 def is_emoji(text: str) -> bool:
     """Check if input string is a valid emoji."""
-    return (text in ALL_EMOJIS) or ("\U0000FE0F" in text and text.replace("\U0000FE0F", "") in ALL_EMOJIS)
+    return (text in ALL_EMOJIS) or (
+        "\U0000FE0F" in text and text.replace("\U0000FE0F", "") in ALL_EMOJIS
+    )

Copy link
Collaborator

@vdonato vdonato left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @abhiklodh,

Sorry for the slow turnaround time on reviews for this PR!

This change LGTM, but I have one small change that I suggested along with a request that a couple unit tests get added to verify that these changes work as expected.

There are already some tests for this function that live here:


so adding another 2-3 cases to the parameterized list should be sufficient (copying some of the examples listed in the PR description should already be enough).

Once that's done, this PR should be good to go

@abhiklodh
Copy link
Contributor Author

Hey @vdonato
I added three test cases for the unit tests.
https://github.com/abhiklodh/streamlit/blob/736b1093b5bf7f20c54a977d104c4a7c5ca28475/lib/tests/streamlit/string_util_test.py#L27-L41

And cleaned up this line too. https://github.com/abhiklodh/streamlit/blob/736b1093b5bf7f20c54a977d104c4a7c5ca28475/lib/streamlit/string_util.py#L44-L46
Let me know if there's anything else left to be done.

@vdonato vdonato added the security-assessment-completed Security assessment has been completed for PR label Nov 20, 2022
Copy link
Collaborator

@vdonato vdonato left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM

Thanks @abhiklodh!

@vdonato vdonato merged commit f739c18 into streamlit:develop Nov 20, 2022
tconkling added a commit to tconkling/streamlit that referenced this pull request Nov 21, 2022
# By Abhik Lodh (1) and others
# Via GitHub
* develop:
  memo_test/singleton_test: ensure ScriptRunCtx exists (streamlit#5739)
  Update loadApp command timeout (streamlit#5731)
  Allow emojis with variant selectors to be validated (streamlit#5583)

# Conflicts:
#	lib/tests/streamlit/runtime/caching/cache_data_api_test.py
#	lib/tests/streamlit/runtime/caching/cache_resource_api_test.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

security-assessment-completed Security assessment has been completed for PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Emojis are not valid if they have a variant selector character attached

3 participants