[tune](deps): Bump tokenizers from 0.8.1.rc2 to 0.10.1 in /python/requirements by dependabot[bot] · Pull Request #8 · architkulkarni/ray

dependabot · 2021-02-13T08:05:48Z

Bumps tokenizers from 0.8.1.rc2 to 0.10.1.

Release notes

Rust v0.10.1

Fixed

#226: Fix the word indexes when there are special tokens

Python v0.10.1

Fixed

#616: Fix SentencePiece tokenizers conversion

#617: Fix offsets produced by Precompiled Normalizer (used by tokenizers converted from SPM)

#618: Fix Normalizer.normalize with PyNormalizedStringRefMut

#620: Fix serialization/deserialization for overlapping models

#621: Fix ByteLevel instantiation from a previously saved state (using __getstate__())

Rust v0.10.0

Changed

#222: All Tokenizer's subparts must now be Send + Sync

Added

#208: Ability to retrieve the vocabulary from the Tokenizer & Model

Fixed

#205: Trim the decoded string in BPEDecoder

[b770f36]: Fix a bug with added tokens generated IDs

Python v0.10.0

Added

#508: Add a Visualizer for notebooks to help understand how the tokenizers work

#519: Add a WordLevelTrainer used to train a WordLevel model

#533: Add support for conda builds

#542: Add Split pre-tokenizer to easily split using a pattern

#544: Ability to train from memory. This also improves the integration with datasets

#590: Add getters/setters for components on BaseTokenizer

#574: Add fust_unk option to SentencePieceBPETokenizer

Changed

#509: Automatically stubbing the .pyi files

#519: Each Model can return its associated Trainer with get_trainer()

#530: The various attributes on each component can be get/set (ie. tokenizer.model.dropout = 0.1)

#538: The API Reference has been improved and is now up-to-date.

Fixed

#519: During training, the Model is now trained in-place. This fixes several bugs that were forcing to reload the Model after a training.

#539: Fix BaseTokenizer enable_truncation docstring

Python v0.10.0rc1

Added

#508: Add a Visualizer for notebooks to help understand how the tokenizers work

#519: Add a WordLevelTrainer used to train a WordLevel model

#533: Add support for conda builds

... (truncated)

Commits

af66d6f Rust - Bump to 0.10.1 for release
f9c76b6 Python - Use PyO3 0.9.2 (#227)
a6c33f5 Python - update some dependencies
d6326a6 Python - Use PyO3 0.9.2
bd18df0 Word indexes are None for special tokens (#226)
3ad1360 Word indices are None for special tokens
e7949fc Python - Fix build for windows 32-bit (#224)
1b9ead7 Python - Try PyO3 master to fix build
33681fa Python - Check it builds for windows 32
b8daeae Python - Force PyO3 to 0.9.0 for now
Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR
@dependabot recreate will recreate this PR, overwriting any edits that have been made to it
@dependabot merge will merge this PR after your CI passes on it
@dependabot squash and merge will squash and merge this PR after your CI passes on it
@dependabot cancel merge will cancel a previously requested merge and block automerging
@dependabot reopen will reopen this PR if it is closed
@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually

Bumps [tokenizers](https://github.com/huggingface/tokenizers) from 0.8.1.rc2 to 0.10.1. - [Release notes](https://github.com/huggingface/tokenizers/releases) - [Commits](huggingface/tokenizers@python-v0.8.1.rc2...rust-v0.10.1) Signed-off-by: dependabot[bot] <support@github.com>

dependabot · 2021-04-10T07:06:11Z

Superseded by #11.

We encountered SIGSEGV when running Python test `python/ray/tests/test_failure_2.py::test_list_named_actors_timeout`. The stack is: ``` #0 0x00007fffed30f393 in std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(std::string const&) () from /lib64/libstdc++.so.6 #1 0x00007fffee707649 in ray::RayLog::GetLoggerName() () from /home/admin/dev/Arc/merge/ray/python/ray/_raylet.so #2 0x00007fffee70aa90 in ray::SpdLogMessage::Flush() () from /home/admin/dev/Arc/merge/ray/python/ray/_raylet.so #3 0x00007fffee70af28 in ray::RayLog::~RayLog() () from /home/admin/dev/Arc/merge/ray/python/ray/_raylet.so #4 0x00007fffee2b570d in ray::asio::testing::(anonymous namespace)::DelayManager::Init() [clone .constprop.0] () from /home/admin/dev/Arc/merge/ray/python/ray/_raylet.so #5 0x00007fffedd0d95a in _GLOBAL__sub_I_asio_chaos.cc () from /home/admin/dev/Arc/merge/ray/python/ray/_raylet.so #6 0x00007ffff7fe282a in call_init.part () from /lib64/ld-linux-x86-64.so.2 #7 0x00007ffff7fe2931 in _dl_init () from /lib64/ld-linux-x86-64.so.2 #8 0x00007ffff7fe674c in dl_open_worker () from /lib64/ld-linux-x86-64.so.2 #9 0x00007ffff7b82e79 in _dl_catch_exception () from /lib64/libc.so.6 #10 0x00007ffff7fe5ffe in _dl_open () from /lib64/ld-linux-x86-64.so.2 #11 0x00007ffff7d5f39c in dlopen_doit () from /lib64/libdl.so.2 #12 0x00007ffff7b82e79 in _dl_catch_exception () from /lib64/libc.so.6 #13 0x00007ffff7b82f13 in _dl_catch_error () from /lib64/libc.so.6 #14 0x00007ffff7d5fb09 in _dlerror_run () from /lib64/libdl.so.2 #15 0x00007ffff7d5f42a in dlopen@@GLIBC_2.2.5 () from /lib64/libdl.so.2 #16 0x00007fffef04d330 in py_dl_open (self=<optimized out>, args=<optimized out>) at /tmp/python-build.20220507135524.257789/Python-3.7.11/Modules/_ctypes/callproc.c:1369 ``` The root cause is that when loading `_raylet.so`, `static DelayManager _delay_manager` is initialized and `RAY_LOG(ERROR) << "RAY_testing_asio_delay_us is set to " << delay_env;` is executed. However, the static variables declared in `logging.cc` are not initialized yet (in this case, `std::string RayLog::logger_name_ = "ray_log_sink"`). It's better not to rely on the initialization order of static variables in different compilation units because it's not guaranteed. I propose to change all `RAY_LOG`s to `std::cerr` in `DelayManager::Init()`. The crash happens in Ant's internal codebase. Not sure why this test case passes in the community version though. BTW, I've tried different approaches: 1. Using a static local variable in `get_delay_us` and remove the global variable. This doesn't work because `init()` needs to access the variable as well. 2. Defining the global variable as type `std::unique_ptr<DelayManager>` and initialize it in `get_delay_us`. This works but it requires a lock to be thread-safe.

dependabot bot added the dependencies Pull requests that update a dependency file label Feb 13, 2021

dependabot bot mentioned this pull request Feb 13, 2021

[tune](deps): Bump tokenizers from 0.8.1.rc2 to 0.10.1rc1 in /python/requirements #7

Closed

dependabot bot closed this Apr 10, 2021

dependabot bot deleted the dependabot/pip/python/requirements/tokenizers-0.10.1 branch April 10, 2021 07:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[tune](deps): Bump tokenizers from 0.8.1.rc2 to 0.10.1 in /python/requirements#8

[tune](deps): Bump tokenizers from 0.8.1.rc2 to 0.10.1 in /python/requirements#8
dependabot[bot] wants to merge 1 commit intomasterfrom
dependabot/pip/python/requirements/tokenizers-0.10.1

dependabot bot commented on behalf of github Feb 13, 2021

Uh oh!

dependabot bot commented on behalf of github Apr 10, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants

Conversation

dependabot bot commented on behalf of github Feb 13, 2021

Rust v0.10.1

Fixed

Python v0.10.1

Fixed

Rust v0.10.0

Changed

Added

Fixed

Python v0.10.0

Added

Changed

Fixed

Python v0.10.0rc1

Added

Uh oh!

dependabot bot commented on behalf of github Apr 10, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants