Package Details: python-tokenizers 0.22.2-1

Git Clone URL: https://aur.archlinux.org/python-tokenizers.git (read-only, click to copy)
Package Base: python-tokenizers
Description: Fast State-of-the-Art Tokenizers optimized for Research and Production
Upstream URL: https://github.com/huggingface/tokenizers
Keywords: huggingface
Licenses: Apache-2.0
Submitter: filipg
Maintainer: xiota (daskol)
Last Packager: daskol
Votes: 12
Popularity: 0.71
First Submitted: 2021-10-23 11:17 (UTC)
Last Updated: 2026-01-28 17:59 (UTC)

Pinned Comments

xiota commented on 2024-08-30 16:15 (UTC) (edited on 2024-08-30 16:59 (UTC) by xiota)

Problems:

Latest Comments

1 2 3 4 5 Next › Last »

soloturn commented on 2026-02-07 09:09 (UTC)

the install fails at a test, it stalls:

/usr/lib/python3.14/multiprocessing/popen_fork.py:70: DeprecationWarning: This process (pid=104327) is multi-threaded, use of fork() may lead to deadlocks in the child.

xiota commented on 2026-01-30 07:53 (UTC) (edited on 2026-01-30 07:55 (UTC) by xiota)

I retract my previous comment... git cleanup does not apply. This package does not use git. (However, still advisable to use makepkg -C and clean chroot.)

aliu commented on 2026-01-30 03:10 (UTC)

Could git -C "${srcdir}/${pkgname}" clean -dfx be added to prepare() in line with Arch's Python package guidelines? Thanks in advance.

musta_ruhtinas commented on 2025-12-16 14:01 (UTC)

@foxycode builds fine in a clean chroot without exporting any vars

foxycode commented on 2025-12-13 08:51 (UTC)

Build is failing unless i run this before makepkg; export CARGO_BUILD_TARGET=x86_64-unknown-linux-gnu

Package seems to think i'm running darwin which ends up complaining about not finding crates for core and std.

🍹 Building a mixed python/rust project
🔗 Found pyo3 bindings with abi3 support
📡 Using build options features, bindings from pyproject.toml
💻 Using `MACOSX_DEPLOYMENT_TARGET=10.12` for x86_64-apple-darwin by default

jamakoiv commented on 2025-05-12 09:03 (UTC) (edited on 2025-05-12 09:05 (UTC) by jamakoiv)

Build fails at compile. Adding the export CFLAGS="$CFLAGS -std=gnu89" as suggested below solves the issue.

Error log from clean chroot: https://pastebin.com/uiRNHBLf

Building the package in regular environment and clean chroot results in the same error.

envolution commented on 2025-05-03 04:54 (UTC) (edited on 2025-05-03 04:55 (UTC) by envolution)

I was getting an error from the build process of cargo package oniguruma/onig_sys v69.8.1 related to incompatible void pointers. I was able to work around it by adding export CFLAGS="$CFLAGS -std=gnu89" just before cargo build and cargo test. There's probably a better solution but this was fine for my purposes.

carlosal1015 commented on 2025-01-21 21:19 (UTC)

https://github.com/huggingface/tokenizers/blob/v0.21.0/bindings/python/benches/test_tiktoken.py#L10

python-multiprocess is now in [extra].

envolution commented on 2024-12-13 05:48 (UTC)

@xiota can you please bump the version?

[I 12-13 00:46:46.750 core:416] python-tokenizers: updated to 0.21.0