Package: tok Title: Fast Text Tokenization Version: 0.2.2.9000 Authors@R: c( person("Tomasz", "Kalinowski", , "tomasz@posit.co", c("ctb", "cre")), person("Daniel", "Falbel", , "dfalbel@gmail.com", c("aut")), person("Regouby", "Christophe", , "christophe.regouby@free.fr", c("ctb")), person(family = "Posit", role = c("cph")) ) Description: Interfaces with the 'Hugging Face' tokenizers library to provide implementations of today's most used tokenizers such as the 'Byte-Pair Encoding' algorithm . It's extremely fast for both training new vocabularies and tokenizing texts. License: MIT + file LICENSE SystemRequirements: Cargo (Rust's package manager), rustc >= 1.77.2 Encoding: UTF-8 Roxygen: list(markdown = TRUE) Depends: R (>= 4.2.0) Imports: R6, cli Suggests: rmarkdown, testthat (>= 3.0.0), hfhub (>= 0.1.1), withr Config/testthat/edition: 3 URL: https://github.com/mlverse/tok BugReports: https://github.com/mlverse/tok/issues Config/rextendr/version: 0.5.0 Config/roxygen2/version: 8.0.0 Config/pak/sysreqs: libclang-dev Repository: https://mlverse.r-universe.dev Date/Publication: 2026-05-19 13:09:10 UTC RemoteUrl: https://github.com/mlverse/tok RemoteRef: HEAD RemoteSha: f925ad65e356dc6d295b633391aa00eae9dad8fe NeedsCompilation: yes Packaged: 2026-06-18 06:45:09 UTC; root Author: Tomasz Kalinowski [ctb, cre], Daniel Falbel [aut], Regouby Christophe [ctb], Posit [cph] Maintainer: Tomasz Kalinowski