Conversation
The problem with this is that extendr uses the package name in the DESCRIPTION file to generate wrappers that call |
|
@sanghoonio gotcha thanks for the explaination. I think I am partial to The consequence of this is that when you install and import you need to do this? library(gtarsr)? Thats a bit annoying it'd be nice to just do For reference:
So it'd be nice if everything was just
// cargo add gtars
use gtars::tokenizers::{Tokenizer};
# pip install gtars
import gtars
// npm i @databio/gtars
import { Tokenizer } from '@databio/gtars'So R would be the odd-one out: # install.packages("gtarsr")
library(gtarsr) |
|
The |
well the wrapper crate sort of needs to be called |
|
Ok I dont actually think the gtars_r thing is a problem, right? I think that its only really an issue if we intend to publish to crates.io, which we aren't planning on doing, nor does that make any sense. I guess put another way... because the R bindings are completely downstream of the core crates ( Put another way... I think that we only would have a problem if we tried to import this crate alongside something else like I just tested it locally with |
khoroshevskyi
left a comment
There was a problem hiding this comment.
Few questions/concerns, especially about python binding crate naming in cargo
| [patch."https://github.com/databio/gtars"] | ||
| gtars-core = { path = "gtars-core" } | ||
| gtars-io = { path = "gtars-io" } | ||
| gtars-igd = { path = "gtars-igd" } | ||
| gtars-refget = { path = "gtars-refget" } | ||
| gtars-tokenizers = { path = "gtars-tokenizers" } No newline at end of file |
There was a problem hiding this comment.
- should this dependencies be here?
- What does patch.".." mean?
There was a problem hiding this comment.
[patch] is for overriding dependencies for the R bindings. The cargo.toml in gtars-r has these dependencies:
[dependencies]
extendr-api = { git = "https://github.com/extendr/extendr", branch = "master" }
anyhow = "1.0.82"
gtars-core = { git = "https://github.com/databio/gtars", branch = "dev" }
gtars-io = { git = "https://github.com/databio/gtars", branch = "dev" }
gtars-igd = { git = "https://github.com/databio/gtars", branch = "dev" }
gtars-refget = { git = "https://github.com/databio/gtars", branch = "dev" }
gtars-tokenizers = { git = "https://github.com/databio/gtars", branch = "dev", features = ["huggingface"] }
because when you install an R package, R copies the source files to a temp directory before compiling which in our case would not include the parent gtars directory that contain the required workspace crates. So the dependencies here are linked to github instead to pull from. The problem is if you make a change to a gtars workspace crate while developing R bindings, the changes don't get reflected until you push changes to github. The [patch] is for the workspace to redirect these github dependencies to the local paths so that rextendr::document() (which compiles in place) uses the local workspace crates instead, and not the github paths. Basically it is just a workaround for a workaround to make development more convenient.
This PR should introduce both wasm bindings (and push them to npm via CI/CD) and then R bindings. We need to bump versions for the following crates:
gtars-core(added feature flags to gate bigtools)gtars-tokenizers(added someFromimpls forSpecialTokens)gtars-wasm(renamed, and then added tokenizers)will need to do the following releases:
v0.5.1, release the core crate and publish to cargo (we updatedgtars-core+gtars-tokenizers)wasm-0.5.1, release the web assembly/JS/TS bindingsTheres no need to do a release for the CLI or python bindings as nothing changed in those crates
TODO:
Version bump/release/publish plan
gtars-coreRelease to
crates.iousingcargo publishgtars-tokenizersRelease to
crates.iousingcargo publishgtarsRelease to
crates.iousingcargo publish. Create tagv0.5.1and create release on GitHub.gtars-pythonCreate tag
py-0.5.1and create release on GitHub. Release via PyPI CI/CD.gtars-wasmCreate tag
wasm-0.5.1and create release on GitHub. Release via NPM CI/CD.gtars-r+ version = "0.5.1"Create tag
r-0.5.1and create release on GitHub. no CI/CD setup for this, so that's it