Add helper function for recipe clean up#223
Conversation
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
ryanking13
left a comment
There was a problem hiding this comment.
Thanks for working on this @teddygood. I left some comments.
Because the CLI design would need some discussion, I think we can separate this PR. How about
- first implementing some helper functions to clean up the recipe build artifacts,
- then connect a CLI entrypoints to those functions.
While working on (1), you can probably also refactor some of our code to make it reusable. For instance, locating the log file or build directories are already implemented in recipe package, and hardcoding those directories are quite easy to break.
pyodide_build/cli/build_recipes.py
Outdated
| from pyodide_build import build_env | ||
| from pyodide_build.build_env import BuildArgs, init_environment | ||
| from pyodide_build.common import get_num_cores | ||
| from pyodide_build.errors import ActionableError |
There was a problem hiding this comment.
There is no definition for ActionableError; maybe you forgot to define it, or the hallucination of AI code generators generated it?
pyodide_build/cli/build_recipes.py
Outdated
| # Register subcommands to the Typer app | ||
| app.command("no-deps")(build_recipes_no_deps) | ||
| app.command("build")(build_recipes) |
There was a problem hiding this comment.
Well, I would like to avoid creating more subcommands.
It was my bad that in the original issue (#218), I suggested adding a clean subcommand under the build-recipes. As there can be (at actually is) a Python package named clean, I think having a subcommand clean under the build-recipes can be confusing.
Instead, maybe let's define it as a top-level subcommand, so that we can extend the clean command itself.
pyodide clean # clean all build artifacts
pyodide clean recipes # clean only recipe-related artifacts
pyodide clean <something else>@hoodmane @agriyakhetarpal any thoughts on the CLI design?
There was a problem hiding this comment.
What you are suggesting sounds good to me.
There was a problem hiding this comment.
Yes, having the possibility to extend pyodide clean SGTM, too.
pyodide_build/cli/build_recipes.py
Outdated
| # Legacy usage: `pyodide build-recipes` without subcommand will build all packages | ||
| build_recipes( | ||
| [], # Build all packages when no subcommand is specified |
There was a problem hiding this comment.
This will be a breaking change, and I would like to avoid that. As I commented in the above comment, let's not introduce another subcommand.
There was a problem hiding this comment.
I’ll drop this commit and remove the subcommand-related changes from this PR so the current behavior stays the same.
pyodide_build/cli/build_recipes.py
Outdated
|
|
||
|
|
||
| @app.command("clean") | ||
| def clean( |
There was a problem hiding this comment.
To make it easier to write unittests (and also to reduce the dependency on the CLI package) let's split the functions.
- The CLI function should only be responsible for handling arguments and filling out the default values.
- Then, it should call the actual function that does the cleanup process.
- The function that does the clean up can also be separated to multiple small functions that can be tested separately.
There was a problem hiding this comment.
I’ll split the CLI from the cleanup logic. The CLI will only parse args/defaults and call small, testable helpers (e.g., resolve targets, remove build/log, optional dist, optional global dist). Unit tests will focus on the helpers.
|
Thanks for the feedback! To proceed, here’s a small plan - please let me know if this looks good:
If this direction sounds good, I’ll push the updates for this PR accordingly. Thanks! |
|
Thanks for summarizing the plans.
Yes, sounds good. Actually, some paths might be hardcoded in recipe classes as well, but I would like to avoid adding more of hardcoded paths. If you think you can refactor some code to make it more reusable, feel free to go ahead and refactor it. But please do not mix refactoring and feature implementation, unless the change is very trivial. Try to do it step-by-step.
Yes. Note that there are two dist directories, one for global dist directory that stores all the wheels and another for storing local wheels per package.
This part can be a matter of choice. We can change the decision at any time later. So when implementing the cleanup feature, this part should be easy to modify.
Yes, sounds good.
Maybe something like |
for more information, see https://pre-commit.ci
pyodide_build/tests/test_clean.py
Outdated
There was a problem hiding this comment.
Let's remove this file in this PR
There was a problem hiding this comment.
Thanks. I forgot to remove it. I’ll remove it in this PR.
pyodide_build/recipe/cleanup.py
Outdated
| @@ -0,0 +1,118 @@ | |||
| from __future__ import annotations | |||
There was a problem hiding this comment.
I don't think this is required. Was there some issue with type annotations without this?
There was a problem hiding this comment.
Thanks. There was no issue with type annotations. I added it out of habit. I’ll remove it.
pyodide_build/recipe/cleanup.py
Outdated
| return list(recipes.keys()) | ||
|
|
||
|
|
||
| def remove_package_build(recipe_dir: Path, build_dir_base: Path, package: str) -> bool: |
There was a problem hiding this comment.
All the remove_... functions are doing the similar thing with duplicate codes, so it would be nice to refactor the functions to be more general.
I think we can have two functions that:
- Removes any file or directory given as a parameter
- Locate a set of files to remove
Then, the function 1 would have a more simple signature, and can be used in a more general purposes.
If we have more files to delete in the future, we can update function 2 to include those files.
There was a problem hiding this comment.
Implementing as suggested I will introduce _remove_path(path) to delete files or directories and _locate_cleanup_paths_for_package(...) to compute the paths to remove then update clean_recipes(...) to use these helpers and remove the duplicated remove_* functions If we need to delete more artifacts later we will only update the locator function.
pyodide_build/recipe/cleanup.py
Outdated
| """ | ||
| Clean recipe build artifacts and optionally dist directories. | ||
|
|
||
| Returns the number of items removed. |
There was a problem hiding this comment.
I am not sure if this function should return the number of items removed. It doesn't seem very useful to me.
There was a problem hiding this comment.
Agreed. I will remove the return value and make the function void, since the count is not particularly useful and no callers rely on it.
pyodide_build/recipe/cleanup.py
Outdated
| def perform_recipe_cleanup( | ||
| *, |
There was a problem hiding this comment.
I think we can make recipe_dir and targets as positional parameters, instead of making everything keyword-only parameters.
There was a problem hiding this comment.
I will make recipe_dir and targets positional and keep the rest keyword-only in clean_recipes.
| _make_pkg_with_artifacts(pkg) | ||
|
|
||
| removed = perform_recipe_cleanup( | ||
| recipe_dir=RECIPE_DIR, |
There was a problem hiding this comment.
You don't need to use a real recipe directory for this. I think you can utilize tmp_path.
There was a problem hiding this comment.
I think to avoid using a real recipe directory we should use pytest tmp_path to create an isolated temporary recipe directory and generate minimal meta.yaml files so that the loader can discover packages Is this approach okay for these tests?
pyodide_build/recipe/cleanup.py
Outdated
| return False | ||
|
|
||
|
|
||
| def perform_recipe_cleanup( |
There was a problem hiding this comment.
naming: How about simply clean_recipes?
There was a problem hiding this comment.
Makes sense. I will rename it to clean_recipes.
for more information, see https://pre-commit.ci
ryanking13
left a comment
There was a problem hiding this comment.
Thanks! I left some minor comments, but it looks good overall.
pyodide_build/recipe/cleanup.py
Outdated
| """ | ||
| try: | ||
| if path.is_dir(): | ||
| import shutil |
There was a problem hiding this comment.
(friendly question) Is there a specific reason you wanted to import shutil dynamically? If it were to optimize the speed, I think we don't need to care much about the import speed in this case.
Otherwise, let's move this import statement to the top of the file.
There was a problem hiding this comment.
Thanks for pointing this out. I recently started using Cursor and didn’t review the import carefully, sorry about that. You’re right that there’s no need to import it dynamically, I’ll move it to the top as suggested.
| logger.info("Removing %s", str(path)) | ||
| path.unlink(missing_ok=True) # type: ignore[call-arg] | ||
| else: | ||
| # Path does not exist; nothing to do |
There was a problem hiding this comment.
Maybe let's print a debug message in this case. So we can debug the issue when it does not work well.
There was a problem hiding this comment.
Thanks, that makes sense. I’ll add a debug log for this case so it’s easier to trace when nothing happens.
| # Path does not exist; nothing to do | ||
| return | ||
| except Exception: | ||
| # Best-effort cleanup; ignore failures |
There was a problem hiding this comment.
Sure, I’ll add a debug log here too.
| # Per-package build directory | ||
| paths.append(build_dir_base / package / "build") | ||
| # Per-package build log | ||
| paths.append(recipe_dir / package / "build.log") | ||
| # Per-package dist directory (optional) | ||
| if include_dist: | ||
| paths.append(recipe_dir / package / "dist") |
There was a problem hiding this comment.
(No action required for this PR)
As I mentioned in the previous comment (#223 (review)), I think this part can be improved to reuse other existing code (or to be more object-oriented).
For instance, we might be able to do something like
builder = RecipeBuilder(recipe_dir, ...)
return [builder.build_dir, builder.log_file, ...]or maybe the RecipeBuilder class can be responsible for cleaning up the files itself.
builder = RecipeBuilder(recipe_dir, ...)
builder.clean()Of course, our RecipeBuilder class does not implement such attributes or methods yet, so you don't need to address it in this PR.
If you are interested, feel free to refactor it in a follow-up PR.
There was a problem hiding this comment.
Sure, I’ll work on this refactor in a follow-up PR. Thanks for the suggestion.
There was a problem hiding this comment.
I think this can be renamed to test_cleanup now.
There was a problem hiding this comment.
Sure, I’ll rename it.
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
clean subcommand to pyodide build-recipes CLI
ryanking13
left a comment
There was a problem hiding this comment.
Thanks! I updated the PR title as the content of the PR changed. Otherwise looks good to me.
Closes #218
Summary
cleansubcommand topyodide build-recipesfor removing downloaded sources and intermediate build artifacts.dist/by default; can include with--include-dist.tag:<name>), or all packages (--all).Changes
cleancommand with:<pkg>/buildand<pkg>/build.log.--include-dist: also remove<pkg>/distand global install dir.test_clean.py).Usage