pyxplod is a command-line tool designed to refactor Python codebases by "exploding" them. It takes Python files and automatically extracts top-level classes and functions into their own separate files. The original files are then updated to import these extracted components. This process helps in breaking down large, monolithic Python files into smaller, more manageable, and modular units.
pyxplod is beneficial for:
- Developers working on large Python projects: It can help declutter modules and make the codebase easier to navigate and understand.
- Software engineers looking to refactor legacy code: It provides an automated first step towards modularizing complex files.
- Teams aiming to improve code organization: It encourages a more granular structure, potentially leading to better separation of concerns.
- Educators and students: It can be used to demonstrate concepts of code modularity, imports, and project structure in Python.
Using pyxplod offers several advantages:
- Improved Code Organization: Breaks down large files into smaller, focused modules.
- Enhanced Readability: Smaller files are generally easier to read and comprehend.
- Better Maintainability: Changes to a specific class or function are isolated to its own file, reducing the risk of unintended side effects.
- Clearer Dependencies: While
pyxplodhandles import management during explosion, the resulting structure can make it easier to visualize dependencies between components. - Facilitates Refactoring: Acts as an initial automated step that can simplify further refactoring efforts.
pyxplod is a Python package. You can install it using uv pip.
-
From PyPI (Recommended if available):
uv pip install pyxplod
(Note: This assumes the package is published to PyPI. If not, use the method below.)
-
From source (after cloning the repository):
git clone <repository_url> cd pyxplod uv pip install .
pyxplod is primarily a command-line tool.
The basic command structure is:
pyxplod <input_directory> <output_directory> --method <method_name> [--verbose]Arguments:
input_directory: The path to the directory containing the Python project or files you want to explode.output_directory: The path to the directory wherepyxplodwill save the exploded files and the modified project structure. This directory will be created if it doesn't exist.--method <method_name>: (Required) Specifies the explosion strategy.files: Extracts each class/function into a new file namedoriginal_filename_extracted_definition_name.pywithin the same relative directory structure in the output path. The original file is modified to import these new files.dirs: For each processed.pyfile (e.g.,module.py), this method creates a new directory (e.g.,module/) in the output path. Extracted classes/functions are saved as individual files (e.g.,my_function.py,my_class.py) within this new directory. An__init__.pyfile is generated inside this directory, containing necessary imports for the extracted components and any remaining module-level code from the original file. Special files like__init__.pyor__main__.pyare processed using thefilesmethod logic even ifdirsis selected.
--verbose: (Optional) Enables verbose logging, providing more detailed output about the tool's operations. Useful for debugging.
Example:
If you have a project in my_project/ and want to explode it into my_project_exploded/ using the dirs method:
pyxplod my_project/ my_project_exploded/ --method dirs --verboseWhile primarily a CLI tool, the core functionality can be accessed programmatically by importing and calling the main function from the pyxplod.cli module.
# main_script.py
from pyxplod.cli import main
if __name__ == "__main__":
input_dir = "path/to/your/input_project"
output_dir = "path/to/your/output_project"
explosion_method = "files" # or "dirs"
verbose_mode = True # or False
try:
main(input_dir, output_dir, method=explosion_method, verbose=verbose_mode)
print(f"Project exploded successfully to {output_dir}")
except Exception as e:
print(f"An error occurred: {e}")pyxplod refactors Python code by parsing it into an Abstract Syntax Tree (AST) and then restructuring it based on the chosen method.
-
File Discovery: The tool starts by scanning the specified
input_directoryrecursively for all Python files (.py), ignoring common non-code directories like__pycache__. -
AST Parsing: Each discovered Python file is read and parsed into an AST using Python's built-in
astmodule. This tree represents the syntactic structure of the code. -
Definition Identification: The AST is traversed to identify all top-level (module-level) class definitions (
ast.ClassDef) and function definitions (ast.FunctionDef). -
Extraction Process (for each definition):
- Name Usage Analysis: The AST node corresponding to the class or function definition is analyzed to determine all names (variables, functions, classes, modules) it uses. This includes names used in decorators.
- Module Variable Handling: Module-level variable assignments (e.g.,
MY_CONSTANT = 10) are identified. If an extracted definition uses any of these module-level variables, those variable assignments are also included in the new file created for the definition. - Import Filtering: The original import statements (
ast.Import,ast.ImportFrom) from the source file are analyzed. Only the imports that are actually necessary for the current definition (and any module variables it uses) are included in the new file. This ensures that extracted files are self-contained with minimal necessary imports. - New File Creation:
- A new Python file is generated. Its name and location depend on the chosen
--method. - This file contains:
- The filtered, necessary import statements.
- Any required module-level variable assignments.
- The class or function definition itself.
- The content is generated by unparsing the constructed AST for the new file.
- A new Python file is generated. Its name and location depend on the chosen
-
Original File Modification:
- The extracted class and function definitions are removed from the AST of the original file.
- New import statements are added to the original file's AST to import the extracted definitions from their new locations. For example, if
MyClasswas extracted tooriginal_module_my_class.py, anfrom .original_module_my_class import MyClass(or similar, depending on the method) would be added. - The modified AST of the original file is then unparsed back into Python code and overwrites the original file in the
output_directory.
-
Processing Methods:
-
--method files:- File Naming: New files are named using the pattern:
output_base_dir/relative_path_to_original/original_stem_snake_case_definition_name.py. For example, ifsrc/utils.pycontainsMyHelperClass, it might be extracted tooutput/src/utils_my_helper_class.py. - Original File: The original file (e.g.,
output/src/utils.py) is modified to import from these newly created sibling files (e.g.,from .utils_my_helper_class import MyHelperClass).
- File Naming: New files are named using the pattern:
-
--method dirs:- Directory Structure: For each Python file (e.g.,
my_module.pyinsrc/), a new directory is created in the output (e.g.,output/src/my_module/). - File Naming: Extracted definitions are saved as
.pyfiles within this new directory, using a snake_case version of their names (e.g.,output/src/my_module/my_function.py,output/src/my_module/my_class.py). __init__.pyGeneration: An__init__.pyfile is created within the new directory (e.g.,output/src/my_module/__init__.py). This__init__.pyfile will contain:- The original import statements from
my_module.py. - New import statements to import the extracted definitions from their files within the
my_module/directory (e.g.,from .my_function import my_function). - Any remaining module-level code (variables, simple statements) from the original
my_module.pythat was not part of an extracted definition or an import.
- The original import statements from
- Special Files: Files like
__init__.py,__main__.py, etc., are handled using thefilesmethod logic to avoid creating subdirectories for them (e.g., an__init__/directory).
- Directory Structure: For each Python file (e.g.,
-
-
Key Modules:
pyxplod.cli: Handles command-line argument parsing (usingfire), logging setup (loguru,rich), and orchestrates the overall process.pyxplod.ast_utils: Provides utilities for working with ASTs, such as extracting imports, finding definitions, analyzing name usage within nodes, and filtering imports based on usage.pyxplod.file_utils: Manages file system operations like finding Python files, validating paths, generating unique filenames for extracted code (handling potential name collisions), and writing the new AST-generated Python code to files.pyxplod.processors: Contains the core logic for processing individual Python files. It implements thefilesanddirsexplosion strategies, utilizingast_utilsandfile_utils.pyxplod.utils: General utility functions (e.g.,to_snake_case).
This project follows specific guidelines for coding and contributions, largely outlined in CLAUDE.md. Contributors are expected to familiarize themselves with it. Here's a summary:
General Principles (from CLAUDE.md):
- Iterative Development: Make gradual changes. Avoid large, sweeping modifications at once.
- Preserve Structure: Maintain existing code and project structure unless a change is well-justified.
- Clarity and Simplicity:
- Use constants instead of magic numbers.
- Write clear, explanatory docstrings and comments. Explain what the code does and why it does it. Also, note where and how the code is used or referred to elsewhere.
- Modularize repeated logic into concise, single-purpose functions.
- Favor flat structures over deeply nested ones.
- Robustness: Handle failures gracefully. Address edge cases, validate assumptions, and catch errors early.
- Efficiency: Let the computer do the work; minimize unnecessary user decisions.
- Codebase Awareness: Check for existing solutions within the codebase before implementing new ones. Maintain a holistic overview of the codebase.
- File Path Tracking: In each source file, maintain an up-to-date
this_file:comment near the top, indicating the file's path relative to the project root.
Python-Specific Guidelines (from CLAUDE.md):
- Package Management: Use
uv pipfor managing dependencies (e.g.,uv pip install <package>). - Execution: Use
python -m <module>when running Python modules. - Style and Formatting:
- Adhere to PEP 8 (Code Layout, Naming Conventions).
- Follow PEP 20 (The Zen of Python) principles – prioritize readability, simplicity, and explicitness.
- Use type hints in their simplest form (e.g.,
list,dict,str | int).
- Docstrings: Write clear, imperative docstrings as per PEP 257.
- Language Features: Utilize f-strings for string formatting. Consider structural pattern matching where appropriate.
- Logging: Implement
loguru-based logging with a "verbose" mode and debug logs. - CLI Scripts: For command-line interface scripts, use
firefor argument parsing andrichfor enhanced terminal output. Start scripts with the shebang:#!/usr/bin/env -S uv run -s # /// script # dependencies = ["PKG1", "PKG2"] # /// # this_file: path/to/current_file.py
Development Workflow (from CLAUDE.md):
- Planning:
- Create/Update
PLAN.mdwith a detailed flat list of tasks ([ ]items). - Identify key items and create/update
TODO.md([ ]items).
- Create/Update
- Implementation: Implement the changes.
- Tracking: Update
PLAN.mdandTODO.mdas you progress. - Changelog: After each significant round of changes, update
CHANGELOG.md. - Documentation: Update
README.mdto reflect any user-facing changes.
Code Quality and Testing (from CLAUDE.md):
- After making Python code changes, run the following sequence of commands to ensure code quality and correctness:
(Ensure
fd -e py -x autoflake {}; \ fd -e py -x pyupgrade --py311-plus {}; \ fd -e py -x ruff check --output-format=github --fix --unsafe-fixes {}; \ fd -e py -x ruff format --respect-gitignore --target-version py311 {}; \ python -m pytestfd-find(forfd),autoflake,pyupgrade,ruff, andpytestare installed in your development environment.)
Contribution Process (General Best Practices):
- Fork the Repository: Create your own fork of the
pyxplodrepository. - Create a Feature Branch: Branch off
main(or the relevant development branch) for your changes (e.g.,git checkout -b feature/my-new-feature). - Make Changes: Implement your feature or bug fix.
- Add Tests: Write unit tests for any new functionality or to cover bug fixes.
- Run Checks: Execute the code quality and testing script mentioned above to ensure your changes pass all checks.
- Update Documentation: If your changes affect user-facing aspects or the technical workings, update
README.mdand other relevant documentation (PLAN.md,TODO.md,CHANGELOG.md). - Commit Changes: Commit your work with clear, descriptive commit messages.
- Push to Your Fork: Push your feature branch to your fork on GitHub.
- Submit a Pull Request: Open a pull request from your feature branch to the main
pyxplodrepository. Clearly describe the changes you've made and why.
By following these guidelines, we can maintain a high-quality, understandable, and collaborative codebase.