Part of the twat collection of tools.
twat-search is a powerful and flexible multi-engine web search aggregator. It provides a unified command-line and programmatic interface to query a wide array of search engines simultaneously, returning results in a consistent and easy-to-process format.
twat-search is designed for:
- Developers who need to integrate web search capabilities into their applications.
- Researchers and analysts who require comprehensive search results from multiple sources for data gathering and analysis.
- Data scientists looking to collect web data for training models or generating insights.
- Anyone who wants to save time and effort by querying multiple search engines at once and getting aggregated results.
- Comprehensive Results: Get a broader perspective by searching across many engines, including specialized ones, reducing the chance of missing information.
- Time-Saving: Query multiple search engines with a single command or API call.
- Consistent Output: All results are normalized into a standard JSON format, making them easy to parse and use.
- Programmatic Access: Easily integrate search functionality into your Python scripts and applications.
- Rich CLI: Enjoy a user-friendly command-line interface with features like engine selection, JSON output, and easy configuration.
- Configurability: Fine-tune search parameters, select engines, and manage API keys via environment variables or direct code configuration.
- Asynchronous Operations: Leverages
asynciofor efficient, non-blocking searches when used programmatically.
- Multi-Engine Support: Access a diverse range of search engines (e.g., Brave, DuckDuckGo, Google via SerpAPI, Bing Scraper, and many more).
- Unified Interface: Consistent CLI commands and Python API for all supported engines.
- Asynchronous API: High-performance
asyncio-based API for concurrent searching. - Standardized Results: Search results provided in a uniform Pydantic model, easily exportable to JSON.
- Flexible Configuration: Configure API keys, engine preferences, and default parameters through environment variables (
.envsupported) or Python code. - Rich Command-Line Tool: Powered by
python-fireandrichfor a pleasant and informative CLI experience. - Extensible: Designed to easily incorporate new search engines.
- Robust Error Handling: Gracefully handles issues with individual engines without disrupting the overall search process.
You can install twat-search using pip:
pip install twat-searchThis will install the core package with a basic set of features.
Many search engines require additional libraries (e.g., for specific APIs or web scraping capabilities). You can install these optional dependencies as "extras".
To install support for specific engines, list them in brackets:
# Install with support for DuckDuckGo and Brave
pip install twat-search[duckduckgo,brave]
# Install with support for Falla-based engines (e.g., Google Falla)
pip install twat-search[falla]To install support for all available optional engines:
pip install twat-search[all]You can find the list of available extras and the engines they enable in the pyproject.toml file or the "Supported Search Engines" section of this document.
twat-search can be used both as a command-line tool and as a Python library.
The CLI provides a quick way to perform searches directly from your terminal. The main command is twat-search web.
Basic Search: Query all configured and enabled engines:
twat-search web q "latest advancements in AI"Specify Engines:
Search using only selected engines (e.g., brave and duckduckgo):
twat-search web q "Python programming best practices" -e brave,duckduckgoNote: Engine names are typically lowercase. Use twat-search web info --plain to see available engine identifiers.
Get JSON Output: Retrieve results in a machine-readable JSON format:
twat-search web q "future of renewable energy" --jsonList Available Engines: See a list of all search engines detected by the tool:
twat-search web info --plainThis will show their identifiers, whether they are enabled, and if API keys are found (if required).
Control Number of Results: Specify the number of results per engine (if the engine supports it):
twat-search web q "best Python IDEs" --num_results 5Note: num_results is a global parameter. Some engines might also support engine-specific parameters like count via the CLI. For example, twat-search web q -e brave "query" --count 3.
Integrate twat-search into your Python applications using its asynchronous API.
Basic Asynchronous Search:
import asyncio
from twat_search.web.api import search
from twat_search.web.models import SearchResult
async def main():
query = "What is quantum computing?"
print(f"Searching for: {query}...")
# Search across all enabled and configured engines
# You can also specify engines: await search(query, engines=["brave", "duckduckgo"])
results: list[SearchResult] = await search(query)
if not results:
print("No results found.")
return
print(f"Found {len(results)} results:\n")
for result in results:
print(f"Engine: {result.source_engine}")
print(f"Title: {result.title}")
print(f"URL: {result.url}")
print(f"Snippet: {result.snippet}")
if result.extra_info:
print(f"Extra: {result.extra_info}")
print("-" * 20)
if __name__ == "__main__":
asyncio.run(main())This example demonstrates a basic search. The search function can be further customized with specific engines, configurations, and parameters.
twat-search offers flexible configuration options primarily through environment variables and programmatic settings.
-
Environment Variables:
- This is the most common way to configure
twat-search, especially for API keys and engine enablement. - You can set variables in your shell, or for convenience, define them in a
.envfile in your project's root directory.twat-searchwill automatically load it. - Key settings include:
- API keys for services like Brave, SerpAPI, Tavily (e.g.,
BRAVE_API_KEY="your_key_here"). - Enabling or disabling specific engines (e.g.,
DUCKDUCKGO_ENABLED=true,GOOGLE_FALLA_ENABLED=false). - Setting default search parameters for engines (e.g.,
BRAVE_DEFAULT_PARAMS='{"count": 7, "country": "US"}'). - Global settings like
NUM_RESULTS=10to attempt to fetch a specific number of results from all engines.
- API keys for services like Brave, SerpAPI, Tavily (e.g.,
- This is the most common way to configure
-
Programmatic Configuration:
- When using
twat-searchas a Python library, you can pass aConfigobject to thesearchfunction. - This allows for dynamic, in-code configuration of engines, API keys, and parameters, overriding any environment settings.
- When using
A more detailed explanation of all configuration options, including specific environment variable names and programmatic Config object usage, can be found in the Technical Deep Dive section later in this document.
This section provides a more detailed look into the inner workings of twat-search, aimed at developers who wish to understand or contribute to the project.
twat-search is built with a modular architecture, primarily centered around its Python components. Key modules and their roles include:
-
twat_search.web.api:- This module exposes the main programmatic interface, primarily the
async def search(...)function. - It orchestrates the search process: loading configurations, selecting and initializing engines, dispatching search queries concurrently, aggregating results, and handling errors.
- This module exposes the main programmatic interface, primarily the
-
twat_search.web.cli:- Implements the command-line interface using the
python-firelibrary for command parsing andrichfor formatted output. - It translates CLI arguments into calls to the
twat_search.web.apior configuration display functions. - The entry point for the CLI is
twat-search web .... (Note:twat-searchitself is a top-level namespace, andwebis the subcommand for web search functionalities).
- Implements the command-line interface using the
-
twat_search.web.config:- Manages all configuration aspects of the library.
- Defines Pydantic models (
Config,EngineConfig) for structuring configuration data. - Loads settings from environment variables (with support for
.envfiles viapydantic-settings) and allows for programmatic overrides. - Handles validation and default values for engine parameters and API keys.
-
twat_search.web.engines:- This package is the heart of the multi-engine capability.
engines/__init__.py: Dynamically discovers and registers available engine implementations. It also checks for necessary dependencies and API keys for each engine.engines/base.py: Defines theSearchEngineabstract base class. All individual engine implementations must inherit from this class and implement its required methods (e.g.,async_search).- Individual Engine Modules (e.g.,
brave.py,duckduckgo.py,falla.py): Each file implements the logic for a specific search engine, handling its API communication or web scraping process, parameter mapping, and result parsing.- Engines requiring browser automation (like
google_falla) might reside in sub-packages likelib_fallaand utilize tools such as Playwright.
- Engines requiring browser automation (like
-
twat_search.web.models:- Defines Pydantic models for data structures, most importantly
SearchResult. SearchResultprovides a standardized format for search results (title, URL, snippet, source engine, etc.), ensuring consistency across all engines. These models are also used for JSON serialization.
- Defines Pydantic models for data structures, most importantly
-
twat_search.web.exceptions:- Contains custom exception classes (e.g.,
SearchError,EngineError,ConfigError) to provide more specific error information and allow for granular error handling by users of the library.
- Contains custom exception classes (e.g.,
-
Asynchronous Operations:
- The core search functionality (
api.searchand individual engine searches) is built usingasyncioandhttpx.AsyncClient(for most HTTP-based engines). This allows for efficient concurrent execution of multiple search engine queries, significantly speeding up the overall search process.
- The core search functionality (
-
Engine Discovery and Loading:
- Engines are typically classes inheriting from
SearchEngine. They are discovered at runtime by inspecting thetwat_search.web.enginespackage. - The
Configobject determines which discovered engines are actually enabled and used for a given search operation, based on environment variables or programmatic settings.
- Engines are typically classes inheriting from
twat-search provides a consistent interface to a variety of search engines. The availability of each engine depends on whether required API keys are provided and necessary optional dependencies are installed.
| Engine Name | Identifier (-e flag) |
API Key Req. | API Key Env Var | Package Extra (pip install twat-search[extra]) |
Description |
|---|---|---|---|---|---|
| Brave Search | brave |
Yes | BRAVE_API_KEY |
brave (or all) |
General web search via Brave Search API. |
| Brave News | brave_news |
Yes | BRAVE_API_KEY |
brave (or all) |
News-specific search via Brave API. |
| You.com | you |
Yes | YOU_API_KEY |
- (core dependency) |
Web search via You.com API. |
| You.com News | you_news |
Yes | YOU_API_KEY |
- (core dependency) |
News search via You.com API. |
| Tavily | tavily |
Yes | TAVILY_API_KEY |
tavily (or all) |
AI-powered research-focused search API. |
| Perplexity AI | pplx |
Yes | PERPLEXITY_API_KEY |
pplx (or all) |
AI-powered search with detailed answers. |
| SerpAPI (Google) | serpapi |
Yes | SERPAPI_API_KEY |
serpapi (or all) |
Google search results via SerpAPI. |
| HasData Google | hasdata-google |
Yes | HASDATA_API_KEY |
hasdata (or all) |
Google search results via HasData API. |
| HasData Google Light | hasdata-google-light |
Yes | HASDATA_API_KEY |
hasdata (or all) |
Light version of HasData Google API. |
| Critique | critique |
Yes | CRITIQUE_API_KEY |
- (core dependency) |
Visual and textual search capabilities. |
| DuckDuckGo | duckduckgo |
No | - | duckduckgo (or all) |
Privacy-focused search results. |
| Bing Scraper | bing_scraper |
No | - | bing_scraper (or all) |
Web scraping of Bing search results. |
| Google Falla | google_falla |
No | - | falla (or all) |
Google search via Playwright-based scraping. |
| Google Scraper | google_scraper |
No | - | google_scraper (or all) |
Google search via direct scraping (less reliable). |
Notes:
- "-" in "Package Extra" means the dependencies are part of the core
twat-searchinstallation or not explicitly defined as an extra (this might need verification againstpyproject.toml). - Engine identifiers are used with the
-eflag in the CLI (e.g.,twat-search web q "query" -e brave,duckduckgo) and in programmatic configuration. - Some engines might be disabled by default and require explicit enabling via environment variables (e.g.,
ENGINE_NAME_ENABLED=true) or programmatic configuration, in addition to API keys and optional dependencies.
Configuration is primarily handled by the twat_search.web.config.Config class, which uses pydantic-settings to load values from environment variables and .env files.
Environment variables are the primary method for configuring twat-search outside of direct code. They are case-sensitive. If an .env file exists in the current working directory when twat-search is initialized, it will be loaded automatically.
Key Environment Variables:
-
Global Settings:
LOG_LEVEL: Set the logging level (e.g.,DEBUG,INFO,WARNING). Defaults toINFO.NUM_RESULTS: A global suggestion for the number of results to fetch from each engine. Engines will try to honor this if they support such a parameter. Example:NUM_RESULTS=5.
-
Engine-Specific Settings: Most engine settings follow a pattern:
[ENGINE_IDENTIFIER_UPPERCASE]_API_KEY,[ENGINE_IDENTIFIER_UPPERCASE]_ENABLED,[ENGINE_IDENTIFIER_UPPERCASE]_DEFAULT_PARAMS.-
API Keys:
BRAVE_API_KEY="your_brave_key"YOU_API_KEY="your_you_key"TAVILY_API_KEY="your_tavily_key"PERPLEXITY_API_KEY="your_perplexity_key"SERPAPI_API_KEY="your_serpapi_key"(for Google via SerpAPI)HASDATA_API_KEY="your_hasdata_key"(for HasData Google engines)CRITIQUE_API_KEY="your_critique_key"- (Add other API key variables as defined in
config.pyor engine modules)
-
Enabling/Disabling Engines: By default, engines that require API keys are typically disabled if the key is not found. Engines that don't require keys are often enabled by default. You can explicitly enable or disable any engine:
BRAVE_ENABLED=true(orfalse)DUCKDUCKGO_ENABLED=trueGOOGLE_FALLA_ENABLED=false- (Refer to
twat_search.web.config.EngineConfigand specific engine implementations for default enabled states and identifiers)
-
Default Parameters: Set default search parameters for specific engines using a JSON string. These parameters are passed to the engine during searches if not overridden by parameters in the
searchcall.BRAVE_DEFAULT_PARAMS='{"count": 10, "safesearch": "strict", "country": "US"}'TAVILY_DEFAULT_PARAMS='{"max_results": 7, "search_depth": "advanced"}'DUCKDUCKGO_DEFAULT_PARAMS='{"max_results": 5, "region": "us-en"}'SERPAPI_DEFAULT_PARAMS='{"num": 10, "gl": "us", "hl": "en"}'- (Consult each engine's documentation or its module in
twat_search.web.enginesfor available parameters.)
-
Example .env file:
# .env
BRAVE_API_KEY="YOUR_ACTUAL_BRAVE_KEY"
BRAVE_ENABLED=true
BRAVE_DEFAULT_PARAMS='{"count": 5, "safesearch": "moderate"}'
DUCKDUCKGO_ENABLED=true
DUCKDUCKGO_DEFAULT_PARAMS='{"max_results": 3}'
# Disable an engine that might be enabled by default
# SOME_OTHER_ENGINE_ENABLED=false
NUM_RESULTS=7
LOG_LEVEL=INFOFor more dynamic control within Python applications, you can instantiate and pass a Config object to the search function. This will override any settings from environment variables for that specific search call.
from twat_search.web.api import search
from twat_search.web.config import Config, EngineConfig
import asyncio
async def perform_custom_search():
custom_config = Config(
# Global settings can also be set here if applicable, e.g. num_results
# num_results=3, # This would apply to all engines in this config
engines={
"brave": EngineConfig(
enabled=True,
api_key="your_brave_key_for_this_search", # Overrides env
default_params={"count": 3, "country": "DE"}
),
"duckduckgo": EngineConfig(
enabled=True,
default_params={"max_results": 2, "region": "de-de"}
),
"serpapi": EngineConfig( # Example: enabling an engine that might be off by env
enabled=True,
api_key="your_serpapi_key_here",
default_params={"num": 2, "gl": "de", "hl": "de"}
),
# Ensure other engines are explicitly disabled if not desired
"google_falla": EngineConfig(enabled=False),
"bing_scraper": EngineConfig(enabled=False),
# ... and so on for other engines to ensure precise control
}
)
results = await search("example query", config=custom_config)
for result in results:
print(f"[{result.source_engine}] {result.title}: {result.url}")
if __name__ == "__main__":
asyncio.run(perform_custom_search())When using programmatic configuration:
- Any engine not explicitly defined in the
enginesdictionary of yourConfigobject will typically fall back to its environment variable settings or defaults. - To ensure only specified engines run, you might need to explicitly disable others in your
Configobject if they could be enabled by environment variables. Alternatively, use theengines=["engine1", "engine2"]parameter in thesearch()call, which takes precedence for engine selection.
All search results, whether from the CLI (using the --json flag) or the Python API, are standardized into a consistent structure defined by the twat_search.web.models.SearchResult Pydantic model.
Key Fields in SearchResult:
query(str): The original search query.source_engine(str): The identifier of the engine that produced this result (e.g., "brave", "duckduckgo").title(str): The title of the search result.url(str): The URL of the search result.snippet(Optional[str]): A short description or snippet from the result page. May beNone.score(Optional[float]): A relevance score, if provided by the engine. May beNone.position(Optional[int]): The rank/position of the result in the engine's original list. May beNone.raw(Optional[dict]): The raw, unprocessed result data from the engine, for debugging or advanced use. May beNone.extra_info(Optional[dict]): Any additional structured information provided by the engine. May beNone.timestamp(datetime): The timestamp when the search result was processed.
Sample JSON Output Snippet (from CLI with --json):
When using the CLI with the --json flag, the output is a JSON array of SearchResult objects.
[
{
"query": "Python programming",
"source_engine": "duckduckgo",
"title": "Python For Beginners - Learn Python Programming",
"url": "https://www.pythonforbeginners.com/",
"snippet": "Python For Beginners.com is a website where you can learn Python programming. We have tutorials for total beginners and advanced programmers.",
"score": null,
"position": 1,
"raw": {
"hostname": "www.pythonforbeginners.com",
"icon": "https://www.pythonforbeginners.com/favicon.ico",
"html": "<b>Python</b> For Beginners.com is a website where you can learn <b>Python</b> programming. We have tutorials for total beginners and advanced programmers."
},
"extra_info": null,
"timestamp": "2023-10-27T10:30:00.123456"
},
{
"query": "Python programming",
"source_engine": "brave",
"title": "Python (programming language) - Wikipedia",
"url": "https://en.wikipedia.org/wiki/Python_(programming_language)",
"snippet": "Python is an interpreted, high-level and general-purpose programming language. Python's design philosophy emphasizes code readability with its notable use of significant indentation.",
"score": 0.95,
"position": 1,
"raw": {
"meta_url": {
"scheme": "https",
"netloc": "en.wikipedia.org",
"path": "/wiki/Python_(programming_language)",
"favicon": "https://en.wikipedia.org/static/favicon/wikipedia.ico"
}
},
"extra_info": {
"type": "web"
},
"timestamp": "2023-10-27T10:30:01.567890"
}
// ... more results
]Programmatically, you receive a list[SearchResult] where each item is an instance of the Pydantic model, allowing direct attribute access (e.g., result.title, result.url).
twat-search includes a robust error handling system to manage issues that may arise during configuration, engine initialization, or the search process. Custom exceptions provide clarity and allow for targeted error management.
Custom Exception Hierarchy:
Located in twat_search.web.exceptions:
SearchError(Exception): Base exception for all errors originating from thetwat-searchlibrary.ConfigError(SearchError): Raised for configuration-related problems (e.g., invalid parameter values, missing required settings if not handled by Pydantic's validation directly).EngineError(SearchError): Base for errors specific to a search engine's operation.APIKeyMissingError(EngineError): Raised if a required API key for an engine is not found.EngineDisabledError(EngineError): Raised when an attempt is made to use an explicitly disabled engine.EngineRequestError(EngineError): For issues during an engine's request to its external API or source (e.g., network errors, HTTP status code errors).EngineResponseError(EngineError): For problems parsing or interpreting an engine's response.
NoEnginesAvailableError(SearchError): Raised if no search engines are specified or if all specified/configured engines are unavailable or disabled.
Key Error Handling Features:
- Graceful Engine Failures: If one engine encounters an error (e.g., API down, invalid credentials), it logs the error and is excluded from the current search. Other configured engines will still attempt to perform the search. The overall
searchcall will not fail unlessNoEnginesAvailableErroroccurs. - Clear Logging: Errors are logged with context, including the engine name and a description of the issue, aiding in debugging.
- Specific Exceptions: Using distinct exception types allows programmatic users to catch and handle different error conditions appropriately.
- Configuration Validation: Pydantic models in
twat_search.web.configautomatically validate many configuration parameters on load, raising errors for type mismatches or invalid values according to defined constraints.
Example (Programmatic Error Handling):
import asyncio
from twat_search.web.api import search
from twat_search.web.exceptions import (
SearchError,
EngineError,
APIKeyMissingError,
NoEnginesAvailableError
)
async def main():
try:
# Intentionally use an engine that might be misconfigured or requires an API key
results = await search("test query", engines=["misconfigured_engine", "brave"])
for result in results:
print(f"[{result.source_engine}] {result.title}")
except APIKeyMissingError as e:
print(f"API Key Error: {e}. Please check your configuration for engine '{e.engine_name}'.")
except NoEnginesAvailableError as e:
print(f"Search Failed: {e}. No engines could be used for the search.")
except EngineError as e:
# Catch other engine-specific errors
print(f"An error occurred with engine '{e.engine_name}': {e}")
except SearchError as e:
# Catch any other search-related errors
print(f"A general search error occurred: {e}")
except Exception as e:
print(f"An unexpected error occurred: {e}")
if __name__ == "__main__":
asyncio.run(main())This structured approach to error handling makes twat-search more resilient and easier to integrate into larger applications where robust error management is critical.
We welcome contributions to twat-search! Whether it's adding a new search engine, fixing a bug, or improving documentation, your help is appreciated.
The project follows a standard Python src-layout:
src/twat_search/: Main source code for the library.web/: Core web search functionality.engines/: Implementations for each search engine.
tests/: Contains unit and integration tests.pyproject.toml: Defines project metadata, dependencies, build system (Hatch), and tool configurations (Ruff, Mypy, Pytest).README.md: This file.CHANGELOG.md: Tracks changes across versions.TODO.md: Lists planned features and improvements..github/workflows/: GitHub Actions for CI (testing, linting, releases).
We use Hatch for project management and uv as a fast package installer/resolver, often via Hatch.
-
Clone the repository:
git clone https://github.com/twardoch/twat-search.git cd twat-search -
Create and activate the development environment: Hatch will manage this for you. To create the default environment (which includes dev, test, and all engine dependencies):
hatch env create
This command sets up an environment (often a virtual environment) and installs all necessary dependencies as defined in
pyproject.toml.To activate the environment's shell if you need to run commands directly:
hatch shell
Alternatively, you can run commands through Hatch:
hatch run <command>.
Tests are written using pytest and are located in the tests/ directory.
-
Run all tests:
hatch run testOr, if you have activated the shell:
pytest
-
Run tests with coverage:
hatch run test-cov
This will generate a coverage report in the terminal and potentially an XML report for CI. Configuration for coverage is in
pyproject.toml([tool.coverage]).
We use several tools to maintain code quality, all configured in pyproject.toml:
- Ruff: For extremely fast linting and formatting. It replaces tools like Flake8, isort, and Black.
- Mypy: For static type checking.
- Pre-commit Hooks: Configured in
.pre-commit-config.yamlto run checks automatically before commits. It's highly recommended to install and use these.
Common Commands (run via Hatch):
-
Check for linting issues and format code:
hatch run lint # Runs ruff check and ruff formatOr more fine-grained:
hatch run fix # Runs ruff check --fix --unsafe-fixes and ruff format -
Run type checking:
hatch run type-check
Or within the Hatch shell:
mypy src/twat_search tests
Pre-commit Hooks: Install pre-commit hooks to automate these checks:
pip install pre-commit # If not already installed
pre-commit installNow, ruff and other checks will run on staged files before each commit.
- Style: We follow PEP 8, with formatting enforced by Ruff (configured in
pyproject.toml). Maximum line length is 120 characters. - Imports:
- Absolute imports are preferred (
from twat_search.web.utils import ...). - Imports are sorted by Ruff (according to isort rules).
flake8-tidy-importsis used (via Ruff) to ban relative imports.
- Absolute imports are preferred (
- Type Hinting:
- Comprehensive type hints are mandatory for all functions and methods.
- Mypy is used for static type checking with a strict configuration (see
[tool.mypy]inpyproject.toml).
- Logging: Use the standard
loggingmodule for any informational or debug messages. - Docstrings: Use Google-style docstrings for modules, classes, functions, and methods.
- Asynchronous Code: Use
async/awaitsyntax for all I/O-bound operations, particularly for network requests within search engines. Usehttpx.AsyncClientfor HTTP requests.
-
Find an Issue or Feature:
- Check the
TODO.mdfile for a list of planned tasks. - Look at existing issues on GitHub, especially those labeled
help wantedorgood first issue. - If you have a new idea, consider opening an issue first to discuss it.
- Check the
-
Fork and Branch:
- Fork the repository on GitHub.
- Create a new branch for your feature or bugfix from the
mainbranch (e.g.,git checkout -b feature/add-new-engineorfix/resolve-bug-123).
-
Develop:
- Write your code, adhering to the coding conventions.
- Add tests! New features must include corresponding tests. Bug fixes should ideally include a test that reproduces the bug and verifies the fix.
- Ensure your code is well-documented with docstrings and comments where necessary.
-
Test and Lint:
- Run all tests:
hatch run test. - Run linters and type checkers:
hatch run lint,hatch run type-check. Fix any reported issues. - Using pre-commit hooks will help catch issues early.
- Run all tests:
-
Update Changelog:
- Add a concise entry to
CHANGELOG.mdunder the "Unreleased" section, describing your changes. Include your GitHub username.
- Add a concise entry to
-
Commit and Push:
- Commit your changes with a clear and descriptive commit message.
- Push your branch to your fork on GitHub.
-
Submit a Pull Request (PR):
- Open a PR from your branch to the
mainbranch of thetwardoch/twat-searchrepository. - Provide a clear description of the changes in the PR. Link to any relevant issues.
- Ensure that CI checks (GitHub Actions) pass on your PR.
- Open a PR from your branch to the
This section is primarily for project maintainers.
- Versioning: The project uses
hatch-vcsfor versioning, which derives the version from Git tags. - Building: Hatch is used to build the package (wheels and sdists).
hatch build
- Releasing:
- Ensure
mainbranch is up-to-date and all tests/checks pass. - Update
CHANGELOG.md: Move entries from "Unreleased" to a new version section. - Commit changelog changes:
git commit -m "Docs: Update changelog for vX.Y.Z" - Create a Git tag for the new version:
git tag vX.Y.Z - Push the commit and tag:
git push && git push --tags - The
release.ymlGitHub Actions workflow should automatically trigger, build the package, and publish it to PyPI. - Verify the new version on PyPI. Create a GitHub Release from the tag with release notes based on the changelog.
- Ensure
The project now includes comprehensive local development scripts for improved developer experience:
-
Build Script (
scripts/build.sh):# Full build pipeline ./scripts/build.sh # Only code quality checks ./scripts/build.sh --quality # Only run tests ./scripts/build.sh --test # Only build package ./scripts/build.sh --build # Build without tests ./scripts/build.sh --skip-tests
-
Test Script (
scripts/test.sh):# Run all tests ./scripts/test.sh # Run unit tests only ./scripts/test.sh --unit # Run integration tests only ./scripts/test.sh --integration # Run specific test pattern ./scripts/test.sh --pattern "search" # Run tests with specific marker ./scripts/test.sh --marker "unit" # Generate comprehensive test report ./scripts/test.sh --report # Run in watch mode for continuous testing ./scripts/test.sh --watch
-
Release Script (
scripts/release.sh):# Release a new version ./scripts/release.sh 1.0.0 # Show current version ./scripts/release.sh --current # Suggest next version ./scripts/release.sh --major # Next major version ./scripts/release.sh --minor # Next minor version ./scripts/release.sh --patch # Next patch version # Dry run (show what would be done) ./scripts/release.sh --dry-run 1.0.0 # Create release and push tags ./scripts/release.sh --push-tags 1.0.0
The project includes a comprehensive CI/CD pipeline with the following features:
- Multiplatform Testing: Tests run on Ubuntu, Windows, and macOS
- Python Version Matrix: Tests against Python 3.10, 3.11, and 3.12
- Code Quality Checks: Automated linting, formatting, and type checking
- Binary Builds: Generates standalone binaries for all platforms
- Coverage Reporting: Comprehensive test coverage analysis
- Semver Validation: Automatic validation of semantic versioning
- Automated Releases: Full release automation with PyPI publishing
- Pre-release Support: Manual pre-release workflow for testing
-
Regular Releases:
- Push a git tag following semantic versioning (e.g.,
v1.0.0) - GitHub Actions automatically validates, tests, builds, and publishes
- Creates GitHub release with binaries and release notes
- Push a git tag following semantic versioning (e.g.,
-
Pre-releases:
- Use the manual pre-release workflow in GitHub Actions
- Publishes to Test PyPI for validation
- Creates GitHub pre-release for testing
-
Installation Options:
- PyPI:
pip install twat-search - Binary Downloads: Platform-specific executables from GitHub releases
- Development:
pip install -e .from source
- PyPI:
The CI/CD pipeline generates standalone binaries for easy installation:
- Linux:
twat-search-VERSION-ubuntu-latest - Windows:
twat-search-VERSION-windows-latest.exe - macOS:
twat-search-VERSION-macos-latest
These binaries are available in GitHub releases and require no Python installation.
This project is licensed under the MIT License. See the LICENSE file for details.