Skip to content

chore: add pyright type checking and ci checks for it #280

Merged
ekzhu merged 22 commits intoekzhu:masterfrom
bhimrazy:checks/add-pyright
Jan 2, 2026
Merged

chore: add pyright type checking and ci checks for it #280
ekzhu merged 22 commits intoekzhu:masterfrom
bhimrazy:checks/add-pyright

Conversation

@bhimrazy
Copy link
Copy Markdown
Contributor

@bhimrazy bhimrazy commented Nov 11, 2025

What does this PR do?

  • Adds Pyright type checking configuration and CI workflow to improve code quality and type safety.
  • Fixes type hints, assertions, and error handling in storage, LSH, MinHash, and related modules.
  • Updates pyproject.toml with Pyright settings.

Checklist

  • Are unit tests passing?
  • Documentation added/updated for all public APIs?
  • Is this a breaking change? If yes, add "[BREAKING]" to the PR title.

@bhimrazy bhimrazy changed the title Checks/add-pyright chore: add pyright type checking and ci checks for it Dec 20, 2025
@bhimrazy bhimrazy marked this pull request as ready for review December 20, 2025 04:57
@bhimrazy bhimrazy requested a review from ekzhu as a code owner December 20, 2025 04:57
Copy link
Copy Markdown
Owner

@ekzhu ekzhu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Could you resolve the conflict that was introduced by the latest merge.

@bhimrazy
Copy link
Copy Markdown
Contributor Author

Thanks! Could you resolve the conflict that was introduced by the latest merge.

Sure @ekzhu

Copilot AI review requested due to automatic review settings January 2, 2026 08:34
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Jan 2, 2026

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

❌ Patch coverage is 78.12500% with 7 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (master@4e29f97). Learn more about missing BASE report.

Files with missing lines Patch % Lines
datasketch/storage.py 76.92% 3 Missing ⚠️
datasketch/lsh.py 71.42% 2 Missing ⚠️
datasketch/minhash.py 66.66% 1 Missing ⚠️
datasketch/weighted_minhash.py 80.00% 1 Missing ⚠️
❗ Your organization needs to install the Codecov GitHub app to enable full functionality.
Additional details and impacted files
@@            Coverage Diff            @@
##             master     #280   +/-   ##
=========================================
  Coverage          ?   77.52%           
=========================================
  Files             ?       15           
  Lines             ?     2056           
  Branches          ?        0           
=========================================
  Hits              ?     1594           
  Misses            ?      462           
  Partials          ?        0           
Flag Coverage Δ
unittests 77.52% <78.12%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds Pyright static type checking to the project to improve code quality and type safety. The changes include configuration for Pyright in pyproject.toml, a new CI workflow for automated type checking, and various fixes to type hints and error handling across the codebase.

  • Configures Pyright with basic type checking mode and selective rule disabling
  • Adds GitHub Actions workflow for automated Pyright checks
  • Improves type annotations and error handling in storage, LSH, and MinHash modules

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
pyproject.toml Adds Pyright configuration with basic type checking mode and various reports disabled
.github/workflows/checks.yml Adds new Pyright CI job using uv and pyright
datasketch/weighted_minhash.py Changes type check from Iterable to Sized, adds redundant type annotation, refactors array initialization
datasketch/storage.py Adds return type annotations, changes None returns to ValueError, adds buffer_size property, makes seeds parameter optional
datasketch/minhash.py Updates import statements, changes type hints from Iterable to Sized/Union[Sized, np.ndarray], adds return type to _parse_hashvalues
datasketch/lshensemble.py Extracts variable to avoid potential type checking issues with array indexing
datasketch/lsh_bloom.py Adds None checks for n and fp parameters in validation
datasketch/lsh.py Adds explicit type annotations for storage attributes, changes _merge return type to None, adds None check for hashfunc
datasketch/experimental/aio/storage.py Adds type: ignore comments for async Redis method calls
Comments suppressed due to low confidence (1)

datasketch/minhash.py:142

  • The _parse_hashvalues method is called twice for the same hashvalues input: once on line 121 to get the length, and again on line 142 to assign to self.hashvalues. This is inefficient - the result from line 121 should be stored and reused on line 142 to avoid parsing the same data twice.
            hashvalues = self._parse_hashvalues(hashvalues)
            num_perm = len(hashvalues)
        if num_perm > _hash_range:
            # Because 1) we don't want the size to be too large, and
            # 2) we are using 4 bytes to store the size value
            raise ValueError(
                "Cannot have more than %d number of\
                    permutation functions"
                % _hash_range
            )
        self.seed = seed
        self.num_perm = num_perm
        # Check the hash function.
        if not callable(hashfunc):
            raise ValueError("The hashfunc must be a callable.")
        self.hashfunc = hashfunc
        # Check for use of hashobj and issue warning.
        if hashobj is not None:
            warnings.warn("hashobj is deprecated, use hashfunc instead.", DeprecationWarning, stacklevel=2)
        # Initialize hash values
        if hashvalues is not None:
            self.hashvalues = self._parse_hashvalues(hashvalues)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

bhimrazy and others added 3 commits January 2, 2026 14:47
@ekzhu ekzhu merged commit 50ce2a0 into ekzhu:master Jan 2, 2026
10 checks passed
@ekzhu
Copy link
Copy Markdown
Owner

ekzhu commented Jan 2, 2026

@bhimrazy great work!

And Happy New Year to you 🎆

@bhimrazy bhimrazy deleted the checks/add-pyright branch January 3, 2026 06:59
@bhimrazy
Copy link
Copy Markdown
Contributor Author

bhimrazy commented Jan 3, 2026

@bhimrazy great work!

And Happy New Year to you 🎆

Thanks, @ekzhu! 🙌
Happy New Year to you too ✨

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants