Skip to content

Conversation

@gpauloski
Copy link
Collaborator

Description

This PR adds the cache_defaults and target parameters to Proxy. When used together, the proxy caches default values of __class__ and __hash__ so that hash(proxy) and isinstance(proxy, ...) can be used without needing the resolve the proxy. The populate_target flag of Store methods now defaults to also setting cache_defaults=True.

Because this change added more special attributes to the Proxy class, I renamed all special attributes (__wrapped__, __factory__, __resolved__, etc.) to be prefixed by __proxy_* to better distinguish what is a property of Proxy versus the target. This is not considered a breaking change because those properties are an internal implementation detail of the Proxy class.

Fixes

Type of Change

  • Breaking Change (fix or enhancement which changes existing semantics of the public interface)
  • Enhancement (new features or improvements to existing functionality)
  • Bug (fixes for a bug or issue)
  • Internal (refactoring, style changes, testing, optimizations)
  • Documentation update (changes to documentation or examples)
  • Package (dependencies, versions, package metadata)
  • Development (CI workflows, pre-commit, linters, templates)
  • Security (security related changes)

Testing

Added new unit tests for the features.

I also validated against this Dask example:

import logging
import tempfile

from dask.distributed import Client
from proxystore.connectors.file import FileConnector
from proxystore.store import Store

logging.basicConfig(level=logging.DEBUG)

if __name__ == '__main__':
    with tempfile.TemporaryDirectory() as tmp_dir:
        with Store('default', FileConnector(tmp_dir)) as store:
            client = Client(processes=True)

            x = list(range(100))
            p = store.proxy(x, populate_target=True)
            y = client.submit(sum, p)

            print(f'Result: {y.result()}')

            client.close()

Here, when populate_target=False, the proxy will be resolved three times: on the client when the proxy is serialized, on the scheduler when Dask does its input management, and on the worker when the proxy is actually used. With populate_target=True, the proxy is only resolved once on the worker when actually used.

Pull Request Checklist

Please confirm the PR meets the following requirements.

  • Tags added to PR (e.g., breaking, bug, enhancement, internal, documentation, package, development, security).
  • Code changes pass pre-commit (e.g., mypy, ruff, etc.).
  • Tests have been added to show the fix is effective or that the new feature works.
  • New and existing unit tests pass locally with the changes.
  • Docs have been updated and reviewed if relevant.

The cache_defaults flag and target parameter, when combined, will
compute and cache the __class__ and __hash__ values of the proxy such
that the proxy can be hashed or used with isinstance() without needing
to resolve the proxy.

This is particularly important for certain serializers or workflow
systems (like Dask) that want to do some custom management of values
which could be proxies.
There's quite a few special attributes between the base Proxy and the
reference Proxy implementations. To keep it clear what is a special
Proxy attribute and an attribute of the target, I've changed all the
special Proxy attributes to be prefixed with __proxy_*.

This is similar to how pydantic v2 prefixes all the pydantic specific
attributes and methods with "model_". This is not a breaking change
because these are all internal implementation details of the Proxy.

There were a few places where __factory__ was accessed so I've added a
get_factory() function to make that the preferred way of getting the
factory.
@gpauloski gpauloski added the enhancement New features or improvements to existing functionality label Apr 24, 2024
@gpauloski gpauloski merged commit 6b85141 into main Apr 24, 2024
@gpauloski gpauloski deleted the issue-549 branch April 24, 2024 17:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New features or improvements to existing functionality

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support paths for hash and isinstance that does not resolve a Proxy

2 participants