Skip to content

Update tokenize to treat dict and kwargs differently#8655

Merged
jcrist merged 2 commits intodask:mainfrom
jrbourbeau:tokenize-kwargs
Feb 15, 2022
Merged

Update tokenize to treat dict and kwargs differently#8655
jcrist merged 2 commits intodask:mainfrom
jrbourbeau:tokenize-kwargs

Conversation

@jrbourbeau
Copy link
Copy Markdown
Member

This ensures, for example, tokenize(5, foo="bar") and tokenize(5, {"foo": "bar"}) yield different hashes.

cc @graingert

dask/base.py Outdated
args = args + (kwargs,)
return md5(str(tuple(map(normalize_token, args))).encode()).hexdigest()
token += str(tuple(map(normalize_token, (kwargs,))))
return md5(token.encode()).hexdigest()
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GitHub won't let me suggest since there's deleted lines, but a more efficient fix would be:

hasher = md5(str(tuple(map(normalize_token, args))).encode())
if kwargs:
    hasher.update(str(normalize_token(kwargs)).encode())
return hasher.hexdigest()

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion -- agreed that looks better 👍

if kwargs:
args = args + (kwargs,)
return md5(str(tuple(map(normalize_token, args))).encode()).hexdigest()
hasher.update(str(normalize_token(kwargs)).encode())
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it might it be worth copying the test case from #8632 ?

@jsignell jsignell added core enhancement Improve existing functionality or make things work better labels Feb 4, 2022
Copy link
Copy Markdown
Member

@jcrist jcrist left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM as is. Merging.

@jcrist jcrist merged commit 943a7b2 into dask:main Feb 15, 2022
@jrbourbeau jrbourbeau deleted the tokenize-kwargs branch February 15, 2022 18:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core enhancement Improve existing functionality or make things work better

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants