Skip to content

Implement digamma#3955

Closed
zou3519 wants to merge 3 commits intopytorch:masterfrom
zou3519:digamma
Closed

Implement digamma#3955
zou3519 wants to merge 3 commits intopytorch:masterfrom
zou3519:digamma

Conversation

@zou3519
Copy link
Copy Markdown
Contributor

@zou3519 zou3519 commented Nov 30, 2017

Implements torch.digamma and tensor.digamma() as per #678

cc @fritzo

Test Plan

New unit tests. cpu unit tests test some edge cases, gpu unit test compares output of cpu to gpu

@pytorchbot
Copy link
Copy Markdown
Collaborator

@zou3519, thanks for your PR! We identified @zdevito to be a potential reviewer.

Copy link
Copy Markdown
Collaborator

@fritzo fritzo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for implementing this!

Comment thread test/test_torch.py Outdated

def test_digamma(self):
y = torch.Tensor([-10, 0])
x = torch.Tensor([-0.1, 3, 999])

This comment was marked as off-topic.

@gchanan
Copy link
Copy Markdown
Contributor

gchanan commented Dec 1, 2017

fwiw I'm adding the ability to write efficient pointwise native functions in ATen directly.


- name: lgamma(Tensor self)
self: not_implemented("lgamma")
self: grad * digamma(self)

This comment was marked as off-topic.

This comment was marked as off-topic.

Copy link
Copy Markdown
Contributor

@apaszke apaszke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. We should resolve the licensing issues before merging this.

Comment thread aten/src/TH/generic/THTensorMath.c Outdated
return THTensor_(digamma_one)(1 - x) - PI / tan(PI * x);
}

// Push x to be >= 10

This comment was marked as off-topic.

Comment thread aten/src/TH/generic/THTensorMath.c Outdated
}

/*
* Algorithm adapted from Cephes

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

Comment thread test/test_cuda.py
tensor = tensor.unsqueeze(1)
self.assertEqual(tensor.var(0)[0], 0.03125)

def test_digamma(self):

This comment was marked as off-topic.

This comment was marked as off-topic.


- name: lgamma(Tensor self)
self: not_implemented("lgamma")
self: grad * digamma(self)

This comment was marked as off-topic.

@fritzo
Copy link
Copy Markdown
Collaborator

fritzo commented Dec 9, 2017

FYI this will merge conflict with #3978 where I've implemented a makeshift finite-difference version of THTensor(digamma_one). If #3978 merges before this PR, you can simply delete my makeshift version.

@fritzo
Copy link
Copy Markdown
Collaborator

fritzo commented Dec 15, 2017

@zou3519 can I help with this at all? Now that you've done the hard part, I'm happy to resolve merge conflicts and add tests. (We're looking forward to using this in Pyro, and we already have a wrapper to use torch.distributions.Gamma).

@zou3519
Copy link
Copy Markdown
Contributor Author

zou3519 commented Dec 15, 2017

@fritzo I haven't reached out to the author about the licensing yet. I'll do that tomorrow and we'll see how that goes :)

@fritzo
Copy link
Copy Markdown
Collaborator

fritzo commented Dec 15, 2017

I see, thanks for letting me know!

@ezyang
Copy link
Copy Markdown
Contributor

ezyang commented Dec 21, 2017

Any update on the license status?

@zou3519
Copy link
Copy Markdown
Contributor Author

zou3519 commented Dec 21, 2017

Sent an email a few days ago to the author, haven't heard back yet

@fritzo
Copy link
Copy Markdown
Collaborator

fritzo commented Dec 28, 2017

If we still haven't heard from the author, I could throw together a little PR that exposes the makeshift finite-difference implementation of digamma that we're already using internally in a few places. This would at least unblock users of Gamma, Beta, and Dirichlet distributions.

@apaszke
Copy link
Copy Markdown
Contributor

apaszke commented Dec 28, 2017

I think merging an approximation is good for now

@fritzo
Copy link
Copy Markdown
Collaborator

fritzo commented Dec 28, 2017

Hmm I looked into it but it appears to be quite complex now that our functions using digamma are spread across C code in aten/src/TH/generic/THTensorMath.c and C++ code in aten/src/ATen/native/Distributions.cpp. @gchanan would it be difficult for you to move dirichlet_grad() to C++ as well so we can implement digamma() in a single place?

@gchanan
Copy link
Copy Markdown
Contributor

gchanan commented Dec 28, 2017

@fritzo sure, I'll take a look at.

@zou3519
Copy link
Copy Markdown
Contributor Author

zou3519 commented Jan 23, 2018

Licensing issues resolved, PR rebased. I was told that "You are welcome to modify and distribute the library under BSD license." so I've added the original copyright notices into the code as comments.

@fritzo
Copy link
Copy Markdown
Collaborator

fritzo commented Jan 23, 2018

That's great news, @zou3519 !

Added test to check digamma float vs double.
TestCuda.test_digamma checks the CUDA {float, double} implementation
against the CPU {float, double} implementation.
@wranai
Copy link
Copy Markdown

wranai commented Mar 31, 2018

Any news on this? It seems only the GPU build failed on Windows last November, and that only because it ran out of memory.

@fritzo
Copy link
Copy Markdown
Collaborator

fritzo commented Mar 31, 2018

@wranai A simpler implementation of digamma and trigamma is already in master. Try torch.digamma

@soumith soumith closed this Mar 31, 2018
zou3519 added a commit to zou3519/pytorch that referenced this pull request Apr 11, 2018
Fixes pytorch#6190.

This is a rebase of pytorch#3955 with some tweaks for better performance around
poles. The code is ported over from cephes with permission.

By itself, the cephes code returns inf for the poles.

For better performance around the poles with float32, one intermediate
step is always computed with double precision, regardless of dtype.
This step does `PI / tan(PI * input)`. This is necessary because small (1e-6)
rounding errors for the inputs to tan have strong effects on the output
(ie, the derivative of tan is very large at some points).
@zou3519 zou3519 mentioned this pull request Apr 11, 2018
zou3519 added a commit that referenced this pull request Apr 13, 2018
* More precise digamma

Fixes #6190.

This is a rebase of #3955 with some tweaks for better performance around
poles. The code is ported over from cephes with permission.

By itself, the cephes code returns inf for the poles.

For better performance around the poles with float32, one intermediate
step is always computed with double precision, regardless of dtype.
This step does `PI / tan(PI * input)`. This is necessary because small (1e-6)
rounding errors for the inputs to tan have strong effects on the output
(ie, the derivative of tan is very large at some points).

* Replace usages of finite-differences digamma with newly implemented digamma

* Better behavior near and at poles

* ScalarConvert -> scalar_cast for readability
laurentdupin pushed a commit to laurentdupin/pytorch that referenced this pull request Apr 24, 2026
* More precise digamma

Fixes pytorch#6190.

This is a rebase of pytorch#3955 with some tweaks for better performance around
poles. The code is ported over from cephes with permission.

By itself, the cephes code returns inf for the poles.

For better performance around the poles with float32, one intermediate
step is always computed with double precision, regardless of dtype.
This step does `PI / tan(PI * input)`. This is necessary because small (1e-6)
rounding errors for the inputs to tan have strong effects on the output
(ie, the derivative of tan is very large at some points).

* Replace usages of finite-differences digamma with newly implemented digamma

* Better behavior near and at poles

* ScalarConvert -> scalar_cast for readability
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants