Switch to anderson method to speed up zetazero#959
Conversation
For the smooth nature of the zeta function, it is an unambiguous win, while maintaining the bracketing guarantee.
|
Could you, please, provide some benchmark, e.g. with pyperf? |
import pyperf
import mpmath
runner = pyperf.Runner()
for prec in [53, 100, 200, 400]:
with mpmath.mp.workprec(prec):
for x in [9, 99, 999, 9999]:
runner.bench_func(f"zetazero({x}) @{prec}", mpmath.zetazero, x)New code: Old code: There were various instances of |
|
I was checking zetazeros() in 2010. zero number 1000000000000000 is equal to The time spent on the computation is It is clear that this result is not reliable. The program search for two consecutive intervals where the Theorem of Brent can be To the right, it appears that there were 7 consecutive good Gram intervals. But what make this not reliable is that apparently the program has separated It follows that we must change the program so that it computes for this high values To clear the situation I see how are the Gram points in the block studied
\medskip we get {\tt \obeylines\obeyspaces the Gram point n = 89 is = 208514052006403.4671825650459474923 } We intercalate the values in the pattern of zeros: {\tt \obeylines\obeyspaces In this way we see that the given values are consistent. It end of quote This is only to show you that I worked hard to get a program giving the true n-th zero and to the correct precision. |
|
I am not sure what you are saying, exactly.
Are you saying that the change I made here is flawed, and you determined this by checking the zero 10^15? But this seems impossible, because the time listed for the computation is 1 day 9 hours, which is much longer than my PR has been open. So, I must assume you mean something else. I do not dispute that you put careful engineering work into this function. Indeed, that is why the change I made here is making a minimal disturbance, and should have no observable numerical impact. If it does, it would imply (to me) that the root-finding method itself is broken. |
|
@d0sboots, did you test also 'pegasus' option? BTW, you could use compare_to pyperf's command to do comparison, e.g. as table. |
|
I'd tried pegasus already, it's not as good. I reran all the numbers to capture as json so I could use compare_to. Edit: The falloff in speedup as the precision and/or zero gets larger is expected. As these parameters get larger, the initial |
|
BTW, it seems that change in separate_zeros_in_block() doesn't affect your tests. |
|
My point the other day is that I did a lot of checking to implement zetazeros. I see your checks only cover zeros with n<10000. This is nothing. (Also I am not fluent in English. It is difficult for me to express in English. ) I recall to select very carefully the method to use to compute to high precision the zeros. It is many year ago that I implemented zetazeros. The main difficulty is to get just the n-th zero and not any other of the many near it. To this end the method determines first an interval [a,b] in which we know there are just N zeros, from which our zero is the m-th. The distribution of zeros of zeta is such that some zeros are very close to each other and some are not. When we determine the adequate number of changes of sign of Z(t) we will know an interval [a',b']\subset [a,b] in which Z(t) changes sign and contains a unique zero that is our zero. So, we have to use a method based on bisection: Illinois, Pegasus or Anderson. I recall checking all these methods (and other ones). You have been checking some small zeros, this is not sufficient, the problem was when zeros were very near. In some cases the program returned a zero not in the interval [a',b'], that is not the n-th zero. You have to check this type of zero. Improving the time in small zeros some fraction of a second, when the main time is spent to locate the zero is not a great advance. Therefore before change the method see that it compute well the difficult cases. I do not find my checks. I give you some cases that I know were not easy. Apart from the 10^15 zero, here are some other more easily computed: For example Van de Lune gives me two near zeros Or check the zeros Gourdon find closer for n<10^13 zero number 8637740722917 is |
That's something worth to be documented in code. |
The reason I did not check large zeros is because there should not be any correctness implications switching between these methods; the only implication should be performance. All of these methods are based on bisection and maintain the invariant that they keep the root bracketed, as long as there is only a single root in the interval from the start. That is the same invariant that all of them require. Let me be more plain: If you found a correctness difference between these methods when implementing zetazeros, it either speaks to a bug in findroot (or one of its solvers), or the original invariant of having an interval bracketing a single root wasn't holding. It sounds like in some cases the latter was true (the interval was previously too wide), but if zetazeros is correct now, that is no longer the case. If you are worried about particular difficult cases, the right answer is for those to be unittests. If they are too large to be run regularly, they can be added as skipped tests, or there are other solutions - I'm not sure how it's done in this project, but I am happy to add more tests in this PR. But relying on manual testing for correctness is not sustainable. |
Yes, I changed that for consistency but it only triggers on specific values, which my benchmarks don't hit. I didn't expect it to be a significant source of the runtime for the ranges where this optimization is noticable. |
In turns out that two of these were already in the unittests (the first one and the third one, although the latter is commented out for "potentially taking hours"). I added the middle one, and I'm running the commented-out ones to verify. Edit: Pleasant surprise, only took 8 minutes each (at the default 53-bit precision of the tests). Both passed. |
On my box these tests timeout. You might try to uncomment them to see if they are passed on CI. Though, I think it's better to have a separate test with "slow" mark. |
|
I think I'm missing the module that causes tests to timeout. In any case, 8 minutes is far too slow to run as part of the CI; I was just happy that they ran faster than the rumored "hours". |
Taken from mpmath#959
@ariasdereyna, is this a valid assumption about your code, that calls findroot() or we miss something? |
|
The short answer is yes. The program determine first an interval (g_r, g_s] containing m zeros, we know also that our zero is the k zero of this interval. After we separate the m zeros in the interval by locating the adequate number of changes of sign, we determine an interval (a,b)\subset(g_r,g_s] that contains only the zero we are computing. We have also a change of sign Z(a) Z(b) < 0. At this moment is when we compute the zero to high precision. Note that in the process of separation of the m zeros, maybe we have to increase the precision to which we are computing. Because the zeros can be very near. Frequently we are computing to a higher precision, to the one we asked for. This explain that to compute a zero, the program is faster if we use from the start a higher precision. I do not recall why I preferred the method Illinois to Anderson. Theoretically both are good. Certainly I used Anderson on many trials. |
So, correctness shouldn't be an issue, right? I tried some of your latest "uneasy cases":
Benchmark hidden because not significant (1): zetazero(1048449115) @113 So, we have some speedup here as well.
I understand that. Though, I see no performance regressions so far. @d0sboots, could you please test other posted cases? Details# bench.py
import pyperf
import mpmath
from mpmath import zetazero
runner = pyperf.Runner()
#for prec in [53, 100, 200, 400]:
# with mpmath.mp.workprec(prec):
# for x in [9, 99, 999, 9999]:
# runner.bench_func(f"zetazero({x}) @{prec}", zetazero, x)
prec=113
with mpmath.mp.workprec(prec):
for x in [1048449114, 1048449115, 3570918901,
3570918902,
# 8637740722917, 8637740722918,
]:
runner.bench_func(f"zetazero({x}) @{prec}", zetazero, x) |
|
Here all tests for your latest "uneasy cases": Original benchmark on my system:
Benchmark hidden because not significant (1): zetazero(99) @53 I think we can revert this small patch later, but for now I'll merge this. It definitely improves things for small indexes. Detailsfrom time import time
from mpmath import zetazero
for n in [1048449114,
1048449115,
3570918901,
3570918902,
8637740722917,
8637740722918,
#1000000000000000
]:
start = time()
r = zetazero(n)
end = time()
print(r, " in ", end - start) |
For the smooth nature of the zeta function, it is an unambiguous win, while maintaining the bracketing guarantee.
This closes #958. I tried a variety of optimizations; some improved only specific areas of the parameter space, others were small wins in some cases but might be considered risky. This was unambiguously better without any risk of altering the existing behavior.