Skip to content

Conversation

@thouis
Copy link
Contributor

@thouis thouis commented May 21, 2012

Add numpy.core.multiarray.trace_data_allocations() function for
controlling tracing malloc/free/realloc for numpy data. This
functionality is exposed by #defing NPY_TRACE_DATA_MALLOC to 1 in
ndarraytypes.h (the default with this commit). When enabled,
PyDataMem_NEW/FREE/RENEW become actual functions, defined in
multiarraymodule.c, which call back into Python if callbacks are
registered with trace_data_allocations().

Also added tests, including handling python errors during a
callback,which I think should probably log errors but not actually
raise an exception.

Add numpy.core.multiarray.trace_data_allocations() function for
controlling tracing malloc/free/realloc for numpy data.  This
functionality is exposed by #defing NPY_TRACE_DATA_MALLOC to 1 in
ndarraytypes.h (the default with this commit).  When enabled,
PyDataMem_NEW/FREE/RENEW become actual functions, defined in
multiarraymodule.c, which call back into Python if callbacks are
registered with trace_data_allocations().

Also added tests, including handling python errors during a
callback,which I think should probably log errors but not actually
raise an exception.
@thouis
Copy link
Contributor Author

thouis commented May 21, 2012

Open question:

  • Which namespace should hold the trace_data_allocations() function?

The callback mechanism is pretty fast, not noticeably slower to run numpy.test() with tracing callbacks set to None than before this functionality. This should be quantified/verified, though.

@teoliphant
Copy link
Member

This looks like a pretty useful facility. I don't remember seeing it advertised on the numpy-discussion list, though (but I might have missed it).

The trace_data_allocations function should probably be either in numpy.core or in a separate numpy.tools name-space. But, this is another good question for the mailing list.

@thouis
Copy link
Contributor Author

thouis commented May 21, 2012

The brief discussion is here: http://mail.scipy.org/pipermail/numpy-discussion/2012-May/062373.html

Sorry if it was a bit rushed, I thought having something that worked in place to see what it would involve would be helpful.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably slightly better to use PyErr_WriteUnraiseable here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about that, as I think the information that should be printed is exactly what PyErr_PrintEx(0) produces. Is there some other reasno to prefer PyErr_WriteUnraiseable?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just that PyErr_WriteUnraisable is defined by the Python API as the official thing to do in exactly this situation :-). I guess it might have trivial improvements too, like printing to sys.stderr even if that's reassigned.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like one major difference is that PyErr_PrintEx calls into sys.excepthook. Perhaps that's what's wanted, to allow, e.g., pdb to be started on an exception in the tracing callbacks.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But if that's a good idea, then it's also a good idea for sys.excepthook to be called when there's an exception in a ctypes callback, a cython callback, a __del__ method, a weakrefs callback, etc. And who knows, maybe it is. But this just doesn't seem like a good place to be trying to second-guess and "fix" the Python interpreter's standard behavior.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. In that case, I guess the callback should either generate an exception (and the malloc/free/realloc happens but doesn't return to the point above), or do as you say. I don't know which is better, so for now will go with the latter.

What should happen seems like another question for the mailing list, so I will check there, as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, since most code is just treating the functions as macro wrappers for malloc(), they don't generally check PyErr_Occured after calling them, so it's almost surely best to just log and continue to limit the amount of code changed and avoid problems with 3rd party libraries.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, it's simply not possible to throw an exception out of functions like these. That's why the Python interpreter has a function for dealing with exceptions that are "Unraisable" :-).

thouis added 2 commits May 23, 2012 22:25
Making these functions part of the API converts them to static, requiring the following
changes:
- npysort/sort.c.src: #define PyDataMem_NEW/FREE to malloc/free.
- src/_compiled_base.c: changed one pair of calls to use PyArray_malloc/free.

Also moved the trace_data_allocations() python function to numpy.core and
modified it to return the previous callbacks.
@thouis
Copy link
Contributor Author

thouis commented May 23, 2012

This is still a WIP, especially since some tests fail if I run np.test() while tracing that don't fail when run outside of tracing:

import numpy as np

total = 0
szs = {}

def inc(ptr, size):
    global total
    szs[ptr] = size
    total += size
    if size > 100:
        print total

def sub(ptr):
    global total
    total -= szs[ptr]

def res(newptr, oldptr, newsize):
    global total
    total -= szs[oldptr]
    szs[newptr] = newsize
    total += newsize

np.core.trace_data_allocations(inc, sub, res)
np.test()
print "END", total

@thouis
Copy link
Contributor Author

thouis commented May 23, 2012

My theory is that some tests are failing due to refcounts being invalid when the new functions are called, and the gc being triggered or objects otherwise being freed.

One test that regularly fails is test_regression.py:test_refcount_vectorize().

@thouis
Copy link
Contributor Author

thouis commented May 24, 2012

Some tests are still failing when run under tracing, though not consistently. Valgrind is not producing showing me anything within them in particular, but disabling gc completely seems to prevent them from failing.

@thouis
Copy link
Contributor Author

thouis commented May 24, 2012

scipy.test() also has some errors that only occur while tracing.

I'm going to set this aside for a few days, and perhaps write a simpler version that doesn't call back into Python. Instead, it can expose the malloc/free counts/bytes via a function, and track them in pure C code. Then the same approach as https://github.com/fabianp/memory_profiler can be used to track allocation line-by-line and most of the complexity goes away.

@thouis
Copy link
Contributor Author

thouis commented Jun 5, 2012

All of the failing tests seem to have been caused by the buffer copy bug, fixed in https://github.com/mwiebe/numpy/tree/nditer_buffer_flag (but not yet pulled into numpy).

I also have a version that implements tracing, with pure C in the allocation functions writing to a dynamically allocated buffer, which must then be fetched proactively by Python. However, I think this version is a little nicer to use from the Python perspective.

@charris
Copy link
Member

charris commented Jun 5, 2012

I'd think the PyDataMem_* memory allocations/deallocations should remain completely independent of Python, I suspect that is why they are there.

@thouis
Copy link
Contributor Author

thouis commented Jun 6, 2012

Thinking more about this, the lower-level version is probably just as easy to use, and avoids the potential issues with GC and python allocation, so I'll close this PR and open a new one when it's ready.

@thouis thouis closed this Jun 6, 2012
luyahan pushed a commit to plctlab/numpy that referenced this pull request Apr 25, 2024
feat: Add vqdmulh[q]_[s16|s32]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants