Skip to content

Slicing#11

Merged
mrocklin merged 24 commits intodask:masterfrom
mrocklin:slicing
Jan 28, 2015
Merged

Slicing#11
mrocklin merged 24 commits intodask:masterfrom
mrocklin:slicing

Conversation

@mrocklin
Copy link
Member

Smoothed out version of #9

  1. Rebases out synchronous commit (unrelated to this work)
  2. Merges in current master
  3. Fix style issues in docstrings
  4. Fix errant doctests

Most work done by @nevermindewe

Wesley Emeneker and others added 24 commits January 20, 2015 12:25
… and tests that verify the functionality.

Current state:
 - a reference implementation for 1D slicing is implemented
 - a helper function that (so far) constructs a correct dask slice graph (minus step, indexing without slicing, and ellipsis) for n-dimensional slicing passes tests for 1D and 2D slicing

Remaining to be written:
 - a front-end function (or perhaps a few functions) that setup the data necessary to correctly call the helper _dask_slice()

Remaining to be figured out:
 - how to deal with slice steps in a clear, concise, easy-to-understand way
 - how to deal with indexing and ellipsis (array[0,0:4,...]) or open colon (array[:,0:3])
This is based on @mrocklin's ideas and prototyping for how things should work.
The result is cleaner (so far).

This implementation passes basic tests, and does not include any optimizations yet.
Adding comments and doctests revealed some bugs in the previous code that didn't properly handle steps with offsets.
The updated code fixes these errors, but I'm not happy with it.
I think that the code can be significantly simpler and clearer, but I don't understand it well-enough yet.
As it stands, steps and starting offsets should be properly handled now.
These additions should better explain what and why the function.
Added doctests to give examples of what/why should be returned.
 - non-slice index examples added to doctest
 - Reformatted and updated docstrings
 - tested against python 3.4 and fixed integer vs. float division problems
 - reformatted to have 80 max 80 columns per line in code, and max 72 columns per line in comments
Added optimization where if all slices are slice(None,None,None), the original object is returned.
Conflicts:
	dask/array.py
This should complete dask block slicing.
Many test cases have been added.
Python 2 does not have an accumulate() in itertools, but python 3 does,
  so I added the reference implementation to compatibility.py
Added negative slicing capabilities to _slice_1d().
mrocklin added a commit that referenced this pull request Jan 28, 2015
@mrocklin mrocklin merged commit 755f3a4 into dask:master Jan 28, 2015
@mrocklin mrocklin deleted the slicing branch January 28, 2015 21:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant