[DataFrame] Sample implement#1954
Conversation
a45fcf8 to
0519e3a
Compare
|
Test FAILed. |
|
Jenkins, retest this please. |
devin-petersohn
left a comment
There was a problem hiding this comment.
The only thing I would add is some extra comments when the logic gets a bit hard to follow. Looks great!
| for sample in samples] | ||
|
|
||
| if axis == 1: | ||
| new_cols = [_deploy_func.remote(lambda df: df.iloc[:, [tup[1]]], |
There was a problem hiding this comment.
Can you add a comment here about the logic. What is tup[0] and tup[1], etc.
|
Test PASSed. |
|
Test FAILed. |
|
Test PASSed. |
devin-petersohn
left a comment
There was a problem hiding this comment.
I didn't put it everywhere, but in a few of the comments, there is only a description of what is happening, but that should be evident from the code itself. I would prefer the comments to say why something is the way that it is.
python/ray/dataframe/dataframe.py
Outdated
|
|
||
| if weights is not None: | ||
|
|
||
| # If a series, align with frame |
There was a problem hiding this comment.
I would prefer for the comments to say why something was done, rather than what. I think that if you say why it's important to align with frame, that would be more helpful in reading the code..
python/ray/dataframe/dataframe.py
Outdated
| if isinstance(weights, pd.Series): | ||
| weights = weights.reindex(self.axes[axis]) | ||
|
|
||
| # Strings acceptable if a dataframe and axis = 0 |
python/ray/dataframe/dataframe.py
Outdated
| raise ValueError("weight vector many not include negative " | ||
| "values") | ||
|
|
||
| # If has nan, set to zero. |
python/ray/dataframe/dataframe.py
Outdated
| # If has nan, set to zero. | ||
| weights = weights.fillna(0) | ||
|
|
||
| # Renormalize if don't sum to 1 |
python/ray/dataframe/dataframe.py
Outdated
| weights = weights.values | ||
|
|
||
| if n is None and frac is None: | ||
| # default to n = 1 if n and frac are None |
python/ray/dataframe/dataframe.py
Outdated
| weights = weights.fillna(0) | ||
|
|
||
| # Renormalize if don't sum to 1 | ||
| if weights.sum() != 1: |
There was a problem hiding this comment.
Sum weights once and use the variable for these two if statements and rebalance
|
Test PASSed. |
python/ray/dataframe/dataframe.py
Outdated
| weights_sum = weights.sum() | ||
| if weights_sum != 1: | ||
| if weights_sum != 0: | ||
| weights = weights / weights.sum() |
There was a problem hiding this comment.
weights.sum() -> weights_sum
|
Test PASSed. |
* 'master' of https://github.com/ray-project/ray: [rllib] Fix broken link in docs (ray-project#1967) [DataFrame] Sample implement (ray-project#1954) [DataFrame] Implement Inter-DataFrame operations (ray-project#1937) remove UniqueIDHasher (ray-project#1957) [rllib] Add DDPG documentation, rename DDPG2 <=> DDPG (ray-project#1946) updates (ray-project#1958) Pin Cython in autoscaler development example. (ray-project#1951) Incorporate C++ Buffer management and Seal global threadpool fix from arrow (ray-project#1950) [XRay] Add consistency check for protocol between node_manager and local_scheduler_client (ray-project#1944) Remove smart_open install. (ray-project#1943) [DataFrame] Fully implement append, concat and join (ray-project#1932) [DataFrame] Fix for __getitem__ string indexing (ray-project#1939) [DataFrame] Implementing write methods (ray-project#1918) [rllib] arr[end] was excluded when end is not None (ray-project#1931) [DataFrame] Implementing API correct groupby with aggregation methods (ray-project#1914)
Implements
sample()function for ray dataframes