Skip to content

[DataFrame] Sample implement#1954

Merged
devin-petersohn merged 7 commits intoray-project:masterfrom
osalpekar:sample_implement
Apr 30, 2018
Merged

[DataFrame] Sample implement#1954
devin-petersohn merged 7 commits intoray-project:masterfrom
osalpekar:sample_implement

Conversation

@osalpekar
Copy link
Copy Markdown
Contributor

Implements sample() function for ray dataframes

@AmplabJenkins
Copy link
Copy Markdown

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/5076/
Test FAILed.

@devin-petersohn
Copy link
Copy Markdown
Member

Jenkins, retest this please.

Copy link
Copy Markdown
Member

@devin-petersohn devin-petersohn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only thing I would add is some extra comments when the logic gets a bit hard to follow. Looks great!

for sample in samples]

if axis == 1:
new_cols = [_deploy_func.remote(lambda df: df.iloc[:, [tup[1]]],
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a comment here about the logic. What is tup[0] and tup[1], etc.

@AmplabJenkins
Copy link
Copy Markdown

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/5109/
Test PASSed.

@AmplabJenkins
Copy link
Copy Markdown

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/5110/
Test FAILed.

@AmplabJenkins
Copy link
Copy Markdown

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/5111/
Test PASSed.

Copy link
Copy Markdown
Member

@devin-petersohn devin-petersohn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't put it everywhere, but in a few of the comments, there is only a description of what is happening, but that should be evident from the code itself. I would prefer the comments to say why something is the way that it is.


if weights is not None:

# If a series, align with frame
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer for the comments to say why something was done, rather than what. I think that if you say why it's important to align with frame, that would be more helpful in reading the code..

if isinstance(weights, pd.Series):
weights = weights.reindex(self.axes[axis])

# Strings acceptable if a dataframe and axis = 0
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above.

raise ValueError("weight vector many not include negative "
"values")

# If has nan, set to zero.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above.

# If has nan, set to zero.
weights = weights.fillna(0)

# Renormalize if don't sum to 1
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above.

weights = weights.values

if n is None and frac is None:
# default to n = 1 if n and frac are None
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above.

weights = weights.fillna(0)

# Renormalize if don't sum to 1
if weights.sum() != 1:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sum weights once and use the variable for these two if statements and rebalance

@AmplabJenkins
Copy link
Copy Markdown

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/5117/
Test PASSed.

weights_sum = weights.sum()
if weights_sum != 1:
if weights_sum != 0:
weights = weights / weights.sum()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

weights.sum() -> weights_sum

@AmplabJenkins
Copy link
Copy Markdown

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/5128/
Test PASSed.

@devin-petersohn devin-petersohn merged commit 1231aa0 into ray-project:master Apr 30, 2018
royf added a commit to royf/ray that referenced this pull request Jun 22, 2018
* 'master' of https://github.com/ray-project/ray:
  [rllib] Fix broken link in docs (ray-project#1967)
  [DataFrame] Sample implement (ray-project#1954)
  [DataFrame] Implement Inter-DataFrame operations (ray-project#1937)
  remove UniqueIDHasher (ray-project#1957)
  [rllib] Add DDPG documentation, rename DDPG2 <=> DDPG (ray-project#1946)
  updates (ray-project#1958)
  Pin Cython in autoscaler development example. (ray-project#1951)
  Incorporate C++ Buffer management and Seal global threadpool fix from arrow (ray-project#1950)
  [XRay] Add consistency check for protocol between node_manager and local_scheduler_client (ray-project#1944)
  Remove smart_open install. (ray-project#1943)
  [DataFrame] Fully implement append, concat and join (ray-project#1932)
  [DataFrame] Fix for __getitem__ string indexing (ray-project#1939)
  [DataFrame] Implementing write methods (ray-project#1918)
  [rllib] arr[end] was excluded when end is not None (ray-project#1931)
  [DataFrame] Implementing API correct groupby with aggregation methods (ray-project#1914)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants