ARROW-14659: [R] Remove warning about factor conversion to string in if_else() #11794

stephhazlitt · 2021-11-29T19:49:44Z

This is my first PR contributing (or an attempt to contribute) to {arrow}.

This PR:

• removes warn_types warning that factors are converted to strings, which is no longer true https://github.com/apache/arrow/blob/master/r/R/dplyr-functions.R#L911-L920
• updates the test by removing the warning https://github.com/apache/arrow/blob/master/r/tests/testthat/test-dplyr-funcs-conditional.R#L130 ~~however it does not remove the mutate() in the test as suggested in the TODO, if removed the test fails?~~
• [UPDATE] test includes a reset of the levels of all factor columns to pass, since Arrow if_else() kernel does not preserve unused factor levels (ARROW-14649)

github-actions · 2021-11-29T19:50:04Z

https://issues.apache.org/jira/browse/ARROW-14659

nealrichardson · 2021-11-29T19:53:54Z

if removed the test fails?

Fails how?

stephhazlitt · 2021-11-29T19:59:43Z

With the mutate line removed (line 128)

e.g.

 compare_dplyr_binding(
    .input %>%
      mutate(
        y = if_else(int > 5, fct, factor("a"))
      ) %>%
      collect(),
    tbl
  )

── Failure (test-dplyr-funcs-conditional.R:119:3): if_else and ifelse ───────────────────────────
`object` (`actual`) not equal to `expected` (`expected`).

`levels(actual$y)[1:4]`:   "a"             "g" "h" "i"
`levels(expected$y)[1:7]`: "a" "b" "c" "d" "g" "h" "i"

  `actual$y[4:10]`: NA 1 NA 2 3 4 5
`expected$y[4:10]`: NA 1 NA 5 6 7 8
Backtrace:
 1. compare_dplyr_binding(...) test-dplyr-funcs-conditional.R:119:2
 2. expect_equal(via_table, expected, ...) helper-expectation.R:129:4
 3. testthat::expect_equal(...) helper-expectation.R:42:4```

ianmcook · 2021-11-29T20:14:20Z

There is a similar test here, in which I solved the same failure by using:

transmute(across(where(is.factor), ~ factor(.x, levels = c(...))))

https://github.com/apache/arrow/pull/11272/files?authenticity_token=86nJx4XEwecmmySQatP4CD6%2BYi62xBNQWT4vtOE0iIzJT6wH0ZJM3imNvGEBkfiPii6v0VXSUmag%2B5%2F8ycn0CQ%3D%3D&file-filters%5B%5D=.R#diff-c63a873ac0b560d2ca7229ac1df2628df8396b0cc51fc247b71a2499c492fe3aR317-R320

I think you can use that approach here too!

stephhazlitt · 2021-11-29T20:44:01Z

Thanks @ianmcook, I replaced the as.character mutate with your approach above and added a comment to reference ARROW-14649. And apologies, I should have added a co-author to that commit.

r/tests/testthat/test-dplyr-funcs-conditional.R

Co-authored-by: Neal Richardson <neal.p.richardson@gmail.com>

r/tests/testthat/test-dplyr-funcs-conditional.R

Co-authored-by: Ian Cook <ianmcook@gmail.com>

ursabot · 2021-11-29T23:53:11Z

Benchmark runs are scheduled for baseline = 4913352 and contender = b83e6b0. b83e6b0 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Finished ⬇️0.0% ⬆️0.0%] ec2-t3-xlarge-us-east-2
[Failed] ursa-i9-9960x
[Finished ⬇️0.8% ⬆️0.13%] ursa-thinkcentre-m75q
Supported benchmarks:
ursa-i9-9960x: langs = Python, R, JavaScript
ursa-thinkcentre-m75q: langs = C++, Java
ec2-t3-xlarge-us-east-2: cloud = True

stephhazlitt added 2 commits November 29, 2021 10:05

rm if_else type warning

32482a4

r warning from if_else test

f91d907

github-actions bot added the Component: R label Nov 29, 2021

reset factor levels in test

a13b4a5

nealrichardson reviewed Nov 29, 2021

View reviewed changes

r/tests/testthat/test-dplyr-funcs-conditional.R Outdated Show resolved Hide resolved

Update r/tests/testthat/test-dplyr-funcs-conditional.R

a4d9378

Co-authored-by: Neal Richardson <neal.p.richardson@gmail.com>

ianmcook reviewed Nov 29, 2021

View reviewed changes

r/tests/testthat/test-dplyr-funcs-conditional.R Outdated Show resolved Hide resolved

Update r/tests/testthat/test-dplyr-funcs-conditional.R

ae404c9

Co-authored-by: Ian Cook <ianmcook@gmail.com>

ianmcook approved these changes Nov 29, 2021

View reviewed changes

ianmcook closed this in b83e6b0 Nov 29, 2021

asfimport mentioned this pull request Nov 30, 2021

[R] Remove warning about factor conversion to string in if_else() #30201

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ARROW-14659: [R] Remove warning about factor conversion to string in if_else() #11794

ARROW-14659: [R] Remove warning about factor conversion to string in if_else() #11794

Uh oh!

stephhazlitt commented Nov 29, 2021 •

edited

Loading

Uh oh!

github-actions bot commented Nov 29, 2021

Uh oh!

nealrichardson commented Nov 29, 2021

Uh oh!

stephhazlitt commented Nov 29, 2021

Uh oh!

ianmcook commented Nov 29, 2021

Uh oh!

stephhazlitt commented Nov 29, 2021

Uh oh!

Uh oh!

Uh oh!

ursabot commented Nov 29, 2021 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ARROW-14659: [R] Remove warning about factor conversion to string in if_else() #11794

ARROW-14659: [R] Remove warning about factor conversion to string in if_else() #11794

Uh oh!

Conversation

stephhazlitt commented Nov 29, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Nov 29, 2021

Uh oh!

nealrichardson commented Nov 29, 2021

Uh oh!

stephhazlitt commented Nov 29, 2021

Uh oh!

ianmcook commented Nov 29, 2021

Uh oh!

stephhazlitt commented Nov 29, 2021

Uh oh!

Uh oh!

Uh oh!

ursabot commented Nov 29, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

stephhazlitt commented Nov 29, 2021 •

edited

Loading

ursabot commented Nov 29, 2021 •

edited

Loading