Several parts of the codebase currently use superseded dplyr syntax, such as mutate_at(), summarise_at(), and other scoped variants. Since dplyr 1.0.0, these functions have been superseded in favor of using across() within verbs like mutate() and summarise(), which is now the recommended approach for applying functions across multiple columns.
Additionally, there are instances where select() and other tidyselect functions are used directly with string variables (e.g., select(df, varname) where varname is a character variable). The current best practice is to wrap such variables with all_of() or any_of() to ensure robust and predictable selection, especially when variables may or may not exist in the data frame.
Examples of superseded syntax:
mutate_at(vars(starts_with("x")), funs(mean))
summarise_at(vars(matches("score")), mean)
Recommended replacements:
mutate(across(starts_with("x"), mean))
summarise(across(matches("score"), mean))
For variable selection:
- Instead of
select(df, varname), use select(df, all_of(varname)) when varname is a character vector of column names.
Benefits of updating:
- Ensures compatibility with the latest
dplyr releases
- Improves code readability and maintainability
- Reduces risk of deprecation warnings or errors in future
dplyr versions
Action items:
- Refactor all instances of
*_at(), *_if(), and *_all() to use across() within the relevant verbs.
- Update any direct string-based column selection to use
all_of() or any_of() as appropriate.
These changes will help keep the codebase current with modern dplyr best practices.
Several parts of the codebase currently use superseded
dplyrsyntax, such asmutate_at(),summarise_at(), and other scoped variants. Sincedplyr1.0.0, these functions have been superseded in favor of usingacross()within verbs likemutate()andsummarise(), which is now the recommended approach for applying functions across multiple columns.Additionally, there are instances where
select()and other tidyselect functions are used directly with string variables (e.g.,select(df, varname)wherevarnameis a character variable). The current best practice is to wrap such variables withall_of()orany_of()to ensure robust and predictable selection, especially when variables may or may not exist in the data frame.Examples of superseded syntax:
mutate_at(vars(starts_with("x")), funs(mean))summarise_at(vars(matches("score")), mean)Recommended replacements:
mutate(across(starts_with("x"), mean))summarise(across(matches("score"), mean))For variable selection:
select(df, varname), useselect(df, all_of(varname))whenvarnameis a character vector of column names.Benefits of updating:
dplyrreleasesdplyrversionsAction items:
*_at(),*_if(), and*_all()to useacross()within the relevant verbs.all_of()orany_of()as appropriate.These changes will help keep the codebase current with modern
dplyrbest practices.