Skip to content

Multiple time measurements #1

@mrdwab

Description

@mrdwab

It would be nice for those of us who are lazy to have convenient names for repeated measures in a wide format.

Consider:

r_data_frame(
  n = 5,
  id, 
  race, race, race, 
  age, age, age
)
# Source: local data frame [5 x 7]
# 
#   ID     Race   Race.1   Race.2 Age Age.1 Age.2
#1  1 Hispanic    White Hispanic  30    30    32
#2  2    White Hispanic    White  31    20    30
#3  3    White    White    White  26    23    25
#4  4    White    Black    White  20    30    31
#5  5    Asian    White    White  20    28    24

Generally, the preferred form would be to have all "times" identified. Thus, at the very minimum, Race should become Race.0 for balance in the naming scheme.

I know I can just do:

r_data_frame(
  n = 5,
  id, 
  Race_1 = race, Race_2 = race, Race_3 = race, 
  Age_1 = age, Age_2 = age, Age_3 = age
)
# Source: local data frame [5 x 7]
# 
#   ID   Race_1   Race_2 Race_3 Age_1 Age_2 Age_3
#1  1    White Hispanic  White    24    30    23
#2  2 Hispanic    White  Black    32    35    32
#3  3    White    White  White    28    21    25
#4  4    White    White  White    33    22    24
#5  5    White    Black  White    31    30    21

But that's a lot of extra typing :-(


I haven't dug into your code (hence raising an issue and not a pull request), but it's possible that the fix might be something as easy as:

r_data_frame <- function (n, ...) 
{
  out <- r_list(n = n, ...)
  temp <- names(out)
  temp <- ave(temp, temp, FUN = function(x) 
    if (length(x) == 1) x else paste(x, seq_along(x), sep = "_"))
  out <- setNames(data.frame(out, stringsAsFactors = FALSE, 
                             check.names = FALSE), temp)
  dplyr::tbl_df(out)
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions