Skip to content

Feature: add function for anonymizing data#162

Merged
moralec merged 4 commits into
mainfrom
feature/anonymization
Jul 2, 2021
Merged

Feature: add function for anonymizing data#162
moralec merged 4 commits into
mainfrom
feature/anonymization

Conversation

@martinctc

@martinctc martinctc commented Jun 30, 2021

Copy link
Copy Markdown
Member

Summary

This branch introduces a function for anonymizing data, as per #156. The use case of this function is to make POC artifacts shareable, as they are often created using real Workplace Analytics data which is typically highly confidential.

Changes

The changes made in this PR are:

  1. Added anonymize() / anonymise().
  2. Added jitter_metrics()

Examples

Anonymize the Organization attribute:

sq_data %>%
  mutate(Organization = anonymise(Organization)) %>%
  email_sum(hrvar = "Organization")

image

Add jitter to a metric:

jittered <- jitter_metrics(sq_data, cols = "Collaboration_hours")
head(
  data.frame(
    original = sq_data$Collaboration_hours,
    jittered = jittered$Collaboration_hours
  )
)

  original jittered
1 18.74210 18.73427
2 15.02403 15.00658
3 14.27897 14.29141
4 12.69034 12.68973
5 10.99079 10.97063
6 18.25287 18.22849

Results:

Checks

  • All R CMD checks pass
  • roxygen2::roxygenise() has been run prior to merging to ensure that .Rd and NAMESPACE files are up to date.
  • NEWS.md has been updated.

Notes

This fixes #156.

@martinctc martinctc self-assigned this Jun 30, 2021
@martinctc martinctc added the enhancement New feature or request label Jun 30, 2021
@martinctc martinctc marked this pull request as ready for review June 30, 2021 10:50
@moralec moralec merged commit ce4db4c into main Jul 2, 2021
@moralec moralec deleted the feature/anonymization branch July 2, 2021 13:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature request: a function for anonymizing HR attributes

2 participants