Context
workpatterns_classify_bw classifies individuals by their hourly collaboration habits. This method groups individuals in 5 groups, depending on their number of active hours and flexibility of work.
The groups that result from the code are:
- 1 Standard with breaks workday: active for fewer than expected hours, with no activity outside working hours
- 2 Standard continuous workday: number of active hours equal expected hours, with no activity outside working hours
- 3 Standard flexible workday: number of active hours are less than or equal to expected hours, with some activity outside working hours
- 4 Long flexible workday: number of active hours exceed expected hours, with breaks occurring throughout
- 5 Long continuous workday: number of active hours exceed expected hours, with activity happening in a continuous block (no breaks)
- 6 Always on (13h+): number of active hours greater than or equal to 13
Issues identified
There are three issues to address:
- [Improvement] Code is currently very difficult to understand. Variable names are not always meaningful, and there is not enough commentary on the code.
- [Improvement] A customer has highlighted that the classification algorithm should run if you specified the expected working hours instead a start - end time.
- [Bug] There is a misclassification problem where individuals that are working flexibly the expected hours (e.g. 8 hours), are classified as 4 Long flexible workday instead of 3 Standard flexible workday
Steps to reproduce this last issue:
em_data %>% workpatterns_classify(start_hour = "0900", end_hour = "1700", return="data") %>% filter(Personas == "4 Long flexible workday") %>% View()
Suggested Solution
I would suggest we make the following changes to this code:
- Rename variable to make the code easier to read: Change "D" for exp_hours, and signals_total for active_hours.
- Make exp_hours a parameter that is by default calculated as end_hour - start_hour
- Remove the -1 for the calculation of expected hours and update the classification rules accordingly.
- Make sure return=data also returns end_hour , exp_hours and start_hour
- Remove unnecessary variables (start_hours_0)
- Add morae commentary to the code.
- Update plot so that the limits of the buckets are clearer (e.g. Standard hours 3-8 hours, extended hours 8-13)
Context
workpatterns_classify_bw classifies individuals by their hourly collaboration habits. This method groups individuals in 5 groups, depending on their number of active hours and flexibility of work.
The groups that result from the code are:
Issues identified
There are three issues to address:
Steps to reproduce this last issue:
em_data %>% workpatterns_classify(start_hour = "0900", end_hour = "1700", return="data") %>% filter(Personas == "4 Long flexible workday") %>% View()Suggested Solution
I would suggest we make the following changes to this code: