The modulo operator (%) is an extremely useful arithmetic operator in R that finds the remainder after division. It has widespread applications in random number generation, checking divisibility, wrapping values, and more. In this comprehensive guide, we will explore the modulo operator in depth and see how it can be leveraged effectively in data analysis and programming.
What is the Modulo Operator?
The modulo operator, represented by the % symbol, gives the remainder left over after one number is divided by another. It works on two numeric operands – the number being divided (the dividend) and the number dividing it (the divisor).
For example:
7 % 3 = 1
Here, when 7 is divided by 3, the result is 2 with a remainder of 1. So 7 % 3 equals 1.
The formal mathematical definition is:
For two integers a and b, a % b = r if there exists an integer q such that:
a = b * q + r, where 0 ≤ r < |b|
Where r is the remainder left over after dividing a by b.
The modulo result always has the sign of the dividend, not the divisor. This detail is important when dealing with negative numbers as we‘ll see later.
Why Use the Modulo Operator?
There are several common uses cases for the modulo operator in R:
1. Generate Repeating Sequences
Since the modulo result cycles between 0 and the divisor, we can use it to generate repeating sequences of numbers.
For example to print numbers from 1 to 10 repeatedly:
nums <- 1:20
nums %% 10 + 1
#> [1] 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10
By taking nums modulo 10, the sequence just repeats after 10.
2. Wrap-Around Values
Similar to repeating sequences, the modulo operator can wrap values within a fixed range.
For example:
wraps <- c(-1, 0, 1, 11, 12, 13) %% 10
#> [1] 9 0 1 1 2 3
This wraps values between 0 and 9.
Game developers use this technique to wrap character positions within boundaries.
3. Check Divisibility
Since the modulo of a number by its factor is 0, we can use modulo to check whether a number is divisible by another.
For example to test for even/odd numbers:
nums <- 1:10
nums %% 2 == 0 #even
#> [1] FALSE TRUE FALSE TRUE FALSE TRUE FALSE TRUE FALSE TRUE
nums %% 2 != 0 #odd
#> [1] TRUE FALSE TRUE FALSE TRUE FALSE TRUE FALSE TRUE FALSE
Here numbers divisible by 2 have a remainder 0 when divided by 2.
4. Random Sampling
Taking random numbers modulo a limit gives samples between 0 and limit-1. This is useful for simulations and games.
For example:
sample(100, 6) %% 6
#> [1] 2 4 3 1 5 3
Gives 6 random numbers from 0 to 5.
There are many other applications like hash functions, image processing, and statistics which rely on the properties of the modulo operator.
Modulo Operation in R
The modulo operator in R works similarly to other programming languages.
There are some key properties and quirks to note when using it:
-
The
%symbol is used for modulo in R. -
It works with integer and floating point numbers.
-
The result has the same sign as the dividend.
-
It gives a compiler error for division by 0 or NaN values.
-
The order of operations applies, so expressions get evaluated before modulo unless parenthesis are used.
Now let‘s see some examples of using modulo in R.
Example 1: Modulo Operation on Scalars
The most basic usage is to find the remainder of two integer scalars.
10 % 4 #10 / 4 = 2 rem 2
#> [1] 2
If the division results in no remainder, modulo returns 0.
16 % 4 #16 / 4 = 4 rem 0
#> [1] 0
And for floating point numbers:
10.5 % 3.2 #10.5 / 3.2 = 3 rem 1.1
#Result is rounded down
#> [1] 1.1
Note that the remainder takes on the sign of the dividend:
-10 % 7
#> [1] -3
10 % -7
#> [1] 3
Trying to divide by 0 gives an error:
10 % 0
#> Error in 10%%0: NaN produced
Example 2: Vectorized Modulo
One of the biggest advantages of R is its vectorization. Modulo works element-wise on vectors and matrices.
c(10, 11, 20) % c(2, 3, 5)
#> [1] 0 2 0
When one operand is shorter than the other, it gets recycled:
c(10, 11, 20, 25) % c(2,7)
#> [1] 0 4 6 1
This applies the shorter divisor vector repeatedly on the longer dividend.
Matrices work similarly:
matrix1 <- matrix(1:9, ncol = 3)
matrix2 <- matrix(c(3,5), ncol = 1)
matrix1 % matrix2
# [,1] [,2] [,3]
# [1,] 1 0 1
# [2,] 2 3 2
# [3,] 0 1 0
Column-wise modulo is applied based on the dimensions.
Example 3: Modulo in Random Sampling
Here is an example of using modulo for random number generation between 0 and 5:
set.seed(10)
rand_nums <- sample(50000, 50)
sample_small <- rand_nums %% 6
table(sample_small)
#> sample_small
#> 0 1 2 3 4 5
#> 9 7 8 8 9 9
We took a large random sample, took modulo by 6 to wrap the range from 0 to 5, and tested that the distribution is fair.
This technique is useful for simulations, games, and applications where controlled sampling is required.
Special Cases and Errors
There are some special cases and pitfalls to be aware of when working with modulo in R:
Division by 0
Attempting division by 0 throws an error:
10 % 0
#> Error in 10%%0: NaN produced
So check for 0 divisors before using modulo.
NaN values
Modulo with NaN also gives an error:
10 % NaN
#> Error in 10%%NaN: NaN produced
If there is any chance your data contains NaN, filter them out before modulo.
Rounding with Decimals
While R allows modulo for floats, the rounding can cause unexpected results:
10.5 % 3 #10.5 / 3 = 3 rem 1.5
#1.5 gets rounded down to 1
#> [1] 1
If precision is important, convert to integers first before using modulo.
Order of Operations
Anything inside parentheses gets evaluated first, so be careful:
10 + 1 %% 3 # Gets evaluted as 10 + (1 %% 3)
#> [1] 11
In some languages, modulo has higher precedence than + and – but not in R. Use parenthesis whenever unsure.
Performance with Big Data
The modulus operator works element-wise out-of-the-box on vectors and matrices in R. But performance can slow down significantly on extremely large numeric data (1e6+ values).
On big data, it is better to use vectorization packages like data.table and dplyr:
library(data.table)
big_data <- data.table(values = sample(1e7))
system.time(big_data[, value_mod_3 := values %% 3])
#> user system elapsed
#> 0.088 0.004 0.093
Here we add a new mod 3 column without any for-loop. This vectorizes modulo on large data performantly.
For even faster speeds, one can use parallelization with packages like foreach, R‘s builtin parallel, and future packages. But the principles remain the same.
Conclusion
The modulo operator is a simple but extremely effective tool that every R programmer should have in their belt. It has versatilities across random sampling, divisibility checks, sequence generation, and more – making it invaluable for statistics, machine learning, and general programming.
I hope this guide gives you a comprehensive overview of how modulo works in R and how you can apply it to your own data tasks. Modulo might seem like a small math operator, but it can provide big value if leveraged properly in code.
Let me know in the comments if you have any other interesting use cases of the modulo operator!


