R / R Blog / R News

Growing Objects and Loop Memory Pre-Allocation

by insightr · August 24, 2018

This article is originally published at https://insightr.wordpress.com

By Thiago Milagres

Preallocating Memory

This will be a short post about a simple, but very important concept that can drastically increase the speed of poorly written codes. It is very common to see R loops written as follows:

v = NULL
n = 1e5
for(i in 1:n) v = c(v, i)

This seems like a natural way to write such a task: at each iteration, we increase our vector v to add one more element to it.

However, we will see that if we simply initialize a vector (with NAs, zeros or any other value) with the total length and then run the loop, we can drastically increase the speed of our algorithm. To illustrate this point, we will use the microbenchmark package. First, let’s put both approaches in functions:

f_slow = function(n) {
  v = NULL
  for(i in 1:n) v = c(v, i)
  return(v)
}

f_fast = function(n) {
  v = rep(NA, n)
  for(i in 1:n) v[i] = i
  return(v)
}

Now to compare the speeds we can start with a small value of n, say 100.

library(microbenchmark)
library(ggplot2)
n = 100
mbm = microbenchmark("f_slow" = f_slow(n),
                      "f_fast" = f_fast(n), times = 50
                      )
print(mbm)

## Unit: microseconds
##    expr    min     lq   mean median     uq      max neval
##  f_slow 24.060 24.576 64.690 25.156 26.892 1958.947    50
##  f_fast  5.098  5.419 43.235  5.562  5.790 1882.557    50

We see that even for such a simple loop, preallocating the vector already makes a big difference in the runtime. For bigger values of n, such difference becomes much more relevant:

n = 1e5
mbm = microbenchmark("f_slow" = f_slow(n),
                      "f_fast" = f_fast(n), times = 50
                      )
print(mbm)

## Unit: milliseconds
##    expr      min       lq     mean   median       uq      max neval
##  f_slow 8615.797 8697.359 8736.413 8728.380 8776.894 8879.216    50
##  f_fast    4.683    4.968    5.257    5.100    5.196    7.098    50

Fully understanding why this happens would require understanding how dynamic allocation of memory works, which is not our focus here, but the basic idea is that when the object grows inside a loop, the computer must repeatedly find more space available for it in the memory, which is a costly process that can lead to memory fragmentation. For illustration purposes, let’s plot the object size (in bytes) as a function of i for our two loops. Note how, in the fast case, the object doesn’t increase in size over time.

library(pryr)
n = 1e5
object_size_slow_loop = rep(NA, n)
object_size_fast_loop = rep(NA, n)

v1 = NULL
for(i in 1:n) {
  v1 = c(v1, i)
  object_size_slow_loop[i] = object_size(v1)
}

v2 = rep(NA, n)
for(i in 1:n){
  v2[i] = i
  object_size_fast_loop[i] = object_size(v2)
} 

sizes = cbind.data.frame(object_size_fast_loop, object_size_slow_loop)
print(ggplot(sizes, aes(1:n)) +
  geom_line(aes(y = object_size_fast_loop, colour = "Fast loop")) +
  geom_line(aes(y = object_size_slow_loop, colour = "Slow loop")) +
  ggtitle("Size of the object at each iteration") + xlab("Iteration") + ylab("Size in bytes")
)

Beyond Vectors

In the above examples, we used vectors just because they are simpler to visualize, but note that this principle holds for any other data structure, such as matrices and lists. Let’s see one quick example of each, starting with the matrix case:

n = 1e3

f_slow_matrix = function(n) {
  M = NULL
  for(i in 1:n){
    M = rbind(M, rep(1, n))
  }
  return(M)
}

f_fast_matrix = function(n) {
  M = matrix(NA, nrow = n, ncol = n)
  for(i in 1:n){
    M[i,] = rep(1, n)
  }
  return(M)
}

mbmM = microbenchmark("f_slow_matrix" = f_slow_matrix(n),
                      "f_fast_matrix" = f_fast_matrix(n)
                      ,times = 50
)
print(mbmM)

## Unit: milliseconds
##           expr      min       lq     mean   median       uq       max neval
##  f_slow_matrix 5823.364 5858.490 5911.030 5914.014 5945.253  6040.606    50
##  f_fast_matrix    7.744    9.607   10.788   10.349   11.270    14.078    50

and, finally, an example with lists:

n = 1e4

f_slow_list = function(n) {
  L = list()
  for(i in 1:n){
    L[[paste0('index_',i)]] = i
  }
  return(L)
}

f_fast_list = function(n) {
  L = vector(mode = "list", length = n)
  for(i in 1:n){
    L[[i]] = i
  }
  names(L) = paste0("index_", 1:n)
  return(L)
}

mbmL = microbenchmark("f_slow_list" = f_slow_list(n),
                      "f_fast_list" = f_fast_list(n)
                      ,times = 50
)
print(mbmL)

## Unit: milliseconds
##         expr     min      lq    mean  median      uq      max neval
##  f_slow_list 607.299 616.947 682.738 639.575 656.597 1298.282    50
##  f_fast_list   3.273   3.364   3.967   3.559   3.775    7.704    50

Final Note: When maximum length is not known from the start

It could be the case that the maximum length is not known to the user. However, an upper bound (even a very bad one) could be used, and then after running the loop we can simply delete the part that wasn’t useful. For example, suppose that the true length is 100,000, but all we known is the upper bound 10,000,000. We can preallocate memory using our upper bound, as follows:

n = 1e5
UB = 1e7
f_slow = function(n) {
  v = NULL
  for(i in 1:n) v = c(v, i)
  return(v)
}

f_fast_upper_bound = function(n, UB) {
  v = rep(NA, UB)
  for(i in 1:n) {
    v[i] = i
  }
  v = v[1:n]
  return(v)
}

mbmUB = microbenchmark("f_slow" = f_slow(n),
                      "f_fast_upper_bound" = f_fast_upper_bound(n, 1e7), times = 50
                      )
print(mbmUB)

## Unit: milliseconds
##                expr      min       lq     mean   median       uq      max neval
##              f_slow 7529.350 7903.024 8394.622 8598.460 8828.130 9318.457    50
##  f_fast_upper_bound   38.629   54.643  112.712   128.72  137.910  224.047    50

Thanks for visiting r-craft.org
This article is originally published at https://insightr.wordpress.com
Please visit source website for post related comments.

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Growing Objects and Loop Memory Pre-Allocation

You may also like...

Categories

Growing Objects and Loop Memory Pre-Allocation

Preallocating Memory

Beyond Vectors

Final Note: When maximum length is not known from the start

You may also like...

How R gets built on Windows

Login to the cluster, request a node and change to your project directory in a single command

Building a scaleable Deep Learning Serving Environment for Keras models using NVIDIA TensorRT Server and Google Cloud

Categories