Skip to contents

The record function records the values issued by a specified iterator to a file or connection object. The ireplay function returns an iterator that will replay those values. This is useful for iterating concurrently over multiple, large matrices or data frames that you can't keep in memory at the same time. These large objects can be recorded to files one at a time, and then be replayed concurrently using minimal memory.

Usage

record(iterable, con, ...)

Arguments

iterable

The iterable to record to the file.

con

A file path or open connection.

...

passed along to iteror(iterable, ...)

Value

NULL, invisibly.

Details

Originally from the itertools package.

Examples


suppressMessages(library(foreach))

m1 <- matrix(rnorm(70), 7, 10)
f1 <- tempfile()
record(iteror(m1, by='row', chunkSize=3), f1)

m2 <- matrix(1:50, 10, 5)
f2 <- tempfile()
record(iteror(m2, by='column', chunkSize=3), f2)

# Perform a simple out-of-core matrix multiply
p <- foreach(col=ireplay(f2), .combine='cbind') %:%
       foreach(row=ireplay(f1), .combine='rbind') %do% {
         row %*% col
       }

dimnames(p) <- NULL
print(p)
#>            [,1]       [,2]       [,3]       [,4]       [,5]
#> [1,]  19.939931  37.987451  56.034971   74.08249   92.13001
#> [2,]   4.977286   9.523644  14.070002   18.61636   23.16272
#> [3,]   4.289386  -3.875437 -12.040260  -20.20508  -28.36991
#> [4,] -12.543284 -13.287506 -14.031729  -14.77595  -15.52017
#> [5,] -19.103148 -46.572138 -74.041129 -101.51012 -128.97911
#> [6,]   1.819314   6.919221  12.019127   17.11903   22.21894
#> [7,]   1.047657   5.516499   9.985342   14.45418   18.92303
all.equal(p, m1 %*% m2)
#> [1] TRUE
unlink(c(f1, f2))