The record
function records the values issued by a specified
iterator to a file or connection object. The ireplay
function
returns an iterator that will replay those values. This is useful for
iterating concurrently over multiple, large matrices or data frames that you
can't keep in memory at the same time. These large objects can be recorded
to files one at a time, and then be replayed concurrently using minimal
memory.
Arguments
- iterable
The iterable to record to the file.
- con
A file path or open connection.
- ...
passed along to
iteror(iterable, ...)
Examples
suppressMessages(library(foreach))
m1 <- matrix(rnorm(70), 7, 10)
f1 <- tempfile()
record(iteror(m1, by='row', chunkSize=3), f1)
m2 <- matrix(1:50, 10, 5)
f2 <- tempfile()
record(iteror(m2, by='column', chunkSize=3), f2)
# Perform a simple out-of-core matrix multiply
p <- foreach(col=ireplay(f2), .combine='cbind') %:%
foreach(row=ireplay(f1), .combine='rbind') %do% {
row %*% col
}
dimnames(p) <- NULL
print(p)
#> [,1] [,2] [,3] [,4] [,5]
#> [1,] 19.939931 37.987451 56.034971 74.08249 92.13001
#> [2,] 4.977286 9.523644 14.070002 18.61636 23.16272
#> [3,] 4.289386 -3.875437 -12.040260 -20.20508 -28.36991
#> [4,] -12.543284 -13.287506 -14.031729 -14.77595 -15.52017
#> [5,] -19.103148 -46.572138 -74.041129 -101.51012 -128.97911
#> [6,] 1.819314 6.919221 12.019127 17.11903 22.21894
#> [7,] 1.047657 5.516499 9.985342 14.45418 18.92303
all.equal(p, m1 %*% m2)
#> [1] TRUE
unlink(c(f1, f2))