Calculate massive covariance matrix in parallel

Speed up covariance calculation for large matrices. The default behavior is the same as cov ('pearson', no NA handling).

Usage

fast_cov(x, y = NULL, col_x = NULL, col_y = NULL, df = NA)

Arguments

x: a numeric vector, matrix or data frame; a matrix is highly recommended to maximize the performance
y: NULL (default) or a vector, matrix or data frame with compatible dimensions to x; the default is equivalent to y = x
col_x: integers indicating the subset indices (columns) of x to calculate the covariance, or NULL to include all the columns; default is NULL
col_y: integers indicating the subset indices (columns) of y to calculate the covariance, or NULL to include all the columns; default is NULL
df: a scalar indicating the degrees of freedom; default is nrow(x)-1

Value

A covariance matrix of x and y. Note that there is no NA handling. Any missing values will lead to NA in the resulting covariance matrices.

Examples


# Set ncores = 2 to comply to CRAN policy. Please don't run this line
ravetools_threads(n_threads = 2L)

x <- matrix(rnorm(400), nrow = 100)

# Call `cov(x)` to compare
fast_cov(x)
#>            [,1]         [,2]        [,3]         [,4]
#> [1,] 1.30987040  0.013002746  0.06263758  0.031019785
#> [2,] 0.01300275  0.905270345  0.02027502 -0.007362583
#> [3,] 0.06263758  0.020275016  0.77420186 -0.147601612
#> [4,] 0.03101978 -0.007362583 -0.14760161  1.287895141

# Calculate covariance of subsets
fast_cov(x, col_x = 1, col_y = 1:2)
#>         [,1]       [,2]
#> [1,] 1.30987 0.01300275

# \donttest{

# Speed comparison, better to use multiple cores (4, 8, or more)
# to show the differences.

ravetools_threads(n_threads = -1)
x <- matrix(rnorm(100000), nrow = 1000)
microbenchmark::microbenchmark(
  fast_cov = {
    fast_cov(x, col_x = 1:50, col_y = 51:100)
  },
  cov = {
    cov(x[,1:50], x[,51:100])
  },
  unit = 'ms', times = 10
)
#> Unit: milliseconds
#>      expr      min       lq     mean   median       uq      max neval
#>  fast_cov 1.527198 1.541936 1.790997 1.564553 1.638705 3.734895    10
#>       cov 5.336953 5.359815 5.422919 5.413061 5.448160 5.640749    10

# }