Skip to contents

Creates forked clusters. If fails, then switch to alternative plan (default is "multisession").

Usage

make_forked_clusters(
  workers = future::availableCores(),
  on_failure = getOption("dipsaus.cluster.backup", "sequential"),
  clean = FALSE,
  ...
)

Arguments

workers

positive integer, number of cores to use

on_failure

alternative plan to use if failed. This is useful when forked process is not supported (like 'windows'); default is options("dipsaus.cluster.backup") or 'sequential'

clean

whether to reverse the plan on exit. This is useful when use make_forked_clusters inside of a function. See details and examples.

...

passing to future::plan

Value

Current future plan

Details

This was original designed as a wrapper for future::plan(future::multicore, ...). Forked clusters are discouraged when running in 'RStudio' because some pointers in 'RStudio' might be incorrectly handled, causing fork-bombs. However, forked process also has big advantages over other parallel methods: there is no data transfer needed, hence its speed is very fast. Many external pointers can also be shared using forked process. Since version 1.14.0, unfortunately, forked 'multicore' is banned by future package by default, and you usually need to enable it manually. This function provides a simple way of enable it and plan the future at the same time.

On windows, forked process is not supported, under this situation, the plan fall back to sequential, which might not be what you want. In such case, this function provides an alternative strategy that allows you to plan. You could also always enable the alternative strategy by setting dipsaus.no.fork option to true.

The parameter clean allows you to automatically clean the plan. This function allows you to reverse back to previous plan automatically once your function exits. For example, users might have already set up their own plans, clean=TRUE allows you to set the plan back to those original plans once function exit. To use this feature, please make sure this function is called within another function, and you must collect results before exiting the outer function.

See also

Examples




if(interactive()){

  # ------ Basic example
  library(future)
  library(dipsaus)

  # sequential
  plan("sequential")

  make_forked_clusters()
  plan()  # multicore, or multisession (on windows)

  Sys.getpid()  # current main session PID
  value(future({Sys.getpid()}))  # sub-process PID, evaluated as multicore

  # ------ When fork is not supported

  # reset to default single core strategy
  plan("sequential")

  # Disable forked process
  options("dipsaus.no.fork" = TRUE)
  options("dipsaus.cluster.backup" = "multisession")

  # Not fall back to multisession
  make_forked_clusters()
  plan()

  # ------ Auto-clean

  # reset plan
  plan("sequential")
  options("dipsaus.no.fork" = FALSE)
  options("dipsaus.cluster.backup" = "multisession")

  # simple case:
  my_func <- function(){
    make_forked_clusters(clean = TRUE)

    fs <- lapply(1:4, function(i){
      future({Sys.getpid()})
    })

    unlist(value(fs))
  }

  my_func()    # The PIDs are different, meaning they ran in other sessions
  plan()       # The plan is sequential, auto reversed strategy

  # ------ Auto-clean with lapply_async2
  my_plan <- plan()

  # lapply_async2 version of the previous task
  lapply_async2(1:4, function(i){
    Sys.getpid()
  })

  identical(plan(), my_plan)

}