Use this function to create a partitioned data frame from existing data frames spread across a cluster

party_df(cluster, name, auto_rm = FALSE)

Arguments

cluster

A cluster

name

Name of data frame variable. Must exist on every worker, be a data frame, and have the same names.

auto_rm

If `TRUE`, will automatically `rm()` the data frame on the workers when this object is created.

Examples

# If a real example, you might spread file names across the clusters # and read in using data.table::fread()/vroom::vroom()/qs::qread(). cl <- default_cluster() cluster_send(cl[1], n <- 10) cluster_send(cl[2], n <- 15) cluster_send(cl, df <- data.frame(x = runif(n))) df <- party_df(cl, "df") df
#> Source: party_df [25 x 1] #> Shards: 2 [10--15 rows] #> #> x #> <dbl> #> 1 0.753 #> 2 0.726 #> 3 0.945 #> 4 0.233 #> 5 0.934 #> 6 0.270 #> # … with 19 more rows