当串行执行随机森林时,它在我的系统上使用8GB的RAM,当并行执行它时,它使用超过两倍的RAM(18GB).如果并行执行此操作,如何将其保持在8GB?这是代码:
install.packages('foreach') install.packages('doSMP') install.packages('randomForest') library('foreach') library('doSMP') library('randomForest') NbrOfCores <- 8 workers <- startWorkers(NbrOfCores) # number of cores registerDoSMP(workers) getDoParName() # check name of parallel backend getDoParVersion() # check version of parallel backend getDoParWorkers() # check number of workers #creating data and setting options for random forests #if your run this please adapt it so it won't crash your system! This amount of data uses up to 18GB of RAM. x <- matrix(runif(500000),100000) y <- gl(2,50000) #options set.seed(1) ntree=1000 ntree2 <- ntree/NbrOfCores gc() #running serialized version of random forests system.time( rf1 <- randomForest(x,y,ntree = ntree)) gc() #running parallel version of random forests system.time( rf2 <- foreach(ntree = rep(ntree2,8),.combine = combine,.packages = "randomForest") %dopar% randomForest(x,ntree = ntree))