本文介绍了RandomSearchCV 超慢 - 故障排除性能增强的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在研究下面的随机森林分类脚本,但遇到了一些与随机搜索性能相关的问题 - 需要很长时间才能完成 &我想知道是我做错了什么还是我可以做得更好以使其更快.

I have been working on the below script for random forest classification and am running into some problems related to the performance of the randomized search - it's taking a very long time to complete & I wonder if there is either something I am doing wrong or something I could do better to make it faster.

有人可以建议我可以改进速度/性能吗?

Would anybody be able to suggest speed/performance improvements I could make?

提前致谢!

forest_start_time = time.time()

model = RandomForestClassifier()
param_grid = {
    'bootstrap': [True, False],
    'max_depth': [80, 90, 100, 110],
    'max_features': [2, 3],
    'min_samples_leaf': [3, 4, 5],
    'min_samples_split': [8, 10, 12],
    'n_estimators': [200, 300, 500, 1000]
}

bestforest = RandomizedSearchCV(estimator = model, 
                                param_distributions = param_grid, 
                                cv = 3, n_iter = 10, 
                                n_jobs = available_processor_count)

bestforest.fit(train_features, train_labels.ravel())
forest_score = bestforest.score(test_features, test_labels.ravel())
print(forest_score)
forest_end_time = time.time()
forest_duration = forest_start_time-forest_end_time

推荐答案

加快速度的唯一方法是 1) 减少功能或/和使用更多 CPU 内核 n_jobs = -1:

The only way to speed this up is to 1) reduce the features or/and use more CPU cores n_jobs = -1:

bestforest = RandomizedSearchCV(estimator = model, 
                                param_distributions = param_grid, 
                                cv = 3, n_iter = 10, 
                                n_jobs = -1)

这篇关于RandomSearchCV 超慢 - 故障排除性能增强的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-19 02:05