以下是创建两个数据集的示例:
from sklearn.linear_model import LogisticRegression from sklearn.datasets import make_classification # data set 1 X1,y1 = make_classification(n_classes=2,n_features=5,random_state=1) # data set 2 X2,y2 = make_classification(n_classes=2,random_state=2)
我想使用具有相同参数值的Logistic回归估计器来拟合每个数据集上的分类器:
lr = LogisticRegression() clf1 = lr.fit(X1,y1) clf2 = lr.fit(X2,y2) print "Classifier for data set 1: " print " - intercept: ",clf1.intercept_ print " - coef_: ",clf1.coef_ print "Classifier for data set 2: " print " - intercept: ",clf2.intercept_ print " - coef_: ",clf2.coef_
问题是两个分类器是一样的:
Classifier for data set 1: - intercept: [ 0.05191729] - coef_: [[ 0.06704494 0.00137751 -0.12453698 -0.05999127 0.05798146]] Classifier for data set 2: - intercept: [ 0.05191729] - coef_: [[ 0.06704494 0.00137751 -0.12453698 -0.05999127 0.05798146]]
对于这个简单的例子,我可以使用像:
lr1 = LogisticRegression() lr2 = LogisticRegression() clf1 = lr1.fit(X1,y1) clf2 = lr2.fit(X2,y2)
避免这个问题.然而,问题仍然是:如何复制/复制具有其特定参数值的估计器?
解决方法
from sklearn.base import clone lr1 = LogisticRegression() lr2 = clone(lr1)