sklearn GridSearchCV with Pipeline

8 浏览
0 Comments

sklearn GridSearchCV with Pipeline

我正在尝试构建一个流水线,首先对我的训练数据进行随机化主成分分析(RandomizedPCA),然后拟合岭回归模型。以下是我的代码:

pca = RandomizedPCA(1000, whiten=True)
rgn = Ridge()
pca_ridge = Pipeline([('pca', pca),
                      ('ridge', rgn)])
parameters = {'ridge__alpha': 10 ** np.linspace(-5, -2, 3)}
grid_search = GridSearchCV(pca_ridge, parameters, cv=2, n_jobs=1, scoring='mean_squared_error')
grid_search.fit(train_x, train_y[:, 1:])

我知道有RidgeCV函数,但我想尝试一下Pipeline和GridSearch CV。

我希望网格搜索CV报告RMSE误差,但是sklearn似乎不支持这个,所以我只能用MSE来代替。然而,它报告的分数是负数的:

In [41]: grid_search.grid_scores_
Out[41]: 
[mean: -0.02665, std: 0.00007, params: {'ridge__alpha': 1.0000000000000001e-05},
 mean: -0.02658, std: 0.00009, params: {'ridge__alpha': 0.031622776601683791},
 mean: -0.02626, std: 0.00008, params: {'ridge__alpha': 100.0}]

显然,这对于均方误差来说是不可能的 - 我在这里做错了什么?

0