将您的数据进行重新整形，若您的数据只有一个特征，则使用array.reshape(-1,1)，若您的数据只包含一个样本，则使用array.reshape(1,-1)。

Question

28 浏览2023年5月21日

匿名的 2022年8月12日

0 Comments

在从我的数据中预测一个样本时，它会给出重塑错误，但我的模型行数相等，问题出在哪里，朋友们，发现类似的问题但没有解释。

import pandas as pd
from sklearn.linear_model import LinearRegression
import numpy as np
x = np.array([2.0 , 2.4, 1.5, 3.5, 3.5, 3.5, 3.5, 3.7, 3.7])
y = np.array([196, 221, 136, 255, 244, 230, 232, 255, 267])
lr = LinearRegression()
lr.fit(x,y)
print(lr.predict(2.4))

错误消息是

\"if it contains a single sample.\".format(array))

ValueError: Expected 2D array, got scalar array instead:

array=2.4.

Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample.

admin 更改状态以发布 2023年5月21日

0

2 答案

匿名的 · Answer 1 · 2022-08-12T20:57:58+00:00

错误基本上说要将平面特征数组转换为列数组。reshape(-1, 1) 可以完成此任务；也可以使用[:, None]。

特征数组X的第二维必须与传递给predict()的内容的第二维匹配。由于将X强制转换为2D数组，因此传递给predict()的数组也应该是2D。

x = np.array([2.0 , 2.4, 1.5, 3.5, 3.5, 3.5, 3.5, 3.7, 3.7])
y = np.array([196, 221, 136, 255, 244, 230, 232, 255, 267])
X = x[:, None]         # X.ndim should be 2
lr = LinearRegression()
lr.fit(X, y)
prediction = lr.predict([[2.4]])

如果输入是一个Pandas列，则使用双括号([[]])获取一个2D特征数组。

df = pd.DataFrame({'feature': x, 'target': y})
lr = LinearRegression()
lr.fit(df['feature'], df['target'])            # <---- error
lr.fit(df[['feature']], df['target'])          # <---- OK
#        ^^         ^^                           <---- double brackets

为什么`X`应该是2D？

如果我们查看scikit-learn中任何模型的fit()源代码，首先要做的事情之一是通过validate_data()方法验证输入，该方法调用check_array()验证X。 check_array()检查，除其他外，X是否为2D。X必须是2D非常重要，因为最终LinearRegression().fit()调用scipy.linalg.lstsq来解决最小二乘问题，lstsq需要X为2D才能执行矩阵乘法。

对于分类器，第二维需要获得特征数量，这对于获得正确形状的模型系数至关重要。

匿名的 · Answer 2 · 2022-08-12T20:57:58+00:00

你需要将你的 X 重塑为一个二维数组而不是一维数组。拟合模型需要一个二维数组。即 (n_samples, n_features)

x = np.array([2.0 , 2.4, 1.5, 3.5, 3.5, 3.5, 3.5, 3.7, 3.7])
y = np.array([196, 221, 136, 255, 244, 230, 232, 255, 267])
lr = LinearRegression()
lr.fit(x.reshape(-1, 1), y)
print(lr.predict([[2.4]]))

将您的数据进行重新整形，若您的数据只有一个特征，则使用array.reshape(-1,1)，若您的数据只包含一个样本，则使用array.reshape(1,-1)。

2 答案

为什么X应该是2D？

为什么`X`应该是2D？