feature_selection.SelectKBest用法

根据评分，选取的评分较高的 k 个特征。

class sklearn.feature_selection.SelectKBest(score_func=<function f_classif>, 
                                            k=10
                                           )

Parameters

score_func

callable

函数接受两个数组X和y，并返回一对数组（分数，pvalue）或带分数的单个数组。

k

int or “all”, optional, default=10

Number of top features to select

Attributes

scores_

array-like of shape (n_features,)

Scores of features.

pvalues_

array-like of shape (n_features,)

p-values of feature scores, None if score_func returned only scores.

Methods

`fit`(self, X, y)	Run score function on (X, y) and get the appropriate features.
`fit_transform`(self, X[, y])	Fit to data, then transform it.
`get_params`(self[, deep])	Get parameters for this estimator.
`get_support`(self[, indices])	Get a mask, or integer index, of the features selected
`inverse_transform`(self, X)	Reverse the transformation operation
`set_params`(self, **params)	Set the parameters of this estimator.
`transform`(self, X)	Reduce X to the selected features.

案例

import pandas as pd
#load sklearn built-in Boston dataset
from sklearn.datasets import load_boston
#Loading the dataset
x = load_boston()
data = pd.DataFrame(x.data, columns = x.feature_names)
data["MEDV"] = x.target
X = data.drop("MEDV",1)   #Remove Target Variable to Get Feature Matrix
y = data["MEDV"]          #Target Variable
data.head()

以下通过线性回归的F值来对各输入变量进行排序，并选出前五个预测效果最好的变量。我们可以看出，LSTAT分数最高，预测能力最好：

from sklearn.feature_selection import SelectKBest
from sklearn.feature_selection import f_regression

#apply SelectKBest class to extract top 5 best features
bestfeatures = SelectKBest(score_func=f_regression, k=5)
fit = bestfeatures.fit(X,y)
dfscores = pd.DataFrame(fit.scores_)
dfcolumns = pd.DataFrame(X.columns)
#concat two dataframes for better visualization 
featureScores = pd.concat([dfcolumns,dfscores],axis=1)
featureScores.columns = ['Specs','Score']  #naming the dataframe columns
print(featureScores.nlargest(5,'Score'))  #print 5 best features

feature_selection.SelectKBest用法

feature_selection.SelectKBest用法

案例

results matching ""

No results matching ""