StackingClassifier
StackingClassifier(classifiers, meta_classifier, use_probas=False, average_probas=False, verbose=0, use_features_in_secondary=False, store_train_meta_features=False, use_clones=True)
A Stacking classifier for scikit-learn estimators for classification.
Parameters
-
classifiers
: array-like, shape = [n_classifiers]A list of classifiers. Invoking the
fit
method on theStackingClassifer
will fit clones of these original classifiers that will be stored in the class attributeself.clfs_
. -
meta_classifier
: objectThe meta-classifier to be fitted on the ensemble of classifiers
-
use_probas
: bool (default: False)If True, trains meta-classifier based on predicted probabilities instead of class labels.
-
average_probas
: bool (default: False)Averages the probabilities as meta features if True.
-
verbose
: int, optional (default=0)Controls the verbosity of the building process. -
verbose=0
(default): Prints nothing -verbose=1
: Prints the number & name of the regressor being fitted -verbose=2
: Prints info about the parameters of the regressor being fitted -verbose>2
: Changesverbose
param of the underlying regressor to self.verbose - 2 -
use_features_in_secondary
: bool (default: False)If True, the meta-classifier will be trained both on the predictions of the original classifiers and the original dataset. If False, the meta-classifier will be trained only on the predictions of the original classifiers.
-
store_train_meta_features
: bool (default: False)If True, the meta-features computed from the training data used for fitting the meta-classifier stored in the
self.train_meta_features_
array, which can be accessed after callingfit
. -
use_clones
: bool (default: True)Clones the classifiers for stacking classification if True (default) or else uses the original ones, which will be refitted on the dataset upon calling the
fit
method. Hence, if use_clones=True, the original input classifiers will remain unmodified upon using the StackingClassifier'sfit
method. Settinguse_clones=False
is recommended if you are working with estimators that are supporting the scikit-learn fit/predict API interface but are not compatible to scikit-learn'sclone
function.
Attributes
-
clfs_
: list, shape=[n_classifiers]Fitted classifiers (clones of the original classifiers)
-
meta_clf_
: estimatorFitted meta-classifier (clone of the original meta-estimator)
-
train_meta_features
: numpy array, shape = [n_samples, n_classifiers]meta-features for training data, where n_samples is the number of samples in training data and n_classifiers is the number of classfiers.
Examples
For usage examples, please see http://rasbt.github.io/mlxtend/user_guide/classifier/StackingClassifier/
Methods
fit(X, y, sample_weight=None)
Fit ensemble classifers and the meta-classifier.
Parameters
-
X
: {array-like, sparse matrix}, shape = [n_samples, n_features]Training vectors, where n_samples is the number of samples and n_features is the number of features.
-
y
: array-like, shape = [n_samples] or [n_samples, n_outputs]Target values.
-
sample_weight
: array-like, shape = [n_samples], optionalSample weights passed as sample_weights to each regressor in the regressors list as well as the meta_regressor. Raises error if some regressor does not support sample_weight in the fit() method.
Returns
self
: object
fit_transform(X, y=None, fit_params)
Fit to data, then transform it.
Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.
Parameters
-
X
: numpy array of shape [n_samples, n_features]Training set.
-
y
: numpy array of shape [n_samples]Target values.
Returns
-
X_new
: numpy array of shape [n_samples, n_features_new]Transformed array.
get_params(deep=True)
Return estimator parameter names for GridSearch support.
predict(X)
Predict target values for X.
Parameters
-
X
: {array-like, sparse matrix}, shape = [n_samples, n_features]Training vectors, where n_samples is the number of samples and n_features is the number of features.
Returns
-
labels
: array-like, shape = [n_samples] or [n_samples, n_outputs]Predicted class labels.
predict_meta_features(X)
Get meta-features of test-data.
Parameters
-
X
: numpy array, shape = [n_samples, n_features]Test vectors, where n_samples is the number of samples and n_features is the number of features.
Returns
-
meta-features
: numpy array, shape = [n_samples, n_classifiers]Returns the meta-features for test data.
predict_proba(X)
Predict class probabilities for X.
Parameters
-
X
: {array-like, sparse matrix}, shape = [n_samples, n_features]Training vectors, where n_samples is the number of samples and n_features is the number of features.
Returns
-
proba
: array-like, shape = [n_samples, n_classes] or a list of n_outputs of such arrays if n_outputs > 1.Probability for each class per sample.
score(X, y, sample_weight=None)
Returns the mean accuracy on the given test data and labels.
In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted.
Parameters
-
X
: array-like, shape = (n_samples, n_features)Test samples.
-
y
: array-like, shape = (n_samples) or (n_samples, n_outputs)True labels for X.
-
sample_weight
: array-like, shape = [n_samples], optionalSample weights.
Returns
-
score
: floatMean accuracy of self.predict(X) wrt. y.
set_params(params)
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects
(such as pipelines). The latter have parameters of the form
<component>__<parameter>
so that it's possible to update each
component of a nested object.
Returns
self