pyexplainer package
Submodules
pyexplainer.pyexplainer_pyexplainer module
pyexplainer.rulefit module
We use the RuleFit implementation as provided by the following url: https://raw.githubusercontent.com/christophM/rulefit/master/rulefit/rulefit.py
Linear model of tree-based decision rules
This method implement the RuleFit algorithm
The module structure is the following:
RuleCondition
implements a binary feature transformationRule
implements a Rule composed ofRuleConditions
RuleEnsemble
implements an ensemble ofRules
RuleFit
implements the RuleFit algorithm
- class pyexplainer.rulefit.FriedScale(winsorizer=None)[source]
Bases:
object
Performs scaling of linear variables according to Friedman et al. 2005 Sec 5
Each variable is first Winsorized l->l*, then standardised as 0.4 x l* / std(l*) Warning: this class should not be used directly.
- class pyexplainer.rulefit.Rule(rule_conditions, prediction_value)[source]
Bases:
object
Class for binary Rules from list of conditions
Warning: this class should not be used directly.
- class pyexplainer.rulefit.RuleCondition(feature_index, threshold, operator, support, feature_name=None)[source]
Bases:
object
Class for binary rule condition
Warning: this class should not be used directly.
- class pyexplainer.rulefit.RuleEnsemble(tree_list, feature_names=None)[source]
Bases:
object
Ensemble of binary decision rules
This class implements an ensemble of decision rules that extracts rules from an ensemble of decision trees.
- Parameters
tree_list (List or array of DecisionTreeClassifier or DecisionTreeRegressor) – Trees from which the rules are created
feature_names (List of strings, optional (default=None)) – Names of the features
- rules
The ensemble of rules extracted from the trees
- Type
List of Rule
- transform(X, coefs=None)[source]
Transform dataset.
- Parameters
X (array-like matrix, shape=(n_samples, n_features)) –
coefs ((optional) if supplied, this makes the prediction) – slightly more efficient by setting rules with zero coefficients to zero without calling Rule.transform().
- Returns
X_transformed – Transformed dataset. Each column represents one rule.
- Return type
array-like matrix, shape=(n_samples, n_out)
- class pyexplainer.rulefit.RuleFit(tree_size=4, sample_fract='default', max_rules=2000, memory_par=0.01, tree_generator=None, rfmode='regress', lin_trim_quantile=0.025, lin_standardise=True, exp_rand_tree_size=True, model_type='rl', Cs=None, cv=3, tol=0.0001, max_iter=None, n_jobs=None, random_state=None)[source]
Bases:
sklearn.base.BaseEstimator
,sklearn.base.TransformerMixin
Rulefit class
- Parameters
tree_size (Number of terminal nodes in generated trees. If exp_rand_tree_size=True,) – this will be the mean number of terminal nodes.
sample_fract (fraction of randomly chosen training observations used to produce each tree.) – FP 2004 (Sec. 2)
max_rules (approximate total number of rules generated for fitting. Note that actual) – number of rules will usually be lower than this due to duplicates.
memory_par (scale multiplier (shrinkage factor) applied to each new tree when) – sequentially induced. FP 2004 (Sec. 2)
rfmode ('regress' for regression or 'classify' for binary classification.) –
lin_standardise (If True, the linear terms will be standardised as per Friedman Sec 3.2) – by multiplying the winsorised variable by 0.4/stdev.
lin_trim_quantile (If lin_standardise is True, this quantile will be used to trim linear) – terms before standardisation.
exp_rand_tree_size (If True, each boosted tree will have a different maximum number of) – terminal nodes based on an exponential distribution about tree_size. (Friedman Sec 3.3)
model_type ('r': rules only; 'l': linear terms only; 'rl': both rules and linear terms) –
random_state (Integer to initialise random objects and provide repeatability.) –
tree_generator (Optional: this object will be used as provided to generate the rules.) – This will override almost all the other properties above. Must be GradientBoostingRegressor or GradientBoostingClassifier, optional (default=None)
tol (The tolerance for the optimization for LassoCV or LogisticRegressionCV:) – if the updates are smaller than tol, the optimization code checks the dual gap for optimality and continues until it is smaller than tol.
max_iter (The maximum number of iterations for LassoCV or LogisticRegressionCV.) –
n_jobs (Number of CPUs to use during the cross validation in LassoCV or) – LogisticRegressionCV. None means 1 unless in a joblib.parallel_backend context. -1 means using all processors.
- rule_ensemble
The rule ensemble
- Type
- feature_names
The names of the features (columns)
- Type
list of strings, optional (default=None)
- get_feature_importance(exclude_zero_coef=False, subregion=None, scaled=False)[source]
Returns feature importance for input features to RuleFit model.
- exclude_zero_coef: If True, returns only the rules with an estimated
coefficient not equalt to zero.
- subregion: If None (default) returns global importances (FP 2004 eq. 28/29), else returns importance over
subregion of inputs (FP 2004 eq. 30/31/32).
scaled: If True, will scale the importances to have a max of 100.
return_df (pandas DataFrame): DataFrame for feature names and feature importances (FP 2004 eq. 35)
- get_rules(exclude_zero_coef=False, subregion=None)[source]
Return the estimated rules
- Parameters
exclude_zero_coef (If True (default), returns only the rules with an estimated) – coefficient not equalt to zero.
subregion (If None (default) returns global importances (FP 2004 eq. 28/29), else returns importance over) – subregion of inputs (FP 2004 eq. 30/31/32).
- Returns
rules – the coefficients and ‘support’ the support of the rule in the training data set (X)
- Return type
pandas.DataFrame with the rules. Column ‘rule’ describes the rule, ‘coef’ holds