Predictor Interface

The PredictorInterface creates the skeletal structure around basic functionality of Lightwood.

class api.predictor.PredictorInterface[source]

Abstraction of a Lightwood predictor. The PredictorInterface encompasses how Lightwood interacts with the full ML pipeline. Internally,

The PredictorInterface class must have 5 expected functions:

  • learn: An end-to-end technique specifying how to pre-process, featurize, and train the model(s) of interest. The expected input is raw, untrained data. No explicit output is provided, but the Predictor object will “host” the trained model thus.

  • adjust: The manner to incorporate new data to update pre-existing model(s).

  • predict: Deploys the chosen best model, and evaluates the given data to provide target estimates.

  • predict_proba: Deploys the chosen best model, and enables user to analyze how the model makes estimates. This depends on whether the models internally have “predict_proba” as a possible method (thus, only for classification).

  • save: Saves the Predictor object for further use.

The PredictorInterface is created via J{ai}son’s custom code creation. A problem inherits from this class with pre-populated routines to fill out expected results, given the nature of each problem type.

adjust(data)[source]

Adjusts a previously trained model on new data. Adopts the same process as learn but with the exception that the adjust function expects the best model to have been already trained.

..warnings:: Not tested yet - this is an experimental feature :type data: DataFrame :param data: New data used to adjust a previously trained model.

Return type

None

Returns

Adjusts best-fit model

learn(data)[source]

Trains the attribute model starting from raw data. Raw data is pre-processed and cleaned accordingly. As data is assigned a particular type (ex: numerical, categorical, etc.), the respective feature encoder will convert it into a representation useable for training ML models. Of all ML models requested, these models are compiled and fit on the training data.

Parameters

data (DataFrame) – Data used in training the model(s).

Return type

None

Returns

Provides best fit model.

predict(data)[source]

Intakes raw data to provide predicted values for your trained model.

Parameters

data (DataFrame) – Data (n_samples, n_columns) that the model(s) will evaluate on and provide the target prediction.

Return type

DataFrame

Returns

A dataframe of predictions of the same length of input.

predict_proba(data)[source]

Intakes raw data to provide some element of confidence/explainability metric to gauge your model’s predictive abilities.

Parameters

data (DataFrame) – Data that the model(s) will evaluate on; provides the some element of predictive strength (ex: how “confident” the model is).

Return type

DataFrame

Returns

A dataframe of confidence metrics for each datapoint provided in the input (n_samples, n_classes)

save(file_path)[source]

With a provided file path, saves the Predictor instance for later use.

Parameters

file_path (str) – Location to store your Predictor Instance.

Return type

None

Returns

Saves Predictor instance.