PCA

Functionality for working with Principal Component Analysis (PCA) modeling.

class friendly_mvda.models.pca_model.FittedPcaModel(model: PcaModel, model_id: int)

Bases: object

property DmodX: DataFrame: Distance to the model in X space

property Hotellings_T2: DataFrame: Hotelling’s T2 from the model

property Q2: DataFrame: Cumulative predicted fraction of the variation of the, according to cross-validation, get data for all components.

property R2: DataFrame: Cumulative explained fraction of the variation, get data for all components.

property get_transform_information: tuple[list[str], list[float], list[str], list[list[float]]]

get the information used for the transformation of the data

returns: tuple: (scale_type, scale_weights, transform_type, transform_constants) scale_type: list of scaling types used for each variable scale_weights: list of scaling weights used for each variable transform_type: list of transformation types used for each variable transform_constants: list of transformation constants used for each variable

property get_transformed_used_data: DataFrame

Get the data used for the model after transformation and scaling

example:

>>> from friendly_mvda.models.pca_model import PcaModel, FittedPcaModel
>>> pca_model = PcaModel.open_model("pca_model.usp")
>>> pca_model.change_scaling("uv")
>>> fitted_model = pca_model.fit(components=3)
>>> transformed_data = fitted_model.get_transformed_used_data()

here the transformed and scaled data used in the PCA model is retrieved and using uv for scaling and no transformation the data is returned will be centered around 0 and have unit variance.

property loadings: DataFrame: Loadings from the model

property loadings_correlation_scaled: DataFrame: Loadings(correlation scaled) from the model

property number_of_components: int: Get the number of components used

classmethod open_model(path: str, model_id: int = 1) → FittedPcaModel: Open an existing model.

predict(data: DataFrame) → DataFrame

Use the model to get the prediction of the observation.

use the name of the variables to identify them.

Parameters

data : The observation to get the prediction of.

Returns

score : The prediction of the observations.

refit(components: int = -1) → None: Refit the model with another number of components. The previous fitted model will not work anymore.

property residuals: DataFrame: Residuals from the model

save(path: str) → None

Save the model.

parameters:

path: the path to the saved file.

property scores: DataFrame: Scores from the model

set_limit_confidence_level(confidence_level: float) → None: Set the confidence level of the limits

transform_data_with_model(data: DataFrame) → DataFrame: Transform the data with the model.

class friendly_mvda.models.pca_model.PcaModel(project: Project)

Bases: object

change_missing_tolerance(tolerance: float) → None: Change the tolerance for when data will be excluded because of missing value

change_scaling(scal_type: str, variable: str | None = None) → None: Change the scaling of the model.

fit(components: int = -1) → FittedPcaModel

Fit the model.

parameters:

components: the number of components, -1 is auto fit

returns:

FittedPcaModel: the fitted PCA model.

example:

>>> from friendly_mvda.models.pca_model import PcaModel
>>> model = PcaModel(project)
>>> model.remove_variables(["pH"])
>>> model.remove_observation([46, 92])
>>> model.change_scaling("uv")
>>> model.change_missing_tolerance(30)
>>> fitted_model = model.fit()

here the model is modified before fitting by removing variable “pH”, removing observations 46 and 92, changing scaling to uv and changing missing tolerance to 30 %. Finally the model is fitted using auto fit for number of components.

classmethod open_model(path: str, model_id: int = 1) → PcaModel

Open a existing model.

parameters:

path: the path to the saved file.

model_id: the index of the model. Can be used if the file contains multiple models.

remove_observation(observations: list[int]) → None: Remove observations from the model

remove_variables(variables: list[str]) → None: Remove variables from the model

save(path: str) → None

Save the model

parameters:

path: the path to the saved file.