PCA
Functionality for working with Principal Component Analysis (PCA) modeling.
- class friendly_mvda.models.pca_model.FittedPcaModel(model: PcaModel, model_id: int)
Bases:
object- property DmodX: DataFrame
Distance to the model in X space
- property Hotellings_T2: DataFrame
Hotelling’s T2 from the model
- property Q2: DataFrame
Cumulative predicted fraction of the variation of the, according to cross-validation, get data for all components.
- property R2: DataFrame
Cumulative explained fraction of the variation, get data for all components.
- property get_transform_information: tuple[list[str], list[float], list[str], list[list[float]]]
get the information used for the transformation of the data
returns: tuple: (scale_type, scale_weights, transform_type, transform_constants) scale_type: list of scaling types used for each variable scale_weights: list of scaling weights used for each variable transform_type: list of transformation types used for each variable transform_constants: list of transformation constants used for each variable
- property get_transformed_used_data: DataFrame
Get the data used for the model after transformation and scaling
example:
>>> from friendly_mvda.models.pca_model import PcaModel, FittedPcaModel >>> pca_model = PcaModel.open_model("pca_model.usp") >>> pca_model.change_scaling("uv") >>> fitted_model = pca_model.fit(components=3) >>> transformed_data = fitted_model.get_transformed_used_data()
here the transformed and scaled data used in the PCA model is retrieved and using uv for scaling and no transformation the data is returned will be centered around 0 and have unit variance.
- property loadings: DataFrame
Loadings from the model
- property loadings_correlation_scaled: DataFrame
Loadings(correlation scaled) from the model
- property number_of_components: int
Get the number of components used
- classmethod open_model(path: str, model_id: int = 1) FittedPcaModel
Open an existing model.
- predict(data: DataFrame) DataFrame
Use the model to get the prediction of the observation.
use the name of the variables to identify them.
Parameters
data : The observation to get the prediction of.
Returns
score : The prediction of the observations.
- refit(components: int = -1) None
Refit the model with another number of components. The previous fitted model will not work anymore.
- property residuals: DataFrame
Residuals from the model
- save(path: str) None
Save the model.
parameters:
path: the path to the saved file.
- property scores: DataFrame
Scores from the model
- set_limit_confidence_level(confidence_level: float) None
Set the confidence level of the limits
- transform_data_with_model(data: DataFrame) DataFrame
Transform the data with the model.
- class friendly_mvda.models.pca_model.PcaModel(project: Project)
Bases:
object- change_missing_tolerance(tolerance: float) None
Change the tolerance for when data will be excluded because of missing value
- change_scaling(scal_type: str, variable: str | None = None) None
Change the scaling of the model.
- fit(components: int = -1) FittedPcaModel
Fit the model.
parameters:
components: the number of components, -1 is auto fit
returns:
FittedPcaModel: the fitted PCA model.
example:
>>> from friendly_mvda.models.pca_model import PcaModel >>> model = PcaModel(project) >>> model.remove_variables(["pH"]) >>> model.remove_observation([46, 92]) >>> model.change_scaling("uv") >>> model.change_missing_tolerance(30) >>> fitted_model = model.fit()
here the model is modified before fitting by removing variable “pH”, removing observations 46 and 92, changing scaling to uv and changing missing tolerance to 30 %. Finally the model is fitted using auto fit for number of components.
- classmethod open_model(path: str, model_id: int = 1) PcaModel
Open a existing model.
parameters:
path: the path to the saved file.
model_id: the index of the model. Can be used if the file contains multiple models.
- remove_observation(observations: list[int]) None
Remove observations from the model
- remove_variables(variables: list[str]) None
Remove variables from the model
- save(path: str) None
Save the model
parameters:
path: the path to the saved file.