pyxpcm.pcm
- class pyxpcm.pcm(K: int, features: {}, scaling=1, reduction=1, maxvar=15, classif='gmm', covariance_type='full', verb=False, debug=False, timeit=False, timeit_verb=False, chunk_size='auto', backend='sklearn')[source]
Profile Classification Model class constructor
Consume and return
xarray
objects- __init__(K: int, features: {}, scaling=1, reduction=1, maxvar=15, classif='gmm', covariance_type='full', verb=False, debug=False, timeit=False, timeit_verb=False, chunk_size='auto', backend='sklearn')[source]
Create the PCM instance
- Parameters:
- K: int
The number of class, or cluster, in the classification model.
- features: dict()
The vertical axis to use for each features. eg: {‘temperature’:np.arange(-2000,0,1)}
- scaling: int (default: 1)
Define the scaling method:
0: No scaling
1: Center on sample mean and scale by sample std
2: Center on sample mean only
- reduction: int (default: 1)
Define the dimensionality reduction method:
0: No reduction
1: Reduction using :class:`sklearn.decomposition.PCA`
- maxvar: float (default: 99.9)
Maximum feature variance to preserve in the reduced dataset using
sklearn.decomposition.PCA
. In %.- classif: str (default: ‘gmm’)
Define the classification method. The only method available as of now is a Gaussian Mixture Model. See
sklearn.mixture.GaussianMixture
for more details.- covariance_type: str (default: ‘full’)
Define the type of covariance matrix shape to be used in the default classifier GMM. It can be ‘full’ (default), ‘tied’, ‘diag’ or ‘spherical’.
- verb: boolean (default: False)
More verbose output
- timeit: boolean (default: False)
Register time of operation for performance evaluation
- timeit_verb: boolean (default: False)
Print time of operation during execution
- chunk_size: ‘auto’ or int
Sampling chunk size, (array of features after pre-processing)
- backend: str
Statistic library backend, ‘sklearn’ (default) or ‘dask_ml’
Methods
__init__
(K, features[, scaling, reduction, ...])Create the PCM instance
bic
(ds[, features, dim])Compute Bayesian information criterion for the current model on the input dataset
display
([deep])Display detailed parameters of the PCM This is not a get_params because it doesn't return a dictionary Set Boolean option 'deep' to True for all properties display
fit
(ds[, features, dim])Estimate PCM parameters
fit_predict
(ds[, features, dim, inplace, name])Estimate PCM parameters and predict classes.
predict
(ds[, features, dim, inplace, name])Predict labels for profile samples
predict_proba
(ds[, features, dim, inplace, ...])Predict posterior probability of each components given the data
preprocessing
(ds[, features, dim, action, mask])Dataset pre-processing of feature(s)
preprocessing_this
(da[, dim, feature_name, ...])Pre-process data before anything
ravel
(da[, dim, feature_name])Extract from N-d array a X(feature,sample) 2-d array and vertical dimension z
score
(ds[, features, dim])Compute the per-sample average log-likelihood of the given data
to_netcdf
(ncfile, **ka)Save a PCM to a netcdf file
unravel
(ds, sampling_dims, X)Create a DataArray from a numpy array and sampling dimensions
Attributes
Return the number of features
Return the number of classes
backend
Return the name of the statistic backend
Return features definition dictionnary
fitstats
Estimator fit properties
Access plotting functions
Access statistics functions
Return a
pandas.DataFrame
with Execution time of method called on this instance