Skip to content

"Parameters" are reported for non-parametric distribution (GaussianKDE); it is just a copy of the data #470

@npatki

Description

@npatki

Environment Details

  • Copulas version: 0.12.2
  • Python version: 3.11
  • Operating System: Linux

Error Description

As first described in #469, it seems that whenever Copulas is asked to print parameters for a fitted GaussianKDE distribution, it just prints out a copy of the data that was fitted.

In the code below, the final column (column z) is fitted to a GaussianKDE distribution.

from copulas.datasets import sample_trivariate_xyz
from copulas.multivariate import GaussianMultivariate

data = sample_trivariate_xyz()
dist = GaussianMultivariate()
dist.fit(data)
parameters = dist.to_dict()
univariates = parameters['univariates']
print(univariates[2])
{'dataset': [0.638689008563623, 1.058121237066397, 0.3725063445214631, 0.687369594994837, -0.8810681732344304, -0.7121672205062004, 5.050261904362624, ...
  'type': 'copulas.univariate.gaussian_kde.GaussianKDE'

The data seems to be just be the exact values in column z

Expected Behavior

It's unexpected that the entire column's data would be reported at this step.

I would expect that when printing out the distribution, it would only show the 'type' of distribution and nothing else.

print(univariates[2])
{ 'type': ''copulas.univariate.gaussian_kde.GaussianKDE' }

It seems like the "parameters" are set to the data in fit portion:

def _fit(self, X):
if self._sample_size:
X = gaussian_kde(X, bw_method=self.bw_method, weights=self.weights).resample(
self._sample_size
)
self._params = {'dataset': X.tolist()}
self._model = self._get_model()

Ideally, the _params assigned to the GaussianKDE should be None, GaussianKDE is non-parametric distribution. Whatever info we need to save the state of the GassianKDE should be saved under a different name and not exposed as parameters.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugThere is an error in the code that needs to be fixed

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions