twinify.dpvi package#

class twinify.dpvi.dpvi_model.DPVIModel(model: Callable, guide: Optional[Callable] = None, clipping_threshold: float = 1.0, num_epochs: int = 1000, subsample_ratio: float = 0.01)[source]#
DefaultAutoGuideType#

alias of numpyro.infer.autoguide.AutoDiagonalNormal

fit(data: pandas.core.frame.DataFrame, rng: chacha.defs.ChaChaState, epsilon: float, delta: float, silent: bool = False, verbose: bool = False) twinify.dpvi.dpvi_result.DPVIResult[source]#

Compute the parameter posterior (approximation) for a given data set, hyperparameters and privacy bounds.

Parameters
  • data – A pandas.DataFrame containing (sensitive) data.

  • rng – A seeded state for the d3p.random secure random number generator.

  • epsilon – Privacy bound ε.

  • delta – Privacy bound δ.

  • kwargs – Optional (model specific) hyperparameters.

class twinify.dpvi.dpvi_result.DPVIResult(model: Callable, guide: Callable, parameters: Dict[str, Union[numpy._typing._array_like._SupportsArray[numpy.dtype[Any]], numpy._typing._nested_sequence._NestedSequence[numpy._typing._array_like._SupportsArray[numpy.dtype[Any]]], bool, int, float, complex, str, bytes, numpy._typing._nested_sequence._NestedSequence[Union[bool, int, float, complex, str, bytes]]]], privacy_parameters: twinify.dpvi.PrivacyLevel, final_elbo: float, data_description: twinify.dataframe_data.DataDescription)[source]#
property final_elbo: float#

The final ELBO achieved by the inference (on the training data).

generate(rng: chacha.defs.ChaChaState, num_parameter_samples: int, num_data_per_parameter_sample: int = 1, single_dataframe: bool = True) Union[Iterable[pandas.core.frame.DataFrame], pandas.core.frame.DataFrame][source]#

Samples a number of samples from the parameter posterior (approximation) and generates the given number of data points per parameter samples.

By default returns a single data frame samples from the posterior predictive distribution, i.e., for each data records first a parameter value is drawn from the parameter posterior distribution, then the data record is sampled from the model conditioned on that parameter value. num_parameter_samples in this case determines the number of data records included in the returned data frame.

This behavior can be customized to sample more than one data record per parameter sample by setting argument num_data_per_parameter_sample to a value larger than 1, in which case the total number of records returned is num_parameter_samples * num_data_per_parameter_sample.

Setting single_dataframe = False causes the method to return an iterable collection of data frames, each of which contains all data records sampled for a single parameter samples, i.e., in this case this method returns num_parameter_samples data frames each of containing num_data_per_parameter_sample records.

Each of the data frames “looks” like the original data this InferenceResult was obtained from, i.e., it has identical column names and categorical labels (if any).

Parameters
  • rng (-) – A seeded state for the d3p.random secure random number generator.

  • num_parameter_samples (-) – How often to sample from the parameter posterior approximation.

  • num_data_per_parameter_sample (-) – How many data points to generate for each parameter sample.

  • single_dataframe (-) – Whether to combine data samples into a single data frame or return separate data frames.

property privacy_level: float#

The privacy parameters: epsilon, delta and standard deviation of noise applied during inference.