jax_privacy.auditing.CanaryScoreAuditor

class jax_privacy.auditing.CanaryScoreAuditor(in_canary_scores, out_canary_scores)[source]

Bases: object

Class for auditing privacy based on attack scores.

To use this library, create a CanaryScoreAuditor providing the attack scores of held-in and held-out canaries. Attack scores can be any value such that that held-in canaries are expected to have higher scores, for example the log-likelihood or the likelihood ratio to the pretrained model. Then the auditor can be used to compute privacy metrics, including various epsilon lower bounds from the literature, the maximum TPR at a given FPR, and the area under the receiver operating characteristic (ROC) curve.

Example Usage:
>>> out_canary_scores = np.arange(100)
>>> in_canary_scores = np.arange(100) + 9.5
>>> auditor = CanaryScoreAuditor(in_canary_scores, out_canary_scores)
>>> tpr_at_low_fpr = auditor.tpr_at_given_fpr(0.01)
>>> float(round(tpr_at_low_fpr, 2))
0.11
>>> auroc = auditor.attack_auroc()
>>> float(auroc)
0.595

Initializes the CanaryScoreAuditor.

IMPORTANT: We consider decision rules that classify examples that score higher than the threshold as “in”. If held-in canaries are expected to have lower scores than held-out canaries, then negate the score before constructing the auditor.

Parameters:
  • in_canary_scores (Union[_Buffer, _SupportsArray[dtype[Any]], _NestedSequence[_SupportsArray[dtype[Any]]], complex, bytes, str, _NestedSequence[complex | bytes | str]]) – Attack scores of held-in canaries.

  • out_canary_scores (Union[_Buffer, _SupportsArray[dtype[Any]], _NestedSequence[_SupportsArray[dtype[Any]]], complex, bytes, str, _NestedSequence[complex | bytes | str]]) – Attack scores of held-out canaries.

Methods

__init__

Initializes the CanaryScoreAuditor.

attack_auroc

Computes the area under the ROC curve from the attack scores.

epsilon_clopper_pearson

Finds epsilon lower bound from scores of held-in/held-out canaries.

epsilon_from_gdp

Calculates the an estimate for epsilon with GDP.

epsilon_one_run

Computes lower bound on epsilon for a single round of auditing.

epsilon_one_run_fdp

Computes lower bound on epsilon for a single round of auditing.

epsilon_raw_counts

Estimates epsilon from raw count statistics of seen/unseen canaries.

max_accuracy

Computes the maximum accuracy achievable by a threshold-based classifier.

tpr_at_given_fpr

Computes maximum TPR at a given FPR.

epsilon_clopper_pearson(significance, delta=0, one_sided=True, *, threshold_strategy=<jax_privacy.auditing.Bonferroni object>)[source]

Finds epsilon lower bound from scores of held-in/held-out canaries.

Described in https://arxiv.org/pdf/2101.04535.

Parameters:
  • significance (float) – Allowed probability of failure (one minus confidence).

  • delta (float) – Approximate DP delta.

  • one_sided (bool) – Whether to use only TPR/FPR (vs. max of TPR/FPR and TNR/FNR).

  • threshold_strategy (ThresholdStrategy) – How to select the threshold to use for the epsilon estimate.

Return type:

float

Returns:

Optimal epsilon lower bound.

epsilon_raw_counts(min_count=50, delta=0, one_sided=True, *, bootstrap_params=None)[source]

Estimates epsilon from raw count statistics of seen/unseen canaries.

min_count is the minimum number of FP (or FN, if not one_sided) required to consider a threshold. If min_count is too high relative to the number of canaries, the estimate will be biased towards zero. If it is too low, the estimate will have high variance.

Parameters:
  • min_count (int) – Only consider thresholds with this many TP/FP (TN/FN).

  • delta (float) – Approximate DP delta.

  • one_sided (bool) – Whether to use only TPR/FPR (vs. max of TPR/FPR and TNR/FNR).

  • bootstrap_params (BootstrapParams | None) – If provided, compute and return bootstrapped quantiles of the estimate. Note that this should not be interpreted as a formal confidence interval on the true epsilon, merely a confidence interval of the estimate.

Return type:

float | ndarray

Returns:

Epsilon estimate ln(TPR/FPR).

tpr_at_given_fpr(fpr, *, bootstrap_params=None)[source]

Computes maximum TPR at a given FPR.

Parameters:
  • fpr (Union[_Buffer, _SupportsArray[dtype[Any]], _NestedSequence[_SupportsArray[dtype[Any]]], complex, bytes, str, _NestedSequence[complex | bytes | str]]) – The desired false positive rate. May be a scalar, or an array of independent FPR values, in which case an array of the same shape is returned with the TPR at each FPR.

  • bootstrap_params (BootstrapParams | None) – If provided, compute and return bootstrapped quantiles of the TPR. fpr must be a scalar in this case.

Return type:

ndarray | float

Returns:

The maximum true positive rate at the given false positive rate, allowing classifiers that randomize between two thresholds.

attack_auroc(*, bootstrap_params=None)[source]

Computes the area under the ROC curve from the attack scores.

Parameters:

bootstrap_params (BootstrapParams | None) – If provided, compute and return bootstrapped quantiles of the AUROC.

Return type:

float | ndarray

Returns:

The area under the ROC curve from the attack scores, allowing classifiers that randomize between two thresholds.

max_accuracy(*, prevalence=None, significance=None)[source]

Computes the maximum accuracy achievable by a threshold-based classifier.

Parameters:
  • prevalence (float | None) – The prevalence of the positive class. If not provided, the prevalence is taken to be the proportion of in-canary examples to the total.

  • significance (float | None) – If provided, compute and return the high probability upper bound on the maximum accuracy with this allowable probability of failure (one minus confidence).

Return type:

float

Returns:

The maximum accuracy.

epsilon_from_gdp(significance, delta, eps_tol=1e-06)[source]

Calculates the an estimate for epsilon with GDP.

This is the method used in https://arxiv.org/pdf/2302.07956 and described in https://arxiv.org/pdf/2406.04827.

Parameters:
  • significance (float) – Allowed probability of failure (one minus confidence).

  • delta (float) – Approximate DP delta. Must be in (0, 1].

  • eps_tol (float) – The tolerance for epsilon (the privacy parameter). Defaults to 1e-6.

Return type:

float

Returns:

The estimated epsilon.

epsilon_one_run(significance, delta, one_sided=True, *, threshold_strategy=<jax_privacy.auditing.Bonferroni object>)[source]

Computes lower bound on epsilon for a single round of auditing.

This is an implementation of the method from Steinke et al. 2024, “Privacy Auditing in One (1) Training Run”: https://arxiv.org/abs/2305.08846.

Currently only one-sided hypotheses are supported ($k_- = 0$).

Parameters:
  • significance (float) – Allowed probability of failure (one minus confidence).

  • delta (float) – Approximate DP delta.

  • one_sided (bool) – Whether to consider only hypotheses with ($k_- = 0$). Must be True.

  • threshold_strategy (ThresholdStrategy) – How to select the threshold to use for the epsilon estimate.

Return type:

float

Returns:

The estimated epsilon lower bound.

epsilon_one_run_fdp(significance, delta, one_sided=True, *, threshold_strategy=<jax_privacy.auditing.Bonferroni object>)[source]

Computes lower bound on epsilon for a single round of auditing.

This is an implementation of the method from Mahloujifar et al. 2024, “Auditing f-Differential Privacy in One Run”: https://arxiv.org/pdf/2410.22235.

Currently only one-sided hypotheses are supported ($k_- = 0$).

Parameters:
  • significance (float) – Allowed probability of failure (one minus confidence).

  • delta (float) – Approximate DP delta.

  • one_sided (bool) – Whether to consider only hypotheses with ($k_- = 0$). Must be True.

  • threshold_strategy (ThresholdStrategy) – How to select the threshold to use for the epsilon estimate.

Return type:

float

Returns:

The estimated epsilon lower bound.