jax_privacy.matrix_factorization.toeplitz.optimize_coefs_for_amplifications
- jax_privacy.matrix_factorization.toeplitz.optimize_coefs_for_amplifications(n, *, dataset_size, expected_batch_size, epsilon, delta, max_optimizer_steps=250, reduction_fn=<function mean>)[source]
Select num_bands (and coefs) to minimize loss subject to a privacy target.
Following Theorem 4 of https://arxiv.org/abs/2306.08153, this function (approximately) minimizes the loss_fn assuming privacy amplification under block-cyclic Poisson sampling (Algorithm 2 of https://arxiv.org/abs/2306.08153). A smaller number of bands allows more benefit from amplification, while a larger number of bands allows more benefit from correlated noise.
Notes
- This function only optimizes over numbers of bands that evenly divide n,
as this is generally preferable. Hence, it is recommended to choose n so it has well spaced factors; powers of 2 are particularly useful.
- This function delegates to optimize_banded_toeplitz to actually
optimize for the coefficients at a given number of bands. Hence, column normalization is not directly supported, but the final returned strategy can always be used with column normalization.
- Parameters:
n (
int) – the number of iterations that defines the workload.dataset_size (
int) – The size of the dataset.expected_batch_size (
int) – The target batch size (so for example if we were Poisson sampling from the whole dataset, the sampling probability would be expected_batch_size / dataset_size).epsilon (
float) – The privacy target is (epsilon, delta)-DP.delta (
float) – The privacy target is (epsilon, delta)-DP.max_optimizer_steps (
int) – The maximum number of LBFGS iterations, passed to optimize_banded_toeplitz.reduction_fn (
Callable[[Array],Array]) – A function that converts per query squared errors to a scalar. Use jnp.mean to optimize mean-squared-error, jnp.max to optimize max squared error, or lambda v: v[-1] to optimize last iterate squared error.
- Returns:
- coefs are the coefficeints of a banded Toeplitz strategy; the number
of bands chosen is simply the length of the returned coefficients.
stddev is the stddev of the uncorrelated noise Z required to achieve the privacy target (that, is, passing this stddev to streaming_matrix_to_single_machine_privatizer in distributed_noise_generation should achieve the (epsilon, delta)-DP guarantee).
- Return type:
A tuple (coefs, stddev) where