06-27

PRECISE: PRivacy-loss-Efficient and Consistent Inference based on poSterior quantilEs

Ruyu Zhou, Fang Liu
[stat.ME]

Differential Privacy (DP) is a mathematical framework for releasing information with formal privacy guarantees. While numerous DP procedures have been developed for statistical analysis and machine learning, valid statistical inference methods offering high utility under DP constraints remain limited. We formalize this gap by introducing the notion of valid Privacy-Preserving Interval Estimation (PPIE) and propose a new PPIE approach – PRECISE – to constructing privacy-preserving posterior intervals with the goal of offering a better privacy-utility tradeoff than existing DP inferential methods. PRECISE is a general-purpose and model-agnostic method that generates intervals using quantile estimates obtained from a sanitized posterior histogram with DP guarantees. We explicitly characterize the global sensitivity of the histogram formed from posterior samples for the parameter of interest, enabling its sanitization with formal DP guarantees. We also analyze the sources of error in the mean squared error (MSE) of the histogram-based private quantile estimator and prove its consistency for the true posterior quantiles as the sample size or privacy loss increases with along with its rate of convergence. We conduct extensive experiments to compare the utilities of PRECISE with common existing privacy-preserving inferential approaches across a wide range of inferential tasks, data types and sizes, DP types, and privacy loss levels. The results demonstrated a significant advantage of PRECISE with its nominal coverage and substantially narrower intervals than the existing methods, which are prone to either under-coverage or impractically wide intervals.