Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adversary's prior knowledge #10

Open
lipikaramaswamy opened this issue May 11, 2022 · 0 comments
Open

Adversary's prior knowledge #10

lipikaramaswamy opened this issue May 11, 2022 · 0 comments

Comments

@lipikaramaswamy
Copy link

Hello,

I have a question about the mechanism proposed in your paper.

In the design of the membership inference attack, a requirement is that the adversary must have a reference dataset drawn from the same distribution as the target model's training data. So in the implementation, for any dataset available in SDGYM (e.g. adult, insurance), one sample is used as the adversary's prior information and another is used as the training set for the generative model that produces the synthetic data, the size of each depending on config params sizeRawT and sizeRawA.

In practice, however, when building a generative model, it's beneficial to use the entire dataset available to train to better learn the underlying distribution. It seems that the mechanism proposed hinges on a) using generative models that do not necessarily require large training sets, such as those listed in the configs (BayesNet, PrivBayes) or b) having very large training sets such that GANs or large language models have enough data for stable training after sampling. I'd love to hear your thoughts on this.

Would you also be able to share the config parameters used to generate results for CTGAN and PATEGAN in your paper? Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant