Need help please
How can we design a machine learning model that is provably immune to all possible adversarial attacks without sacrificing accuracy or efficiency?
8 Replies
:MudWhat:
uh you can try running it locally
or you can do it without the machine learning and basically just build a normal chatbot
I see what you mean, but I was referring to theoretical robustness
as in, whether it’s possible to design a model that is provably immune to adversarial perturbations under any distribution
i feel like you don't understand the question you're asking
you can make it probably immune to specific attacks
which specific attacks are you trying to mitigate?
Actually, I do understand the question it’s a theoretical one
I’m not referring to robustness against a specific class of attacks like FGSM or PGD
I mean true, provable immunity to all possible adversarial perturbations under any data distribution, without sacrificing model accuracy or efficiency
As far as we know, that’s mathematically impossible unless you make extremely strong assumptions about the threat model or the data manifold
don't you also need to make assumptions about what accuracy even is? how are you going to measure how accurate a model is?
When I said “without sacrificing accuracy,” I meant within the conventional empirical sense: maintaining comparable performance on clean, in-distribution data
Even if we fix that definition, achieving provable immunity to all adversarial perturbations still seems theoretically impossible
i would agree
also this probably isn't the right server to ask about this, this isn't a ML server
True, thanks