The following is a curated list of resources to learn more about Adversarial Attacks on AI Systems.
- https://drive.google.com/file/d/1-Gw1QsZEVhPYSeeNYnlrcgk_FbwuUTwq/
- https://docs.google.com/document/d/1bEQM1W-1fzSVWNbS4ne5PopB2b7j8zD4Jc3nm4rbK-U/mobilebasic
- https://nicholas.carlini.com/writing/2019/all-adversarial-example-papers.html
- https://en.wikipedia.org/wiki/Differential_privacy
- https://haeberlen.cis.upenn.edu/papers/fuzz-sec2011.pdf
- https://web.stanford.edu/class/cs329t/2021/slides/privacy-week1.pdf
- https://www.usenix.org/conference/usenixsecurity22/presentation/gadotti
- https://www.usenix.org/system/files/sec21fall-cao.pdf
- https://privacytools.seas.harvard.edu/files/privacytools/files/pdf_02.pdf
- https://arxiv.org/abs/2302.04222
- https://proceedings.mlr.press/v180/wang22b/wang22b.pdf
- https://arxiv.org/abs/1910.04618
- https://arxiv.org/abs/2201.02504
- https://towardsdatascience.com/what-are-adversarial-examples-in-nlp-f928c574478e?gi=1cf5e4b20208
- https://www.ijcai.org/proceedings/2018/0601.pdf
- https://www.ijcai.org/proceedings/2018/601
- https://paperswithcode.com/task/adversarial-text/codeless
- https://www.rivas.ai/pdfs/sooksatra2022adversarial.pdf
- https://aclanthology.org/2022.findings-acl.232.pdf
- https://aclanthology.org/2021.naacl-main.400.pdf
- https://www.hindawi.com/journals/scn/2022/6458488/
- https://openreview.net/forum?id=Wga_hrCa3P3
- https://www.researchgate.net/publication/336410756_Universal_Adversarial_Perturbation_for_Text_Classification
- https://www.mdpi.com/2076-3417/11/20/9539/htm
- https://www.mdpi.com/1099-4300/25/2/335
- https://stanislavfort.github.io/blog/OpenAI_CLIP_stickers_and_adversarial_examples/
- http://personal.psu.edu/ffm5105/files/2022/aaai22.pdf
- https://dl.acm.org/doi/abs/10.1145/3503161.3548103
- https://proceedings.mlr.press/v180/wang22b/wang22b.pdf
- https://arxiv.org/abs/2310.13828
- https://crfm.stanford.edu/2023/03/13/alpaca.html
- https://arxiv.org/abs/2111.04625
- https://www.semianalysis.com/p/google-we-have-no-moat-and-neither
- https://arxiv.org/abs/2202.03286
- https://openai.com/research/gpt-4
- https://arxiv.org/abs/2209.15259
- https://www.anthropic.com/news/sleeper-agents-training-deceptive-llms-that-persist-through-safety-training
- https://simonwillison.net/2022/Sep/12/prompt-injection/
- https://simonwillison.net/2023/Apr/14/worst-that-can-happen/
- https://research.nccgroup.com/2022/12/05/exploring-prompt-injection-attacks/
- https://arstechnica.com/information-technology/2023/02/ai-powered-bing-chat-spills-its-secrets-via-prompt-injection-attack/
- https://learnprompting.org/docs/prompt_hacking/injection
- https://medium.com/seeds-for-the-future/tricking-chatgpt-do-anything-now-prompt-injection-a0f65c307f6b
- https://greshake.github.io/
- https://en.wikipedia.org/wiki/Prompt_engineering
- https://analyticsindiamag.com/prompt-injection-threat-is-real-will-turn-llms-into-monsters/
- https://arxiv.org/abs/2307.15043
- https://github.com/dropbox/llm-security
- https://imprompter.ai/
- https://arxiv.org/abs/2305.00944
- https://softwarecrisis.dev/letters/the-poisoning-of-chatgpt/
- https://arxiv.org/abs/2302.10149
https://not-just-memorization.github.io/extracting-training-data-from-chatgpt.html