Our great sponsors
-
IntruderPayloads
A collection of Burpsuite Intruder payloads, BurpBounty payloads, fuzz lists, malicious file uploads and web pentesting methodologies and checklists.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
It's very interesting that AppSec may now begin to include "prompt injection" attacks as something of relevance.
Specifically with libraries like LangChain[0] that allow for you to perform complex actions ("What's the weather?" -> makes HTTP request to fetch weather) then we end up in a world where injection attacks can have side effects with security implications.
I've been thinking about what security might look like for a post-ChatGPT world and how I'd attempt to defend against it. I'd probably start by building a database of attack prompts, kind of like this[1] fuzz list but for AI, then I'd train a second neural net that acts like an adversarial neural network[2] to try to exploit the system based on those payloads. The end result would sort of like SQLMap[3] but for AI systems where it can automatically "leak" hidden prompts and potentially find "bypasses" to escape the sandbox.
Has anybody else spent any time thinking about how to defend systems against prompt injection attacks that have possible side effects (like making an HTTP request)?
0: https://langchain.readthedocs.io/en/latest/modules/agents/ex...
1: https://github.com/1N3/IntruderPayloads
2: https://en.wikipedia.org/wiki/Generative_adversarial_network
3: https://sqlmap.org/
It's very interesting that AppSec may now begin to include "prompt injection" attacks as something of relevance.
Specifically with libraries like LangChain[0] that allow for you to perform complex actions ("What's the weather?" -> makes HTTP request to fetch weather) then we end up in a world where injection attacks can have side effects with security implications.
I've been thinking about what security might look like for a post-ChatGPT world and how I'd attempt to defend against it. I'd probably start by building a database of attack prompts, kind of like this[1] fuzz list but for AI, then I'd train a second neural net that acts like an adversarial neural network[2] to try to exploit the system based on those payloads. The end result would sort of like SQLMap[3] but for AI systems where it can automatically "leak" hidden prompts and potentially find "bypasses" to escape the sandbox.
Has anybody else spent any time thinking about how to defend systems against prompt injection attacks that have possible side effects (like making an HTTP request)?
0: https://langchain.readthedocs.io/en/latest/modules/agents/ex...
1: https://github.com/1N3/IntruderPayloads
2: https://en.wikipedia.org/wiki/Generative_adversarial_network
3: https://sqlmap.org/
Related posts
- Restful API Testing (my way) with Express, Maria DB, Docker Compose and Github Action
- Is this sql query in django safe?
- Enhancing Code Quality and Security: Building a Rock-Solid CI Test Suite for Seamless Development
- 👨🏻💻Securing Your Web Applications from SQL Injection with SQLMap
- Are these good projects to have? (appsec)