AI-powered Bing Chat spills its secrets via prompt injection attack

Our great sponsors

InfluxDB - Power Real-Time Data Analytics at Scale

WorkOS - The modern identity platform for B2B SaaS

SaaSHub - Software Alternatives and Reviews

Our great sponsors

IntruderPayloads

2 3,526 1.8 BlitzBasic

A collection of Burpsuite Intruder payloads, BurpBounty payloads, fuzz lists, malicious file uploads and web pentesting methodologies and checklists.

It's very interesting that AppSec may now begin to include "prompt injection" attacks as something of relevance.
Specifically with libraries like LangChain[0] that allow for you to perform complex actions ("What's the weather?" -> makes HTTP request to fetch weather) then we end up in a world where injection attacks can have side effects with security implications.
I've been thinking about what security might look like for a post-ChatGPT world and how I'd attempt to defend against it. I'd probably start by building a database of attack prompts, kind of like this[1] fuzz list but for AI, then I'd train a second neural net that acts like an adversarial neural network[2] to try to exploit the system based on those payloads. The end result would sort of like SQLMap[3] but for AI systems where it can automatically "leak" hidden prompts and potentially find "bypasses" to escape the sandbox.
Has anybody else spent any time thinking about how to defend systems against prompt injection attacks that have possible side effects (like making an HTTP request)?
0: https://langchain.readthedocs.io/en/latest/modules/agents/ex...
1: https://github.com/1N3/IntruderPayloads
2: https://en.wikipedia.org/wiki/Generative_adversarial_network
3: https://sqlmap.org/

SQLMap

40 30,560 8.7 Python

Automatic SQL injection and database takeover tool

It's very interesting that AppSec may now begin to include "prompt injection" attacks as something of relevance.
Specifically with libraries like LangChain[0] that allow for you to perform complex actions ("What's the weather?" -> makes HTTP request to fetch weather) then we end up in a world where injection attacks can have side effects with security implications.
I've been thinking about what security might look like for a post-ChatGPT world and how I'd attempt to defend against it. I'd probably start by building a database of attack prompts, kind of like this[1] fuzz list but for AI, then I'd train a second neural net that acts like an adversarial neural network[2] to try to exploit the system based on those payloads. The end result would sort of like SQLMap[3] but for AI systems where it can automatically "leak" hidden prompts and potentially find "bypasses" to escape the sandbox.
Has anybody else spent any time thinking about how to defend systems against prompt injection attacks that have possible side effects (like making an HTTP request)?
0: https://langchain.readthedocs.io/en/latest/modules/agents/ex...
1: https://github.com/1N3/IntruderPayloads
2: https://en.wikipedia.org/wiki/Generative_adversarial_network
3: https://sqlmap.org/

InfluxDB

www.influxdata.com sponsored

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
dicectf-2023-challenges

1 62 10.0 C

All challenges from DiceCTF 2023

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project