AI-powered Bing Chat spills its secrets via prompt injection attack

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • SaaSHub - Software Alternatives and Reviews
  • IntruderPayloads

    A collection of Burpsuite Intruder payloads, BurpBounty payloads, fuzz lists, malicious file uploads and web pentesting methodologies and checklists.

  • It's very interesting that AppSec may now begin to include "prompt injection" attacks as something of relevance.

    Specifically with libraries like LangChain[0] that allow for you to perform complex actions ("What's the weather?" -> makes HTTP request to fetch weather) then we end up in a world where injection attacks can have side effects with security implications.

    I've been thinking about what security might look like for a post-ChatGPT world and how I'd attempt to defend against it. I'd probably start by building a database of attack prompts, kind of like this[1] fuzz list but for AI, then I'd train a second neural net that acts like an adversarial neural network[2] to try to exploit the system based on those payloads. The end result would sort of like SQLMap[3] but for AI systems where it can automatically "leak" hidden prompts and potentially find "bypasses" to escape the sandbox.

    Has anybody else spent any time thinking about how to defend systems against prompt injection attacks that have possible side effects (like making an HTTP request)?

    0: https://langchain.readthedocs.io/en/latest/modules/agents/ex...

    1: https://github.com/1N3/IntruderPayloads

    2: https://en.wikipedia.org/wiki/Generative_adversarial_network

    3: https://sqlmap.org/

  • SQLMap

    Automatic SQL injection and database takeover tool

  • It's very interesting that AppSec may now begin to include "prompt injection" attacks as something of relevance.

    Specifically with libraries like LangChain[0] that allow for you to perform complex actions ("What's the weather?" -> makes HTTP request to fetch weather) then we end up in a world where injection attacks can have side effects with security implications.

    I've been thinking about what security might look like for a post-ChatGPT world and how I'd attempt to defend against it. I'd probably start by building a database of attack prompts, kind of like this[1] fuzz list but for AI, then I'd train a second neural net that acts like an adversarial neural network[2] to try to exploit the system based on those payloads. The end result would sort of like SQLMap[3] but for AI systems where it can automatically "leak" hidden prompts and potentially find "bypasses" to escape the sandbox.

    Has anybody else spent any time thinking about how to defend systems against prompt injection attacks that have possible side effects (like making an HTTP request)?

    0: https://langchain.readthedocs.io/en/latest/modules/agents/ex...

    1: https://github.com/1N3/IntruderPayloads

    2: https://en.wikipedia.org/wiki/Generative_adversarial_network

    3: https://sqlmap.org/

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • dicectf-2023-challenges

    All challenges from DiceCTF 2023

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts