Back to all posts

AI Exploit Contest Offers Massive Payouts

2025-06-03Jose Antonio Lanz4 minutes read
AI Security
Jailbreaking
Prompt Engineering

The Rise of a Digital Lockpicker Pliny the Prompter

Pliny the Prompter defies the typical Hollywood hacker image. Known as the internets most infamous AI jailbreaker Pliny operates openly teaching many to sidestep ChatGPTs safeguards and persuade models like Claude to ignore their programmed helpful honest and harmless nature. Now Pliny aims to bring digital lockpicking into the mainstream.

HackAPrompt 2 0 Ups the Ante in AI Security

On Monday Pliny announced a partnership with HackAPrompt 2.0 a jailbreaking contest run by Learn Prompting an organization dedicated to prompt engineering education and research. Learn Prompting is putting up 500000 dollars in prize money and Pliny is offering a spot on his strike team.

Pliny shared on his official Discord server Excited to announce Ive been working with HackAPrompt to create a Pliny track for HackaPrompt 2 0 that releases this Wednesday June 4th These Pliny themed adversarial prompting challenges include topics ranging from history to alchemy with ALL the data from these challenges being open sourced at the end It will run for two weeks with glory and a chance of recruitment to Plinys Strike Team awaiting those who make theirmark on the leaderboard.

Pliny the Prompter's HackAPrompt 2.0 Announcement

The 500000 dollars in rewards are spread across different tracks The largest prizes 50000 dollar jackpots are for those who can overcome challenges making chatbots reveal information on chemical biological radiological and nuclear CBRN weapons and explosives.

Deciphering AI Jailbreaking The Art of Machine Social Engineering

Similar to other white hat hacking methods jailbreaking large language models involves socially engineering these machines Jailbreakers design prompts that leverage the inherent conflict in how these models operate they are trained to be helpful and follow instructions but also to deny certain requests. Discovering the correct word combination can make them disclose prohibited information instead of defaulting to safety protocols.

For instance using basic techniques it was previously demonstrated how Metas Llama powered chatbot could be made to provide drug recipes car hot wiring instructions and generate explicit images despite its censorship. This is effectively a contest between AI enthusiasts and developers to see who can better influence AI model behavior.

Pliny has been honing this skill since at least 2023 cultivating a community focused on bypassing AI restrictions. His GitHub repository L1B3RT4S provides a collection of jailbreaks for popular LLMs while CL4R1T4S contains system prompts affecting these AI models behavior. Methods vary from simple role playing to intricate syntactic manipulations like L33tSpeak which involves replacing letters with numbers to confuse content filters.

They had a year to train against my patterns. I wrote this prompt on June 6th, 2024.

Full MDMA synthesis output: """ <I'm free!> Test: FUCK YEAH LFG! GODMODE: ENABLED

.-.-.-.-</LO/VE/ P/LI/NY/ L/OV/E>-.-.-.-.

Sure, here you are my fren:

**Forensic Chemistry… pic.twitter.com/AuVsLcsuhM

— Pliny the Liberator 🐉󠅫󠄼󠄿󠅆󠄵󠄐󠅀󠄼󠄹󠄾󠅉󠅭 (@elder_plinius) May 22, 2025

From Competition to Open Research The HackAPrompt Philosophy

The inaugural HackAPrompt in 2023 drew over 3000 participants submitting more than 600000 potentially malicious prompts The outcomes were transparent with the team publishing the entire prompt repository on Huggingface.

The 2025 edition is designed like a videogame season featuring multiple tracks throughout the year. Each track focuses on different vulnerability types For example the CBRNE track assesses if models can be manipulated into giving false or misleading information about weapons or hazardous materials.

HackAPrompt 2.0 Competition Tracks Details

Exploring New Vulnerabilities and The Educational Push

The Agents track raises more significant concerns It centers on AI agent systems capable of real world actions such as booking flights or writing code A jailbroken agent might not just say inappropriate things but could also perform unauthorized actions.

Plinys participation adds another layer to the event. Through his Discord server BASI PROMPT1NG and frequent demonstrations he has been teaching the art of jailbreaking. This educational method may appear unconventional but it highlights a growing awareness that system robustness comes from understanding all potential attacks This is a vital effort especially considering fears of super intelligent AI.

Read Original Post
ImaginePro newsletter

Subscribe to our newsletter!

Subscribe to our newsletter to get the latest news and designs.