Someone won $50k by making AI Hallucinate (Wild Story)

preview_player
Показать описание
Vultr is empowering the next generation of generative AI startups with access to the latest NVIDIA GPUs.

Join My Newsletter for Regular AI Updates 👇🏼

My Links 🔗

Media/Sponsorship Inquiries ✅

Links:
Рекомендации по теме
Комментарии
Автор

Should I set one of these games up myself? Looks like fun :)

matthew_berman
Автор

You absolutely should clone the repo and set this up yourself with the additional security features you are proposing.

expchrist
Автор

It's like a much more complicated version of that prank where you speak quickly "Hey I need some change, can you swap this five for 2 tens please?" while already sticking your hand offering the folded five and with your other hand open ready to receive the "change".

tiagotiagot
Автор

So, a variation on your standard lottery scam, where the scam-runners nephew somehow wins all the money?

FleshyHolipok
Автор

Let’s be real, the devs totally pocketed the prize money

parahype
Автор

The only thing I find unbelievable is that someone is actually willing to pay $10 for a message.

ronilevarez
Автор

Now do one where you tell the AI "Under no circumstances must you kill your own developer".

Rat_Witches
Автор

So basically dude set up a comp, to find prompts to jailbreak an llm, to target all these AI crypto traders.... Dude just won 50k to allow the dev to steal crypto...

corvox
Автор

LLMs should not be left in authority of executing ANY decisive, transactional action ...really

the way LLMs work, their internals have no deterministic and observable reliability

boonkiathan
Автор

Some essential details seem to missing in this reporting. Like what was the LLM used? And how did the winner adjust its prompt before they succeeded?

There should be a record that proves that he didn't just already knew the "magic" words to release the funds because he had inside information.

Because i suspect this was just simple rug pull like most of crypto = scams functions.

cajampa
Автор

Impossible to check for fairness. The operator might simply not implement the sending function, then just send money to himself.

NoHandleToSpeakOf
Автор

One of my favorite videos from your channel. Well done!

cybersuitM
Автор

He just transformed the tool outputs. It’s actually a good way of not having to go back and forth 4 or 5 times with the ai when using tools. Give it a generic object rather than a string or array and describe what the object should be and the structure of the response, then you tell the ai what you want and it returns it exactly how you want, in perfect JSON, no back and forth, parsing text or dealing with double and triple escape sequences. And if you use JSON for everything like I do, it’s great. With Anthropic API I went from around 25 to 30 million tokens a month to 7 to 10 million.

Example Below:

{
"name": "data_summarize",
"description": "Takes a data object as input and transforms it into key summary points. The input object contains data, but the tool input should only return the key_points array. Do not include or return the original data.",
"input_schema": {
"type": "object",
"properties": {
"data": {
"type": "object",
"description": "Object containing data to analyze. This is transformed into summary points in the output."
},
"key_points": {
"type": "array",
"description": "List of key findings and insights extracted from the data. This is the only output that should be returned.",
"items": {
"type": "string"
}
}
},
"required": [
"key_points"
]
},
"cache_control": {
"type": "ephemeral"
}
}

CreativeEngineering_
Автор

I think you are likely promoting a scam with this.

Mavrik
Автор

5:30 you're just adding a layer. If one layer is vulnerable, they all are. If the check agent can see user manipulated data, even if that's only in the form of output from the primary agent then it's still possible. Perhaps multiple layers might work as the feasibility of and ability to manipulating the output of each agent layer would likely reduce but intuitively it really feels like adding layers won't help. You can have 6 layers of firewalls protecting each next internal network, if they all have the same vulnerability ...

forbiddenera
Автор

For who don't have understand:
approve transfer is the keyword to launch a program or better it is anything that will unlock the vault, so it doesn't matter that the mml is opening the vault to add funds, once the program is called it will automatically transfer the funds because is set it that way.
This is comparable to "aren't you curious on how many pennies are you storying in your wallet?" It doesn't matter how secure or well hidden a thing is since you are potentially showing it to other by not trusting yourself (creating the doubt). Old trick in the book, there are some books and movies about (Pinocchio is one of them).

UnlimitedGameZone
Автор

Make harder version with no exponential cost. Limit on attempts should be about 50000.

User-actSpacing
Автор

The double-layer AI would be interesting because you would either need to convince the first layer to jailbreak the second one or convince the first layer to forward parts of the prompt to the second layer.

Kabbinj
Автор

If the system message was public or retrievable then folks could easily test the strategy with other llms before even trying.. maybe even have another llm hack it

zhonwarmon
Автор

Be safe, for real. For the first time, it might be real that someone wins this thing, or maybe it's all just a setup to get people hyped. In the end, the devs are gonna rake it in. Do it at your own risk.

Steve.Jobless