'Not with my data' many people say to AI companies

Показать описание

Artificial intelligence just seems to keep growing and growing and growing, fueled by largely unregulated access to massive amounts of free, publicly available data. But unfortunately for our future robot overlords, that data has begun drying up as websites and organizations have begun restricting access to their information. What does this mean for the AI industry? Let’s take a look.

🔗 Join this channel to get access to perks ➜

#science #sciencenews #ai #tech

Рекомендации по теме

Комментарии

Having to know the name of the bot to block it feels eerily like an exorcism

somerandompersonintheinternet

AI: "We'll conquer humanity... right after this cat video."

DataIsBeautifulOfficial

The fun story with Facebook, they were asked, in Australia, if they were using accounts to train AI, and the answer was "we'll have to look into it". Then it turns out it is completely legal in Australia to use the data and they come out "Yeah. Yeah we're using any and all photos shared by Australians in our AI." No pretense. No concern.

doublepinger

If you're going to talk about copyright hypocrisy, make sure to mention the Internet Archive and how it's being sued into oblivion by people who think libraries are a disservice to humanity.

michaelleue

"We need high quality data. Let's crawl the internet for that!"

Something does not fit together...

mangalores-x_x

Sorry, some correction regarding robots.txt:
1) You can make a rule to block all bots and then allow only certain (like Google bot) via allow list.
2) robots.txt is still only as useful as a stop sign put on an empty field of grass. Bots from Perplexity etc just ignore it.
Hence all at least kinda reliable blocking requires technical measures... usually using more AI, just not of the generative kind. Thus one can see it is data theft, as much as a someone breaking a window is doing physical theft.

stephan

Happy birthday 🎈❤, Dr. Sabine (well, tomorrow), so good that you were born, great to have you here in our universe.

Thomas-gk

Wow, so the movie Short Circuit (1986) was prescient. AI "needs input."
Number 5: "Malfunction. Need input."
Stephanie Speck: "Input. That's information! Listen, I am full of it."

kimwelch

Definitely talk about copyright hipocricy !

drazenimoti

This is something I laughed about when this whole thing started - GIGO - Garbage In, Garbage Out.

RocketCityTim

robots.txt doesn't block access, it just tells the crawlers, which files they should not access. So, there is no need to rename the crawlers, they can just ignore these instructions.

thomasmueller

Technically, robots.txt does not prevent crawler activity but merely advises on it. Some crawlers respect this advice, while many ignore it and often do not properly identify themselves as bots.

dmitrysmirnov

robot.txt is just a sign in your front yard that says "don't take my data, please." It has no effect on crawlers that ignore it. And web crawlers are literally called robot spiders.

scottmiller

I wouldn't be surprised if Open AI just bought the data from companies with less restricted web crawlers.

michaelblacktree

Most of VPN companies will actually sell your data. You can only guarantee privacy by running your own VPN server, or using a VPN that doesn't require your ID and pay with crypto or cash.

kras_mazov

About robots.txt: you can say nobody except a few are allowed to crawl. Also the crawlers can just ignore robots.txt. Don't know if there is any legal ruling about if robots.txt is legally binding.

bloody_albatross

6'13" exactly. I've seen job ads that recruit scientists to train models for $50 per hour, which is slightly higher than what a science professor gets on average in the US.

suichiao

I'm a software developer and, at work, we're putting a lot of work implementing AI to answer phone calls, route calls, etc... you know, customer service stuff. Then, the other day I was looking up a restaurant and Google offered to have AI call for me, check the wait times and/or make a reservation... that's when I realized... we're building a network of AI bots that all talk to each other using English as their API? I don't know what that means but... it doesn't sound good at all.

charliemopps

An additional problem is that more and more "data" on the internet is itself generated by AI, or at least LLMs. So AI will increasingly be training on so-called data that it produced itself. This will lead to total uniformity (we're already about 80-90% there already) and a convergence on idiocracy.

chrishall

Really interesting and well put together

ericlani

'Not with my data' many people say to AI companies

'Not with my data' many people say to AI companies

my mobile phone no data not working

🫣 ANYONE can see your browser history 😨

My iPhone Mobile Data is not Working - Fixed / No internet Connection

TRICK TO SAVE MOBILE DATA ON ANDROID | OnePlus Tips & Tricks #shorts | TheTechStream

How To Fix Mobil Data Not Working on iPhone,Why my iphone (Cellular/Mobile),stuck Searchng

Health app not tracking steps on iPhone Fix

fix almost ANY network issue on iPhone under 10 seconds! #wifi #iphonewifi #iphonehacks

manually writing data to a HDD...kinda #shorts

Why 1% Battery Last So Long ☹️

how to solve storage space running out problem

What Happens When You Quit Your Phone

😥 cricbuzz open nahi ho raha hai | cricbuzz app not working | cricbuzz app loading problem

How to Bypass ' iPhone Is Disabled ' Without Restoring #Shorts

Inside OpenAI's Stargate Megafactory with Sam Altman | The Circuit

DON'T DO THIS with your Nintendo Switch OLED ⛔️ #Shorts

At What Age Should You Stop ____?

How to tell if your iphone has been hacked! #shorts

the 'he's flying' song IRL 🌈 | SpongeBob #shorts

Don't Do This To Your PS5

How not to plug in your PC

If you have an Xbox, DO THIS!

How To Speed Up Any Android Phone With This Simple Tip In 2023!

Excel tip advanced filter unique values