Thew New 'Claude 3.5 Sonnet' Actually SHOCKED The Industry! - Beats Gpt4o

Показать описание

Claude 3.5 Sonnet Revealed!

Links From Todays Video:

Welcome to my channel where i bring you the latest breakthroughs in AI. From deep learning to robotics, i cover it all. My videos offer valuable insights and perspectives that will expand your knowledge and understanding of this rapidly evolving field. Be sure to subscribe and stay updated on my latest videos.

Was there anything i missed?

#LLM #Largelanguagemodel #chatgpt
#AI
#ArtificialIntelligence
#MachineLearning
#DeepLearning
#NeuralNetworks
#Robotics
#DataScience

TheAIGRID

Рекомендации по теме

Комментарии

It's really remarkable. It's huge. It shows that when it comes to AI, literally everything can change in the blink of an eye.

daniely

This thing is so neutered it's not even funny. Can't ask anything even remotely controversial.
Asked for places that would be safe in case of nuclear war, and it told me that I should talk to a therapist and to practice relaxation techniques...

eugenes

This video is so shocking that I was shocked by it. Truly some great YouTube shocking material

darkhollow

Clause Opus is already good enough for my use cases. If Sonnet can increase the message limit, then that's a very real quality of life improvement.

carlkim

13min vid🥶 im stunned, shocked, pretty pretty pretty shocked

riptechnoblademinecraftkin

Thanks for a great description. I am amazed at the very human responses I get from 3.5. It even interpreted my multiple-sentence poetic analogy in perfect detail, understanding how each phrase was analogous to our topic. Truly amazing

tunahelpa

Sonnet 3.5 with 62% is at the level of a good amateur programmer

sephirothcloud

Wow, the car door notification chiming in the background is really actually shocking

charleslpayne

This thing past the test "write 10 sentences that end with the word "orange"

anta-zjbw

Anthropic needs to work on their API pricing and number of messages restriction in chat. It’s annoying, GPT4-0 is pumping out work 24/7 for relatively decent pricing and much cheaper than Claude.

OscarTheStrategist

Ive found from my tests that GPT-4o still is definitely the best for math questions. it gets them right more often and shows more of its work and shows it better and the webui for claude doesn't seem to support latex as well. for creative writing I was expecting Claude 3.5 to be better since Claude 3 opus is very human but I've noticed when it comes to sounding human and creativity Claude 3 opus is still to this day better than GTP-4o and Claude 3.5 sonnet so Ive found that this release of course is great because its free but if you're expecting a major super duper improvement or anything its not there ChatGPT is still probably better for most situations simply because it has more features. however, the magnetic capabilities shown in many of their demo videos could change this and make claude 3.5 better but I don't have access to it yet only the text model :(

pigeon_official

I'm absolutely shocked the video ended mid sentence.

fractal

Most relevant for RL is humaneval and GPQA benchmarks. Actually dope asf. Looks like the D riding is finally ending and labs are trying new ideas. You be surprised what type of performance gains you can get exploiting LOTS of test time compute per prompt(emulates larger model output), filter, and coupling with something like LiPO. Still a lot of easy hobblings out there as the kids say lol.

Cheers to them. Almost 60% on GPQA zero-shot is extremely impressive. I do hope companies include more revealing benchmarks. Considering error. Humaneval and a few other benchmarks used for to promote model releases, are almost completely saturated and damn near meaningless.

alexanderbrown-dgsy

just tried it with a specialized prompt of mine and follow up questions that no model could solve yet properly, not even gpt4o. and every time i did this, the result was that my worries about ai taking over humanity were eased. let me tell you, i am worried now. this is no joke anymore. this is getting creepy. it begins now. and its only june. and this is only their sonnet model. boy oh boy are we in for a ride.

peterkonrad

Subscribed to pro instantly. The artifacts is so useful for the makers among us. Upgrade of the year I find it

koen.mortier_fitchen

This entire experience is blowing my mind.

HexylvaniaFilms

I am critical of benchmarks these days, as benchmark data can accidentally be leaked into the training data of the models. One might better wait for the chatbot arena leaderboard to get a first hint of how good the model might be.

The model might be very good; however, one should also be careful when interpreting graphs with no tick marks, especially when an undefined quantity like 'intelligence' is presented on the y-axis. 😁

OmicronChannel

really wasn't expecting this news. indeed it is shocking

elon--musk

I'm so shocked that I didn't even watch the video.

ZipZapTesla

Very very, pretty pretty and really really cool

robertonery

Thew New 'Claude 3.5 Sonnet' Actually SHOCKED The Industry! - Beats Gpt4o

Thew New 'Claude 3.5 Sonnet' Actually SHOCKED The Industry! - Beats Gpt4o

Unlocking AI Creativity ChatGPT, Claude 2024-2025 | Midjourney and Bing Image Creator | The Real ai