Can OpenAI's o1 solve complex medical problems?

preview_player
Показать описание
First thoughts and preliminary insights into OpenAI's GPT o1 Strawberry in the medical domain, with some expected and unexpected findings. We have a "bake off" between o1 and Doc to demonstrate how o1 fares with tricky medical scenarios

Disclaimer - obviously don't use AI to diagnose or treat your medical problems, if you are unwell please seek a medical professional (AI isn't good enough just yet :)).

👋 Hey! If you are enjoying our conversations, reach out, share your thoughts and journey with us. Don't forget to subscribe whilst you're here :)

00:00 start + highlights
1:28 intro, what is GPTo1
5:18 what is "reasoning" in o1
12:38 Benchmarks- o1's successes and failures
24:07 O1 and doctor bake off!
24:21 The pregnancy acid test for LLMs
26:23 clinical coding
30:06 Tricky patient scenarios
32:25 opioid dose conversions
Рекомендации по теме
Комментарии
Автор

A tip: start a new chat for each of the questions. It will likely respond better then, as it uses all of the previous questions as context, and quite heavily so.

satioOeinas
Автор

I created a simple website, with anthropic API keys, took me a couple of hours. You enter a patients information, their history, and symptoms, it returns possible diagnosis(s) and a patient specific treatment plan.

My cousin who is in med school stress tested it and she was like omg how did you make this, its amazing, and I was like its just a wrapper haha

arnavprakash
Автор

Great talk. Interesting to see how AI is helping the wider medtech industry.

Just a small tip. Always try to use fresh sessions when asking unrelated questions. Us humans have a remarkable ability to ignore the past and move on to the next problem in the set, but LLMs will analyse the entire history prior to marking them as irrelevant (even with the initial message indicating that it's a quiz). As a result, accuracy and precision drops the deeper you go into the conversation.

bombala
Автор

Great video! I’m also excited for o1. I gave it 350 records to sort and analyze, and it did the same work in 20 seconds what would have taken me 3 hours in Excel. Very impressive

jd_real
Автор

The full model should be arriving next month, would be interesting to give it even harder tests.

ShpanMan
Автор

The problem with ARC puzzle Is that substantially It Is a visually reasoning task. When you translate It into a matrix you are not testing the same thing as for humans. I think LLM will only get better at this task improving the vision capabilities, not only the reasoning ones. And with this I won 1 million dollars :-)

LucaCrisciOfficial
Автор

Also, many people were accomplishing this "reasoning" by using RAG processes and making multiple api calls to both hold the model's hand through the reasoning process and also as a way to confirm results. Supposedly, much of this won't be necessary, if it delivers on its promises.

I'd like to see the model come back with requests for clarification or additional information.

sevilnatas
Автор

It is nice to see real experts tests this model and not relay on OpenAI internal testing or random tech youtuber.

chickendinner
Автор

Still waiting for full version of gpt o1

AlfarrisiMuammar
Автор

I don't get what the apparent error was in the last case (?). It stated it as "approximate" after all 🤔

djayjp
Автор

Minor pedantic correction: it isn't OpenAI GPT o1, it's just OpenAI o1. Sam doesn't like the name GPT. The o1 series is a fresh start without the GPT.

human_shaped
Автор

Slight correction, it's not called GPT o1 just Open AI o1. But great content and very scary.

EamonnMooney
Автор

wait till next year, this is just the start.

michaelhartjen
Автор

audio wouldnt work for this models, unless people want to wait for an answer.

andreaskrbyravn
Автор

Proprietary and Open Source is not supposed to exist at the same time. The origin of OpenAI was as a non-profit and supposed to be both open source, safe and benefiting society. Seems that is all out the window now.

sevilnatas
Автор

Well, I don't know what to say, other than I am glad I have no children. I feel sorry for those who do. Looks like a cold and dark "future" for them

xXstevilleXx
Автор

it will aid the depop movement greatly. get your heads out of the sand

standingbear