NEW OpenAI Reinforcement Fine-Tuning! (12 Days of OpenAI)

preview_player
Показать описание
Today OpenAI showcased their new Reinforcement Fine-Tuning feature!

--------------------
📞 90 Minute Consulting

🛒AI Templates, Prompts and Courses for ChatGPT and Midjourney:

📰Visit our website:

🔗Follow me on Linkedin:

🐥Follow me on twitter:
--------------------

⌚Chapters:
0:00 - OpenAI Reinforcement Fine Tuning
2:16 - Free Ai Community
3:07 - Recap of Reinforcement Learning Demo
11:00 - Understand Reinforcement Fine Tuning
16:35 - Our Free Ai Community

#chatgpt #openai
Рекомендации по теме
Комментарии
Автор

Thanks for sharing your knowledge on this Drake. Excellent information.

MJFUYT
Автор

your voice & easy explanation is better ease than actual openai videos!!

paragbharadia
Автор

I think the key here is the reward model learnt from graders ? since these are autonomous agents - it is likely a powerful model like O1 is used as a grader who has the access to groundtruth, so it can reason and guide the O1-mini (or smaller one that we are ReFT ing) to generate a positive or a negative reward or potentially fractional reward. I wonder how this kind of a solution is different from other agentic solutions that self-critize and "think".

deepak_babu
Автор

You are doing great 🥰🥰 watching you from the beginning 👏👏🥰🥰🙏🙏

MrBoxsoumendu
Автор

Exciting times! This is going to be huge.

ProductiveDude