Pandas DataFrame Agent... the future of data analysis?

preview_player
Показать описание

Let's dive into the Pandas DataFrame Agent from the LangChain library to see how we can integrate analytical capabilities into LLM apps. We use the OpenAI API to ask questions about an Excel/CSV dataset and experiment with the possibilities and limitations of this LangChain Toolkit.

🔗 Links

👋🏻 About Me
Hey there, my name is @daveebbelaar and I work as a freelance data scientist and run a company called Datalumina. You've stumbled upon my YouTube channel, where I give away all my secrets when it comes to working with data. I'm not here to sell you any data course — everything you need is right here on YouTube. Making videos is my passion, and I've been doing it for 18 years.

Рекомендации по теме
Комментарии
Автор

About the calculations: Have you tried the prompt:
- "Act as an expert matematician. <prompt where AI needs to do calculations>. Explain this step by step (that last words are sometimes is required) "
I've read about this workaround to make AI self correct before responses. Happy to watch you update and review with the new stuff. Nice content sir!

RanaGustico
Автор

Every day I am more impressed by the llm potential with langchain, I am a fan of knowledge thank you for your content

camilocampos
Автор

Hi Dave: Thank you very much for the excellent explanation. Now, would you please do a video where you meet with the tokens limitation of the LLM? I would like to see how to overcome this. Thanks in advance!

joseluisbeltramone
Автор

@daveebbelaar any plans to update this for langchain 0.1.0 ?? Maybe in the members' area??

MikeRhodesIdeas
Автор

Thank you. Nice video. Do you know if you can summarize text within a cell in the data frame? If you have a dataset that includes blog posts and you want a new column that has a 2 line summary. Ideas?

DK-dpkk
Автор

how do i put this sort of application to a website so that i can upload my own data on the website and enter a prompt and have it displayed on the website ?

tommyharlim
Автор

Great video. The 2 dataframes part was interesting. I was hoping I can pass in a summary dataframe and a raw dataframe, tell the LLM what is in each dataframe, and then ask it to write an article using both dataframes. "Write an article in this months results (which are in the summary dataframe), and also don't forget too mention some interesting related facts from the raw dataframe. This would require it to join the dataframes together.

Do you think this is possible yet? I see lots of chatGPT with your database but I'm curious how it can work with multiple tables of data.

AwB
Автор

I think using memory component from Langchain will help overcoming bottleneck of memory management for operations requiring more than 1 step.

kumargaurav
Автор

It's interesting to play with it - have tried it out multiple times - but i do see limitations of it. Someitmes it also outputs wrong answers. What (in your opinion) would it take for it to be production ready?

prateekkeshari
Автор

I have been looking for a chain or agent that can work with tools and your own files as well but I couldn't find. Is this even possible?

waddaa
Автор

Would it be more accurate if you added the Wolfram OpenAi plugin to the mix?

onangarodney
Автор

I am building a Streamlit app with the Panda Dataframe Agent, and for the life of me, I cannot get the chatbot to have any memory context in chat. Is there a tutorial where you cover this?

JT-Works
Автор

nice but did you try that with chat models ChatOpenai and use gpt-turbo-3.5 which is much cheaper ? I think the pandasDatframe agent will not work properly though !

HazemAzim
Автор

Just an idea, a video using the new function feature would be great. ;-)

Canna_Science_and_Technology
Автор

Awesome video. Can you do this with Node js?

xanderklein
Автор

Can this work on big data frames? Say 1 million rows of Data ?

madhuful
Автор

I've actually looked at this dataset before and one thing I noticed was that the agent actually made another error at 11:30. It found the median salary using the salary column and not the salary_in_usd column so for example the Head of Machine Learning role only had a single person who lived in india, so when converting 6, 000, 000 indian rupees it only ends up being 76k USD, far from what the results show. While the agent is very powerful, clearly it's not perfect and you have to make sure the questions provided are specific enough and double check the actual code it provides. Regardless, great video and it's definitely a tool I'll look to be using in later projects!

quickandsmart
Автор

My suggestion as a YouTube make the video smaller ur voice is great for background track but add more info into the video, which add value to views time .😊

gamerwager
Автор

Hi Dave, pls can I use an open source model for this instead of Open ai?

ibqmehq
Автор

I note that again you use text-davinci which openai claims is just a slower and more expensive way of getting what got 3.5 gives you for a fraction of the price.
Have you found differently in real use?

johnbrisbin