Synthetic DATA Generation using LANGCHAIN 🦜️🔗

preview_player
Показать описание
In this video, I will show you how to create synthetic data using LangChain and OpenAI models.

Synthetic data refers to artificially generated data that imitates the characteristics of real data without containing any information from actual individuals or entities. It is typically created through mathematical models, algorithms, or other data generation techniques. Synthetic data can be used for a variety of purposes, including testing, research, and training machine learning models, while preserving privacy and security

Happy Learning 😎

👉🏼 Links:

------------------------------------------------------------------------------------------

------------------------------------------------------------------------------------------
🔗 🎥 Other videos you might find helpful:

------------------------------------------------------------------------------------------
🤝 Connect with me:

#langchian #llm #synthetic #syntheticdata #datasciencebasics
Рекомендации по теме
Комментарии
Автор

This is amazing! Can you please try making a more comprehensive version of this and use real data as example (doesnt have to be medical but just so that we can see full procedure)

seththunder
Автор

Thanks! That's a very practical use case. Can you make a full-scale video?

hadikhantec
Автор

Getting Cannot generate a JsonSchema for ({'type': 'with-info', 'function': <bound method BaseModel.validate of <class '__main__.MedicalBilling'>>})

saivihari
Автор

Great tutorial! Is there any open-source implementation available of this approach?

nasiksami
Автор

I saw your video about fine tuning Llama 2 on your own data, can you please make a similar video on fine tuning zephyr or mistral 7b on google colab using abhisekh thakur's autotrain and then how to use that fine tuned model?

devyanshrastogi
Автор

Hello,
How to generate data when there are two tables and having relationship PK, FK? Does the model is capable enough to generate such data with relation?

teja
Автор

Good content, very helpful, able to advice ?

If we check statistical correlation between the real and synthetic data, will the % would be above 90 % ?

prashantt
Автор

interesting video👍 Curious if you have fields that are lookup values and has only 4 different values and after generation the generated values is still valid... Also if you have fields that are made by some algorithm, for example bank number, if its also passed the check constraint for this field after generation based on the few shot examples... And can it also be done using open source llm?

henkhbit
Автор

Using AzureChatOpenAI instead of ChatOpenAI, It's not working any idea?

Shubhknsha
Автор

Hi, good video, for multi table data generation with referential integrity can we use Langchain ?

shitaldhakne-bp
Автор

Could you let me know which version of opening and Langchain used in this video

sebiraj
Автор

May I request to suggest what other open source models we can use to generate synthetic data?

ankitjain
Автор

What framework is best for enterprise application, haystak or langchain?

orlandocastellanos
Автор

I have made application using same code ....getting output parser error while passing sample data to langchain library

harshadahadawale
Автор

Can we generate a larger dataset >1000 using this?

gamevint
Автор

Please give the approach for synthetic data generation using Azure open AI as i have azure open AI key

sivaprasadatla
Автор

I am unable to create 2 tier nested json using this example. Can anyone help here?

Player.