🤬 How the #@%$! Do You Use an LLM in a SaaS Platform?

preview_player
Показать описание

In this video, I’ll walk you through how I’m using LLMs in a SaaS platform that I recently launched called Learntail. Learntail is an easy-to-use AI-powered quiz-generating tool! 🤖

🔖 Chapters:
0:00 Intro
2:43 Langchain in Learntail
11:32 Outro

#arjancodes #softwaredesign #python
Рекомендации по теме
Комментарии
Автор

I am super excited about this series! There is no other content like this that goes beyond toy LLM applications and actually talks about the technical complexity involved in creating an LLM SaaS application. Can't wait for the next video :)

bassimeledath
Автор

I cant wait for these series. This is new, I dont think there is a single company that has launched a product and then explained how it has been developed. You are a definitely giving value to the developer's community. Thanks

DanielRodriguez-luuu
Автор

very cool thought process. I'm on the same journey only a much more novice programmer. I'd love to see a video on how you handle connecting to the API and how you handle timeouts and failed attempts. I haven't even learned how the async thing works yet so it looked like magic :) I'd also love to learn how to use pydantic in this context. One thought, why try to generate the entire quiz in one pass? When humans write exams, its very iterative and requires awareness of the skill level of the student, the purpose of the course, and how to craft questions that match the skill level and mitigate surface level clues (ie., long text = correct answer) and randomize the question order and answer order. A quiz seems more suited for use with function calling and non-GPT performed steps, repeated a number of times until the quiz is upgraded to a high quality product. Also, creating a quiz might benefit from a grading rubric that is fleshed out first. When a student signs up for the "101" course on a topic they are implicitly identifying that know little to nothing about the topic and need a broad overview and the exams reflect that. normally, a teacher defines the learning objects on day 1. So maybe as part of the student onboarding when making an exam, you have them describe their intentions and learning objectives and then select a boilerplate rubric you can pass to the LLM. the student is a novice, the student wants to learn a broad overview of the topic, we need to focus on terms and definitions, terms and definitions are suitable for multiple choice questions, to help differentiate student performance, we need 5-10 questions that are more sophisticated but still fall within the rubric...

EmilioGagliardi
Автор

One of the best content for development in yt! Congrats!!

Malbao
Автор

Arjan, thanks for sharing this. Such kind of content is really precious, since one can find high level insights and struggles on how to built a SaaS app, which is especially interesting to me as a former SaaS CTO. Btw, I like the lights on the background, they look really Dutch cozy and remind me good times when I lived in the Netherlands :)

edward
Автор

Loved the video.

Are you not worried about the costs associated with OpenAI? Are you exploring Local LLMs? Would love you to do a video on Local LLMs if you are. 

Would it ever make sense to train a small model yourself to extract the relevant parts of body from an HTML page? That way you will be able to just that model for extracting text and (hopefully) control the response time.

BTW the duration of the video is just right. 10-12 mins.

aseemasthana
Автор

I love the concept of you explaining to us how to build out a proper project ♥ thanks man!

obsidiansiriusblackheart
Автор

Two thoughts - 1) I hit all the same problems as you regarding web scraping, it's good to hear I'm not the only one. I actually started a service that simply returns a best guess at body text for an URL and handles YouTube transcripts too. 2) If you're looking to create programming tutorials, unless it's a recent feature why not ask for the quiz directly from the LLM and skip the middle man? It likely has programming knowledge for many of the features you're trying to train it on already. It may be the case that it's even able to look up newer content as it does with Bing in plus, if it's added to the api at some point in the future.

IanWootten
Автор

exciting series indeed! :) i'm also working with llms & langchain atm and these are the things that would be really interesting to see be worked on:
- llm/agent evaluation and python testing (as mentioned, especially for temperature>0) -- helps ensure accuracy
- "risky" agents like python/sql agents
- optimizing for context windows (as mentioned, concurrent/parallel llm calls / agent steps) -- helps with scalability
- extending from the base langchain agent to create a custom one that might handle more complex usecases
- finetuning gpt or smaller local models
- human in the loop / user-llm interaction loop to drive better results

zactamzhermin
Автор

Cost. How do you see the conversation about token usage between the dev and higher ups go with building these products?

d_b_
Автор

++ for the idea of a prompt testing library. But wouldn't that library need an LLM itself to perform the tests ? 😅 I don't see how you could "hard code" the checking process without it.
Do you see things differently ?

selimrbd
Автор

Hey Arjan, at 10:40 you mentioned a prompt testing library. This sounds awesome and incredibly useful for the commuity at large. I think something like the autogen library may be able to do it.

You may be able to create a prompt generator agent that defines the prompt and have a critic agent that guages how well your prompt may adhere to a certain set of contiditions that are set up.

You may even have another agent that could be the quiz-taker agent that acts as the student for a given subject and may provide feedback on whether or not the prompt/quiz generated could provide outputs that drive meaningful understanding of the given context.

Scaling of the above could blow up but contraints can be set. Autogen could also allow for human-in-the-loop participation.

I hope to see something like the prompt testing library integrated seamlessly in the future so we can gain the rewards that it can provide over a single prompt AI conversation.

DisloyalASP
Автор

Hi Arjan. Consider using the readability package to get the contents of a URL. It will do the heavy lifting for you, and you won't have to wrangle with BS4 as much.

realmac
Автор

For testing the LLM output, maybe not exactly applicable to your case, but have you checked the RAGAS library or ChainForge?

nourelmawass
Автор

Just a thought; you could use GPT to summarize the output from beautifulsoup. I'd imagine that is a nice way to clean up your data and at the same time provide a better input for your next step. Because generally I guess you want quiz-questions that relate to the 'main' ideas of the sample text. Downside is indeed that it is slow; although with GPT4 you might even be able to do this in a single prompt? Or set the instruction as a system message before the actual data arrives?

Zaklamp
Автор

Arjan, would be cool to see a video about DDD in Python development from you. I already posted this video request in the discord server, but just doubling my chances here 😂

Sheihesinusslon
Автор

Thank you. Very nice content.
By the way, are you using conversational (chat) or completions endpoints? GPT3.5-turbo (chat) is, imho, pretty dumb as long as following instructions goes. What you could ask at once to GPT4, you have to break it in simpler steps for 3.5.
Another one: how much mileage does langchain give you? I am a bit skeptical on the over-complexity it brings about...

ivanherreros
Автор

Definitely excited about this - and will play around for sure.
I've built a MVP of an idea I had around communication (sadly a data scientist so learning JS and spinning it all up was hard for me :P), using Python + ChatGPT in the backend - and definitely feel a lot of your pain (particularly around "good" vs "bad" responses or easy, medium, hard questions in your case.
I've always thought that data is going to be king - so my second MVP page focused on created a data feedback loop that I thought was still fun and engaging for the user, but helped label data for me to train a model in the future (or fine tune a LLM).
P.S. I left it running on a AWS compute for 3 weeks and already burnt through my years free trial money 😢so it's currently offline :P
Excited to see more!... Interested in any help?

alexjenkins
Автор

Thanks for sharing this! I found it very interesting as we are working with similar challenges and posibilities in our company. Good Video!

Arrowtake
Автор

This is exactly what happened to me, I'm still struggling to fix speed issues thinking I'm stupid or something but the reality Is that It's not easy especially in my case as a solo dev.

picassoofai