How Large Language Models Work

preview_player
Показать описание

Large language models-- or LLMs --are a type of generative pretrained transformer (GPT) that can create human-like text and code. There's a lot of talk about GPTs and LLMs lately, but they've actually been around for years! In this video, Martin Keen briefly explains what a LLM is, how they relate to foundation models, and then covers how they work and how they can be used to address various business problems.

#llm #gpt #gpt3 #largelanguagemodel #watsonx #GenerativeAI #Foundationmodels
Рекомендации по теме
Комментарии
Автор

I don't know what is more impressive, LLMs or this guy's ability to write backwards perfectly.

mindofpaul
Автор

Very nice explanation, short and to the point without getting bogged down in detail that is often misunderstood. I will share this with others

dennisash
Автор

Nicely done! You explain everything very clearly. This video is concise and informative. I will share with others as an excellent foundational resource for understanding LLMs.

surfercouple
Автор

Martin keen as awesome as so natural. I love his talks and somehow I owe to him my understandingof complicated subjects in AI>

saikatnextd
Автор

Great video presentation! Martin Keen delivers a superbly layman friendly elucidation of what are otherwise very 'high tech talk' to people like me who do not come from a tech based professional background. These types of content are highly appreciable, and in fact motivate further learning on these subjects. Thank you IBM, Mr. Keen & team. Cheers to you all from Sri Lanka.

DilshanBoange
Автор

Really really enjoyed this primer. Thank you and great voice and enthusiasm!

KageManTV
Автор

Hey, nice job!!! yeah, I'd like to see more of these kinds of subjects in the present and the future as well!!!

Pontie
Автор

Very nice and crisp explanation. Love it.. Thanks

decryptifi
Автор

great presentation, feels like personal asistant, great!

rappresent
Автор

tbh, I just love his voice and ready to listen all his videos 🤗

evgenii.panaite
Автор

IBM big thanks to you for all this videos! This videos are really helpfull

dmitriyartemyev
Автор

I've liked and subscribed and done it again a thousand times in my mind

peterprogress
Автор

Large language models like GPT-3 work by using deep learning techniques, specifically a type of neural network called a transformer. Here's an overview of how they work:

1. **Data Collection**: Large language models are trained on vast amounts of text data from the internet, books, articles, and other sources. This data is used to teach the model about language patterns, grammar, syntax, semantics, and context.

2. **Tokenization**: The text data is tokenized, which means breaking it down into smaller units such as words, subwords, or characters. Each token is assigned a numerical representation.

3. **Training**: The model is trained using a process called supervised learning. During training, the model learns to predict the next word or token in a sequence based on the preceding context. It adjusts its internal parameters (weights and biases) through backpropagation to minimize prediction errors.

4. **Transformer Architecture**: Large language models like GPT-3 use a transformer architecture, which is highly effective for handling sequential data like language. Transformers include attention mechanisms that allow the model to focus on relevant parts of the input sequence while generating output.

5. **Fine-Tuning**: After pre-training on a large dataset, language models can be fine-tuned on specific tasks or domains. This process involves additional training on a smaller dataset related to the target task, which helps the model specialize in that area.

6. **Inference**: Once trained, the language model can generate text by predicting the most likely next tokens given an input prompt. It uses the learned patterns and context from training to generate coherent and contextually relevant responses.

7. **Continual Learning**: Some language models support continual learning, which means they can be updated with new data over time to improve their performance and adapt to changing language patterns.

Overall, large language models combine sophisticated neural network architectures, extensive training data, and advanced training techniques to understand and generate human-like text.

ChatGPt
Автор

Thank you for posting this video. What are the other architectures available apart from Transformer?

SatishDevana
Автор

Can a subsequent SFT and RTHF with different, additional or lesser contents change the character, improve, or degrade a GPT model?

amparoconsuelo
Автор

Lol. I only knew Martin Keen from Brulosophy. This is sort of mindblowing.

NicholasDWilson
Автор

The term large can not be referred to as large data; to be precise it is the number of parameters that is large. So slight correction.

CyberEnlightener
Автор

Very nice explanation, are these foundation models are proprietary? How many foundation models exist?

vainukulkarni
Автор

I get a remote job offer. The duty is AI training for LLM.
Shall i go for it? What do you think?

korgond
Автор

Imagine a world where wikipedia no longer needs human contributors. You just upload the source material, and an algorithm writes the articles and all sub-pages, listing everything it knows about a certain fictional character because it read the entire book series in half a second. Imagine having a conversation with the world's most eminent Star Wars expert.

kevnar