filmov
tv
Language Model Evaluation and Perplexity

Показать описание
Transcript:
In this video I'll show you how to evaluate a language model. The metric for this is called perplexity and I will explain what this is. First, you'll divide the text corpus into train validation and test data, then you will dive into the concepts of perplexity an important metric used to evaluate language models. So, how can you tell how well your language model is performing? Recall from the previous videos that a language model assigns a probability to each sentence. The model was trained on the corpus. So for the training sentences, it may assign very high probabilities. You should therefore first split the corpus to have some testing and validation data that are not used for the training. As you may have done in the other machine learning projects, you'll create the following splits of training validation and test sets. The training set is used to train your model. The validation set is used for things like tuning hyper-parameters, and the test set is held out for the end. Where you test it once and get an accuracy score that reflects how well your model performs on unseen data.
Комментарии