Zipf's Law

preview_player
Показать описание
Do most words in a corpus occur with average frequency? Absolutely not! This video discusses a surprising regularity about word frequencies in corpora. And at the end, we'll make a trip to Hogwarts and see if Zipf's Law applies also in the world of wizards.

If you want to follow along, here are the word list files:

Request the Potter corpus:

Vsauce on Zipf:
Рекомендации по теме
Комментарии
Автор

I LOVE that you've linked to Michael Stevens' video. I'm playing around with predictive language models and I'm really happy you're talking about WORD TOKENS in this video!

LeonaDarkwind
Автор

Hello :) I watched your abralin talk live on Wednesday. I study generative syntax, and I was very inspired by your discussion of negative evidence in the Q&A session! Thank you for all the wonderful videos!

yingyusu
Автор

Never seen normal distribution being explained so clearly and easy way to understand.

Mustafghan
Автор

Thanks for that extensive video! It put a great value into my master's thesis. Even though I'm dealing with distributions in geographical data, it was great and easy way to understand Zipf's law.

BassmanTh
Автор

Martin, Zipf's law makes me wonder about the value of MI scores, not that they aren't meaningful, but when you review collocation results for a word and find that MI seems to have nothing to do with absolute frequency, but just mutual attraction continuing to exert its pull regardless of frequency. Collocation is a function of context, and it's the frequency of contexts that varies, analogous to the way certain climatic circumstances can promote the health of, say, vegetation and insects. Plug "miserable" into COCA and you get "creature" at rank 15 and an MI of 7.38 after a long line of MIs in the 3.0 range, because "miserable creature" is construction that occurs on certain rhetorical occasions. Am I overthinking this?

TheRealGnolti
Автор

I'm doing a project on this same thing, would there be any chance for me to get in contact with you for a possible interview? awesome video by the way

thinaradesilva
Автор

Excellent video. You teach excellently, your students must be happy with you.

shamsuddeenhassanmuhammad
Автор

Hi Martin, thank you for the wonderful and very helpful video. I am applying Zipf's law on my task to create a dictionary of words that are specific for a particular category - However, I wonder if I could use the curve to determine a threshold number for the most significant words for the dictionary ? For instance, use the intercept to determine this?

cidiladamourasemedo
Автор

Have you ever tried plotting the multiple "position × n", would be interesting to see how much it varies. (if it was in the video I missed it)

Pakanahymni
Автор

Hi, thank you for your wonderful videos.
Does this law hold true for words uttered or written by non-native speakers of a language? or uttered by children before having mastered the language?

carolynknight
Автор

But what if you make a language with "aaa" before every word? Does Zipf's law apply then?

Temerold_se
Автор

Thank you for the video:D I'm trying to download Antconc on mac with the newest version but there can be opened because "Apple cannot check it for malicious software." Also, when I was forced to open it doesn't have a way to open files on it. I would wondering do there have any ways to fix those problems?

Melnish
Автор

Hi, Thank you, I will follow all video of uncle

duck