filmov
tv
Cosine Similarity โ Natural Language Processing โ Socratica

ะะพะบะฐะทะฐัั ะพะฟะธัะฐะฝะธะต
๐๐ฃ๐ฉ๐ง๐ค๐๐ช๐๐๐ฃ๐ ๐๐ค๐๐ง๐๐ฉ๐๐๐ ๐พ๐๐๐๐๐๐
Cosine Similarity is a way to compare two pieces of text (docs) to see how similar they are stylistically. This is a useful technique from Natural Language Processing, a growing subfield of AI & Machine Learning. In this lesson, we review how to use the bag of words technique to turn a piece of text into a vector, then show how the 'cosine similarity' measure is a useful way to compare two docs. As a concrete application, we compare 10 different classic novels from different authors and time periods to see how well the cosine similarity measure performs.
๐๐ค๐ช ๐๐๐ฃ ๐๐ช๐ข๐ฅ ๐ฉ๐ค ๐จ๐๐๐ฉ๐๐ค๐ฃ๐จ ๐ค๐ ๐ฉ๐๐ ๐ซ๐๐๐๐ค ๐๐๐ง๐:
0:00 Intro
0:48 Prerequisites
1:43 The Big Idea
3:39 Cosine Similarity
4:42 Example setup
5:47 The Books
6:51 Building a Feature Vector
8:56 Writing the Functions
10:08 Computing Cosine similarities
11:30 No Stop Words
12:50 Analysis
14:00 No Nouns
๐๐ผ๐๐พ๐ ๐๐๐๐:
Bag of Words
Use Mathematica for Free
BTWโSocratica offers a pro course, 'Mathematica Essentials,' providing key concepts for mastering Wolfram products:
Thank you to our VIP Patreon Members who helped make this video possible!
Josรฉ Juan Francisco Castillo Rivera
KW
M Andrews
Jim Woodworth
Marcos Silveira
Christopher Kemsley
Eric Eccleston
Jeremy Shimanek
Michael Shebanow
Alvin Khaled
Kevin B
John Krawiec
Umar Khan
Tracy Karin Prell
โ Thank you kind friends! ๐๐ฆ
โทโทโท
We recommend the following (affiliate links):
The Wolfram Language
The Mythical Man Month - Essays on Software Engineering & Project Management
Innumeracy: Mathematical Illiteracy and Its Consequences
Mindset by Carol Dweck
How to Be a Great Student (our first book!)
โทโทโท
If you find our work at Socratica valuable, please consider becoming our Patron on Patreon!
If you would prefer to make a one-time donation, you can also use
Socratica Paypal
โทโทโท
Written & Produced by Michael Harrison & Kimberly Hatch Harrison
Edited by Megi Shuke
About our Instructors:
Michael earned his BS in Math from Caltech, and did his graduate work in Math at UC Berkeley and University of Washington, specializing in Number Theory. A self-taught programmer, Michael taught both Math and Computer Programming at the college level. He applied this knowledge as a financial analyst (quant) and as a programmer at Google.
Kimberly earned her BS in Biology and another BS in English at Caltech. She did her graduate work in Molecular Biology at Princeton, specializing in Immunology and Neurobiology. Kimberly spent 16+ years as a research scientist and a dozen years as a biology and chemistry instructor.
Michael and Kimberly Harrison co-founded Socratica.
Their mission? To create the education of the future.
โทโทโท
PLAYLISTS
#cosinesimilarity #AI #naturallanguageprocessing
Cosine Similarity is a way to compare two pieces of text (docs) to see how similar they are stylistically. This is a useful technique from Natural Language Processing, a growing subfield of AI & Machine Learning. In this lesson, we review how to use the bag of words technique to turn a piece of text into a vector, then show how the 'cosine similarity' measure is a useful way to compare two docs. As a concrete application, we compare 10 different classic novels from different authors and time periods to see how well the cosine similarity measure performs.
๐๐ค๐ช ๐๐๐ฃ ๐๐ช๐ข๐ฅ ๐ฉ๐ค ๐จ๐๐๐ฉ๐๐ค๐ฃ๐จ ๐ค๐ ๐ฉ๐๐ ๐ซ๐๐๐๐ค ๐๐๐ง๐:
0:00 Intro
0:48 Prerequisites
1:43 The Big Idea
3:39 Cosine Similarity
4:42 Example setup
5:47 The Books
6:51 Building a Feature Vector
8:56 Writing the Functions
10:08 Computing Cosine similarities
11:30 No Stop Words
12:50 Analysis
14:00 No Nouns
๐๐ผ๐๐พ๐ ๐๐๐๐:
Bag of Words
Use Mathematica for Free
BTWโSocratica offers a pro course, 'Mathematica Essentials,' providing key concepts for mastering Wolfram products:
Thank you to our VIP Patreon Members who helped make this video possible!
Josรฉ Juan Francisco Castillo Rivera
KW
M Andrews
Jim Woodworth
Marcos Silveira
Christopher Kemsley
Eric Eccleston
Jeremy Shimanek
Michael Shebanow
Alvin Khaled
Kevin B
John Krawiec
Umar Khan
Tracy Karin Prell
โ Thank you kind friends! ๐๐ฆ
โทโทโท
We recommend the following (affiliate links):
The Wolfram Language
The Mythical Man Month - Essays on Software Engineering & Project Management
Innumeracy: Mathematical Illiteracy and Its Consequences
Mindset by Carol Dweck
How to Be a Great Student (our first book!)
โทโทโท
If you find our work at Socratica valuable, please consider becoming our Patron on Patreon!
If you would prefer to make a one-time donation, you can also use
Socratica Paypal
โทโทโท
Written & Produced by Michael Harrison & Kimberly Hatch Harrison
Edited by Megi Shuke
About our Instructors:
Michael earned his BS in Math from Caltech, and did his graduate work in Math at UC Berkeley and University of Washington, specializing in Number Theory. A self-taught programmer, Michael taught both Math and Computer Programming at the college level. He applied this knowledge as a financial analyst (quant) and as a programmer at Google.
Kimberly earned her BS in Biology and another BS in English at Caltech. She did her graduate work in Molecular Biology at Princeton, specializing in Immunology and Neurobiology. Kimberly spent 16+ years as a research scientist and a dozen years as a biology and chemistry instructor.
Michael and Kimberly Harrison co-founded Socratica.
Their mission? To create the education of the future.
โทโทโท
PLAYLISTS
#cosinesimilarity #AI #naturallanguageprocessing
ะะพะผะผะตะฝัะฐัะธะธ