Tidy Tuesday screencast: predicting wine ratings

preview_player
Показать описание
I analyze a dataset of WineEnthusiast ratings as an example of statistical modeling and machine learning in R, performed without looking at the data in advance. This includes fitting a linear regression to predict wine ratings based on price, country, and taster, and then using tidytext and glmnet to fit a sparse regression based on text descriptions.

Рекомендации по теме
Комментарии
Автор

I did not understand half of the things you talked about, but i really appreciate the video and the effort behind it. thank you

calemadrian
Автор

Thanks David, I learn a lot watching your screencast. This one was revealing, I want to practice that kind of text analysis.

deradelo
Автор

Best way to learn! It's a bit humbling to watch David in action, but inspiring and very insightful!

baruchschwartz
Автор

1:17:50, the description is "Aromas of pumpkin, squash and corn chips are stale and not inviting. There's an acceptable mouthfeel to this weird, unbalanced Chardonnay along with flavors of spiced squash, mealy apple and sautéed root vegetables." not is a stop word, and it has been filtered out. But change the meaning of these part of the statement. "not" appear 23322 times in all descriptions, I think it's a mistake to filter it out.

xiaoranmo
Автор

Another amazing screencast as always. Thank you David for taking your time to teach us mere mortals! Hahaha

victorgaluppo
Автор

Really enjoying Tidy Tuesday screencasts. Thank you David.
If you could have RStudio showing your environment variables (rather than your Git files) as you develop your code, that would help this particular R neophyte quite a bit.

gflocktube
Автор

Regex for the last occurence of year using negative lookahead: (\\d{4})(?!.*\\d{4}) Better: Only with 19 and 20 at the beginning using non-capturing Even better: apply a function that extracts the years and takes the maximum year.

soylentpink
Автор

Hi David, super nice screencast! Really learned a lot from it. Just one question, is there a particular reason you use a general R Mardown document and not an R Notebook or is it just personal preference? Thanks a lot!

Rycon
Автор

Found your blog which led to #rstats on Twitter leading to this. Learned a lot! You have a new subscriber.

frederickcorpuz
Автор

Is there any possibility to make new videos on atmospheric data (netcdf, hdf files) such as plotting spatial maps?.

kunalbali
welcome to shbcf.ru