Model Optimization | Optimize your Model | Introduction to Text Analytics with R Part 12

preview_player
Показать описание
This video concludes our Introduction to Text Analytics with R and covers optimizing your model for the best generalizability on new/unseen data. It includes:
– Discussion of the sensitivity/specificity tradeoff of our optimized model.
– Potential next steps regarding feature engineering and algorithm selection for additional gains in effectiveness.
– For those who are interested, a collection of resources for further study to broaden and deepen their text analytics skills.

The data and R code used in this series is available here:

Table of Contents:
0:00 Introduction
17:18 Books recommendation
20:55 NLP task view
22:04 Information retrieval
23:47 Python

--

--

Unleash your data science potential for FREE! Dive into our tutorials, events & courses today!

--

📱 Social media links

--

Also, join our communities:

_

#modeloptimization #textanalytics #datascience
Рекомендации по теме
Комментарии
Автор

Thanks Dave for the fantastic introduction and the list of resources at the end - particularly for IR!

shobhamourya
Автор

This course has been fantastic, pitched at the right level and a great pace for someone like me who is not new to R but is new to text analytics and machine learning in general. I find Dave's delivery particularly engaging. Keep up the great content!

sambaker
Автор

Thank you! You helped me understand and be able to use these at work when my uni professor failed to teach us this stuff.

meagtessmann
Автор

Thank you very much for the awesome series!!

mehmetkaya
Автор

Thanks, Dave! These tutorials were great. I'm very much looking forward to learning more about NLP, now!

minc
Автор

Hey Dave, just watched the conclusion to the series. Excellent stuff. "Chomping at the bit, now"

terrybrooks
Автор

Excellent learning, thanks a ton Dave. Feeling inspired with this series to learn and apply more of these.

debashismukhetjee
Автор

Great series, would love to have a series on analyzing big data using loops and vectorized operations.

junaideffendi
Автор

I'd like to thank you for you work in this series, you are an awesome teacher! You helped me a lot, as I was having a hard time with NLP at work, thanks again!

rafaelsilva
Автор

Amazing series Dave, loved every bit!

adityamehta
Автор

Thank you so much Dave. This was very useful and informative :)

TheShekhar
Автор

Thank you for this awesome introduction. I was there was the same for time series analysis.

toastersman
Автор

great ... I'm going to watch the whole thing again .... it is intense and thrilling :-)

fayburns
Автор

Great course. Thanks. The loaded rc.cv.3 and 4 don't seem to fit the video anymore. It works if you calculate it yourself though.

datafakts
Автор

Thank you Dave for the series of lectures. Great delivery. I am trying to determine the sentiment of the news based on news headlines of particular domain to determine if it is positive, negative or neutral to that domain. How to go about a domain specific sentiment analysis. Any suggestion might help me a lot.

shaleensrivastava
Автор

Great work..I have joined some paid courses for text mining but they don't even come close to this..

hemantnaikgoa
Автор

Hello Dave.. thank you for presenting text analytics in such a lucid manner. I have two questions that I would like to ask
1) Have you worked on Linear programming for aggregate planning. I am a supply chain consultant and I was looking for some help material on this.
2) When is you and your Data Science Dojo team coming to India :)

vikrantnag
Автор

At 9:55 am not sure if the calculation behind the confusionMatrix function have been done correctly with specifity, I tried to calculate it manually and the result is about 0.96 but all of the other metrics like accuracy and sensitivity are the same ie I compute them manually, so is this a problem with confusionMatrix function or I have missed the concept here?

nureyna
Автор

This is an excellent tutorial... thank you so much... Please can you suggest the best way to store the train.tokens.idf data format. Will RData format work. As IDF will be used for testing in later stages, I want to store IDF for reusing it again and again.

Please can you suggest, should the column names used in train.tokens.tfidf should be the same while testing or validating the results, will it give better results?

Thanks in advance

punithac
Автор

Hi David,
Now that we have a 97% accurate model, how can we apply it to data that doesn't already have a label?
Let's say a new SMS came in and we wanted to predict if it was spam or ham, what would we do?

jackcornwall