Pickling and Scaling - Practical Machine Learning Tutorial with Python p.6

preview_player
Показать описание
In the previous Machine Learning with Python tutorial we finished up making a forecast of stock prices using regression, and then visualizing the forecast with Matplotlib. In this tutorial, we'll talk about some next steps.

I remember the first time that I was trying to learn about machine learning, and most examples were only covering up to the training and testing part, totally skipping the prediction part. Of the tutorials that did the training, testing, and predicting part, I did not find a single one that explained saving the algorithm. With examples, data is generally pretty small overall, so the training, testing, and prediction process is relatively fast. In the real world, however, data is likely to be larger, and take much longer for processing. Since no one really talked about this important stage, I wanted to definitely include some information on processing time and saving your algorithm.

While our machine learning classifier takes a few seconds to train, there may be cases where it takes hours or even days to train a classifier. Imagine needing to do that every day you wanted to forecast prices, or whatever. This is not necessary, as we can just save the classifier using the Pickle module.

Рекомендации по теме
Комментарии
Автор

90, 000+ views on the first video, 10, 000 here. Glad we are all sticking with it! Thanks Sentdex!!!

j.hanleysmith
Автор

I was zero in python a month ago .. Then came across ur channel Now I know python and doing good in ML too...Keep up the good work bro!!

ravikiran
Автор

after I become an expert in machine learning I will definitely remember you, THANKS SENTDEX

rohitborra
Автор

Really helpful. I really appreciate the way you explain almost everything, each line of code that you write.. :-)

singer
Автор

I have to admit, last session you did I will have to go through it few more times, but keep it up - all in all, great work.
You should seriously think about teaching.

MrAsardi
Автор

There is an easier way to save a classifier, it includes some internal optimisation that doesn't occur when just using pickle, and it seems to be the official way now. You can do it like so:

*from sklearn.externals import joblib*
*joblib.dump(clf, 'linearregression.pickle')*

On the other end, you can load your classifier like so:
*clf =

Nezopu
Автор

BEST platfrom to learn machine learn with python :)

rohitpingale
Автор

You can use cPickle instead of pickle, works in the same way but it's faster (IIRC it's coded in C). Nice tutorials by the way, you're the number 1 !

mfasco
Автор

I like the cheeky acknowledgement at how consistent your sign off is haha

crispychicken
Автор

from sklearn.externals import joblib

joblib.dump(clf, 'clftrained.pickle') #for saving
clf = #for loading

eastwoodsamuel
Автор

Hi, thanks for the great videos. A quick question, the prediction being made is being made for till date and not into the future. I saw a few others post the same problem on the last video. Any chance of a fix?

sidgoyal
Автор

OK, for the problem of Prediction in the future day: See below solution(Tested):
1.Remove#: #df.dropna(inplace=True)
because after you shift, the dataframe is reduced (forecast_out) rows, eg. the 'Goog' has 5000 business day in Stock Market, now only the df(dataframe) have 4900 days after dropna. so at the end the last_date=4900th days, so the prediction start from 4901th days.
2. add:y = y[:-forecast_out]
because you now did step 1, len(X) has 4900 rows, and the Train and Test must have same len(), so you need to cut the y from 4900 days which is [:-forecast_out].
after training, the new prediction will start from Last_day=last row of dataframe df which is last business day when you run this code

MatthewTorontoParis
Автор

Hi,


Great tutorial, everything works fine except the zoom in for the plot. Mine looks very pixelated when zoomed .
What file viewer are you using?


thanks

manafu
Автор

Hi, so, about the dates when they are predicted. Do you know how to change the code to get the business days instead all days of the month? Because the days are just going in sequence getting weekends and holidays and it provokes some confusing to show the data and to compare. Thanks, your presentation is awesome!!!

jscomputadores
Автор

thanks for sharing the tip to train a classifier in cloud using pickle :)

ashkat
Автор

Thank you for your video. Is it possible to use the saved data with new data for adaptive learning, without the need to re-train the whole data again?

sabotaged
Автор

Right now its 2020 but the predictions are only going up to 2018. Is this because of quandl's api limitations or am i missing something about setting the datetime to predict now += future?

jwalk
Автор

Hi, really a very good training, thx. Looking forward to the next part for customising linear regression ;-)

jordig
Автор

Don't we have to scale 'y' as well or does scaling 'X' scales 'y' as well?
Please clarify
Thanks

northstar
Автор

Thanks bro your vids are amazing! Keep up the good work.

nynom
join shbcf.ru