Using More Data - Deep Learning with Neural Networks and TensorFlow part 8

Показать описание

Welcome to part eight of the Deep Learning with Neural Networks and TensorFlow tutorials. In the last tutorial, we applied a deep neural network to our own dataset, but we didn't get very useful results. We're wondering what might happen if we significantly increase the size of the dataset. Before, we were using ~10,000 samples, how about we try with 1.6 million samples?

Рекомендации по теме

Комментарии

Hey sentdex,

Firstly, thank you for the videos. I've found these machine learning tutorials to be quite informative and easy to get through.

I was a little surprised by this video. It's the first one in the series where so much code was done outside of the video. While I understand a lot of the code may not be as important, most of my understanding comes from writing it out as you do in the video. This helps with context and grasping the ideas behind it more. Unfortunately this video left me lacking on information and led me to feel as though I'd missed a video.

Just a bit of constructive criticism. Thanks again!

Jakesters

I have just recently decided to learn deep learning and these tutorials are still some of the best I have found. The code is redundant enough that it works today with very little updating

MrJastus

You can keep track of the step by creating a global_step by using . This way you can also keep track of the epoch. When total steps you mentioned for the training are finished, you can just increment a num_epoch variable and print out the epoch you just finished.

anmol

Great tutorial. I would like to make just a contribution. In Tensorflow 1.0:

tf.reduce_mean (forecast, y), should be:
(logits = forecast, labels = y)

And sess.run (tf.initialize_all_variables ()):
Sess.run ())

danielantoniodasilva

Sorry by reviving it but 2638 nodes in the input layer comes from where ? I receive an error but my code runs with 49% accuracy when i change it to 32

jppradoleal

Hi, in the preprocessing script, when I run it I seem to be getting a train_set.csv file with only negative sentiment[1, 0] tweet samples (with around 13k such samples).

Also in the main script where you've trained the neural nw, why did u put 2638 as the number of inputs allowed into the layer at a time?(shouldn't it be the size of the individual training sample?). I'm getting a matrix multiplication error cuz of that.

Also I'm getting the following error while saving/restoring the tf var
Expected to restore a tensor of type float, got a tensor of type int32 instead: tensor_name = Variable
[[Node: save/RestoreV2 = RestoreV2[dtypes=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2/tensor_names,

Any help on the above issues is much appreciated, thanks :)

ishanmadan

I downloaded model.ckpt and the pickle file and I'm getting an error message, when trying to restore variables, running only function 'test_neural_network()':

"NotFoundError (see above for traceback): Unsuccessful TensorSliceReader constructor: Failed to find any matching files for model.ckpt

[[Node: save/RestoreV2_5 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_recv_save/Const_0, save/RestoreV2_5/tensor_names,
"

Were those file created for GPU ? I have CPU atm.

kbinkows

I ran 10 and then 15 epochs, both times were 49.3% accuracy with loss of 1.0. Am I doing something wrong?

jeamzfilms

If you are facing the problem : "'charmap' codec can't encode character '\x9a' in position 10: character maps to <undefined>" then introduce the line of code as below (-->)

tweet = line.split(', ')[-1]
outline = str(initial_polarity) + ':::' + tweet
--->outline = str(outline.encode("utf-8"))
outfile.write(outline)

NiteshKumarChaudhary

InvalidArgumentError: Matrix size-incompatible: In[0]: [32, 230], In[1]: [2638, 500]
Getting above error while running neural net program. Any help????

anandsinghyt

Hi, thanks for your videos! I am getting an error when I run the preprocessing_data.py code: 'ascii' codec can't encode characters in position 12-14: ordinal not in range(128)
I tried different solutions, but none worked. Do you know what is the problem?

sergiorinaudo

Some tweets do contain ::: inside them. To get around that use:

tweet = ' '.join(line.split(':::')[1:]) # in create_lexicon

and

# in convert_to_vec
parts = line.split(':::')
label = parts[0]
tweet = ' '.join(parts[1:])

faneltoamna

I always get this error when running the preprocessing script: "'charmap' codec can't encode character '\x9a' in position 10: character maps to <undefined>". I copied the script from the pythonprogramming.net site, so it's not an error of mine. The generated train_set_shuffled.csv works, but the lexicon is just 2KB big, instead of your 41KB. I tried different encodings for opening the files, but none of them does the job. Please help, I'm stuck...

BTW, while training the network (with your lexicon), my GPU usage basically stays at 1-2%, like normal desktop use, but the VRAM usage is almost 10GB. Also it doesn't finish the whole batches for each epoch, but stops at 1221 / 50000 and then just moves on to the next epoch.

ColinRies

I am saving and restoring the model in the same way (with path fixes applied), but the testing phase seems to throw me random results to the same input string. I have a slightly modified version of the problem using a lstm cell and with 3 classes in stead of 2. So, the output confidence randomly selects one of the three for same input on consecutive runs. Is the model saved properly? If not, how to handle it? I mean shouldn't we be able to get the saved values of weights and biases from our trained model?

aniketdhar

Amazing tutorial! I am using Python 3.6 and kept seeing the following error:
'charmap' codec can't encode character '\x9a' in position 10: character maps to <undefined>
Resolved by using codecs to open the sentiment-140 csv files initially.

yuema

First I would like to say, your tutorials are awesome!! :) I really enjoy watching it!
I have a question for the function create_lexicon. Is there a reason why you do this "content += ' '+tweet"? The string gets freaking long ;) But I think the lexicon is also generated the correct way if you do this "content = tweet". Am I wrong?

temblabub

If someone is having troubles with restoring the 'model.ckpt' file try this instead: saver.restore(sess, './model.ckpt')
source:

BTW, the error shown is:
NotFoundError (see above for traceback): Unsuccessful TensorSliceReader constructor: Failed to find any matching files for model.ckpt

glongoria

While testing the Trained model, with the model.ckpt and the lexicon.pickle, i got the following error regarding loading the pickle information:
lexicon = pickle.load(f)
_pickle.UnpicklingError: invalid load key, '\xe2'.
I don't get it, specially as it seems it is related to the way the pickle was created.

chiviado

Hey Sentdex!

First of all, thank you so much, your tutos are so great!

I just have a little problem while runing this one, i've got this 'Invalid Argument Error', i can't get rid of :
'InvalidArgumentError (see above for traceback): Matrix size-incompatible: In[0]: [32, 0], In[1]: [2638, 500]'

As I've understand it, the matrix the second layer give to the output is not in the right format, but I can't manage to change that.

Thank you so much for your help!
I can paste the whole 30 lines description of the error if needed...

annelaure

Have you tried joblib from scikit-learn instead of pickle? It allows for further compression (although I think the read/write takes more memory) and can result in a significantly smaller file size.

DanielTompkinsGuitar

Using More Data - Deep Learning with Neural Networks and TensorFlow part 8

Using More Data - Deep Learning with Neural Networks and TensorFlow part 8

Not enough data for deep learning? Try this with your #Python code #shorts

Tutorial 26- Create Image Dataset using Data Augmentation using Keras-Deep Learning-Data Science

Deep Learning With Just a Little Data

Capturing Deep-Learning Data for Neural Network Training

Time Series Data Preparation for Deep Learning (LSTM, RNN) models

Image Data Augmentation for Deep Learning

Machine Learning and Deep learning in MATLAB | Test and Train Data sets in MATLAB

🧠 The Brain Benefits of Deep Breathing: A Simple Trick for Focus 🌬️

Deep Reinforcement Learning with Real-World Data

Working with Synthetic Data | Deep Learning for Engineers, Part 2

Load Data and Train Neural Network Model - Deep Learning with PyTorch 6

51 Using More Data Deep Learning with Neural Networks and TensorFlow part 8 red manc

Deep Learning for Tabular Data: A Bag of Tricks | ODSC 2020

Data - Deep Learning and Neural Networks with Python and Pytorch p.2

Data Preprocessing for Deep Learning

XGBoost outperforms Deep Learning Models for Tabular Data: Paper Summary

Data Augmentation in Neural Networks and Deep Learning with Keras and TensorFlow

Processing our own Data - Deep Learning with Neural Networks and TensorFlow part 5

Googles New AI Research Is Incredible! (The Sky Is the limit....)

Data augmentation to address overfitting | Deep Learning Tutorial 26 (Tensorflow, Keras & Python...

Deep Double Descent: Where Bigger Models and More Data Hurts

Sources (2) - Data Management - Full Stack Deep Learning

Training/Testing on our Data - Deep Learning with Neural Networks and TensorFlow part 7