Anomaly Detection with AutoEncoders using Tensorflow

preview_player
Показать описание
#datascience #machinelearning #neuralnetworks

An autoencoder is a neural network that learns to copy its input to its output

An Autoencoder can be divided into two parts: the encoder and the decoder. The encoder is a mapping from the input space into a lower dimensional latent space (bottle neck layer).

In this video we will see how we can build an autoencoder from scratch using tesnorflow and python
Рекомендации по теме
Комментарии
Автор

Pure gold. Not many channels show the actual implementation, majority focus on the theory

vipinamar
Автор

Thanks, great video! If the plotting doesn't work for anyone, rounding the values in the tensor helps

train_loss = train_loss.numpy()
train_loss = [round(x, 4) for x in train_loss]

colinvanlieshout
Автор

With the target column present the whole things becomes very easy .. in fact then various other approaches can also be used .. I was expecting an unsupervised approach using auto encoders for anomaly detection ..

subhankarbhattacharya
Автор

Another warning: early stopping is actually not used (at 21:00): the code instead reaches the maximum number of epochs put in the code (50).

EricLebigot
Автор

@AIEngineeringLife If we know what is anomaly and what's not and are training only on normal data, then what's the specific requirement of unsupervised approach, we can go for supervised approach too if we know labels ?

I am working on a use case in Cybersecurity domain where I don't know what exactly are anomalous data points to create labels (and that's why the real need for unsupervised approaches), then how can I segregate normal and anomalous data to train only on normal data? Any inputs on how we can proceed on this? One approach I can think of is getting some anomalies labelled from domain experts but that also won't cover all possible types of anomalies existing in the overall data.

SanjogMehta
Автор

I think viewers should be cautioned on two things: (1) indeed, as some have noted, including anomalies in the validation set when training on normal behavior encourages the auto-encoder to reconstruct anomalies correctly (which we want to avoid since we measure the reconstruction error, later); (2) it is generally a mistake to mix the training and test data and to split it again later (like what is done at the very beginning), because the test data can have been selected to better test the generalization abilities of the model (for example, the training data can contain multiple ECGs from some individuals, while the test data might contain different individuals, so that by mixing the training and test data, you strongly help the model remember individuals instead of helping it to learn to generalize to different people).

EricLebigot
Автор

Thanks sir, I really needed this for my battery storage anomaly detection project at work. 💪🏾🙏🏾

EnochAppiah
Автор

Big fan of yours sir.
one point, even though the performance is over 90% in both the anomaly and normal classes in real life the distribution of normal to anomaly is highly imbalanced like 99:1 or even 99.5:0.5 hence i think it'll be better if you could incorporate this into the anomaly detection series. thanks for your videos really appreciate it !

ravitanwar
Автор

hi, is there a github link for this project?

hbk
Автор

thank you for the detailed video! could you please elaborate on the idea of having a validation dataset with both normal and abnormal records in it (while training the model)? To my understanding, validation data is not used by Keras to tune any parameters. So while training on normal records only, we would compare training loss to a validation loss calculated for both normal and abnormal records making it incomparable and difficult to judge if model e.g. is overfitting. If the purpose is to get the best possible model for modelling normal behaviour, would it not be beneficial to have validation set consisting of normal records only?

anafotina
Автор

Thank you so much sir for explaining everything clearly with code. God bless you. And if possible please upload videos on domain adaptation with auto encoders.

poojaanand
Автор

Thanks for a great demo.
Anamoly detection problems are generally heavily unbalanced. What should we do to take the unbalanced problem?

akrsrivastava
Автор

How to deal with categorical features in Input data?

Thanks for the video, really well explained and easy to understand!

soumyajitsarkar
Автор

We can also evaluate the loss after training a normal neural network without autoencoders and label them as anomalies if the loss is greater than the threshold? How does this evaluation differ from autoencoder models? Thanks.

sriadityab
Автор

A small doubt why u used sigmoid as output layer activation? Won't it work better if we use linear activation? because output having the values which are continuous?

RaviTeja-zklb
Автор

Thanks for doing something different from mnist. Everyone is doing that. Any examples with clustering of the encoder output?

johnsondlamini
Автор

Nice explanation,
1. I have a doubt if we are using a regular sequential model, can we use model. encode and model.decode functions?
2. expect for separating the good vs bad target variable is not useful right (can I consider anomaly detection as unsupervised learning)

Can you make anomaly detection for time series data it could be beneficial for many one.

vamsikrishnabhadragiri
Автор

How can we find those anomaly locations in the dataset instead of TRUE/FALSE?

rafiaakhter
Автор

Could you show the exact parts of the code where you calculate precision and recall for this?

jiajun
Автор

Please what is the windows version for concatenating the two files ECG5000_TRAIN.txt and ECG5000_TEST.txt?

Use used !cat ECG5000_TRAIN.txt ECG5000_TEST.tx. > ecg_final. This line of code does not work with windows. I’ve tried all the codes I know and searched on the internet but no avail. Please help me sir

roger_island
join shbcf.ru