Sign Language Detection using ACTION RECOGNITION with Python | LSTM Deep Learning Model

preview_player
Показать описание
Want to take your sign language model a little further?

In this video, you'll learn how to leverage action detection to do so!

You'll be able to leverage a keypoint detection model to build a sequence of keypoints which can then be passed to an action detection model to decode sign language! As part of the model building process you'll be able to leverage Tensorflow and Keras to build a deep neural network that leverages LSTM layers to handle the sequence of keypoints.

In this video you'll learn how to:
1. Extract MediaPipe Holistic Keypoints
2. Build a Sign Language model using a Action Detection powered by LSTM layers
3. Predict sign language in real time using video sequences

Get the code:

Chapters
0:00 - Start
0:38 - Gameplan
1:38 - How it Works
2:13 - Tutorial Start
3:53 - 1. Install and Import Dependencies
8:17 - 2. Detect Face, Hand and Pose Landmarks
40:29 - 3. Extract Keypoints
57:35 - 4. Setup Folders for Data Collection
1:06:00 - 5. Collect Keypoint Sequences
1:25:17 - 6. Preprocess Data and Create Labels
1:34:38 - 7. Build and Train an LSTM Deep Learning Model
1:50:11 - 8. Make Sign Language Predictions
1:52:40 - 9. Save Model Weights
1:53:45 - 10. Evaluation using a Confusion Matrix
1:57:40 - 11. Test in Real Time
2:20:46 - BONUS: Improving Performance
2:26:52 - Wrap Up

Oh, and don't forget to connect with me!

Happy coding!
Nick

P.s. Let me know how you go and drop a comment if you need a hand!
Рекомендации по теме
Комментарии
Автор

as someone who is following this in 2023, here's some be editing them in as they pop in while I go through the tutorial.
25:42 FACE_CONNECTIONS seems to be renamed/replaced by FACEMESH_TESSELATION.And well since we want just the outlines of the face, it's FACEMESH_CONTOURS that we would need in this project.

kanchanpatil
Автор

I remember some time ago requesting this type of video, but to see that its finally here brings me joy. Can't wait to do this and show to my sign language friends.

girishkemba
Автор

I can not thank you enough for all the videos you create i was a noob in tech but the moment i started watching your videos its been a year now and i am so proud of you and myself for coming this far and this project works for me❤

savi-
Автор

Thank you for these clear, practical, straight to the point tutorials! Looking forward to your future videos!

aminberjaouitahmaz
Автор

that's amazing! I watched this video more than a month ago but it seemed difficult for me as a beginner. Then I've tried my best to finished Machine Learning/ Deep Learning/ Python / Tensorflow and some Data Science course within a month. Now watching this video again is like watching a movie! it's easy to follow! love it

study_with_thor
Автор

Hi Nicholas thanks so much !!!! I am creating a model to help deaf people here in my country. Greetings from Guatemala !!!

Stacio
Автор

Man, you are so underrated and deserve a lot more! thanks a lot for these awesome learning materials! I have learned a lot from you. Keep inspiring, man :)

yohanessatria
Автор

Thank you @Nicholas Renotte I just passed my capstone project defense utilizing this deep learning model

engeerdanisme
Автор

00:01 This video demonstrates sign language detection using action recognition with Python.
01:40 The video discusses the process of sign language detection using action recognition and LSTM deep learning model.
05:16 MediaPipe Holistic allows us to get key points from face, body, and hands
07:17 Setting up webcam access and rendering frames using OpenCV
11:06 The code captures frames from a webcam and displays them on the screen.
12:46 Setting up MediaPipe Holistics and creating variables for MediaPipe Holistic and MediaPipe Drawing Utilities
16:46 The video explains the process of color conversion in sign language detection.
18:32 The process involves detecting sign language using media pipe and a deep learning model.
21:59 The video discusses the different types of landmarks in sign language detection using action recognition.
23:23 The video explains how to detect and visualize different types of landmarks using MediaPipe.
27:05 The video discusses how landmarks in facial and body pose can be connected to each other.
28:37 Implementing sign language detection using LSTM deep learning model in Python
32:18 Landmarks are drawn and rendered in real time using image pass and cv2
33:55 You can customize the formatting of the dots and connections in Sign Language Detection using a landmark drawing spec and a connection drawing spec.
37:32 Updating pose and hand landmarks with different colors and parameters
39:31 Different models in action: left hand, right hand, face, and pose.
43:09 The code demonstrates how to extract landmark values using pose estimation.
45:04 The video explains how to reshape and convert landmarks into a single array.
48:27 Building a neural network and extracting key points using action recognition with Python
50:10 Setting up error handling and placeholder arrays for pose and face landmarks.
53:52 The video explains how to extract key points for sign language detection using LSTM deep learning model in Python.
55:31 Concatenating pose, face, left hand, and right hand keypoints for sign language detection.
59:11 Using LSTM Deep Learning Model to detect sign language actions
1:00:57 Creating folders to store data for different actions and sequences.
1:04:16 Creates a folder structure for sign language detection using action recognition with Python.
1:05:48 Collecting data using MediaPipe loop and capturing snapshots at each point in time.
1:10:14 The code is outputting text to the screen and taking a break at frame 0.
1:11:44 The first block of code prints starting collection in the middle of the screen and pauses.
1:15:10 The code collects key points by looping through actions, sequences, and frames.
1:16:39 Implementing sign language detection using action recognition with a LSTM deep learning model.
1:20:25 Sign language detection using action recognition with Python
1:23:20 Using MediaPipe to collect key points for sign language detection
1:26:55 Creating a dictionary to map labels to numeric ids
1:29:08 Sequences represent feature data and labels represent y data
1:32:26 Data preprocessing and training and testing partitioning are important steps in sign language detection using LSTM deep learning model.
1:34:12 Training LSTM neural network using TensorFlow and Keras.
1:38:14 The model uses LSTM layers for sign language detection.
1:39:55 The next three layers are dense layers using fully connected neurons.
1:43:16 The video discusses the process of formulating a neural network for sign language detection using action recognition and LSTM deep learning model.
1:44:58 Training the model with 2000 epochs
1:48:11 The training accuracy is high at 93.75% after 173 epochs.
1:49:37 The model has three LSTM layers and dense layers, with a small number of parameters to train.
1:53:13 Reloading a deleted model and evaluating its performance using scikit-learn.
1:55:03 Converting y test and y hat values to matrices and then evaluating the model performance using a confusion matrix and accuracy score.
1:58:18 Implementing prediction logic by concatenating data onto sequence and making detections when 30 frames of data are available.
2:00:30 Implement logic to grab the last 30 sets of key points for generating predictions.
2:04:18 Implementing visualization logic and checking result threshold and sentence length
2:06:45 The code checks if the current action matches the last sentence in the string.
2:10:05 Sign language detection using LSTM deep learning model
2:12:39 The video discusses sign language detection using action recognition with Python using an LSTM deep learning model.
2:17:13 The video discusses sign language detection using action recognition with Python
2:19:14 Sign Language Detection using Action Recognition with Python
2:22:29 To ensure accurate action detection, the last frame needs to be included in the sequence.
2:24:04 The code implementation adds stability by checking if the last 10 frames have the same prediction.
Crafted by Merlin AI.

BINARAI-qr
Автор

It's so cool to see how happy Nicholas is when everything works in the end. That's the spirit! Amazing video, thanks a lot for your work man!

rainymatch
Автор

Can’t lie.. I have learnt a lot from Nicholas

danieladama
Автор

Thank you so much! You are my best teacher in my college life!!!!

김미소-zw
Автор

Thanks for the amazing tutorials! absolutely life-saving. Just a reminder that the z value from mediapipe is with respect to the wrist landmark not the distance from the camera! I found out pretty late!

Cheese_Academia
Автор

thanks for this amazing tutorial sir, we are working on a project that needed this section and your videos and explanation are being extremely helpful to me and my team ! thanks a lot

leafiadias
Автор

Nicholas is the best machine learning youtuber, his tutorials are interesting and fun.

malice
Автор

Hi, Nicholas! These are great video series to watch and learn! Thank you very much!
Can you please prepare a video applying CV on real-time sign language detection on the base of a ready dataset avaliable in Internet?
It may be much more interesting if we can see ViT in action recognition as well.

ibrahimalizada
Автор

Someone just completed his internship with the help of your code and also got a certificate from an IT company

Rohan_is_discovering
Автор

Hey Nicholas, I am working on similar project. Just wondering when I test the model using your metric it does not reflect the same accuracy as the real-time test. I train the model accuracy to 80-90% but the real-time test barely capture any sign language. Do you have any thought?

theethatanuraksoontorn
Автор

Hi ! I'm impressed by the amazing clarity of your explanations. For one second I thought you must be a trained teacher robot....

Nikos_prinio
Автор

Thanks, great videos. Would be great if you could elaborate into the differences of the used media pipe implementation, compared to the others you mentioned. I mean really a comparison of the underlying models/ networks and their training.

torstenknodt