How to Do Speech Recognition with Arduino | Digi-Key Electronics

Показать описание

Speech recognition is the process of using computers to recognize and understand human speech. Being able to understand full sentences or questions requires a lot of processing power, as it often relies on the complex algorithms found in natural language processing (NLP).

Most microcontrollers (and Arduino boards) cannot run NLP due to their limited resources. However, we can train a neural network to perform basic keyword spotting, which still has many uses (such as enabling a smart speaker by saying “Alexa” or shouting “stop” to halt a machine).

In this video, we will use Edge Impulse to train a neural network to identify and classify a few custom keywords. We will then deploy this trained model to an Arduino Nano 33 BLE Sense to perform keyword spotting in real time.

To begin, we collect samples of the keywords we wish to identify. These can be collected on any number of recording devices and then edited using Audacity to create 1-second snippets. We recommend collecting at least 50 samples to start.

After, we run a custom Python script that mixes the samples with random snippets of background noise and curates the custom keywords along with keywords found in the Google Speech Commands dataset.

From there, we upload our curated dataset to Edge Impulse. We use Edge Impulse as a tool to extract features from the audio samples, which are the Mel frequency cepstral coefficients (MFCCs). We then use it to train a neural network to identify our target keywords. Once done, we can test the model and download it as part of an Arduino library.

We load the library into Arduino and use it to perform inference in real time. The Arduino example code continually captures audio data, extracts features (computes MFCCs), and uses those MFCCs as inputs to the trained model. The model returns (what is essentially) the probabilities that it thinks it heard our target keywords.

We can compare those output values to thresholds to take action whenever it hears the desired keyword! To start, we’ll blink a simple LED (because who doesn’t love an overly complicated blinky program?).

Product Links:

Related Videos:
What is Edge AI?

Intro to TensorFlow Lite Part 1: Wake Word Feature Extraction

Intro to TensorFlow Lite Part 2: Speech Recognition Model Training

Intro to TensorFlow Lite Part 3: Speech Recognition on Raspberry Pi

Getting Started with TensorFlow Lite for Microcontrollers

Related Project Links:

Related Articles:

Getting Started with TensorFlow Lite for Microcontrollers -

Рекомендации по теме

Комментарии

I have come to greatly appreciate Digi’s dedication to education, not to mention how amazing teacher Shawn is! Keep up the good work and Cheers

VAAYG

With so many uncertain in test set, lower the minimum confidence rating to 0.6 to get much better results.

janjongboom

thank you! Great and timely content, great speed and information content (at 2x)

userou-igze

It'd be nice if you could show how to run an audio classifying tflite model on an Arduino Nano / Raspberry Pi Pico *using an Analog Microphone* . There's no proper video that I could find on the web that does that or even resembles this concept remotely.

rohanmanchanda

I need some help. While setting things up in Anaconda I'm getting this error:

dataset-curation.py: error: the following arguments are required: d

I really don't know what this could be and I would really apreciate any help, thanks

eonoire

This is really a good content!!! Thanks very much

harrytsai

Can this method explained in the video be used to recognize a specific sound rather than specific text, e.g., a clap sound?

adhamelrouby

Thank you dude. Very cool tutorial. Please make STM32F4 speech recognition example.

resatyigen

Can I use the data_curation for my dataset and profit out of the model I develop?

mri

Hi Shawn, Thanks for the great tutorial. I dusted off my BLE, went through every one of your steps and got it all working in near flawless fashion, six hours later. I am now in a good place to start experimenting. My keyword was 'shut-down' (only 48 samples) which turns the LED on and 'go' turns the LED off. This could be a light bulb controller or a TV power switch. A lot of sounds get confused for 'go' so increasing the threshold for 'go' to 85% worked well.
Perhaps its just my OS but I had to replace all your '\' with '/' in the curation script command line. Oddly, my feature extraction time took less that half of yours (123ms) but I had the same BLE Sense as you.
I want to try to expanding the # of wake words. What do you think held you back: processor speed, RAM, Edge Impulse, data....?
Looking forward to your next videos

bertbrecht

Thank You So Much.
I have added the file as per Edge Impulse still facing an issue with The filename or extension is too long.

Error compiling for board Arduino Nano 33 BLE.

ashwinis

I want to use this method with an ESP32, how can I make the program use the audio data coming via I2S

sureshtiwari

Hi Thanks for the tutorial. I want to use the method with an ESP32, how can I make the program use the audio data coming via I2C ?

hamishgrant

Buen dia, hay una guia como esta pero usando solo raspeberry pi pico y los canales analogicos conectados a un microfono

brayanaquino

I could of tell you a lot faster way to do the audio samples, but you already did do it so

nikonissinen

"hand me my patching trowel, boy!"

djtomoy

Is this using tflite micro in backend?

akkutyagi

why do i not have app data under my user :(

MilSimVipers

Hi! I want to convert the speech to text and then work this the text in Python would this module work for me for this?

imsteven

"I've got 68, which should work for this prototype"

Ckckck, I'm disappointed Shawn

dhupee

How to Do Speech Recognition with Arduino | Digi-Key Electronics

How Does Speech Recognition Work? Learn about Speech to Text, Voice Recognition and Speech Synthesis

How speech recognition works in 60 seconds!

How to generate speech from text in Python

A Basic Introduction to Speech Recognition (Hidden Markov Model & Neural Networks)

Speech Recognition in Python

I Built a Personal Speech Recognition System for my AI Assistant

Real-time Speech Recognition in 15 minutes with AssemblyAI

Automatic Speech Recognition in 4 Lines of Python code with HuggingFace

Understanding Agents Governance and Anatomy Unpacked by Igor Jablokov

How to use Dictation in Windows 11: Speech Recognition and Voice Typing

Real-Time Speech Recognition In Python in 60 seconds!

What is Speech Recognition?

How Voice Recognition Works

Python Speech Recognition Tutorial – Full Course for Beginners

Speech Recognition in Unity [Tutorial]

Creating a Speech to Text Program with Python

HOW TO use Speech Recognition Built into Windows 10

Speech recognition। How to disable speech recognition in windows 10। windows speech recognition OFF...

Build your own real-time voice command recognition model with TensorFlow

Turn On or Off Speech Recognition In Windows 11 [Tutorial]

Speech Recognition Threshold Testing

Excel data entry using Windows Speech Recognition

10.4: Speech Recognition with p5.Speech - Programming with Text

Add Real-Time Speech Recognition In JavaScript!