Building an OCR Model to Crack Captchas: A Neural Network Tutorial with Keras and TensorFlow

preview_player
Показать описание

You will also get access to all the technical courses inside the program, also the ones I plan to make in the future! Check out the technical courses below 👇

_____________________________________________________________

In this video 📝 we will create an OCR Model To Read Captchas With Neural Networks In Keras And TensorFlow. We will first go over what a recurrent neural network is and why we are going to use that in this video to create an OCR model. We will talk about the CTC loss and at the end of the video, we will create the model, load in the dataset, preprocess it and train our neural network.

If you enjoyed this video, be sure to press the 👍 button so that I know what content you guys like to see.

_____________________________________________________________

_____________________________________________________________

_____________________________________________________________

📞 Connect with Me:

_____________________________________________________________

🎮 My Gear (Affiliate links):
🖥️ Desktop PC:

_____________________________________________________________

Tags:
#OCR #NeuralNetwork #Captchas #NeuralNetworks #DeepLearning #NeuralNetworksPython #NeuralNetworksTutorial #DeepLearningTutorial #Keras #Tensorflow
Рекомендации по теме
Комментарии
Автор

Join My AI Career Program
Enroll in My School and Technical Courses

NicolaiAI
Автор

For anyone who is getting poor results:

1. The small dataset means that a random split might not generalise the problem. for example, the train dataset might contain much higher percentage of a digit than another

2. You can use opencv to perform preprocessing which can improve performance. Using morphological transformations to remove noise can improve performance immensely.

3. To avoid overfitting, I found that a Gaussian noise layer can help. This makes it harder to learn therefore harder to overfit.

Hope this helps!

axelanderson
Автор

where is the repository link
i am not able to find it in description

adepusairahul
Автор

Do you think I can use your code to decode the digits of my water counter?

benoitd
Автор

Hey, Thank you very much for this beautiful explanation of the code and the philosophy behind ocr with LSTM and CTC layer.
Can you please verify if the code always works well because I was executing it and it was working but now doesn't. I think there is a problem in mapping characters to numbers and mapping numbers to their original characters by the function of ('' I tried to compilate it in google colab but when I tried to visualize the data it doesn't give the correct label text.
I would be very thankful if you verify it and give some solutions to fIxe the problem of mapping characters to numbers and mapping numbers to their original characters .

souhailel-ghayam
Автор

Dear Coding Lib! im here with the Capthcha project! seems like turning the shuffle on messes with the shuffling function and does incorrect tplit. I have yet to find solution, and would really appreciate if you looked into it! If shuffle is off, it works well. Another person pointed the bug out, and its labels being on wrong images

GuyJustCool
Автор

Hi Nicola When i add "num_oov_indices" = 0 parameter in stringLookup code then model training code work but it post labels on wrong images in visualization part before training and creating model. So i removed "num_oov_indices" and now my model training code of earlystopping is not working. Code stop in very first epoch Any solution for this ?

syedmuzammilahmed
Автор

I m trying to run this code but m getting error like InvalidArgumentError : graph execution error
Anyone can help with this

omkarmestry
Автор

Hi Nicolai,
I was wondering would there be a way to feed in this kind of network wider images with text or have kind of dynamic input with size?

alexmoruz
Автор

can i extract text from images by the way ? My final project is extract text from images but i can not coding . I need to help please .

sule-yg
Автор

Can I use this for model for license plates?

coconutnut
Автор

can it suitable for text recognition task?

chelvanchelvam
Автор

Hey i am not getting accurate results, i checked your github for some reason the labels arent matching the captchas during testing what would you recommend to do

prathamshah
Автор

can you provide library versions you used

int-
Автор

I've finally ended with this working configuration:

images = sorted(map(str,
labels = for img in images]
vocab = sorted(set("".join(labels)))
max_length = max(len(label) for label in labels)

char_to_num = StringLookup(vocabulary=vocab, mask_token=None, num_oov_indices=0, oov_token="[UNK]")
num_to_char = StringLookup(vocabulary=char_to_num.get_vocabulary(), invert=True, mask_token=None, num_oov_indices=0,
oov_token="[UNK]")

And rest of the code like in video.

megistone
Автор

Hi Nicolai, thanks for great explanation. Could you please explain how to measure accuracy?

ehsanroshan
Автор

## Preprocessing

# Mapping characters to integers
char_to_num =
vocabulary=list(characters), mask_token=None
)

# Mapping integers back to original characters
num_to_char =
vocabulary=char_to_num.get_vocabulary(), mask_token=None, invert=True

hsnhsynglk
Автор

Hi, how can I get for captcha that has 6 digits each picture? Currently it’s 5 digits in your example, I know I need to change something in the model but I can’t seem to figure it out, :( the error I keep getting is cannot add tensor to batch. Number of elements does not match. Shapes are: [tensor]: [5] [batch]: [6]

How should I change or how do I understand what I need to change?

tricialamjingyi
Автор

Hi, I am a beginner in this field and I've watched your video and implemented this code. Its working fine but I need to test a single captcha image how can I do that. I was trying to do that but the prediction was not good . Please help me out if you can. 🥺

abhisekseal
Автор

Excuse me bro, i have an issue when im running build_model() function after CTC Loss

its happen in line 43 about    x = layers.Reshape(target_shape = new_shape,  name='reshape')(x)

ValueError Traceback (most recent call last)
in <module>()
73
74 # Panggil Functionnya buat bkin model
---> 75 model = build_model()
76 model.summary()
in build_model()
41 # floor division menghasilkan nilai berupa hasil dari pembagian bersisa
42 new_shape = ((img_width // 4), (img_height // 4) * 64)
---> 43 x = layers.Reshape(target_shape = new_shape, name='reshape')(x)
44 x = layers.Dense(64, activation='relu', name='dense1')(x)
45 x = layers.Dropout(0.2)(x)
in __call__(self, *args, **kwargs)
975 if _in_functional_construction_mode(self, inputs, args, kwargs, input_list):
976 return self._functional_construction_call(inputs, args, kwargs,
--> 977 input_list)
978
979 # Maintains info about the `Layer.call` stack.

in _functional_construction_call(self, inputs, args, kwargs, input_list)
1113 # Check input assumptions set after layer building, e.g. input shape.
1114 outputs =
-> 1115 inputs, input_masks, args, kwargs)
1116
1117 if outputs is None:

in _keras_tensor_symbolic_call(self, inputs, input_masks, args, kwargs)
846 return tf.nest.map_structure(keras_tensor.KerasTensor, output_signature)
847 else:
--> 848 return self._infer_output_signature(inputs, args, kwargs, input_masks)
849
850 def _infer_output_signature(self, inputs, args, kwargs, input_masks):
in _infer_output_signature(self, inputs, args, kwargs, input_masks)
886 self._maybe_build(inputs)
887 inputs =
--> 888 outputs = call_fn(inputs, *args, **kwargs)
889
890 self._handle_activity_regularization(inputs, outputs)

in call(self, inputs)
537 # Set the static shape for the result since it might lost during array_ops
538 # reshape, eg, some `None` dim in the result could be inferred.
--> 539
540 return result
541
in compute_output_shape(self, input_shape)
528 output_shape = [input_shape[0]]
529 output_shape += self._fix_unknown_dimension(input_shape[1:],
--> 530 self.target_shape)
531 return tf.TensorShape(output_shape)
532

in _fix_unknown_dimension(self, input_shape, output_shape)
516 output_shape[unknown] = original // known
517 elif original != known:
--> 518 raise ValueError(msg)
519 return output_shape
520

and this the error message
ValueError: total size of new array must be unchanged, input_shape = [50, 50, 64], output_shape = [50, 768]

hendrywijaya
visit shbcf.ru