Python: Renaming PDFs using text inside a document with regex

Показать описание

In this tutorial, we expand on renaming PDFs using regular expressions (regex). This is one of the many examples of using regex, if you have different requirements it will require a different regex expression.

I have posted a written explanation of the regular expression used in the video on GitHub.

If you have any questions leave them down below and I'll try and respond (hopefully more quickly this time).

Chapters:
00:00 Intro
00:18 Requests & start
01:28 Reviewing code
07:34 Reviewing the regex expression
11:23 Alt. regex w/o formatting
12:46 Example run with regex
14:11 Using a list of names to rename

Рекомендации по теме

Комментарии

IMPORTANT:
If you do NOT CARE about what comes after the keyword for the positive lookbehind expression, use the following instead:
(?<=Order #: ).+

stephencodes

Thank you, Steve. You saved me from manually renaming nearly 400 PDFs. And many more in the future. I'm a trial attorney who handles big medical files that are often unorganized. My RegEx is \d{1, 2}(\/|-)\d{1, 2}(\/|-)(\d{4}|\d{2}) then I rearrange, pad the pieces, and add a random 3-digit string to make the filename unique to sort and group date-related records.

parkourninja

Omg the first time my comment is in a video! Thank you so much for this amazing tutorial! When are you going to set up a Patreon?! Or I can pay you back in calculus videos or any higher level math tutoring!

TacosYBurritosP

Bro is just insane, thank you so much for this video man

Lioneriod

Man
Thank you so much, worked like a charm

giamonioz

Hey Stephen - i have like thousands of pdf’s in a folder with a difference that like we take your case some pdf’s have Order # basis which we want to rename however some pdf in same folder has Product # instead of order #. So how to rename within the same code? Do or works in Regex?

aayushaggarwal

Hi Stephen, this works like a dream.
But when I try to change the cr_regex line to suit my case it does not work.
The text in my file is Ｂ／Ｌ番号(1) JBX1A12345. I only want the JBX1A12345 so I tried to change to cr_regex = r'(?<=Ｂ／Ｌ番号(1) )[A-Z]{4}\d+', it shows AttributeError: 'NoneType' object has no attribute 'group'.

noctischen

This is awesome man. Nice work. Would it be difficult to edit the code to exclude special characters? It worked perfectly other than instances where I had a "/" in the lookup text.

johnnyb

how we do it ? if want to take diif text from pdf like case num, doc number, name and save with this file name

for example:

using the naming format "C:\...\Case Name\DocumentNumber FilingDate LastName FilingType.pdf."

"C:\...\Leal v. Bedel et al\#026 2022-07-02 Staedter Motion for Extension of Time to File Answer.pdf."

greenlight

Python: Renaming PDFs using text inside a document with regex

Rename PDFs using text content from a document (PYTHON)

Python: Renaming PDFs using text inside a document with regex

How to Rename Multiple PDF Files by Extract Specific Text Using Python

How to rename and merge PDF files based on the text content of the PDF file w/ Python script/program

PDF title detect & rename tool 'Rename by contents'

Rename Files With Python (Automation Script For Beginners)

PYTHON: Renaming PDFs using an Excel file and splitting PDF pages

Rename pdfs BASED ON THE CONTENTS – with an easy-to-use batch rename tool!

How To Rename PDF Files Based On Text Content

How to rename a folder of PDFs based on content found in each PDF file.

Extract PDF Content with Python

Rename files with a #Python script

Automatically rename PDF files based on content or custom naming conventions

How to Rename PDF Files in Bulk - According to The Contents (A-PDF)

How To Rename Bulk Files At Once - With This Simple Trick

Working with PDF files in Python | How to extract text from Pdf using Python?

Rename files with Python! #programming #tech #coding #python

Easy Bulk File Rename with GUI in Python | Beginner Project in Python | Priyam Kapoor

How to quickly rename multiple PDF files

How to Rename multiple files fast

Merge PDF Documents and Rename based on Content using Power Automate Desktop

Python Rename Multiple Files and Maintain Order

How to rename multiple files in a folder using python

Rename Multiple Files or Sequence Files in 2 mins. | No Softwares | No CMD