The Insanity Of Linux's Regular Expressions

Показать описание

Become A Channel Member:

SOCIALS
----------------

RobertElderSoftware

Рекомендации по теме

Комментарии

"The plural of Regex is Regrets."

lockonjunkie

As the old saying goes:

"Hey, I know, I'll use Regex to solve this problem." Now you have two problems.

stbuchok

Thank you! I’ve been saying this for 20 years and what I’ve been told is that every regex has to have these subtle differences because “they solve different problems!”

diego

And then Rust thought they should invent their own RE specification, just to mess with me I'm sure

swanyriver

I first really learned Regular Expressions from the terrific O'Reilly "Owl" book, Mastering Regular Expressions by Jeffrey E. F. Friedl, second edition (my favorite, though the third edition is equally excellent). I reread it rather often,
Yeah, he goes into a lot of detail, but explains the "standards" are just sort of really strong suggestions.
The great thing about standards is that there are so many!
He suggests picking your favorite tools and learning the idiosyncrasies of how Regular Expressions are used by them. He also recommends using Perl, because it's clearly by far the leader in Regular Expressions.
Perl was designed to replace ALL the various command line tools you mentioned.
The book does explain that the way they work is in a relatively uniform or at least predictable manner, within just a few different "flavors."
The moral of the story is, as always, is to carefully test any pattern before actually putting it into production.

lorensims

Phew. I was losing my mind with RegEx and VSCode the other day... I gave up. Glad it wasn't my fault.

delicious_seabass

Love the thumbnail, you have the same facial expression my best friend had when I asked him if he was my best friend.

stalker

And on posix shell, you have shell expansions and glob patterns to add to the confusion.

ngelf

I use Perl but yes being modern it works, put '?' (is lazy not grouping on match) instead of the '+' (greedy) in the regular expression,
that is global or don't return on first match. Gives you different and mostly unexpected results. I expected '?' to be the more proper.
Example:

vilijanac

I am confident with my RE skills of in programming languages, but grep on the commandline always turns into looking at documentation for me. I know that egrep/grep -E is junk, and learned to default to always pass the P flag for the least painful experience.

ForeverZer

I use Regexes since maybe 25 years, and it is rare that I can code one without looking half of it up again. I pretty much every time use Regex101 (the website) to make and test them, and if coding in C#/Java use Rider/IntelliJ's help to write them (Rider has become incredibly useful). How to name a group? Non-capturing back reference? What exaclty is in a \w ?

der.Schtefan

I've been using UNIX since 1981 and have been dealing with REGEXs since then. Pretty early on I realized that I needed to think of REGEX as a raw capability that exists in many mutated forms in different tools but one that is expressed slightly differently in each. I figured out what I needed to do and then consulted the manual for the tool to grind through the sequence of characters that would implement the search I needed. for example it's always a crap shoot whether the REGEX compiler in the particular tool wants '[' et al escaped with backslash or stand naked to get the REGEX behaviour.

In some ways I use REGEXs the same way I solved calculus problems. In calculus I could never remember the slew of worked integrals and generally had more success deriving from first principles (my memorization skills are weak). So I kinda do the same with REGEX -- figure out what I want from first principles and then derive the final jumble of characters to implement the match.

I agree that PERL is the better set of Regex functions. I think PERL and the PCRE library would the be way I would present REGEXs in any new tool I created.

ksbs

My thought is having an external modular regex that let you set an environmental variable to choose which one you want to use across all tools. That would allow for backwards compatibility as it would behave normally unless you have something like "default_regex=pre" set. The tools of course would have to be rebuilt with this in mind, but it would bring some consistency to the current chaos.

Seaoftea

At a previous job, I started getting really into my dot files, solving many of the smaller problems I had (they had a really esoteric and out of date.... everything) with increasingly elaborate bash scripts running into all the different commands with different versions of regex and different syntaxes and even different options with similar names.

I was getting pretty out of control (I once made a ridiculously inefficient method to find out if any item was in an array without using subshells, on an older version of bash and without regex or cases) and eventually tamped down once I realized how ridiculous it was to stay logged into work over the weekend just to work on my dot

I mean, I now know about a ton of things like Shell Parameter Expansion, basics of pipes etc, but I must admit, if anything It made me realize that a lot of what I was doing should have probably just been a python script at the end of the day or not at all 😅

BeefIngot

3:35 i gave up the simple substituitions with sed on discovering lack of PCRE several months ago & decides to use straight up Perl

yash

This video explains so much!, I feel a bit less dumb for randomly failing at regex, I'm gonna check your blog for more info later, thank you!

willft

Unfortunately, PCRE can be tricked into exponential runtime, which is why RE2 was developed. RE2 is missing a lot of features that prevent programmers from accidentally writing matchers that hang on malicious input.

sfllaw

Amen agree Perl RE should be the default for everything

vincefinch

I’m amazed at how much I disagree. If you go back to “original Unix”, you had commands like “ed” and “sed” that used what are now called “BRE”. The goal was minimal size because “ed” was the only editor on the boot volume. So if you had a problem at boot that you needed to fix, you had to know how to use “ed” and use it well. “sed, while not on the boot volume, came from “ed” — thus it had a simpler regular expression capability. “grep” had BRE and “egrep” had extended regular expressions. “egrep” was the only tool back then with alternations. And, of course, “fgrep” had no regular expressions at all and was nice when grep’ing for dots and other special characters.

GNU came along and started combining tools into one big mass. This is where the confusion probably started. And, oh, by the way, No… Pearl regular expressions are the odd man out. They came late to the game and caused the confusion. “POSIX”, as with all standards is just for idiots — sorta like NATO and the UN.

So… yea. About the only thing I agree with is one statement that you disagreed with at the beginning of the video: learn your tools.

You also started out the video with a mention of “greedy” and “non-greedy” constructs. “?” and “+” are not either one of those as far as I’m concerned and if you look in the pages I looked at while trying to verify your video, “greedy” isn’t found. As far as I know, there are various forms of grouping constructions like parens with special decorations that denote greedy and non-greedy matching. That’s the only time I’ve seen those two terms used.

Last: Pearl was a priceless improvement when it first came out in the early 80s. It was used for scripts that were complex. It was easier to write in Pearl for complex scripts than it was in sh or csh. But when bash came out (and probably when ksh came out but I didn’t use ksh), that advantage was lost and I quit writing pearl scripts and thus stopped keeping up with its progress which seemed to stop dead in the early 90s. So, that is another reason to ignore the Pearl regular expressions. If you seriously need really complex REs, you are probably doing search and replace — not just search and at that point, you throw it into Emacs and use its full power to get what you need done. Also pipe lines of greps can solve the extreme edge cases where complies REs might be needed far simpler than trying to figure out how to do it in one pass.

pedzsan

I have a pretty good understanding of BREs and EREs, and I know pertty well what PCREs can do, even if I have to open "perlre" to figure out how to do it. When I'm grepping, the moment I go for a backslash, I add the -E switch, and when I want a PCRE, I also pretty much want perl itself. After that, if I still haven't solved the problem (and I'm no longer having fun), I reckon I'm using the wrong tool and go write a manual parser in C++ or something.

MCLooyverse

The Insanity Of Linux's Regular Expressions

TempleOS in 100 Seconds

The Madness behind the Linux Source Code Comments

5 reasons Linux is the best OS for coding/ Programming #linux #programming

Insane Linux Distributions

Windows user vs Linux user customizing their desktop

Why Linux is best Operating System Ever | Linux Operating System

Uncovering the INSANE Functionality of a Linux SCRATCHPAD #shorts

Funny Linux Commands - How to Kill Child With Fork | Search History | Operating Systems | #shorts

why you should install and use Linux Mint

This CODE broke the Linux Kernel’s ability to use GPU DRM

THERE are MANY types of LINUX Kernels

Top 10 INSANE Linux Terminal Applications You Should Be Using! (Command Line Sorcery)

How do you say 'chmod' the linux/unix system utility #shorts

linux vs windows #hacking #linux #windows #vs #cybersecurity #shorts #youtubeshorts #youtube #hello

Arch Linux Insane Boot Time! 11.210s

Best Linux Distributions for programmers #linux #programming

Are Linux Smartphones about to KILL Android?

Insane Arch Linux responsiveness with zen-kernel

The #Linux kernel is written on a C programming language - interesting facts about OS

Must Read Book For Understanding Linux Internals #shorts

Why Am I Loving This Insane Linux Video Editor 🤯🤯🤯 Is It Really That Great 🥺🥺🥺

AAA Games on Linux

I forced EVERYONE to use Linux

I Installed Linux On a Playstation 2!