CppCon 2018: Hana Dusíková “Compile Time Regular Expressions”

Показать описание

—
—
I will present a library that utilizes a feature of C++20 to build regular expressions from compile-time strings. This is a novel approach that doesn't have ordinary disadvantages of other regular expression implementations like having to use a linked library or a run-time cost of parsing and interpreting an internal finite machine.

You will see implementation details of the library and problems I have run into during its writing. In the last part of the talk, I will compare other implementations of regular expression engines and show compiled code in Compiler Explorer.
—
Hana Dusíková, Avast
Senior Researcher

Hana is working as a senior researcher in Avast Software. Her responsibility is exploring new ideas and optimizing existing ones. She also propagates modern C++ techniques and libraries in internal techtalks and gives talks at local C++ meetups.

She studied computer science at Mendel university and subsequently taught several courses there, including: Data Structures, Computability and Complexity, and Formal Languages and Automata.
—
*--*
*-----*

Рекомендации по теме

Комментарии

Good video, enjoyed watching it.
Especially the moment when she revealed compilation time benchmarking

zzzXopHeTzzz

Thank you for sharing this! That's a great use of compile time evaluation. Can't wait to see it revised for C++ 20 proper! :)

PROMuffy

wow, shows how strong keep it simple can be, even with complex things like compile time evaluation in c++17. amazing

jhbonarius

Great presentation of a fantastic idea.

JonathanSharman

Wow! Great job on the lib and talk. Just brilliant

bobbymah

I came across this project some time back when I was creating a similar compile time pattern matching class of about 100 LOC targeting C++14. Since it involved numerical ranges, the regex representation was somewhat large. This librarry only took around ~8ns where as the hand rolled version took ~8.5ns while std::regex and intel hyperscan took about 130ns. Our librarry version took ~17.5ns. Made me realize that Avast has some excellent programmars!

HashanGayasri

Fantastic stuff. Great presentation, I really enjoyed it.

jonathanwatmough

35:56 there is no a mistake - Plus and Star are correct. The additional case on the next slide is for testing whether the cycle runs 0 times. Because inside the loop, a "Star..." is applied in every iteration.
37:00 given that, I think the match for star should just
return match(begin, it, end, list<opt<plus<Star...> >, Ts...>{});

ViktorEngelmann

about the runtime behavior - I think the examples are somewhat generous, because the regexps are very deterministic, so the downside of the backtracking almost never happens. I think if you matched a long sequence of 'a' against a*a*b - it will obviously not match, but the algorithm here will spend a lot of time (quadratic) trying all the possible transitions from the first to the second a*. And you can make it any polynomial rank by just adding more a*... If you did a**b, a***b etc. I'm not even sure how it would behave - I suspect a**b is exponential, a***b might be double-exponential. I'm not even sure a**b would even terminate, because the outer star might add infinitely many a*'s that generate ɛ...

ViktorEngelmann

Very nice, but surely, if you're going to compile simple parsers like that, you should use something more well-typed than regular expressions? Doesn't C++ have proper parser combinator libraries by now that achieve the performance simply by inlining of the primitive parsers, and can furthermore parse the data right into a suitable type (probably with `std::variant`) and give you compile-time errors when what you're trying to match doesn't have the right shape?
That at least is how this kind of stuff would be done in Haskell, except in very simple, not performance-critical applications where a normal runtime regex engine does the job just fine.

leftaroundabout

CppCon 2018: Hana Dusíková “Compile Time Regular Expressions”

CppCon 2018: Hana Dusíková “Compile Time Regular Expressions”

Compile time regular expressions - Hana Dusíková - Meeting C++ 2018

A State of Compile Time Regular Expressions - Hana Dusíková - CppCon 2019

CppCast Episode 171: Compile Time Regular Expressions with Hana Dusíková

CppCon 2018: “Multi-Precision Arithmetic for Cryptology in C++, at Run-Time and at Compile-Time”

Episode #46 - with Hana Dusíková

Hana Dusíková - 'Compile Time Regular Expressions with Deterministic Finite Automaton'

CppCon 2018: Fabian Renn-Giles “A Semi Compile/Run-time Map with (Nearly) Zero Overhead Lookup”

Hana Dusíková — A state of сompile time regular expressions

Core C++ 2019 :: Hana Dusíková :: Compile Time Regular Expressions

C++Now 2019: Hana Dusíková “Compile Time Regular Expressions with A Deterministic Finite Automaton”...

CppCon 2018: Matthew von Arx “Set it and forget it!”

Compile Time Regular Expressions - Hana Dusíková

CppCon 2017: Hana Dusikova “Regular Expressions Redefined in C++”

CppCon 2018: Pablo Halpern “Using Compile-time Code Generation to build an LLVM IR Pattern Matcher”...

CppCon 2018: Mark Elendt “Patterns and Techniques Used in the Houdini 3D Graphics Application ”

CppCon 2018: Juan Manuel Martinez Caamaño “Easy::Jit: A Just-in-Time compilation library for C++”...

CppCon 2018: Anastasiia Kazakova “Debug C++ Without Running”

CPPP 2019 - A State of Compile Time Regular Expressions - Hana Dusíková

Beyond the Horizon of C++ - Hana Dusíková - Meeting C++ Secret Lightning Talks

CppCon 2018: Brian Ruth “std::basic_string: for more than just text”

CppCon 2018: Nir Friedman “Understanding Optimizers: Helping the Compiler Help You”

C++ Cryptozoology - A Compendium of Cryptic Characters :: #2 - Adi Shavit [ CppCon 2018 ]

CppCon 2018: Tony Wasserka “Teaching Old Compilers New Tricks: Transpiling C++17 to C++11”