Adding adding to my language (Compiler Pt.4)

preview_player
Показать описание

Chapters
========
0:00 Intro
0:45 Math Grammar
6:00 Updating AST in Parser
8:53 The Problem with Pointers
11:44 Explaining an Arena Allocator
15:47 Implementing an Arena Allocator
24:20 Using the Arena Allocator
33:30 Parsing Math Expressions
43:15 Debugging
45:41 Restructuring the Grammar
54:08 Updating the Code Generation
56:27 Addition in Assembly
1:01:11 Surprise Functionality
1:04:53 Refactoring

FAQ
===
CLion Theme: One Dark theme
Keyboard: Keychron V6, Gateron G Pro V2 Brown Switches
Рекомендации по теме
Комментарии
Автор

I'm so glad that you are still uploading! So many of these series online are only like one or two videos before the creator stops uploading. Excited to see where this goes!

DarkPlaysThings
Автор

“Premature optimization is the root of all evil”
“We need an arena allocator”
😂

mattmurphy
Автор

Instead of parsing the math this way, you should take a look at something called the 'shunting yard algorithm'.

I've used it in my own compiler, and it saved me a bunch of headaches.

Great videos btw, cant wait to see the result!

larmkaart
Автор

-- Mom, can we get Tsoding,
-- No, we already have Tsoding at home.
Tsoding at home:
Adding adding to my language

Автор

i commented this on part 2 as well but you can put the commands on one line if you separate them with a semicolon like this: ./out; echo $?
The reason ./out && echo $? didn't work is that the right hand side of a && in bash only executes if the left hand side has exit code 0.

wedarobi
Автор

it’s so exciting to see the whole process, i really appreciate the fact that there is no cut, and allows us to see every stage, from confusion to accomplishment. i love this format, looking forward to see the next episodes

damianodinatale
Автор

So, about approaching operator precedence in your compiler. In our "programming languages and compilers" class we were shown how you should do it, but it was never explained why. Luckily I had enough experience with parsers after that to understand it myself.

Instead of writing prec = 0 or something while describing your grammar, use this way of describing it:
BinExpr -> ExprPrec1 + ExprPrec1 | ExprPrec1 | // add subtractions here
ExprPrec1 -> ExprPrec2 * ExprPrec2 | ExprPrec2 | // add division here
ExprPrec2 -> Expr | // add any higher precedence operators here

It's not self-explanatory why whould you do that and add so many additional (and kind of ugly when you have like 7+ of them) non-terminals, but just drawing an AST for this would probably be enough to get the idea.
In this way, higher precedence equals to operators being lower in resulting AST. That means that walking the tree from leaves to the root would resualt in correct evaluation without any smart precedence resolution algorithms

MrVladoCC
Автор

I'm typically a lurker on many vids and never comment... But just wanted to say how much I'm enjoying the content! Thanks and keep it up!

CoreyWysong
Автор

This is a great series. I'm 10+ years into programming, and I wanted to learn about interpreters and compilers by just building them since it's hard to retain that much info by just reading. I followed another series where the dude built an interpreter in python, and now I have the bones of my language working well. But his videos only implemented an interpreter, so I'm really glad you're getting into compilation, since it's apparently just interpretation with one extra step lol.

I'm gonna try to use your videos to make my language compile to llvm bytecode!

patrickshepherd
Автор

I'm enjoying your content,
I'm enjoying the "thinking out loud" as you problem solve

I'm enjoying just how good a programmer you are

quick thoughts (on the parsing):
1. parse the whole text through linear passes first (not recursively)
2. use indexes to sections in the string (no need for pointer magic)
3. collect the keywords and variables (then can easily check if identifier has already been declared or not) in a vector of indexes (eg: type can be {index int, length int} )
4. after collecting all identifiers and expressions, then simple rules to check if syntax rules obeyed
(eg, math expression has lhs n rhs, expression precedence, identifiers preceeded by let, type system obeyed, etc)

i think biggest benefits:
1. easy to reason algorithmically
2. no need for allocator magic (which looks really cool btw)
3. parsing becomes a linguistics problem not a data structure problem
4. parse through function calls that check if expected expression is found at expected location

because I'm seeing your project already has so many different types (hence needing allocator magic when they recurse),
when there's no need to split from the file/string (which already has all that information encoded)

judahmatende
Автор

Really great series! I can't wait to have all the steps clear before attempting my own compiler in Rust!

Furetto
Автор

Hey, awesome work there buddy! Just so you know, in formal grammars you can enforce precedence in operations like this:

[Prog] -> [Stmt]*
[Stmt] -> exit([Expr]); | let ident = [Expr]
[Expr] -> [Term] | [BinSumExpr]
[BinSumExpr] -> [Expr] + [Expr] | [BinMultExpr]
[BinMultExpr] -> [Expr] * [Expr]
[Term] -> int_lit | ident

This way, you're defining the grammar rules to reflect the desired operator precedence without the need for additional variables ("prec"). It will be especially helpful when you'll deal with more complex operations like subtraction, division, parentheses, or exponentiation... just follow the same rule!

morel
Автор

I genuinely get excited to watch these videos from this series, thank you for making em!

Also first comment 😎 GET REKT

Kfoo-djmd
Автор

23:17
Just a quick note:
I usually use char pointers (char*) for pointers that I do arithmetic with since its exactly a byte long, std::byte is just a fancy feature, that makes you type longer :)
Also, it's more safe to do a reinterpret_cast for pointers as it explicitly says you are reinterpreting the pointer instead of casting the value of the variable, which is a pointer anyways.

lesscomplex
Автор

This is amazing, please don't stop this series!

banana
Автор

Great series so far! Just wanted to warn you that your ArenaAllocator might be unsafe since you're not checking whether the buffer has enough space left in it to allocate the requested type and could go out of bounds. I haven't used C++ myself in a decade so I'm not 100% sure that it'll cause an issue, but it made me nervous lol
Looking forward to part 5!

DeathBean
Автор

curious to see how you're gonna handle functions and variable scopes

matthias
Автор

Loving this series, can't wait to see what comes next!

coolben
Автор

Great vid, just one nitpick: mb would be mili bits

Akronymus_
Автор

Tip: When parsing expressions TDLR if you perform parsing in order of precedence operation, you get order of precedence for free.
For instance, when parsing a term, first look to see if it's an integer literal, then look to see if it's a "MulOp" (that is multiplication or division), then check to see if it's an "AddOp" (addition or subtraction), and generate your AST nodes as you discover each. In this case, expression nodes with left/right always come out in the right order. For a far better explanation than I can give in a Youtube comment, see Jack Crenshaws "let's build a compiler" series, the parts on expression parsing.

ChapmanWorldOnTube