these compression algorithms could halve our image file sizes (but we don't use them) #SoMEpi

preview_player
Показать описание
an explanation of the source coding theorem, arithmetic coding, and asymmetric numeral systems
this was my entry into #SoMEpi. this video can get pretty confusing, so don't worry if it takes some rewatches to understand. if i had more time i would've made it better....but anyway i hadn't seen many videos on this so i hope it is a helpful introduction

00:00 intro
01:07 what's wrong with huffman
02:46 prove the source coding theorem
05:35 entropy and information theory
06:59 everything is a number
07:50 arithmetic coding
11:38 asymmetric numeral systems

the music is debussy, satie, and schumann
Рекомендации по теме
Комментарии
Автор

TO CLARIFY: don't be afraid to use these algorithms! the patents for arithmetic coding are all expired, and the only justifiably patentable material in Microsoft's patent are specific implementations of adaptive-probability rANS (which JPEG XL actually doesn't use). Microsoft has also given anyone permission to use it in open source, royalty-free codecs

at around 7:20, i put "NH(X) = log_2 X" ... H(X) is the entropy of a symbol, not of the code! X on the left is a symbol, and X on the right is the code ... sorry! let me know of any other corrections. and thank you for watching!!

I saw some mixed reactions to the music. I made a poll about this for future reference, but I don't know how to link it here, so please check it out in my channel

JentacularGent
Автор

0:47 intellectual property law is a completely fair and balanced system with no exploits whatsoever

leeroyjenkins
Автор

Patents suffer from the problem of trivial extension. The typical thing I have noticed is that you can take a well known thing, not change it at all, and use it in a context it hasn't really been used before. Then patent that.

I can't patent Huffman Coding? What about Huffman Coding... in space?

timseguine
Автор

If there were enough algorithms to do this with, I could imagine making a series of "Algorithms you can't use"

purplenanite
Автор

jarek duda: check out this really cool compression algorithm I made! it’s free and unpatented and twice as good as Huffman coding.
Microsoft: check out this really cool compression algorithm I totally didn’t steal from jarek duda! it’s proprietary and patented and twice as good as Huffman coding.
patent office: yeah, we don’t have it on file. Here’s your patent, Microsoft
jarek duda:

LuveelVoom
Автор

This a neat video, and an interesting subject, however, it was a bit too fast paced for me. A few could be done differently :
- Sometimes, you list goals, but do not put those in written format, which makes things harder to follow.
- Having a synthetic representation of the algorithm as a whole could also help a lot, seeing the states of the variables evolve in real time is neat, but insufficient.
- Getting the meaning of abstract notations takes some time, using a more explicit naming convention could help.

jeanf
Автор

It's actually being used in JPEG XL.
The problem is so few programs support it despite its backing from every major web browser, photography company, open source organisation, and cybersecurity group.
Cuts the size of PNGs in half in most cases while still retaining lossless quality.

safebox
Автор

6:13 Took me a second to figure out how clever this was.

Assuming you're playing a game of hangman in which the word to be guessed is an actual, English word, it could be plane, plant, plank, or plans. In that case, saying that the last letter is a vowel is worth two bits of information (as it divides the space of valid possibilities by 2^ *2* ). Saying that the last letter is a letter is worth zero bits, given assumptions.

Manabender
Автор

I can't tell you how far above my head this video is....but it approaches infinity.

denislamarche
Автор

My data compression professor last semester made really good points about how Huffman still has a place. It’s just a substitution thing and you can simply read the encoded text if you have the scheme. It’s a lot more performant when it comes to decoding.

Anon-dolo
Автор

I'm of the opinion that software shouldn't be patentable, but also that copyright should expire after no more than 28 years, although 14 would be preferable. Years ago, back when I was in middle school, I remember playing with a compression program written in QBasic which used arithmetic coding. I thought it was pretty cool, but at the time didn't know how to use it. That was significantly longer than 17 years ago now, and even if the idea were capable of being patented, it definitely should no longer be as 17 years is the limit for a patent. Of course, I also think that our patent system needs to be fixed in more ways than one, including that people with technical knowledge should be the ones to review patents in technical fields and the law, which gives us the criteria that they should be novel, should actually be applied and prevent patent squatting as MS, and many other corporations acting in bad faith, have done.

anon_y_mousse
Автор

1:25 "hello, how are you? fine, thank you"

wikiPika
Автор

We need another edition of Math You Can't Use by Ben Clemens just to cover this BS. Well, I mean we probably need a reconsideration of the patent regime first. Software - being just maths - should *not* be granted patents at all. If the whole idea behind patents was to foster innovation then "we've been sending completely unnecessary amounts of data over the internet because patent trolls" is definitely evidence that it doesn't work that way.

Xankillr
Автор

So what I got from this video is that we need patent law reforms and also Microsoft needs to get its shit kicked in REALLY hard?

Chalkadoo
Автор

Thank you so much for sharing this! I was implementing LZMA earlier this year (for fun, and to learn how it works), and when I got to LZMA's range encoding component I realized how genius this approach is. It took me days to understand, but your video would've helped so much if I was around haha

arduano
Автор

1:23 Azumanga Daioh reference 「I wish I were a bird.」

DumToasty
Автор

Typical fucking Microsoft, trying to patent something in the public domain.

God how I hate that company.

kristianTV
Автор

This is way above my paygrade. I need more readings before I watch this video.

dumdum
Автор

I remember seeing a demo for fractal encoding instead of JPG in the early 1990's in the days where 14.4kbps modems were still very expensive so I only had 2400 baud dialup. The demo was a real-estate website with pictures of a houses. Interlaced jpg images took a few minutes to download and view, where as the new fractal format was <10% the size and took only seconds to download for the same image quality. BUT the catch was that on low end PC's the image took almost as long to render as the JPG took to download

dl
Автор

I recently used an adaptive Arithmetic encoder and a LSTM for a custom compression algo, the arithmetic encoder was cool to work with

sniper