Paper Review Call 019 - UMAP

preview_player
Показать описание
Paper Review Call 21: UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction

COMPLIANCE STATEMENT:
* THIS IS NOT A MICROSOFT YOUTUBE CHANNEL
* WE ARE NOT A SPOKESPEOPLE FOR MICROSOFT I.E. OUR OPINIONS
* THERE IS NO MICROSOFT-CONFIDENTIAL INFO IN THIS VIDEO, IT'S AN EDUCATIONAL VIDEO ON MACHINE LEARNING, ALL INFO IS IN PUBLIC DOMAIN!
* EVERYONE IN THE VIDEO HAS GIVEN THEIR EXPLICIT PERMISSION FOR PUBLICATION ON YOUTUBE AND WAS AWARE IT WAS CREATED FOR THIS PURPOSE
* I WORKED FOR MICROSOFT AT THE TIME THIS CONTENT WAS CREATED

Interested in dimensionality reduction? TSNE is so last century, these days it’s all about UMAP!

UMAP (Uniform Manifold Approximation and Projection) is a novel manifold learning technique for dimension reduction. UMAP is constructed from a theoretical framework based in Riemannian geometry and algebraic topology. The result is a practical scalable algorithm that applies to real world data. The UMAP algorithm is competitive with t-SNE for visualization quality, and arguably preserves more of the global structure with superior run time performance. Furthermore, UMAP has no computational restrictions on embedding dimension, making it viable as a general purpose dimension reduction technique for machine learning.

Mihaela Curmei is going to expertly dissect this paper in detail as well as offer some practical engineering advice around dimensionality reduction in general and explain the pros and cons of TSNE vs UMAP. Having just done a dry run with Mihaela, I can promise you that this is NOT ONE TO MISS!


@article{mcinnes2018umap,
title={Umap: Uniform manifold approximation and projection for dimension reduction},
author={McInnes, Leland and Healy, John and Melville, James},
journal={arXiv preprint arXiv:1802.03426},
year={2018}
}
Рекомендации по теме
Комментарии
Автор

This is the best overview of UMAP that I've found! I really appreciated the comparisons between UMAP and other dimensionality reduction methods (namely, t-SNE), and the visuals made the math more digestible. I found myself having a lot of "aha!" moments when ideas finally linked together like puzzle pieces. Also, a lot of the questions being asked were either ones that I also had, or they were ones that helped me identify gaps in my understanding of the topic, so I'm glad those were included. Mihaela did a fantastic job on this presentation!

azure-hawk
Автор

TIL about PRC (thanks reddit!). Thanks for sharing on youtube!
Great questions in the interview part! one of the best I heard recently. Like other call-based webinars (looking at you intel) I am not really happy about the call audio quality (I wish Mihaela would also have an HQ audio channel), but nevertheless this was really interesting to listen to while waiting for my training runs to end ;)

untitledipynb
Автор

Really high-level content, thank you all for your work!

sumailsumailov
Автор

Sad that no one (in any video o paper) explain the why they choosed the properties of the log2 of UMAP, for calculation the sigma value (the normalization factor), so essentially, it was empiric the choose of the log2. This is the heart of UMAP! rigth? ^o^

cristianpadron
Автор

You were right, this is not one to miss!

Bookerer
Автор

To answer the question at 1:57:00 (paraphrased: "What is the s-o-t-a for global methods"):

I'm not sure if they are the best, there seems to be a lot going on at the moment in that domain. Some keywords to put in your favourite paper search: AtSNE, HLLE, MLDL, TopoAE...
This repository is certainly a good starting point, as it references many other papers and has a nice comparison.

timmothey
Автор

Haha Siraj is everyone’s favourite.. loved the call..

macbro
Автор

I think Mapper is a pretty nice global nonlinear “dimension-reduction” though its output is a graph 😂

changjonathan
Автор

*wince* Those questions around 21min. Let her actually give the presentation guys.

hamnonox