filmov
tv
Feature Hashing for Scalable Machine Learning - Nick Pentreath

Показать описание
"Feature hashing is a powerful technique for handling high-dimensional features in machine learning. It is fast, simple, memory-efficient, and well suited to online learning scenarios. While an approximation, it has surprisingly low accuracy tradeoffs in many machine learning problems. Feature hashing has been made somewhat popular by libraries such as Vowpal Wabbit and scikit-learn. In Spark MLlib, it is mostly used for text features; however, its use cases extend more broadly. Many Spark users are not familiar with the ways in which feature hashing might be applied to their problems. In this talk, I will cover the basics of feature hashing, and how to use it for all feature types in machine learning. I will also introduce a more flexible and powerful feature hashing transformer for use within Spark ML pipelines. Finally, I will explore the performance and scalability tradeoffs of feature hashing on various datasets.
Session hashtag: #EUds15"
About: Databricks provides a unified data analytics platform, powered by Apache Spark™, that accelerates innovation by unifying data science, engineering and business.
Connect with us:
Session hashtag: #EUds15"
About: Databricks provides a unified data analytics platform, powered by Apache Spark™, that accelerates innovation by unifying data science, engineering and business.
Connect with us:
Feature Hashing for Scalable Machine Learning: Spark Summit East talk by: Nick Pentreath
Feature Hashing for Scalable Machine Learning - Nick Pentreath
Machine Learning 50: Feature Hashing
Representation Learning | Fully Understanding the Hashing Trick | NeurIPS
Fully Understanding The Hashing Trick
Feature Engineering-How to Perform One Hot Encoding for Multi Categorical Variables
Feature Hashing: Efficient Categorical Data Encoding for Large-Scale ML Systems
Scaling Machine Learning Feature Engineering in Apache Spark at Facebook
Feature Hashing or Hashing Trick for Natural Language Processing (NLP)
Different Types of Feature Engineering Encoding Techniques
Featuring Engineering- Handle Categorical Features Many Categories(Count/Frequency Encoding)
RedisDays India: Scaling Payments with Machine Learning in Redis
#130: Scikit-learn 124: Computing strategies
Gianluca Campanella: The unreasonable effectiveness of feature hashing | PyData London 2019
Compress Deep Learning models 10,000x with Probabilistic Hash Functions
Tom Augspurger: Scalable Machine Learning with Dask | PyData New York 2019
F5.E — Sparse Hashing for Scalable Approximate Model Counting: Theory and Practice
Hashing: feature hashing y the hashing trick (4/4)
Intro to Feature Engineering with TensorFlow - Machine Learning Recipes #9
The Secret Language Scaling WhatsApp and Discord
Scalable Machine Learning (AutoML) Meetup
Scalable and Sustainable Deep Learning via Randomized Hashing
ICCV 2023 SHACIRA: Scalable HAsh-grid Compression for Implicit Neural Representations
The HARDEST part about programming 🤦♂️ #code #programming #technology #tech #software #developer...
Комментарии