SBTB 2015, SF Scala @Nitro: Marek Kolodziej, Scala, FP and Spark - the Perfect Combo for ML

Показать описание

-----

While FP and Scala have already become the mainstays of middleware, web development and big data stacks (Akka, Play, Kafka, Spark), they tend not to have a big presence in the machine learning and NLP communities. For instance, the emerging deep learning toolkits are mostly Python‐based (Pylearn2, Theano, etc.). The same goes for general-purpose machine learning (Python's scikit-learn, countless R libraries). Performance seekers dissatisfied with slow scripting languages write typed Cython code, contorted C++ libraries bound to scripting language wrappers, or resort to random exotic solutions such as Lua. Some even dispense with all abstraction and write incomprehensible CUDA kernels. There has to be a better way. As a machine learning engineer, I want to write strongly typed functional code. Math has no place for side effects, and I don't want to waste time running a simulation for hours, only to find that I made a typo in my "stringly-typed" script. Unbeknownst to most, Scala's machine learning and NLP ecosystem is growing rapidly, from numeric processing (Spire, Breeze) to big data machine learning (MLLib, Mahout) to GPU‐based text parsing (Puck), to general‐purpose probabilistic programming (FACTORIE). In this talk, I'll do a quick overview of Scala's machine learning ecosystem, and show how easy it is to re-use existing components to build a new, scalable algorithm implementation. If you'd like to see how you can write vectorized linear regression running native BLAS code, based on an SGD/Adagrad implementation written from scratch. capable of running at scale on petabytes of data using Spark, this talk is for you.

Marek Kolodziej is a Senior Research Engineer at Nitro.

Рекомендации по теме

SBTB 2015, SF Scala @Nitro: Marek Kolodziej, Scala, FP and Spark - the Perfect Combo for ML

SBTB 2015, SF Scala @Nitro: Malcolm Greaves, A Type Class for Data of All Sizes

SBTB 2015, SF Scala @Nitro: Alex Minnaar, Introduction to Topic Modeling and Evaluation Techniques

SBTB 2015, SF Scala @Nitro: Marek Kolodziej, Scala, FP and Spark - the Perfect Combo for ML

SBTB 2015, SF Scala @Nitro: Paul Kinsky, Generating Immutable Case Classes from Avro Schemas

SBTB 2015: Jean Rémi Desjardins, Automatic Concurrency through Computation Expressions

SBTB 2015: Buck Shlegeris, Automatically deriving efficient data structures in Scala

SBTB 2015: Tiho Bajić, The Rise of the Full-stack Scala Employee

SBTB 2015: James Earl Douglas, What JavaScript taught me about Programming in Scala

SBTB 2015: Long Cao, Evolving Your Code: More Functional Error Handling in Scala

SBTB 2015: Paul Phillips, Suffuse: Usable Virtual Filesystems

SBTB 2015: Jason Swartz, Enterprise APIs With Ease Using Scala

SBTB 2015: Adam Pingel, An overview of Axle: a Scala-embedded DSL

SBTB 2015: Martin Odersky Panel on the Past, Present, and Future of Scala

SF Scala: Marek Kolodziej, Spark and Databricks Notebook at Nitro

SBTB 2015: Julie Pitt, Scala.js: Confessions of a Backend Engineer

SBTB 2015: Michael Pilquist, A Tour of Functional Type Classes via Scodec and Simulacrum

[Scala Central] Yeshwanth Kumar - Understanding frees

Parametric Types (using Scala)

Vehicle Tracking

SBTB 2015: Pathikrit Bhowmick, Get Productive with Scala Macros

SF Scala @Spotify: Neville Li, Macros in Data Pipelines

SBTB 2015: Chris Richardson, Developing Functional Domain Models with Event Sourcing

ScalaIO - Stephen Zeiger - Type level Computations in Scala

SBTB 2015: Duncan DeVore, CQRS/ES with Scala and Akka Persistence