Scaling time-series forecasting models to cope with the multi-verse with Ray

preview_player
Показать описание
At JPMorgan Chase we need robust forecasts of market prices and other time-series to provide a high-quality financial service to our clients. However, time-series forecasts in financial markets are difficult due to the richness of the systems involved and low signal-to-noise ratio. Furthermore, often the underlying drivers (e.g. Inflation) are non-stationary. In this work, we present the concept of probabilistic forecasting at scale powered by Ray, and we apply this technique to improve time-series models for multiple use cases relevant to the finance industry.

When forecasting time-series, one must consider the question: "Is the future likely to represent this exact version of history?" Or can we take random subsets of the past and develop a probabilistic model to forecast the future based on multiple slices of past data (i.e. a multi-verse approach)? This kind of sampling (called back-testing) helps remove the influence of outlier data points, while being representative of history, but requires much, much more processing. This is why we scale our project with Ray. To be more specific, probabilistic forecasting and non-stationarity necessitate large scale compute and distributed ML model development:

Probabilistic forecasting requires a distribution of outcomes
Non-stationarity is addressed through a back-testing framework where instead of training a single global ML model, a sequence of models is trained periodically over time accounting for the changing macro and micro-economic dynamics
Potential use cases for this work are forecasting stock, commodity & energy prices, interest rates, exchange rates and a large variety of tradable assets. The forecasts could help traders and investors make better informed decisions about buying and selling assets, modeling other assets that depend on these time-series forecasts and managing risk. These use-cases can be expanded to cover different forecast horizons. A short-term model might be appropriate for day trading, while a medium-term model might inform asset selection in a portfolio, and finally a long-term forecast might help with stock valuation or asset construction decisions (e.g. wind-farms needed by 2050).

Our research team has created a platform for large scale ML based time-series forecasting for the above use cases. The backbone of the platform is Ray and its capabilities to efficiently distribute large scale computations on cloud leveraging Kubernetes. This JPMorgan Chase platform is used internally to conduct probabilistic regression, feature engineering, feature selection, hyper-parameter optimization, and probabilistic analysis metrics, with scaling powered by Ray.

Distributed time-series forecasting has the potential to drive significant improvements in efficiency, profitability, and sustainability, thanks to the power of Ray.

Authors:

Peyman Tavallali and Savinay Narendra are Vice Presidents and Applied AI ML Leads at JPMorgan Chase's Machine Learning Center of Excellence.

Berowne D Hlavaty is an Executive Director in JPMorgan Chase's Big Data & AI Strategies Research team.

About Anyscale
---
Anyscale is the AI Application Platform for developing, running, and scaling AI.

If you're interested in a managed Ray service, check out:

About Ray
---
Ray is the most popular open source framework for scaling and productionizing AI workloads. From Generative AI and LLMs to computer vision, Ray powers the world’s most ambitious AI workloads.

#llm #machinelearning #ray #deeplearning #distributedsystems #python #genai
Рекомендации по теме