filmov
tv
Developing ML-enabled Data Pipelines on Databricks using IDE & CI/CD at Runtastic
Показать описание
Data & ML projects bring many new complexities beyond the traditional software development lifecycle. Unlike software projects, after they were successfully delivered and deployed, they cannot be abandoned but must be continuously monitored if model performance still satisfies all requirements. We can always get new data with new statistical characteristics that can break our pipelines or influence model performance.
All these qualities of data & ML projects lead us to the necessity of continuous testing and monitoring of our models and pipelines. In this talk we will show how CI/CD Templates can simplify these tasks: bootstrap new data project within a minute, set up CI/CD pipeline using GitHub Actions, implement integration tests on Databricks. All this is possible because of conventions introduced by CI/CD Templates which helps automate deployments & testing of abstract data pipelines and ML models.
The CI/CD templates are used by Runtastic for automating deployment processes of their Databricks pipelines. During this webinar Emanuele Viglianisi, Data Engineer at Runtastic will show how Runtasic is using CI/CD templates during their day to day development to run, test and deploy their pipelines directly from PyCharm IDE to Databricks. Emanuele will present the challenges Runtastic has faced and how they successfully solved them by integrating the CI/CD template in their workflow.
About:
Databricks provides a unified data analytics platform, powered by Apache Spark™, that accelerates innovation by unifying data science, engineering and business.
Connect with us:
All these qualities of data & ML projects lead us to the necessity of continuous testing and monitoring of our models and pipelines. In this talk we will show how CI/CD Templates can simplify these tasks: bootstrap new data project within a minute, set up CI/CD pipeline using GitHub Actions, implement integration tests on Databricks. All this is possible because of conventions introduced by CI/CD Templates which helps automate deployments & testing of abstract data pipelines and ML models.
The CI/CD templates are used by Runtastic for automating deployment processes of their Databricks pipelines. During this webinar Emanuele Viglianisi, Data Engineer at Runtastic will show how Runtasic is using CI/CD templates during their day to day development to run, test and deploy their pipelines directly from PyCharm IDE to Databricks. Emanuele will present the challenges Runtastic has faced and how they successfully solved them by integrating the CI/CD template in their workflow.
About:
Databricks provides a unified data analytics platform, powered by Apache Spark™, that accelerates innovation by unifying data science, engineering and business.
Connect with us:
Комментарии