Dockerize Data Science Project

preview_player
Показать описание
Chapters:
0:00 Intro
0:46 Docker Concepts
2:40 Why data scientists should use docker?
4:20 Tutorial on dockerizing data science project
15:34 Running your data science project with docker
22:34 Outro

** Intro **
- Docker is a standard tool used by companies large and small
- Docker is already a standard in the developer world for building apps
- It can be easily used to build an end-to-end data science project
- Many data scientists, especially those new to this field, struggle with setting up the environment with Jupyter Notebook, python, and conda.
- Most of the things are being done in local environment, which first of all takes time to setup, after struggling with installing dependencies, they are finally able to jupyter notebook and build things. After the environment has been built they struggle to share this model and transformation code with other peers,
- Docker has some very useful features for everything from data exploration and modeling to deployment
- I will provide a full project with docker files, notebooks, models and apis, gitignores etc

- **What is a container?**: You can Abstract containers as lightweight virtual machines with their own CPU, Memory and Network Resources
- software that packages up code and all its dependencies
- so the application runs quickly and reliably from one computing environment to another.
- **Docker Image:** To start a container you need a docker image, this image is your blueprint for the container, you can get prebuilt images like MySQ
- **Dockerfile:** A yaml based instructions for building the image. You can consider it like DNA to the image with all the set of instructions in a place to make the image.
- **Dockerhub**: Cloud based GitHub for your Docker images; Building and hosting images for sharing across

# Why Data Scientists should use it

Advantages

- Easier to develop
- Easier to Version Control
- Easier to share with co-workers
- Easier to deploy

- Environment issues, working on my local computer but not yours
- Can’t install the package
- What packages are you using
- cloud platform uses
- Volume Mounts
- The good part is you get all the predefined images

FOLLOW ME ON

#docker #datascience #jupyter #python #pythonprogramming #fastapi #dataengineering #linearregression #apis
Рекомендации по теме
Комментарии
Автор

Amazing video, thank you for your explanation!

nachoeigu