Jupyter Notebooks vs Python Scripts | When to Use Which?

preview_player
Показать описание
Jupiter Notebooks vs. Python Scripts? Yes, you read that right! For the first time on the channel, I’ll discuss Jupyter Notebooks and make a comparison to traditional caveman methods, like scripts.

🎓 Courses:

👍 If you enjoyed this content, give this video a like. If you want to watch more of my upcoming videos, consider subscribing to my channel!

Social channels:

👀 Code reviewers:
- Yoriz
- Ryan Laursen
- Dale Hagglund

🔖 Chapters:
0:00 Intro
0:40 What are Jupyter notebooks?
1:36 Jupyter notebooks in VSCode
10:25 Scripts + Notebooks
11:58 Final thoughts

#arjancodes #softwaredesign #python

DISCLAIMER - The links in this description might be affiliate links. If you purchase a product or service through one of those links, I may receive a small commission. There is no additional charge to you. Thanks for supporting my channel so I can continue to provide you with free content each week!
Рекомендации по теме
Комментарии
Автор

VSCode has this mode called “Python Interactive Window” where you have Jupyter-esque code blocks. The blocks are separated by a special comment (“# %%”) so the end result is still a script you can version control, unit test, debug, etc. It’s available through the Python extension.

CastToVoid
Автор

For the Jupyter issue with regard to imports being present or missing due to editing "errors". I have a rule that I reload and run all features after finishing changes in a cell. This helps to ensure that I did not create a side effect, add, or remove something needed elsewhere.

I can still run into issues, but the restart typically shows me the errors of my ways.

digiryde
Автор

Currently trying to refactor a jypiter notebook I got from our data science team - complete nightmare.

Alticroo
Автор

Im a computational chemist, so I use jupyter notebooks daily for explorative data analysis. Especially when analyising convergences, some paramaters have to be adjusted on the fly everytime and here notebooks are awesome to see the immediate effect of your choices without painfully loading in the huge amount of data again

konokonoth
Автор

I've been using Jupyter for EDA and building pipelines. However, transitioning that pipeline into a standalone script has always been a bit of a journey for me. I would absolutely love to see a video on how to effectively make that transition from a Jupyter notebook to a full-fledged Python script, especially when it comes to keeping checks (maybe asserts to ensure data looks as expected?) for the exploratory nature of Jupyter while ensuring the robustness and maintainability of a script. Thanks for all the content you produce, and keep up the great work!

d_b_
Автор

Great video! Coming from data science, I definitely see the value of exploratory data analysis with Jupyter notebooks. For your question, one annoying difficulty with Jupyter notebook files is version control. If you write a .py file and a coworker runs the file to see what it does, then there is no change to the .py file. Hence the version control software will not note the file as changed. But if the same scenario happens with a Jupyter notebook file, then the file changes! This is pretty annoying, especially if your coworkers are used to simply write git add .

TMQuest
Автор

I use notebooks for reports. Of course I run into the same issues you mentioned. This is why I try to define functions in a separate module. However, in many cases I use notebooks in an IPython shell which is convenient to explore code snippets.

christiansiegl
Автор

I've used Jupyter notebook a lot working in earth science modeling. I also manage the frontend (Java) and backend (Python) code for a website that process data based on a user's request. My thought process when using a notebook is so different than when I'm working on code that is part of the site backend. For instance, "testing" when working on the backend becomes more like "validation" when using notebooks for earth science modeling. With the backend code I may be using unit tests, while when I'm using notebooks I may be generating a plot or map to ensure that the data are being modified how I expect.

To address the issues that can arise from running cells out if order, I am a stickler about using the "Restart and run all" command.

Working in earth science modeling, Jupyter has been a big boon in regards to repeatably, reusability, and transparency.

Thanks for another great video!

brandonhouse
Автор

VScode actually supports doing annotated jupyter-style blocks in .py files. The advantage of this is that instead of a blob of json (jupyter is json with a lot of the output saved in there as literals) you're working in plain text and can therefore version control your file.

Luminon
Автор

I would have expected that you’ve touch on two other aspects:

- version control challenges with Jupiter notebooks; and related
- breaking the code -data separation paradigm in these notebooks that can be also a security/privacy risk.

lancu
Автор

I find jupyter useful for the reasons you have outlined.

Plus they are a good way to prepare presentations where you want to show graphs and the like. Not to actually run the code, just to be able to see it with markdown providing reasonable headings, comments etc.

Being able to access the presentation through a browser is also useful - you can demo through eg an iPad. The alternative of exporting the graphs/tables etc and then importing to PowerPoint is a pain. In the past I might of used XL to do something similar.

I also find them useful for developing new code (where it’s not obvious what data manipulations are required up front) then once happy with the results, re-write the algorithm as a script.

On the downside, version control is a pain in the arse. Merging always seems to go wrong with git. Also, they do seem to glitch in strange ways sometimes losing code or requiring a re-write.

IterativeTheoryRocks
Автор

This was a nice video. Too often I see the more formal programmers, who don't have any experience with exploratory data analysis, dismiss notebooks upfront, without any nuance. Yeah, sure, I'll just run a script again and again, redoing the calculations and plots I again and again, super efficient. Notebooks are a great way of combining text, images, code and output, and have their downsides, of course, as everything in this planet. I've faced all the problems you mentioned in the video, and I'm now aware of the code smells. One golden rule I found was that, before "checking in" any notebook, or giving it to someone else, always restart the Kernel and run all cells. If it doesn't run to the end, except in some very specific cases, you have a problem that needs fixing. Ideally, restart and run everything every once in a while, like 1-2 hours. In the end, I consolidate some useful behavior into functions or classes and move them to a module that I can import in future notebooks, and which is properly unit tested and documented.

PraecorLoth
Автор

I used to use Jupyter notebooks for data exploration and especially if there were intermediate results that took a long time to calculate. Eventually I gave up on them because it was too easy to save data off to csv, xslx, or even into a sqlite database (usually via diskcache) and then read them back in each time I re-ran.

parswarr
Автор

I love Jupyter notebooks and use them almost every day, biggest issues I've found are version control and debugging. Although VScode has some limited debugging features for Jupyter it's definitely not as smooth as .py files.

charabango
Автор

Just a note - a square latlon will vary from an area perspective, with the largest at the equator and smallest at the poles. There are (lossy) projections to translate latlon to local distance in meters using the Azimuthal equidistant projection

craigmcconomy
Автор

Does anyone know how to add the runtimer at the lower left corner when running a cell in the notebook?
Thanks in advance!

brisingreye
Автор

Combining scripts with notebook is very useful for me in some situations!

diegol_
Автор

I usually build pipelines on notebooks and transition that to regular python scripts. You just have to be aware of any changes to predefined variables. I do find notebooks to be much slower in magnitudes of 10s of minutes

jerin
Автор

This was extremely helpful, thank you!

JClishe
Автор

the function at 5:14 looks simple but why it took 4.5 minutes? how big was the UFO data?

pietraderdetective
visit shbcf.ru