Python for Finance: Are stock returns normally distributed?

preview_player
Показать описание
Today we investigate whether stock returns are normally distributed! First, I show the difference between simple returns and log returns, highlighting the main reason log returns are preferred for financial analysis.

Then we explore whether CBA (ASX listed stock, Commonwealth Bank of Australia) log returns are normally distributed. We do this by considering several visual and statistical techniques; Quantile-Quantile Plots, Box Plots, Histograms, and numerically by Hypothesis Testing with Kolmogorov Smirnov and Shapiro Wilk tests.

00:00 Intro
00:27 Why use log returns?
02:18 Simple returns
07:28 Log returns
10:44 Are log returns normally distributed?
16:08 Quantile-Quantile Plots
18:00 Box Plots
19:20 Kolmogorov Smirnov test
22:10 Shapiro Wilk tests

As a high-level programming language, Python is a great tool for financial data analysis, with quick implementation and well documented API data sources, statistical modules and other frameworks related to the financial industry. We will be using Jupyter Lab as an interactive web browser editor for this series due to ease of use and presenting code in a live notebook is ideal for this tutorial series.

This is the third video of many on the topic of Python for Finance. The series will include general techniques used for financial analysis and act as an introduction for more in-depth tutorials that we may explore later (such as time series modelling, building financial dashboards, machine learning ect.).

★ ★ Code Available on GitHub ★ ★

★ ★ QuantPy GitHub ★ ★

★ ★ Discord Community ★ ★

★ ★ Support our Patreon Community ★ ★
Get access to Jupyter Notebooks that can run in the browser without downloading python.

★ ★ ThetaData API ★ ★
ThetaData's API provides both realtime and historical options data for end-of-day, and intraday trades and quotes. Use coupon 'QPY1' to receive 20% off on your first month.

★ ★ Online Quant Tutorials ★ ★

★ ★ Contact Us ★ ★

Disclaimer: All ideas, opinions, recommendations and/or forecasts, expressed or implied in this content, are for informational and educational purposes only and should not be construed as financial product advice or an inducement or instruction to invest, trade, and/or speculate in the markets. Any action or refraining from action; investments, trades, and/or speculations made in light of the ideas, opinions, and/or forecasts, expressed or implied in this content, are committed at your own risk an consequence, financial or otherwise. As an affiliate of ThetaData, QuantPy Pty Ltd is compensated for any purchases made through the link provided in this description.
Рекомендации по теме
Комментарии
Автор

Hi, thanks for the video! I just want to add that when you using stats.kstest, by default it compares with standard normal distribution (std=1 and mean =0), however your stock returns has different std & mean, so you might want to add args in your function, then the stock return might fit normal distribution better

for example: ks_stat, p_value = kstest(log_returns, 'norm', args = (mean, std))

weirdcuteai
Автор

Thank you for a great content! I would like to add that there is one more test for normality which is Jarque-Bera test that should be familiar to students who have done econometrics. Therefore, I might suppose that you have completed non-economic major. However, I think it is nice to know other normality tests as well!

vladislavpyatnitskiy
Автор

Great video! Just a bit of nit-picking though: at 13:00 when you subtract the mean and divide by standard deviation, you say that will 'normalise' the best and worst case scenarios when they are instead being 'standardised'

sahilharidas
Автор

Before doing ks /sw test, did I need to standardize the log return?

yeuyang
Автор

For those running into issues plotting the diagrams, replace

df.Close.plot().update_layout(autosize=False, width=500, height=300).show(renderer="colab")

with

df.Close.plot().update_layout(autosize=False, width=500, height=300).show()

Same for
log_returns.plot(kind='hist').update_layout(autosize=False, width=500, height=300).show(renderer="colab")
and
log_returns.plot(kind = 'box').update_layout(autosize=False, width=350, height=500).show(renderer="colab")

MA-ojzk
Автор

Great explanation! I have a question. If arithmetic average of simple returns does not yield a correct measure of mean and we need to use geometric mean instead, what do we do with standard deviation? How will we compute standard deviation in a manner that deals with the fact that simple returns are not additive?

DimitarDobrinov
Автор

Wow, great class! Thanks! Was thinking that the mean of the simple return was a good to have a first idea of returns over time but it can overshoot a lot! Will be able to estimate better now! And learned what the heck a ks test is about hehe Thanks a lot!

Just a tip ... I think that you could have just sum 1 to simple_returns instead of doing a list comprehension, it's cleaner and more efficient (on cell 57 of the notebook)

joaopedrorocha
Автор

Hello,

I got the same figure when adding log or simple return to the beginning price to get the ending price.

My code:

For log_return:

data.Price[0] * * len('log_return'))


This code did not give me the price at the end of the timeseries

For simple_return :

data.Price[0] * (1 + data['simple_return'].mean()) ** len('simple_return') .

Is there something I am doing wrong?

Kindly respond.

omololaomotalade
Автор

Stock returns and log returns can be modeled more accurately with the t-distribution.

ncheymbamalu
Автор

In a lot of engineering projects we ignore >3sigma events. Is it the same for finance projections? Let's say i want to make an ARIMA projection, or a monte carlo simulation. Should i keep these rare events as useful data?

vladk
Автор

So what's the solution to this? Most all of theoretical finance assumes a normal distribution (Markowitz theory, CAPM, Black-Scholes). What do quants do in the real world when returns are not normally distributed?

steez
Автор

in which IDLE are you runing your code?

carlosarrieta
Автор

Doesn't this basically show that log returns are normally distributed and therefore the stock returns have a lognormal distribution?

jure
Автор

I would appreciate if you would move your picture outside of the script...it is hard enough to see the blurred script because its small scale. And your photo makes it worst...

ooddy
Автор

For those running into issues with the yahoofinance pandareader, here's the fix:

import yfinance as yf
yf.pdr_override()

MA-ojzk