Python and Pandas for Sentiment Analysis and Investing 4 - Data manipulation

preview_player
Показать описание

In this video, we learn how to access specific data from our dataset.

sentiment analysis, containing about 600 stocks, mostly S&P 500 stocks.

Pandas is used to work with our data quickly and efficiently. The ideas of Pandas is to act as a sort of framework

for quickly analyzing data and modeling it.

Sentiment Analysis data:

Python Module downloads:
(Get all of the listed dependencies, or at least the major ones like NumPy, Dateutils, Matplotlib, )

Bitcoin donations: 1GV7srgR4NJx4vrk7avCmmVQQrqmv87ty6
Рекомендации по теме
Комментарии
Автор

Pretty cool stuff.  When I plot using your code though, I get the same graph, but it doesn't let me slide it around like you could.  Also, you get the labels "Price" and "500MA" however I only get "close" for the upper graph and "None" for the lower graph. 

Any idea on what could cause those differences?

tomm
Автор

Thank you for the tutorials ! Where can i find the data of this video?thank you 

gregdeng
Автор

Why is the does the rolling mean plot range from -3 to three. Shouldn't it be on the order of the stock price?

GodAmoungMen
Автор

Hello, I manage to generate the 2 graphs, but i'm not able to zoom in or out nor move it around like how you were doing it at the 5.50 mark. Any ideas? I've done sharex as well. still no luck

JakeWong
Автор

Hey there, I can't seem to get the code in your video to work.  I keep getting an error:
TypeError: Empty 'Series': no numeric data to plot

I'm running Python 2.7.9, pandas 0.16.0 and matplotlib version 1.4.3.  

Here is my code if you have a second to take a look:


import pandas as pd
from pandas import DataFrame
import matplotlib.pyplot as plt
from matplotlib import style
import numpy as np

style.use('ggplot')

def single_stock(stock_name):
    df = pd.read_csv('stocks_sentdex_dates_full.csv',
                     index_col='time', parse_dates=True)
    
    df = df[df.type == stock_name.lower()]

    _500MA = pd.rolling_mean(df['value'], 500)

 
    ax1 = plt.subplot(2, 1, 1)
   
    plt.legend()
    
    ax2 = plt.subplot(2, 1, 2, sharex = ax1)
    _500MA.plot(label='500MA')
    plt.legend()

    plt.show()


single_stock('bac')

clockwerkz
Автор

Hello,

I really appreciate all of your series on data analysis, they have been incredibly helpful.

I was wondering if you cold recommend a good laptop for programming. I have an i75930K PC with 32gb of DDR4 RAM running Windows 7 Pro 64bit which handles large data files handily. I want to purchase a good laptop that I can do some data analysis with while chilling out of doors on the weekends. One of the difficulties that I'm running into is finding a new model with good specs that DOES NOT have Windows 10 installed, or where Windows 10 can be removed--one of the models that I had my eye on would not allow WIn/10 to be removed/replaced.

I'd like to be able run Windows 7 ( I really don't like any of the versions after 7) with an i7 series cpu, minimum 16gb RAM an SSD main drive and a 1TB HDD for storage

I would use this laptop for writing and editing scripts with smaller chunks of data (like 2-3 years of 8 minute intra-day Emini data).

Any suggestions would be appreciated.
Thank you,
Ana

quantza
Автор

Hey thanks for the tutorials! I'm getting an error running your data 'AssertionError: Index length did not match values'. Is the data you posted the same as in your video. Thank you

mrozener