Cleaning up Tweets: How to use the Twitter API v1.1 with Python to stream tweets

preview_player
Показать описание

This is the 3rd video in the Twitter streaming tweets mini-series, showing you how to split up the tweet data and save only what you want.

Рекомендации по теме
Комментарии
Автор

If you know what they are, then you can just simply run a replace function... so lets say \u521f is a smiley, you can just replace("\\u521f", ':)') .. or similar. You can also find all instances of unicode by regex... something like r'\\u(.*?) would be a start to finding them, then just remove em... etc.

sentdex
Автор

Because of non-ascii characters, I had to do utf-8 encoding on tweet first. The problem is that, when I am encoding it with utf-8, it is saving emoticons on mysql like this:
\xf0\x9f\x90\x8a --> ðŸŠ
How can I solve it?

ehsanbadakhshan
Автор

Nice video. We can even use json module to read the attributes we want from "data".
Sample code goes this way:
import json
:
:
    def on_data(self, data):
        try:
            jsonData = json.loads(data)
            createdAt = jsonData['created_at']
            text = jsonData['text']
            saveThis = createdAt + " :: " + text
            :
            saveFile.write(saveThis)

Hope this helps. Once again, great videos :-) Appreciate it !!

satishchandrad
Автор

Thank you for the video :-)
can you explain how to get video and posts from some # (e.g #metoo)

akashmondal
Автор

Hi, thanks for the great videos. I was wondering if you could give me some advice with the location side of things. Firstly could you show me how you filter the tweets by location in terms of coordinates, eg. if I only want tweets within 50 miles of London. And is it possible to limit the tweets downloaded to just those with lat longs? Thanks very much for your help.

robleanumber
Автор

Thanks very much,
Could you tell me what I should change in the code in order to collect only the posts I receive in my account??

LORYPUNTO
Автор

great tutorial! what does [1] and [0] represent for? what if i want to add more splits such as retweet count, and few more? 

etyty
Автор

Hi Harrison, all your videos are great. Far away the best material to learn Python. I have a question about this subject. How can I do to retrieve tweets beggining with the symbol $ followed with one, two, three or four Capital letters? should I do it with Regular expressions or it can be done directly by the filter? thanks for your help.

alemazzuca
Автор

Hello Sentdex big thank you for sharing this tutorial
I have a quesstion, when you used + : : + to save timestamp and text it is occupying the same column in CSV file, how do you put them in separate column ?

rohitdalvi
Автор

Thank you!! I got the basic concept from tommorrow i gonna practice!! This was the video i was looking for!!

sinistergeek
Автор

Hi,  With the part 2 version of the .csv file, the data is split into relevant columns.  However with the split in the final part 3 version, the data (time and tweet) appear in a single column.  Is there any way to have them split into 2 separate columns? i.e Time in column A and tweet in column B

sunnydays
Автор

Nice video, very helpful. I am facing an issue, the response from twitter is too slow, the first tweet appears after 15 minutes and sometimes even after 20 minutes. Is it this bad ? Not sure where the issue is, appreciate any help to speed it up.

infinitykumar
Автор

How does this work? I mean the program takes the tweets that contain that keyword 'car' but I am curious on which period? What is the maximum period from which I can download tweets with this code?

For example I want to make an app that analyze tweets using my keyword. But for that I want all the tweets posted ever on Twitter that contain my keyword. Is there a chance I can get all of them?

wizzard
Автор

Great tutorial. It was a great help getting started with my twitter ticker on my raspberry pi.
I'm struggling a bit with the contents of the actual tweets now. The tweets contain data like \ud83d\ude02 \u2014 and escape characters for urls. This doesn't look very nice when being displayed on my ticker. I figured out it's smilleys and special characters in unicode, but I can't seem to find a way to filter this or even better, print a substitute for these. Any tips to get me started?
Thanks!

pieterbervoets
Автор

Hi Sentdex. Thank you for share it, it has been very helpful to me. I have a question: How can i get tweets from for example 1 month ago? I'm trying to collect tweets but i can`t figure out how to do it... Thanks bro. I hope you can help me with that

DanielMelo
Автор

You are awesome just i want to know can i use it to extract tweets from a specific username ? For example if i want to make me updated for all elon musk latest tweets how can i do that? Pls share. Thanks

Rohitsingh
Автор

Hey sendex i'm a newbie using python and i got some error when i follow
it say this "expected an intended block" and red block on tweet text (below #print data)
can you help me solve with this, thanks btw

arieazhar
Автор

Hey Sentdex!
Thanks a lot for the tutorial, helped a lot. I was wondering what step I should look into so that I can display this tweet data Live onto a website rather than storing it in a database?

rhenness
Автор

Hi is there possiblity to extract the all tweets of particular user. I used follow instead of track and i tried with username, this times it gives 406 (authorisation failure) and i tried with user id at this time it simply runs and gives no output

uniqueraj
Автор

I have a question. I am working on a tweet classification project. I have collected enough tweets for testing data. Now I am trying to create a training data. I unfollowed all the users I followed for test tweets and started to follow new users (let's say user that just post political tweets). When I run the codes it collects the tweets from users I already unfollowed. Is there a solution for that? thanks in advance. 

merdinni
join shbcf.ru