filmov
tv
Sean P. Rogers - Introduction to Machine Learning for Text Analysis and Classification with Python

Показать описание
Machine learning allows humans to create a model that can act as an extension of the creator’s mind and classify data based on predetermined categories. Manually tagging thousands of rows of data can often be cumbersome and time consuming. Forming a human-machine relationship to classify data can save researchers time and help catalyze data analysis and classification on projects that would otherwise take an untenable number of working hours.
This tutorial will teach participants how to use Python for machine learning and text classification, creating a human-machine relationship to process and classify textual datasets. Learn how to use the Natural Language Toolkit (NLTK) to explore data. Use pandas, a Python library with extensive functionality to manipulate data, to clean and manipulate a dataframe (a table in pandas). Participants will also learn how to engineer textual features and build machine learning classification pipelines with SciKitLearn (a popular open source machine learning library). Examples of projects that can be undertaken using these methods include identifying a behavioral health component in police incident narratives, identifying hate speech on Facebook, and identifying wildlife trafficking posts on Twitter. Participants should be familiar with basic python data types and methods of manipulating strings.
PyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R.
PyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.
00:00 Welcome!
00:10 Help us add time stamps or captions to this video! See the description for details.