Python Tutorial: Introduction to string manipulation

preview_player
Показать описание

---

Welcome to this course!

My name is Eugenia and I will guide you in your journey to master regular expressions.

In this course, you will learn how to manipulate strings to find and replace specific substrings.

You will also explore different approaches for string formatting, such as interpolating a string in a template.

Last, you will dive into basic and advanced regular expressions to master how to find complex patterns in a string.

As a data scientist, you can encounter strings when cleaning a dataset to prepare it for text mining or sentiment analysis. Sometimes, you will need to process text to feed an algorithm that determines whether an email is spam.

Maybe, you will need to parse and extract specific data from a website to build a database. Learning to manipulate strings and master regular expressions will allow you to perform these tasks faster and more efficiently.

The first step of our journey is strings, a data type used to represent textual data.

Python recognizes any sequence of characters inside quotes as a single string object.

As shown on the slide, both single or double quotes can be used. You should use the same quote type to open and close the string.

If a quote is part of the string as seen in the code, we need to use the other quote type to enclose the string. Otherwise, python recognizes the second quote as a closing one.
Python has built-in functions to handle strings.

Suppose we define the following string.

We can get the number of characters in the string by applying the function len()
which returns eleven as shown in the output.

The function str() returns the string representation of an object as seen in the code.

Suppose now we have the following two strings shown on the slide. You want to concatenate them. Concatenate means obtaining a new string that contains both of the original strings.

Applying the plus operand to sum up both strings, specifying also space,
generates the output seen in the code.

Individual characters of a string can be accessed directly using an index; the position of that character within the string.

Let's work with the following example.

To get the fourth character of the string, we specify the string name followed by the position inside square brackets.

In python, string indexing is zero-based meaning that the first character has index zero as shown on the slide.

For character four, we specify index three getting the following output.

We can also indicate indices with negative numbers. If we specify index minus one, we get the last character of the string as shown in the output.

With the bracket notation, python allows you to access a specific part or sequence of characters within the original string.

For that aim, we specify the starting and ending positions inside square brackets separated by a colon as you see on the slide.

The ending position is excluded from the resulting output.

Omitting the first or second index results in the slice starting at the beginning or going until the end of the string as shown in the output.

String slicing also accepts a third index which specifies how many characters to omit before retrieving a character.

In the example, the specified indices return the following output. They are the characters retrieved between positions zero and six, skipping two characters in between.

Interestingly, omitting the first and second indices and designating a minus one step returns a reversed string as shown in the output.

Now, you are ready to start manipulating string by yourself.

#DataCamp #PythonTutorial #RegularExpressionsinPython #Introductiontostringmanipulation #stringmanipulationinPython
Рекомендации по теме