How can I clean text in Python similar to using dplyr in R