Bioinformatics in Python: DNA Toolkit. Part 2: Transcription, Reverse Complement

preview_player
Показать описание
🚀 [DESCRIPTION]:

In this lesson, we continue enhancing our DNA Toolkit. You'll add two more crucial functions: transcription and reverse_complement. We'll also dive into restructuring and re-formatting output, implementing nucleotide colored output for better readability, and exploring the power of docstrings for clean code documentation.
#DNAToolkit #Bioinformatics #PythonProgramming #Transcription #ReverseComplement #Docstrings #PythonColors #CodingTutorial

🔹🔹🔹🔹🔹

💻 [GITHUB REPO]:

🔹🔹🔹🔹🔹

🔗 [VIDEO LINKS]:

Amazing Python lessons by Corey Schafer:

Python colorized output:

colored() function code:

🔹🔹🔹🔹🔹

✨ [CONNECT WITH US]!

Stay updated and join our growing community of bio-coders and rebel thinkers!

🔹🔹🔹🔹🔹

🔬 [JOIN THE REBEL SCIENCE COMMUNITY]!

Engage with us, ask questions, and share your ideas!

🔹🔹🔹🔹🔹

💖 [SUPPORT REBELSCIENCE]!

Love what we do? Your support helps us create more open-source bioinformatics tools and content! You can support us with a one-time donation or a subscription.

Forum subscriptions via Stripe come with many benefits, including a dedicated community, community-only projects and discussions (like Rosalind and other related courses), and opportunities for collaboration on different projects.

Any donations will be used for new content creation and to cover server and community management expenses. Thank you for being a part of rebelScience!

🔹🔹🔹🔹🔹

Рекомендации по теме
Комментарии
Автор

Dear Rebel
please note that you have to use complement only and not reverse complement in order to match the original DNA string with it, otherwise you will not get the expected result as "A" should be matched with "T" and "G" should be matched with "C" and just simple review of your result you will find that this not the case. I think you have to remove [::-1] from reverse_complement function and you will get the expected result.

dlearhasan
Автор

One of the users left a very good comment here, which is not visible anymore, for some odd reason. He brought up a very good point. at 7:20 we print out the following:

[5] + DNA String + Reverse Complement:
5' AAAATCGGCGTTTGGCCCCTTTTGCCCC 3'
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
3' GGGGCAAAAGGGGCCAAACGCCGATTTT 5

3' -> 5' part is really confusing as it actually is 5' -> 3' as we have already reversed it there. So the correct output should be:



[5] + DNA String + Complement + Reverse Complement:
5' AAAATCGGCGTTTGGCCCCTTTTGCCCC 3'
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
3' TTTTAGCCGCAAACCGGGGAAAACGGGG 5' (complement)
5' GGGGCAAAAGGGGCCAAACGCCGATTTT 3' (reverse complement)

So thanks to the person who brought it up. An amazing example of collaborative work. If anyone finds something wrong/confusing, please feel free to leave a comment here. I will also bring this up in our next video and we will correct the output.

rebelScience
Автор

Awesome Video!..Not many great bioinfo vids on YT. Thanks

TheCuriousLifeOfCode
Автор

At 7:27 i think you dont need to reverse the bottom part of DNA helix, because the output shows AAA is connected to TAT and ACG in connected to ATG and so on...

matic
Автор

Great series.

If anyone wants the transcript function to swap out the other nucleotides and make an RNA transcript of the randomly generated sequence, this will work in the DNAToolkit file:

("import random" to whichever files require it)

def transcription(seq):
"""DNA->RNA Transcription Product, Replacing Thymine with Uracil"""
seq = seq.replace('C', 'c')
char_to_replace = {'A': 'U',
'T': 'A',
'G': 'C',
'c': 'G'}
# Iterate over all key-value pairs in dictionary
for key, value in char_to_replace.items():
# Replace key character with value character in string
seq = seq.replace(key, value)
return seq

And this will work in the bio_seq file later on (lesson 8):

def transcription(self):
"""DNA->RNA Transcription Product, Replacing Thymine with Uracil"""
self.seq = self.seq.replace('C', 'c')
char_to_replace = {'A': 'U',
'T': 'A',
'G': 'C',
'c': 'G'}
#replace to "G"
# Iterate over all key-value pairs in dictionary
for key, value in char_to_replace.items():
# # Replace key character with value character in string
self.seq = self.seq.replace(key, value)
return self.seq.replace(key, value)

It is KEY to have the first line after the docstring convert Cytosine to a lower case value, which is passed through instead of "C" in the "replace to G" line. If this is not done, then ALL "C"s, original and transcribed, are converted to "G". These functions work with the nucleotide-colouring function in "Utilities.py"

If there is anything wrong with this explanation or it breaks a communication convention let me know. I am quite new to the IT side of this.

billal
Автор

Docstrings are cool! Thanks rebelCoder.

amitrupani
Автор

Thank you for this series, it was ready easy to follow along!

regx_
Автор

Hi rebel, shall we add [::-1]? I get the result without this part and I do not know how this code works with you. may u explain, plz?

mohammedamertaha
Автор

Thanks for doing this videos, they're really useful!

lucianoinso
Автор

I just want to share some notes I have. There is probably a more efficient way for this to be done, but I figured out a way to do transcription and make a reverse complement for a given DNA sequence.





Transcription for a given sequence:

main:
# DNA Toolset/Code testing file
from DNAToolkit import *

#this line here is where to put the code to be transcriped
givenDNAStr = "AGTC"
DNAStr = validateSeq(givenDNAStr)

print(transcription(DNAStr))

DNATool kit:
Nucleotides = ["A", "C", "G", "T"]

# Check the sequence to make sure it is a DNA string
def validateSeq(dna_Seq):
tmpseq = dna_Seq.upper()
for nuc in tmpseq:
if nuc not in Nucleotides:
return False
return tmpseq

def transcription(seq):
# DNA to RNA a process known as transcription
return seq.replace("T", "U")





For a reverse complement of a given DNA sequence:

main:

# DNA Toolset/Code testing file
from DNAToolkit import *

# insert a desired sequence for complement:
givenDNAStr = 'AGTC'

DNAStr = validateSeq(givenDNAStr)

print(f' DNA sequence: {DNAStr}\n')

print(f'Complementary DNA sequence:

DNATool kit:
Nucleotides = ["A", "C", "G", "T"]
DNA_ReverseComplement = {'A': 'T', 'T': 'A', 'G': 'C', 'C': "G"}

# I am not sure what this line of code is doing, but the results won't run in the terminal without it
def validateSeq(dna_Seq):
tmpseq = dna_Seq.upper()
for nuc in tmpseq:
if nuc not in Nucleotides:
return False
return tmpseq

def reverse_complement(seq):
return for nuc in seq])

LumTheAlien
Автор

at 7:40, i think you have a mistake. A-A is wrong.

manhtuan
Автор

great videos, thanks for all your work

stengah
Автор

Thanks, very well explained, regards from Colombia.

cristiancamilocanizalessil
Автор

Does anyone knows why when I try to color the sequences, in the terminal it shows the sequences on two or more lines???

riccardo
Автор

hi, please help me,
how to show colored on spyder. I do not find terminal

PhanCanhTrinhPlus
Автор

Amazing content, how can i get horizontal neon colored current line theme?

hyperionspring
Автор

Good video. But I thing you’ve made a mistake.
When you’re doing the reverse complement the sense change. It becomes also 5’ to 3’

techlife
Автор

The transaction tool is wrong bro. In transcription RNA is the complement and not just replacing T with U if it is a template seq. If that is a codig seq. U r right

angsumandas
Автор

Do the anaconda promt must show the colours? I use spyder and when I run the script the screen of anaconda it only shows the codes for each colour with each letter, as the output did in the video. Someone could help me? Thank you for the amazing work with these videos

juanmaruizrobles
Автор

Hi Rebel!
first of all thanks for your amazing videos.
unfortunately, I still get those garbage even on my terminal. what should I do?
and could you tell me which modules I should learn for bioinformatics?

ghazal
visit shbcf.ru