DNA Writer: Storing Information in DNA Exercise

DNA Writer: Translate text into a DNA code (and back again) using a simple lookup table.

I created this little DNA Writer webpage after seeing the article on scientists recording one of Shakespeare’s sonnets on DNA, I was inspired to put together something similar as an assignment for my middle-school science class to demonstrate how DNA records information. With the website to do quick translations for me, I’ll give each student the translation table and a simple message in DNA code and have them figure out the message.

Update: I’ve adapted the code to add a two to five letter sequence of non-coding DNA to the beginning and end of the message code. There’s also start and stop code as well.

The DNA sequence (or RNA in this figure) can be broken down into groups of three nucleotides called codons. Each codon codes for a specific amino acid, so the order of codons gives the sequence of amino acids in the proteins created by the DNA strand. Image by TransControl via Wikipedia.

The DNA Writer code uses a simple look-up table where each letter in the English alphabet is assigned a unique three letter nucleotide code. The three letters are chosen from the letters of the DNA bases – AGCT – similar to the way codons are organized in mRNA. Any unknown characters or punctuation are ignored.

Also, with a little tweaking, I think I can adapt this assignment to show how random mutation can be introduced into DNA sequences during transcription. Maybe break the class into groups of 4, give the first student a message as a nucleotide sequence have them copy and pass it on to the next student and so on. If I structure this as a race between the groups, then someone’s bound to introduce some errors, so when they translate the final code back into English they should see how the random mutation affected their code.

UPDATE: Non-Coding (junk) DNA: I’ve updated the code so that you have the option of adding a short (2-5 character) string of non-coding DNA to the beginning and end of each sequence.

A more personalized and printer friendly format for output.

UPDATE 2: Personalized and Printable output: Since I’m using the DNA writer to give each student a personalized message, I’ve created a button that gives “Printer Friendly Output” which will produce an individualized page with the code, the translation table, and some information on how it works, so I can print off individualized assignments more easily.

UPDATE 3: You can now get a color coded version of the sequence.

Ravenclaw’s DNA sequence color coded, and translated back to English.

Update 4: Now you can embed the nucleobase color patterns into other websites. Like so:

Update 5: Closer to the standard lettering

DNA Writer A: https://earthsciweb.org/js/bio/dna-writerA/

In constructing the codon-to-english conversion table I had to decide if I wanted to go with the standard coding (e.g. letting GTC which codes for alanine represent A) or make up a random encoding.

I opted for the random approach for a number of reasons, but the primary one was that multiple codons can code for the same amino acid. GCT, GCC, GCA, and GCG all code for alanine. This would not necessarily be a problem, except that if we respect all of the multiple encodings, we run out of codons to represent things like numbers and punctuation. A secondary reason is that U is used to represent the 21st amino acid, selenocysteine, but its codon is the same as the stop codon (Croat, 2012) and its addition to the protein chain depends on not just a single codon in the sequence.

I’ve created a hybrid option: dnaWriterA which respects the standard lettering as much as possible (based off of the inverse DNA codon table on Wikipedia). In the table below, the bolded sequences are the ones that have been reassigned.

Letter/codeAmino acidCodon
startATG
stopTAA
space (” “)GCA
.GGA
AAlaGCTGCCGCAGCG
BAsn or AspAAC
CCysTGTTGC
DAspGATGAC
EGluGAAGAG
FPheTTTTTC
GGlyGGTGGCGGAGGG
HHisCATCAC
IIleATTATCATA
JTTG
KLysAAA
LLeuCTTCTCCTACTGTTATTG
MMetATG
NAsnAATAAC
OAGG
PProCCTCCCCCACCG
QGlnCAACAG
RArgCGTCGCCGACGGAGAAGG
SSerTCTTCCTCATCGAGTAGC
TThrACTACCACAACG
UAGA
VValGTTGTCGTAGTG
WTrpTGG
XAGC
YTyrTATTAC
ZGln or GluCAACAGGAAGAG
0AGT
1GCG
2GGG
3CTG
4CCG
5CGG
6TCG
7ACG
8GTG
9GAG
Codons mapping to letters/codes used in the dnaWriterA version. The bolded sequences are the ones that have been reassigned.

I’ve also posted the code to GitHub: https://github.com/lurbano/dnaWriterA with instructions on how to adapt the sequence.

Flipped Teaching

Mrs. D. recommended this nice little article on “flipped teaching”, where students get lessons from videos (usually at home) and spend their time in class working on problems and getting help from peers and their teacher. Sounds a lot like Montessori. In middle school, for example, where you get a short lesson at the beginning of the week and spend the rest of the time working on projects and assignments.

Pushing the video out of the classroom can, potentially, be a useful step, especially for those students who can work independently. I’ve been trying it a little with the Khan Academy videos, but I need to organize it a bit more.

Influence Explorer: Data on Campaign Contributions by Politician and by Major Contributors

Influence Explorer is an excellent resource for assessing data about money in politics.

The website Influence Explorer has a lot of easily accessible data about the contributions of companies and prominent people to lawmakers. As a resource for civics research it’s really nice, but the time series data also makes it a useful resource for math; algebra and pre-calculus, in particular.

DNA Hard Drives

Adam Cole has an excellent NPR article on some fascinating researchers who are storing data — text files, web pages, sonnets — on DNA.

This should be a interesting introduction for middle-schoolers to the idea of DNA as a means of storing and transferring information. The question I hope to get is, “How did they do that?”

Converting text into a DNA sequence. From Goldman et al. (2013).

Atom Builder

This app lets you drag and drop electrons, protons, and neutrons to create atoms with different charges, elements, and atomic masses. You can also enter the element symbol, charge and atomic mass and it will build the atom for you.

Note, however, it only does the first 20 elements.

Wiggle Matching: Sorting out the Global Warming Curve

To figure out if the climate is actually warming we need to extract from the global temperature curve all the wiggles caused by other things, like volcanic eruptions and El Nino/La Nina events. The resulting trend is quite striking.

I’m teaching pre-Calculus using a graphical approach, and my students’ latest project is to model the trends in the rising carbon dioxide record in a similar way. They’re matching curves (exponential, parabolic, sinusoidal) to the data and subtracting them till they get down to the background noise.

Carbon dioxide concentration (ppm) measured at the Mona Loa observatory in Hawaii shows exponential growth and a periodic annual variation.