I created this little DNA Writer webpage after seeing the article on scientists recording one of Shakespeare’s sonnets on DNA, I was inspired to put together something similar as an assignment for my middle-school science class to demonstrate how DNA records information. With the website to do quick translations for me, I’ll give each student the translation table and a simple message in DNA code and have them figure out the message.
Update: I’ve adapted the code to add a two to five letter sequence of non-coding DNA to the beginning and end of the message code. There’s also start and stop code as well.
The DNA Writer code uses a simple look-up table where each letter in the English alphabet is assigned a unique three letter nucleotide code. The three letters are chosen from the letters of the DNA bases – AGCT – similar to the way codons are organized in mRNA. Any unknown characters or punctuation are ignored.
Also, with a little tweaking, I think I can adapt this assignment to show how random mutation can be introduced into DNA sequences during transcription. Maybe break the class into groups of 4, give the first student a message as a nucleotide sequence have them copy and pass it on to the next student and so on. If I structure this as a race between the groups, then someone’s bound to introduce some errors, so when they translate the final code back into English they should see how the random mutation affected their code.
UPDATE: Non-Coding (junk) DNA: I’ve updated the code so that you have the option of adding a short (2-5 character) string of non-coding DNA to the beginning and end of each sequence.
UPDATE 2: Personalized and Printable output: Since I’m using the DNA writer to give each student a personalized message, I’ve created a button that gives “Printer Friendly Output” which will produce an individualized page with the code, the translation table, and some information on how it works, so I can print off individualized assignments more easily.
UPDATE 3: You can now get a color coded version of the sequence.
Update 4: Now you can embed the nucleobase color patterns into other websites. Like so:
Update 5: Closer to the standard lettering
DNA Writer A: https://earthsciweb.org/js/bio/dna-writerA/
In constructing the codon-to-english conversion table I had to decide if I wanted to go with the standard coding (e.g. letting GTC which codes for alanine represent A) or make up a random encoding.
I opted for the random approach for a number of reasons, but the primary one was that multiple codons can code for the same amino acid. GCT, GCC, GCA, and GCG all code for alanine. This would not necessarily be a problem, except that if we respect all of the multiple encodings, we run out of codons to represent things like numbers and punctuation. A secondary reason is that U is used to represent the 21st amino acid, selenocysteine, but its codon is the same as the stop codon (Croat, 2012) and its addition to the protein chain depends on not just a single codon in the sequence.
I’ve created a hybrid option: dnaWriterA which respects the standard lettering as much as possible (based off of the inverse DNA codon table on Wikipedia). In the table below, the bolded sequences are the ones that have been reassigned.
Letter/code | Amino acid | Codon | |||||
start | ATG | ||||||
stop | TAA | ||||||
space (” “) | GCA | ||||||
. | GGA | ||||||
A | Ala | GCT | GCC | GCA | GCG | ||
B | Asn or Asp | AAC | |||||
C | Cys | TGT | TGC | ||||
D | Asp | GAT | GAC | ||||
E | Glu | GAA | GAG | ||||
F | Phe | TTT | TTC | ||||
G | Gly | GGT | GGC | GGA | GGG | ||
H | His | CAT | CAC | ||||
I | Ile | ATT | ATC | ATA | |||
J | TTG | ||||||
K | Lys | AAA | |||||
L | Leu | CTT | CTC | CTA | CTG | TTA | TTG |
M | Met | ATG | |||||
N | Asn | AAT | AAC | ||||
O | AGG | ||||||
P | Pro | CCT | CCC | CCA | CCG | ||
Q | Gln | CAA | CAG | ||||
R | Arg | CGT | CGC | CGA | CGG | AGA | AGG |
S | Ser | TCT | TCC | TCA | TCG | AGT | AGC |
T | Thr | ACT | ACC | ACA | ACG | ||
U | AGA | ||||||
V | Val | GTT | GTC | GTA | GTG | ||
W | Trp | TGG | |||||
X | AGC | ||||||
Y | Tyr | TAT | TAC | ||||
Z | Gln or Glu | CAA | CAG | GAA | GAG | ||
0 | AGT | ||||||
1 | GCG | ||||||
2 | GGG | ||||||
3 | CTG | ||||||
4 | CCG | ||||||
5 | CGG | ||||||
6 | TCG | ||||||
7 | ACG | ||||||
8 | GTG | ||||||
9 | GAG |
I’ve also posted the code to GitHub: https://github.com/lurbano/dnaWriterA with instructions on how to adapt the sequence.