Supporting materials
Download
Download this article as a PDF
Get your students to crack the genetic code for themselves.
In 1958, Crick postulated the central dogma of molecular biology: that the flow of information goes from DNA to RNA to protein. But the question remained: how did the four-letter alphabet of nucleotides in DNA (A, C, T and G) or its equivalent in RNA (A, C, U and G) encode the 20-letter alphabet of amino acids that build our proteins? What was the genetic code?
In 1961, Marshall W Nirenberg and Johann H Matthaei deciphered the first letter of the code, revealing that the RNA sequence UUU encodes the amino acid phenylalanine. Subsequently, Har Gobind Khorana showed that the repeating nucleotide sequence UCUCUCUCUCUC encodes a strand of amino acids reading serine-leucine-serine-leucine. By 1965, largely due to the work of Nirenberg and Khorana, the genetic code had been completely cracked. It revealed that each group of three nucleotides (known as codons) encodes a specific amino acid, and that the order of the codons determines the order of amino acids in (and, consequently, the chemical and biological properties of) the resulting protein.
Nirenberg and Khorana compared short sequences of the nucleic acid RNA and the resulting amino acid sequences (peptides). To do this, they followed the protocol that Nirenberg developed with Matthaei.
This involved artificially synthesising a specific sequence of RNA nucleotides and mixing it with extracts of Escherichia coli bacteria that contained ribosomes and other cellular machinery necessary for protein synthesis. The scientists then prepared 20 samples of the resulting mixture; to each sample, they added one radioactively labelled amino acid and 19 unlabelled amino acids, then allowed protein synthesis to occur. Each of the 20 samples contained a different radioactively labelled amino acid. If the resulting peptide was radioactive, it indicated that the radioactively labelled amino acid was included, confirming that the RNA nucleotide sequence coded for this amino acid at some point.
By repeating this experiment with different RNA sequences, more and more information could be gathered about the genetic code. After simple sequences such as UUUUUU and AAAAAA had been tested, further teams of scientists took up the challenge, analysing more complex RNA sequences, eventually allowing all 64 codons to be de-coded.
The genetic code itself is a crucial element of biology lessons, providing a molecular explanation of the actions of genes (for example, in mutation, evolution and gene expression). Furthermore, the way in which Nirenberg and Khorana cracked the genetic code – by comparing short sequences of RNA with the resulting amino acid sequences – can be re-run as an inquiry-based teaching activity at school. Using the sequences provided by the teacher, the students work in teams to:
The activity thus offers a model for teaching the nature of scientific knowledge: a provisory consensus constructed by the community with conclusions of diverse strength based on partial evidence.
This activity is suitable for 14- to 18-year-old students working in teams of 3–4, and takes about two hours, divided into four steps plus a final discussion. It is designed as an introduction to molecular biology, before you explain anything about the genetic code or the central dogma of molecular biology.
Students are asked to crack a code composed of different sequences of letters (A, C, T, G) using the messages that those sequences encode (e.g. AspHisTrp…). In each of the first three steps, each team is given a different set of letter sequences and corresponding messages. At each step, they will need to re-evaluate their conclusions from the previous steps, and modify their solution to the code.
Explain that all the groups will be working to crack the same code, using different examples. Do not tell your students about the biological nature of the sequences (DNA and amino acids); they should focus on finding patterns and relationships.
Nirenberg and Khorana used RNA sequences to crack the code; in contrast, this activity uses DNA sequences (sense codons, 5′ to 3′). The crux of the activity is the existence of the code rather than the details of transcription and translation, which can be addressed in subsequent lessons.
After each step, you may ask one student from each team to join a different team. (This mimics the dynamics of how scientific knowledge is acquired and shared, for example at conferences or through publications.)
Otherwise, teams may exchange information only when they are told to do so. (If one team gets stuck and discouraged, it can be more motivating to ask another team to help them rather than the teacher.)
Allow at least 10-15 minutes for your students to discuss each step. When all the teams feel that they have obtained all the possible information from their sequences, move on to the next step.
Sequence | Message | Students discover that… |
---|---|---|
ATGTTAGGTAGTAAAGATGCT | MetLeuGlySerLysAspAla | The code is based on triplets and each triplet represents one of the three-letter elements, e.g. Met. |
ATGCATGAAGCTATTTATGAT | MetHisGluAlaIleTyrAsp | |
ATGGGTAGTGATGAAGCTTAT | MetGlySerAspGluAlaTyr |
Sequence | Message | Students discover that… |
---|---|---|
ATGGTTTCGTACACTGCGTCA | MetValSerTyrThrAlaSer | Some elements can be encoded by more than one triplet, e.g. Ser. |
ATGCCGTACACATGTGTCACA | MetProTyrThrCysValThr | |
ATGACGAGTGCGTTGTGCGAT | MetThrSerAlaLeuCysAsp |
Sequence | Message | Students discover that… |
---|---|---|
TGTCATGCATCCGTCATCACTGAC | – | The ATG triplet determines the beginning of the message and the TGA triplet its end. |
TGCGTGACTATGGACACAGTCGT | MetAspThrVal | |
ATGTGTCGATGACTGATCATG | MetCysArg | |
ATGTGCGTACACATTTGAGTC | MetCysValHisIle | |
ATGCTGTACACATGATGCACAGT | MetLeuTyrThr |
Ask your students to consider the following questions:
After the discussion, explain to your students that the sequences were DNA and amino acid sequences, and that they have just reproduced a real key experiment in molecular biology. Your students should now be motivated to learn more about the genetic code and the central dogma of molecular biology, including how similar their activity was to the way in which the genetic code was really cracked.
You could recap the activity, reminding your students what they discovered for themselves:
(Note that the activity could generate the misconception that proteins are usually composed of six or seven amino acids, so this may need to be addressed.)
Explain that the way your students have been working, in collaborative and / or competitive teams, with the membership of the teams changing, and information being shared with other teams, reflects the way that scientists work in real life.
To make the activity easier, you could give your students more sequences in each step (e.g. the sequence sets for two teams). Alternatively, you could leave out step 3, and simply explain the role of the start and stop codons after the activity.
Pedagogic reflections on the activity described in this article are part of the work of the language and context in science education (llenguatge i contextos en educació científica, LICEC) research group at the Autonomous University of Barcelona (reference 2014SGR1492), financed by the Spanish Ministry of Economics and Competitiveness (reference EDU2015-66643-C2-1-P).
This article offers a strategy that helps teachers to explore, simply and accessibly, one of the most challenging aspects of science teaching: helping their students to appreciate and understand how science actually works. Acquiring knowledge requires scientists to ask good questions, design and carry out good experiments, and work together to address uncertainty. This is exactly what the students need to do in this activity to crack the genetic code.
I anticipate that teachers of disciplines other than biology (particularly maths and chemistry) would also find this article useful. It would also be a very good activity to use during a science fair.
Betina Lopes, Portugal
Download this article as a PDF