3.4.1 DNA, genes and chromosomes — Atlas

DNA is information you can read. AQA expects you to know what it's made of, how it spells out the proteins a cell can produce, how it's packed into chromosomes, and how it copies itself before a cell divides. It is also a topic where AQA expects exact wording on a handful of definitions.

Read this topic as an analogy.

Picture a library with a paired-shelf system. Every book on shelf A has a matching book on shelf B. They are not photocopies of each other. Over the years, different editions of some recipes have been printed, so two copies of the same volume may carry slightly different versions of recipe 47. But the books are the same books: recipe 47 sits on page 209 in both, and the dish it cooks is in some sense the same dish, even if shelf A uses a different ratio of salt to butter than shelf B does.

A cell carries that paired-shelf system. The 46 books (23 paired pairs) are the chromosomes. The recipes inside the books are the genes. The variant editions are the alleles. The page numbers are the loci. The kitchen attached to the library can cook any recipe in the library; the full menu it could prepare (everything from the simplest stock to dishes nobody has ordered in years) is the proteome. The full inventory of recipes across the whole library is the genome. The proteome is what the kitchen can cook; the genome is what is written down.

Within each book, every recipe is written as a sequence of three-word instructions ("chop the onion", "add the salt"). A three-word instruction is a codon. There are far more possible three-word instructions than there are dishes, so different instructions can mean the same step, the same way different codons can specify the same amino acid. Recipe books also have stretches of publisher's notes and acknowledgements bound in among the cooking instructions; the kitchen tears those pages out before cooking (introns), keeping only the instruction pages (exons).

Books are long and shelves are short, so each book is wound tightly around small spools to take up less room (histones). One section of pages wound onto one spool is a nucleosome. When the library is busy, books are unwound enough to read individual recipes. When the books need to be carried somewhere safe, they are wound tighter still, and what was an open book becomes a tightly bundled package (a condensed chromosome at cell division).

A separate kind of pamphlet sits on a stand by the entrance: bound in one piece, no spools, no shelf-pair. That's prokaryotic DNA. Small versions of the same pamphlet sit inside the kitchen's tool shed and the greenhouse out the back: the mitochondrial and chloroplast DNAs, both circular and both spool-free.

Once a year, every book in the library has to be duplicated. The process runs in five named steps. First, a spine-unbinder loosens the cover and lets the pages separate (DNA helicase). Second, the partner-line bonds that hold the matching halves of each page together are snipped apart (hydrogen bonds break). Third, an alignment guide reads each old line and places its matching partner line nearby (complementary base pairing: free nucleotides line up against the template). Fourth, a binder fixes those new partner lines into a new spine (DNA polymerase joins adjacent nucleotides). Fifth, that new spine is sealed with glue between the page sections (phosphodiester bonds form between sugar and phosphate). The result: two new books, each with half its pages original and half newly printed. That's semi-conservative replication.

Mapping back to formal vocabulary. The books are chromosomes, the recipes are genes, the variant editions are alleles, the page numbers are loci, and the three-word instructions are codons. The kitchen's full menu is the proteome; the library's inventory is the genome. The spools are histones; one section wound onto one spool is a nucleosome. The five-step duplication is semi-conservative replication: DNA helicase, hydrogen bond separation, complementary base pairing between free nucleotides and the template, DNA polymerase joining adjacent nucleotides, and phosphodiester bonds sealing the new strand. Sex chromosomes (XX, XY) don't have a clean library equivalent; use the formal terms directly for the 23rd pair.

DNA is a polymer of nucleotides held together by complementary base pairs.

DNA is a polynucleotide. Many nucleotide monomers join end to end into a long polymer chain. Each nucleotide has three components: a deoxyribose sugar, a phosphate group, and one of four nitrogenous bases. Adjacent nucleotides within a strand link by phosphodiester bonds, creating a continuous sugar-phosphate backbone with the bases projecting inward.

What a nucleotide contains

Three components, every time: a deoxyribose sugar (a five-carbon pentose), a phosphate group, and one nitrogenous base. The four DNA bases are adenine, thymine, guanine, and cytosine. Phosphodiester bonds link adjacent nucleotides within a strand by joining the phosphate group of one to the sugar of the next.

The bases pair complementarily

Two strands wind together into a double helix. Adenine pairs only with thymine, held by two hydrogen bonds. Guanine pairs only with cytosine, held by three hydrogen bonds. Every pair is one purine (adenine or guanine, double ring) opposite one pyrimidine (thymine or cytosine, single ring), which keeps the helix diameter constant. Chargaff's rules follow: %A = %T, %G = %C.

Write adenine, thymine, guanine, and cytosine in full. Don't use single-letter abbreviations (A, T, G, C). Letter codes are GCSE shorthand; at A-Level the full base names are required for credit.

Genes carry information in a triplet code that is degenerate and non-overlapping.

A gene is a sequence of DNA bases coding for the amino acid sequence of one specific polypeptide. The cell reads that sequence in triplets, and each triplet of bases (a codon) specifies one amino acid.

Properties of the genetic code

Each codon is three adjacent bases. Four bases taken three at a time gives 4³ = 64 possible codons, but only 20 standard amino acids exist. The code is therefore degenerate: most amino acids have more than one codon, usually differing only in the third base. It is non-overlapping: each base belongs to one triplet only. Start codons mark where translation begins; stop codons mark where it ends.

Exons code; introns do not

Eukaryotic genes alternate exons (coding regions) with introns (non-coding intervening sequences). The pre-mRNA transcript carries both. Before translation, introns are excised and exons are spliced together, so the mature mRNA carries only exon-derived sequence. Non-coding repetitive sequences also sit between genes or at the telomeres, but never inside exons or between them.

Write that non-coding multiple repeats sit between genes or at the telomeres. Don't write in introns or between exons. Those are different non-coding features. For introns themselves, write that they do not code for amino acids (plural); the singular amino acid is rejected.

Eukaryotic DNA is wound around histones into chromosomes; prokaryotic DNA is circular and free.

In eukaryotic cells, DNA is linear and associated with histone proteins. Histones are small positively charged proteins; the negatively charged phosphate groups on DNA wind tightly around histone cores. The basic structural unit is the nucleosome. Nucleosomes coil further into chromatin, which compacts at cell division into the discrete chromosomes visible under a light microscope.

Three sources of DNA inside a cell.

DNA type	Location	Form	Histone association
Eukaryotic nuclear DNA	Inside the nucleus	Linear, multiple chromosomes	Wound around histones
Prokaryotic DNA	In the nucleoid region of the cytoplasm (no nuclear membrane)	Short, circular, single molecule	Not associated with histones
Organelle DNA (mitochondria, chloroplasts)	Inside the organelle	Short, circular	Not associated with histones

"Give two ways" questions on this contrast need both points stated across the cell types. Two prokaryotic-side features paired together (for example "circular" plus "as plasmids") cap the answer at one mark.

Pair contrasts across cell types. Circular plus not associated with histones earns two marks. Circular plus as plasmids (both inside the prokaryotic side) caps at one. AQA's max-cap rule fails any pair drawn from the same side of the contrast.

Homologous chromosomes carry the same genes at the same loci; alleles can differ.

Diploid organisms carry chromosomes in matched pairs. Human somatic cells have 46 chromosomes arranged as 23 pairs. The 23rd pair determines sex (XX in females, XY in males). The X and Y differ substantially in size and gene content; they pair only at a small region of homology during meiosis I.

Homologous chromosome definition (the mark-scheme one)

Homologous chromosomes carry the same genes at the same loci in the same order. They are not identical: the two members may carry different alleles at one or more loci. AQA rejects "same alleles" (which implies genetic identity), "identical chromosomes" (same reason), and "one from each parent" alone (true but not a definition of homology). Sister chromatids are something different: two identical strands of one chromosome, joined at the centromere.

Define homologous chromosomes as same genes at same loci. Don't write same alleles, identical chromosomes, or one from each parent alone. All three are rejected. And don't confuse homologous chromosomes with sister chromatids: chromatids are two identical strands of one chromosome, not two separate chromosomes.

Locus and allele

A locus is the position on a chromosome where a particular gene is found. An allele is one of the alternative base-sequence variants of that gene that can occupy the locus. A diploid organism has two alleles for each autosomal gene, one on each chromosome of the homologous pair. Two identical alleles is homozygous; two different alleles is heterozygous.

DNA replicates semi-conservatively through a five-event chain.

DNA replication is semi-conservative: each new DNA molecule contains one parental strand and one newly synthesised strand. The mechanism is a five-event chain in fixed order. Each event scores its own mark on the AQA mark scheme, and the events do not merge.

DNA helicase acts on the double helix.
DNA helicase breaks the hydrogen bonds between complementary bases on the two strands, separating the strands and exposing the bases.
Free nucleotides in the surrounding nucleoplasm align opposite the exposed bases of each template strand by complementary base pairing: adenine with thymine, guanine with cytosine.
DNA polymerase joins adjacent nucleotides on each new strand.
The joining forms phosphodiester bonds between the sugar and phosphate groups of adjacent nucleotides.

Write that DNA helicase breaks the hydrogen bonds. Don't write hydrolyses hydrogen bonds. Hydrogen bonds contain no water, so hydrolysis is the wrong process; the credited verb is "break" (or "separate").

Pitfall — Replication steps 3 and 4 are independent marks

Don't merge steps 3 and 4 into DNA polymerase catalyses complementary base pairing. That scores one mark instead of two.

The two events are separate. Complementary base pairing positions free nucleotides opposite the template strand. DNA polymerase then joins those positioned nucleotides together. The bases hydrogen-bond to the template; the nucleotides are joined by polymerase through phosphodiester bonds. Keep the two steps distinct in the answer. Examiner reports flag step 3 and step 4 as the consistently dropped pair on this five-mark question.

The genome is all the DNA in a cell; the proteome is the proteins a cell can produce.

Two definitional terms with strict mark-scheme phrasings. They are often examined as a paired sub-question and score independently: one mark per definition.

Genome (the mark-scheme definition)

The complete set of DNA, or equivalently the complete set of genes, in a cell (or in an organism). "In a cell" is the load-bearing scope. AQA rejects "all the DNA in a species" or "in a population": the genome is per-cell or per-organism, never per-species. "Genetic information", "genetic code", and "genetic constitution" are also rejected; none specifies DNA or genes precisely enough.

Write in a cell for the genome. Don't write in a species or in a population. Both are rejected. Genetic information, genetic code, and genetic constitution are also explicit rejects; use DNA or genes directly.

Proteome (the mark-scheme definition)

The full range of proteins a cell can produce. "Can produce" is the load-bearing phrase. AQA rejects "all proteins in a cell" (without "can produce") because it describes only the proteins currently expressed, not the cell's full protein-making capacity.

Write can produce for the proteome. Don't write all proteins in a cell without it. That phrasing describes only currently expressed proteins, not the cell's full capacity to produce them.

Key terms

genes
genome
homologous
hydrogen bonds
nucleotides
DNA polymerase
DNA helicase
complementary
semi-conservative
adenine
same loci
can produce