The applications of Mendelian genetics, chromosomal abnormalities, and multifactorial inheritance to medical practice are quite evident. Physicians work mostly with patients and families. However, as important as they may be, genes affect populations, and in the long run their effects in populations have a far more important impact on medicine than the relatively few families each physician may serve. It is important that certain polymorphisms are maintained so that the species may survive, even at the expense of individuals. Genetic polymorphisms often are detrimental to the homozygote, but they allow others of the species to survive. Before medical intervention was possible, populations that lacked the sickle cell anemia allele could not survive in the malaria regions of West Africa. Those that had the sickle cell anemia allele survived, and the gene remains in the population at high frequency today, even though the homozygous recessive phenotype was at a severe disadvantage in the past. The high rate of thalassemia in people of Mediterranean origin, the high rate of sickle cell anemia in people of West African descent, the high rate of cystic fibrosis in people from Western Europe, and the high rate of Tay-Sachs disease in ethnic groups from Eastern Europe may all owe their origin to environmental factors that cause changes in gene frequencies in large populations by giving some advantage to heterozygotes who carry a deleterious allele. Although one may never use the calculations of population genetics in medical practice, the underlying principles should be understood.

Population genetics is also the most widely misused area of human genetics, sometimes bordering on "vigilante genetics," a term coined by Newton Morton. Persons have mistakenly applied population genetics to "prove" race superiority for intelligence and aptitudes, and have misused it in eugenics. As an educated and, I hope, a respected member of your community you must be alert to "vigilante genetics."

Population genetics is concerned with gene and genotype frequencies, the factors that tend to keep them constant, and the factors that tend to change them in populations. It is largely concerned with the study of polymorphisms. It directly impacts counseling, forensic medicine, and genetic screening.



Consider a population of 1000 individuals all typed for the simplest test at the MN blood group locus. At its most simplistic form this locus can be reduced to a codominant system with two alleles M and N. (In reality it is considerably more complex than this but this simple form will suffice for our examples.) Every individual in the population will be either M (having two M alleles), MN (heterozygous), or N (having two N alleles). Suppose the blood typing results were as follows: 300 M individuals, 600MN individuals, and 100 N individuals. You probably want to ask, "What is the gene frequency of the M allele in the above population of 1000 individuals?" I'm glad you're interested!

1000 individuals each have two alleles at the MN locus = 2000 genes

Each M individual has 2 M alleles 300 x 2 = 600 M alleles

Each MN individual has 1 M allele 600 x 1 = 600 M alleles

There is a total of 1200 M genes in a population of 2000 genes. The gene frequency of the M allele is 1200/2000 = 0.6

I'll bet you want to know, "What is the gene frequency of the N allele?" Well, I'll show you how to find out.

Each MN individual has 1 N allele 600 x 1 = 600 N genes

Each N individual has 2 N genes 100 x 2 = 200 N genes

Again, there is a total of 2000 genes in the population for the MN locus. The gene frequency of the N allele is 800/2000 = 0.4

Notice that when there are only two alleles in the population, their gene frequencies must add to 1. If they don't, you've done something wrong. This counting method of calculating the gene frequency must be used whenever the heterozygote can be detected.

Gene frequency = (2 x homozygote + heterozygote) / 2 x population

Gene frequency for one allele = 1 - gene frequency of the other allele

These two general formulas assume nothing of the population, only that it is a single interbreeding group. All other methods make some assumptions of the population in order to simplify calculations.


For many human autosomal recessive traits the heterozygote cannot be distinguished from the normal homozygote. When this occurs the Hardy-Weinberg equilibrium is assumed to apply. These authors, Hardy in England and Weinberg in Germany, used different approaches but came to the same conclusions in 1908. They made several assumptions of the population:

  1. Large population
  2. Random mating
  3. No effect of recurrent mutation
  4. No selection against any phenotype
  5. No migration in or out of the population
  6. Autosomal locus

Under these assumptions, Hardy and Weinberg found that the gene frequency and the genotype frequency in the population do not change from generation to generation. Furthermore, if the frequency of the dominant allele A in the founding population was p , and the frequency of the recessive allele a in the founding population was q, then after one generation of random mating the genotype frequencies would remain fixed and would be in the ratio:

p2 (AA) 2pq (Aa) q2 (aa)
Since there are only two alleles in the population,
p+q = 1 and p2 + 2pq + q2 = 1

If you want to see evidence that this is true, see Figure 20. If, on the other hand, you believe everything you read, and only want to study what will be covered on the examination, continue on.

Consider a population of two types of individuals, 50% AA and 50% aa (p and q = 0.5). Then with random mating one should have the following:
Mother Father Frequency Offspring
      AA Aa aa
AA AA 0.5 x 0.5 0.25    
AA aa 0.5 x 0.5   0.25  
aa AA 0.5 x 0.5   0.25  
aa aa 0.5 x 0.5     0.25
Total 0.25 0.50 0.25
Even though there were no heterozygotes in the founding population, after one generation of random mating, the genotype frequencies are in the ratio of p2 (AA), 2pq (Aa), and q2 (aa), and the gene frequencies remain p = 0.5 and q = 0.5. Both remain fixed in those frequencies for future generations. Using a table similar to the above, the student can easily calculate the frequencies in the next generation.

Figure 20.

Hopefully, someone will ask the question, "Is there any evidence that the human population meets the requirements of Hardy-Weinberg equilibrium, or is this just a mental exercise?" Of course there is evidence! Consider the following:

In my experience, one may use several criteria for selecting a person to mate with, but one usually doesn't select a mate based on blood types at the MN blood group locus. Therefore, we might assume that this locus would be a good test of random mating. All of the other Hardy-Weinberg criteria also seem to be met. Mutations at this autosomal locus are rare. We know of no selective advantage or disadvantage in the present environment. And migration wouldn't be much of a factor if we took the sample at one short interval of time. This locus should provide a good test.

We have already seen that gene frequencies and genotype frequencies for this locus can be determined without using assumptions of Hardy-Weinberg equilibrium. Let's see if a real population sample is distributed as p2 (M), 2pq (MN), q2 (N).

In 1975, Race and Sanger reported the typing results from 1279 individuals in London. They were not collecting these data for the purpose of testing for Hardy-Weinberg equilibrium, so they could not be accused of typing individuals until a certain distribution was achieved, a question that has always remained about Mendel's original studies. Race and Sanger found 363 persons were M, 634 were MN, and 282 were N. Using our original method of calculating gene frequencies, the frequency of the M allele (p) would be:

p = (2 x 363) + 634 / (2 x 1279) = 0.53167

The frequency of the N allele (q) would be:

q = (2 x 282) + 634 / (2 x 1279) = 0.46833

If the population were in Hardy-Weinberg equilibrium, then the number of M individuals should be p2 x 1279, the number of MN individuals should be 2pq x 1279, and the number of N individuals should be q2 x 1279, or

  M MN N
Observed 363 634 282
Expected 361.54 636.93 280.53

For the MN blood group locus there can be little doubt that the conditions for Hardy-Weinberg equilibrium are met in the human population, at least the population in London where the sample was taken. The observed frequencies closely approximate what would be expected if the population were in Hardy-Weinberg equilibrium.

This gives us the assurance that we can use Hardy-Weinberg as a method when the heterozygote cannot be detected. An example of the use of the Hardy-Weinberg principle in medical genetics is given below.

Suppose there is an autosomal recessive disease where the frequency of affected in the population is 1/10,000. If the population is in Hardy-Weinberg equilibrium, this frequency would equal q2. The gene frequency of the recessive allele (q) would then be the square root of q2, or the square root of 1/10,000 which equals 1/100. The carrier (heterozygote) frequency (2pq) is usually approximated as 2q since p (0.99) is so close to 1. The carrier frequency is then 1/50.

For an autosomal recessive disease with a population frequency of 1/10,000, the carrier frequency is 1/50. Put another way, on average, as many as 3 or 4 first year medical students at UIC are carriers of such a disease.

From time to time, certain groups have suggested that the way to eliminate a deleterious disease from the population is to not allow affected individuals to mate. The above example should provide some evidence that this will have little effect on gene frequencies in the population. Although the frequency of the disease is only 1/10,000, (we should have one affected first year medical student at UIC every 50 years) the carrier frequency is 1/50 (we should have 3 or 4 carriers at UIC in every incoming class). These phenotypically normal carriers will keep the gene in the population.

If, by chance, a student in the first year class has a sibling with an autosomal recessive disease that is present at birth, the student would have a 2/3 chance of being a carrier. If that student were to have a child with an unrelated partner selected at random from the general population, and the disease frequency in the general population is 1/10,000, the probability of their child being affected is:

2/3 (prob. of student being a carrier) x 1/50(prob. of a random individual being a carrier) x 1/4 (prob. of two carriers producing an affected child) = 2/3 x 1/50 x 1/4 = 1/300

Compare that to the probability that two unrelated individuals, with no history of the disease in their families would have an affected child, when the carrier frequency is 1/50:

1/50 x 1/50 x 1/4 = 1/10,000

Since it is a stated goal of medicine to do what is best for the patient, what happens to genes in populations when exceptions to Hardy-Weinberg occur?


Although mutation rates are usually very low, geneticists have long been concerned about environmental factors that will lead to even slight increases. There are two general types of mutation, a mutation that changes a gene that makes a functional product into a gene that makes a nonfunctional product (forward mutation) and a mutation that changes a gene that makes a nonfunctional product into a gene that makes a functional product (reverse mutation). Several events can lead to a forward mutation, base change, base insertion, base deletion, etc., but a reverse mutation must correct the specific change that produced the original forward mutation. For example if a single base deletion caused the original forward mutation, then that base must be re-inserted in exactly the same place for a reverse mutation to occur. In general, forward mutations occur at a frequency that is at least 10 times that of reverse mutations. A method of estimating forward mutation rates is given in Gelehrter, Collins, and Ginsburg, 2nd ed., Chapter 4. Students will be well advised to read this chapter carefully.

If µ is the forward mutation rate from a functional to a nonfunctional allele, and is v the reverse mutation rate from a nonfunctional allele to a functional allele at the same locus, an equilibrium will be established between these two mutation rates that determines q, the gene frequency of the nonfunctional allele.

At equilibrium, q = µ/(µ+v)

If v is truly one tenth the frequency of µ, then we can assign the value 1 for v and 10 as the value for µ. The above equation reduces to

qequil = 10/(10+1) or 10/11 =0.90909090909

Gene frequencies for nonfunctional alleles tend to increase in the population because of recurrent mutation. They will not entirely eliminate functional alleles but they tend to replace them, and can, if no other factors are involved, reach very high frequencies.

As a possible human example of the effects of recurrent mutation consider the following. In the ABO blood group system, there are two functional alleles, A and B. Alleles A and B control transferase enzymes that connect the proper sugar molecule (glucosamine or n-acetyl glucosamine) to a common precursor substance. Most likely, B was the result of a rare mutation of the A allele. O is a nonfunctional allele that recognizes no substrate, and no sugar molecule is transferred, leaving the precursor unchanged. In the ABO system, O is now the most frequent allele. If there is no selective advantage, O should continue to increase at the expense of A and B.

The derivations of the equations used to calculate the effects of recurrent mutation are shown in Figure 21. Again, if you are interested only in studying for possible test questions, this material is not required.

Recurrent Mutation

Assume a population of N individuals with two alleles at a locus, D with a frequency of p and d with a frequency of q. At generation 0 there will be 2Np D alleles , or 2N(1-q) D alleles, and 2Nq d alleles. Assume D mutates to d at a frequency of µ and that d mutates to D at a frequency of v. Assume that µis 10 times as frequent as v. Then at generation 1 the number of d alleles (2Nq1) would be:

2Nq1 = 2Nq (from gen. 0) + 2N (1-q)µ (mutations from D to d) - 2Nqv (mutations from d to D)

This reduces to:

q1 = q + (1-q)µ - qv Or the change in q = q1 - q or the change in q = q + (1-q)µ - qv - q

At equilibrium the change in q = 0, so at equilibrium 0 = q +(1-q)µ -qv -q, or, qv = (1-q)µ, or, qv = µ - qµ

This reduces to q (at equilibrium) = µ/(µ+v)

Figure 21.


One factor assumed in the discussion of recurrent mutation was that the nonfunctional allele and the functional allele have the same selective advantage. This may be true of the ABO blood group system, but it is not usually true of autosomal recessive diseases. The disease state, by definition, is always a deleterious phenotype. In autosomal recessive diseases the phenotype is almost always the result of nonfunctional alleles in the homozygous state. If left untreated the recessive phenotype for a disease would be less fit than the heterozygote or normal homozygote. How does selection against the homozygous recessive individual affect gene frequencies in the population?

Fitness, to a geneticist, is not the same as fitness to a movie director or a sports columnist. Fitness is not measured by physical attributes, it is measured by the number of offspring produced in the next generation that survive and reproduce. In a hunting-gathering society, the most fit person may have been the near sighted male who could not go on the hunt because he would stumble and make too much noise. If he were left behind to gather fruit and berries with the women, he may have become the most fit person in the tribe. Grandchildren, great-grandchildren, etc., are the best measures of the fitness of an individual. This has alway been my favorite explanation of why so many of us are near sighted, and why society changed from hunting-gathering to agriculture. It's all population genetics!

The most fit phenotype in the population is assigned a fitness of 1. If there are two equally fit phenotypes, each is assigned a fitness of 1. Those less fit must be assigned a fitness of less than 1. The difference between 1 and the fitness value is called the selection coefficient. The relationship between fitness, w, and the selection coefficient, s, is given by the equation, w = 1-s. The textbook uses f as the symbol for fitness, although historically most geneticists reserve f as the symbol for the inbreeding coefficient and use w as the symbol for fitness.

The effect of selection against the recessive phenotype is that, no matter how little the selection coefficient, as long as s is not 0, recessive alleles will be lost at each generation until no more remain in the population. Selection tends to reduce nonfunctional recessive alleles from the population; recurrent mutation tends to create nonfunctional recessive alleles in the population. The derivations of the effects of selection against the recessive phenotype are shown if Figure 22. Again, the material in Figure 22 will not be examined in this course.

Selection Against the Recessive Phenotype
  DD Dd dd Total
Generation 0 p2 2pq q2 1
Fitness 1 1 (1-s)  
Generation 1 p2 2pq (1-s)q2 1-sq2

The frequency of q in generation 1, q1, = (2 x homozygote + heterozygote)/ 2 x total

q1 = [2(1-s)q2 + 2pq]/ 2(1-sq2) , and q, the change in q, = q1 - q

q = [(1-s)q2 + (1-q)q]/ (1-sq2) , which reduces to q = [-spq2]/ (1-sq2)

q = 0 only when q = 0. There will be no equilibrium until the recessive allele is eliminated.

Figure 22.


Since mutation tends to increase nonfunctional alleles in the population, and selection against the recessive phenotype tends to remove them, is there a point where these two will reach an equilibrium where gene frequencies remain stable from generation to generation? Again, if µ is the mutation rate, and s is the selection coefficient, an equilibrium will be reached when

µ= sq2

If the fitness of the homozygous recessive individual is 0, that is, the individual with that phenotype cannot reproduce, then s equals 1 and the above equation reduces to

µ = q2

The disease frequency cannot go lower than the recurrent mutation rate, even if affected individuals cannot reproduce.

The derivations of these equations are shown in Figure 23.

Balance Between Selection Against the Recessive Phenotype and Recurrent Mutation

For mutation, the change in q = µ - qµ -qv. For selection, the change in q = [-spq2]/ [1-sq2]. If they balance at an equilibrium, the net effect is that they should sum to 0.

µ - qµ - qv + ([-spq2]/[1-sq2]) = 0

To simplify calculations, we will get rid of second order variables (qv) is only 1/10 of () and can be eliminated. Similarly, sq2 is very small in the denominator when compared to 1, and can be eliminated. This reduces the equation to

µ-qµ - spq2 = 0 to first order magnitude.

This reduces to µ - qµ = s(1-q) q2 or µ(1 - q) = (1-q)sq2

At equilibrium, µ = sq2 to first order magnitude.

Figure 23.


Some genes exist at a rather high frequency in the population because the heterozygote is more fit than either homozygote. The only documented example of this is sickle cell anemia in Western Africa. There are three major genotypes for the sickle cell locus, each producing a different phenotype, in West Africans, AA, or normal individuals, AS or heterozygote individuals (often called carriers), and SS individuals who will have sickle cell anemia. Without medical intervention, SS individuals will have a fitness less than 1. In the falciparum malarial environment of West Africa, AA and AS individuals get malaria, but AS individuals usually have much milder cases of the disease and usually survive while AA individuals are less likely to do so. The heterozygote is the most fit phenotype of the three. If the selection coefficient against the homozygous normal AA individual is t, and the selection coefficient against the homozygous SS individual is s, and if p is the frequency of the A allele and q the frequency of the S allele then an equilibrium will be reached in which

p = s/(s + t) and q = t/(s + t). The gene frequencies at equilibrium are determined only by the relative sizes of the selection coefficients, not by their absolute magnitudes.

The derivations of these formulas are shown in Figure 24. Again, you are not responsible for knowing how to derive these formulas.

Balanced Polymorphism
  DD Dd dd Total
Generation 0 p2 2pq q2 1
Fitness (1-t) 1 (1-s)  
Generation 1 p2(1-t) 2pq q2(1-s) 1-tp2-sq2

The gene frequency of the q allele at generation 1, q1 = [2pq + 2q2(1-s)]/2[1- tp2 - sq2]

Again the change in q, q, = q1 - q and at equilibrium, q = 0

0 = [pq + (1-s)q2/ [1-tp2-sq2] Substituting (1- q) for p, this equation will reduce to:

0 = -spq + tp2 or sq = tp

When (1-q) is substituted for p or (1-p) is substituted for q, this reduces to:

q = t/(s + t) and p = q/(s + t).

Figure 24.


Assortive mating in humans may occur to a limited degree for traits such as intelligence. In some studies, married couples have higher correlation coefficients for intelligence than do siblings. In modern western culture, we tend to marry someone who is about our own intelligence, although this is probably an over simplification. If intelligence were controlled by a single genetic locus with two alleles, S for smart and D for dumb, then three phenotypes would be possible, SS for smart persons, SD for persons with average intelligence, and DD for persons who are mentally challenged. Of course, we know that intelligence is a multifactorial trait and not a single gene trait, but it is interesting to see what happens if it were a single gene trait with assortive mating where smart persons were only allowed to mate with smart persons, average persons with average persons, and mentally challenged only with mentally challenged. Strangely enough the gene frequencies do not change, only the genotype frequencies. The results are shown in Figure 25.

Assortive Mating
Suppose a population started out as all heterozygotes, but heterozygotes mate only with other heterozygotes, homozygous dominant with homozygous dominant, and homozygous recessive with homozygous recessive. Then, over time, the following would result:
At each generation, half of the heterozygotes are lost with no change in gene frequency.

Figure 25.

Two different populations result, one smart, the other mentally challenged. Average gets lost. Assortive mating eventually results in two species being formed from one.


Gene frequencies in small isolate populations do not reflect those of the larger founding population from which they were derived because of two factors, founder effect and random genetic drift. Founder effect occurs when the population grew from a few founding individuals. A few individuals cannot represent all of the genomes of the founding population. As we discussed before, each of us is carrying from 1 to 8 mutant genes in the heterozygous state, even though we are normal. When the founding population is small, intermarriage must result even though steps are taken to avoid it. The mutations carried by the founders are in higher frequency than they would be in the general population from which the founders came. Island populations founded by pirates or shipwreck, that were isolated for several generations tend to have different gene and genotype frequencies because of founder effect. Similarly, religious isolates, where marriage outside the religion is forbidden, also have founder effects.

Even if the founders of small isolate populations had exactly the same genotypes and gene frequencies of the original parent population, gene and genotype frequencies would change because of random genetic drift. Random genetic drift occurs because a small population cannot maintain randomness. Consider a population with 10 individuals with only two alleles at a locus, D with a frequency of 0.5 and d with a frequency of 0.5. By chance alone one would expect to find 10 D and 10 d gametes being passed to the next generation. But one may find 11 D and only 9 d gametes. The next generation, one could find 10 and 10 again, or could find 12 and 8. But suppose after drifting to 12 D and 8 d, by chance a really skewed sampling occurred and one got 15 D and 5 d. It would be difficult, if not impossible to get back to the original 10 D and 10 d. Sampling errors in small populations are always going to occur if given enough opportunities. These errors assure that random genetic drift will always occur. Isolate populations never have the same gene and genotype frequencies as their founding populations.


It is obvious that the major difference between autosomal loci and X-linked loci in populations is that the males (usually half the population) have only one X. Males cannot have the distribution, p2, 2pq, and q2 because they have only one X, they have either the normal allele p, or the recessive allele, q. In males, gene and genotype frequencies are the same. Thus, the genotype frequencies in the male and female can never be the same. In addition, there can be no heterozygote x heterozygote mating class since there are no male heterozygotes, and as of this date females cannot mate and produce a child. X-linked traits can reach stable gene frequencies in males and females, but cannot reach Hardy-Weinberg equilibrium.

[Return to top of this page] or [Return to the Course Outline]

Quizzes on Population Genetics are available on-line at our secure Mallard site. Click here and the UIC WWW Identification Service will ask for your netid and then your password (these are the same as those you use for email.)

Once the Mallard page loads you can access the quizzes by clicking on the Lessons Page link (also the third icon from the top of the navigation bar) or the Current Lesson link (also the fourth icon from the top of the navigation bar).

Contact Dr. Robert Tissot with questions about the content of these pages.

Contact Dr. Elliot Kaufman, Course Director with questions about the functionality of these pages.