NCSI Talks

   

Dominant Recessive Sampling Learning Scenario


Shodor > NCSI Talks > Excel > Dominant Recessive Sampling Learning Scenario

Learning Scenario - Dominant Recessive Sampling

Basic Model:

Description

This is a simple system model that demonstrates passing on a recessive allele to one's offspring. Genotypes from parents are calculated via random numbers, which consequently determine the expression (or lack thereof) of the recessive allele in their kid. If, by random chance, both parents pass on the recessive allele, their kid will exhibit that recessive trait. This can be tied to the inheritance of recessive genetic disorders in real life. When one recessive trait is very rare, the chance of it being expressed is also rare, since the dominant allele will always "hide" the recessive. There are no inputs for this model, but it may be run multiple times to gain a better understanding of the concepts surrounding allelic patterns.

Background Information

For any one trait, the offspring of two parents possesses two alleles: one from the mother and one from the father. The dominant allele is always the one expressed, if present. If the parents both pass on the dominant allele or they pass on one dominant allele and one recessive allele, then the offspring will display the dominant trait. If, on the other hand, both parents pass on the recessive allele, the offspring will express the recessive trait. This model demonstrates the probabilities of one kid inheriting both recessive alleles and thereby expressing that trait. The ratios of the number of those who express the trait to the total population should also tell the genotypic frequencies in a population and the chance that offspring in a population will inherit the disorder.

Science/Math

The fundamental principle behind this model is HAVE = HAD + CHANGE. For each run of the simulation, the following calculations and processes happen:

  1. The dominant or recessive allele is assigned to each parent based on a random number value between 0 and 1, where any number less than .5 represents dominant and any number greater than .5 represents recessive.
  2. The exhibition of the trait in the kids of both parents is constituted based on the alleles inherited. If both alleles inherited are recessive, the kid will exhibit the recessive trait. If not, the kid will exhibit the dominant trait.
  3. The kids who exhibit the recessive trait are assigned a number of 1, and those who do not are assigned a number of 0. These numbers are plotted in the "Recessive?" column
  4. The number of recessive kids are totaled up and recorded in the SUM column based on the sample size dictated in the COUNT column
  5. The ratio of recessive kids to total population is calculated and recorded in the RATIO column

The CHANGE in this model is dictated by preset equations in the spreadsheet. While the model starts with a set of random labels based on random numbers, through computations, the numbers are eventually transformed into the phenotypic frequency.

Teaching Strategies

An effective way of introducing this model is to relate the topic to a manifestation that the students are already familiar with. Ask students to roll their tongue (a dominant trait) and study the class percentage of people who cannot. The number of people who cannot roll their tongue should be less than those who cannot. Have the students answer these questions in order to start with a better idea of genetic inheritance:

  1. What do you think will happen if a kid inherits two dominant alleles from his or her parents? One dominant, one recessive? Two recessive? Explain.
  2. If a kid inherits one dominant and one recessive allele, what happens to the recessive allele? Is it still there?
  3. What does the ratio of those who have express the recessive trait to those who do not represent?
  4. One dominant trait is the ability to roll your tongue. How many people can roll their tongue in the classroom? What percent of the class cannot roll their tongue? Is this larger or smaller than the expected ratio?

Have students write down their hypotheses for each of these questions so that they can be tested when using the model.

Implementation:

How to use the model

While there are no parameters that can be changed, this model allows the user to refresh the random variables and thereby come up with new dominant or recessive traits in parents. In order to do both of these, simply press [Ctrl] [=]. Immediately, new random variables will be calculated and affect the dominant or recessive traits in parents, which will in turn determine the dominant or recessive expression in their kid. **Note: Make sure that under Excel > preferences > calculation: be sure to select calculate sheets "Manually", to check the box marked "Limit Iteration" and set "Maximum Iterations" = 1. For more information on Excel, reference the Excel tutorial at: http://shodor.org/tutorials/excel/IntroToExcel

Implementation

  1. Explore dominant versus recessive alleles and the role they play in genetics
  2. Interpreting graphs of differing sample sizes and the importance of the Law of Large Numbers

Objective 1

This objective encompasses the overall concept that the model attempts to teach the user. When two parents contribute dominant or recessive alleles to their offspring, the recessive trait will always be "hidden" by the dominant trait. The trait that is expressed is known as an organism's "phenotype," while the set of alleles is known as its "genotype." Have students explore genotypes by looking at the alleles contributed by the parents and phenotypes by looking at the trait expressed by the kid. Ask the following questions to guide their discovery:

  1. Look at the characteristics of the parents' offspring by looking at the "kid" column on the excel sheet. How is each kid's dominant or recessive trait related to the dominant or recessive traits of their parents?
  2. For the kids who have the recessive traits, what is the genotype of their parents? Why do you think this is?
  3. Study column labeled RATIO. What is the percentage of kids who have the recessive traits? Why is the number so low? What is the approximate amount of kids who have the recessive trait?
  4. Someone's "genotype" is the genetic makeup of their alleles, while their "phenotypes" is the trait expressed. Which is the "Kid" column recording?
  5. The parents of the kids each randomly contribute one of two alleles from their genotype to their kids. Do parents always express the gene that they pass on? Explain.

Objective 2

This model uses differing amounts of random numbers in a sample in order to calculate the percentage of people in the populations that express the recessive phenotype. The model uses samples of 5, 50, 500, and 5000 to calculate the ratio to an increasing accuracy. The Law of Large Numbers states that with an increased number of data points comes increased accuracy in predictions. Therefore, the predictions with an increased number of samples should show a closer estimation to the actual frequency of the allele. Students should study the patterns in the outputs with differing number of samples. Ask the following questions to guide their discovery:

  1. Each of the four ratios is an estimate of the frequency of the recessive allele in the population. Run the trail several times, noting the change for each estimate in successive runs. Which estimate had the least variance?
  2. What is the count for the ratio that has the least variance? Why would an increased amount of kids lead to less variance in determining the frequency of the allele?
  3. The Law of Large Numbers states that with an increased number of data points comes an increased accuracy in predictions. Does this make sense in relation to the model?

Extensions:

  1. Explore the mathematical equations for populations at Hardy Weinberg equilibrium
  2. Extend the idea of alleles to chromosomes and karyotypes with an activity
  3. Understand the application of random numbers to the real world

Extension 1

The RATIO column in the model calculated the percentage of people who expressed the recessive phenotype in a population (the populations of kids). At Hardy Weinberg Equilibrium, populations have set relationships for the frequency of each allele. The equations p2+2pq+q2=1 and p+q=1 governs these relationships with p being the frequency of the dominant allele and q being the frequency of the recessive allele. Have students research the (hypothetical) requirements for Hardy Weinberg Equilibrium and calculate the frequency of the dominant allele after running the simulation once.

  1. What are the five requirements for Hardy Weinberg Equilibrium? What does it mean to say that a population is in equilibrium?
  2. What variable does the probability calculated in Objective 2 question 2 represent in the equation p2+2pq+q2=1? How would you use this equation and p+q=1 to find the ratio of kids who have both the recessive and dominant alleles?
  3. What does it mean to be homozygous? To be heterozygous? Does the ratio display the percentage of individuals who are homozygous dominant, homozygous recessive, or heterozygous?

Extension 2

The Dominant Recessive Sampling model deals with genes that specifically code for a certain trait. The following activity should help in teaching students the overall importance of genes as well as understanding the process by which genetic disorders are found. When someone has their genes karyotyped, their chromosomes are set out in size order. If there is an abnormality in one of the 43 human chromosomes (23 in haploid cells) or there is the wrong number of chromosomes, scientists can detect genetic disorders. The following activity simulates that process:

  1. Prepare and cut out 23 chromosomes, simulating the chromosomes in a human gamete
  2. Make a mutation on one or more of the chromosomes, such as a deletion, duplication, etc.
  3. Have the students order the chromosomes in size order, from largest to smallest
  4. Ask the students to find the chromosome with the mutation and explain what happened in relation to cell reproduction (meiosis in this case)
  5. 5. Make sure students understand the connection between this activity and the genes that were calculated in the Dominant Recessive Sampling model. Ask the following questions after completion of the activity:
    • Where are the genes located on each chromosome? Are there any mutations to any of the genes? Explain.
    • What is the difference between a gene and allele? How are these related to chromosomes?
    • How can karyotyping be used to find genetic disorders?

Extension 3

This extension deals with the random number aspect of the Dominant Recessive Sampling method. Since random numbers from a random number generator are traditionally not random, the resulting ratio obtained could be affected by bias in the computer program that is generating numbers. Give students the following links to research this idea and supply the HotBits Genuine Random number generator for comparison. Ask the following questions:

  1. What is a pseudorandom number? Would you say that pseudorandom numbers determine the genes of the Mom and Dad in the Dominant Recessive Sampling model?
  2. Why can a computer not output truly random numbers? Explain.
  3. Compare the random number generator below to the HotBits number generator. What is the difference between the two? Are the HotBits numbers truly random?

Supplemental Materials:

  1. Fourmilab Hotbits
  2. Random Number Generator

Related Models

  • Cystic Fibrosis

    The Cystic Fibrosis model provides a more in depth understanding of the dominant and recessive alleles and how some recessive traits can be deadly. Cystic fibrosis is a recessively inherited trait. This relatively simple simulation allows students to see the offspring resulting from two parents who have or do not have the allele. When the simulation is run, a new set of parents and their genotype is displayed and whether their offspring inherit the trait. This model will provide methods to calculate the chance of passing on a trait.

  • Problematic Patterns in Random Noise

    The Law of Large Numbers came into play with the Monte Carlo integrals when multiple data points allowed for a better approximation of pi. The Problematic Patterns in Random Noise model expands upon the Law of Large numbers and allows students to approximate a trend in a set of data points through regression. With a greater number of data points, the regression becomes more accurate and the coefficient of determination becomes closer to 1.0. Monte Carlo integrals focused on having a large number of random numbers, but this model will show applications for the Law of Large Numbers in relation to studying trends.

  • Pseudo Random Numbers

    The random numbers in the Monte Carlo Integrals model used x-values that were randomly generated by Excel. The Pseudo Random Numbers Model explains the flaws in generating random numbers on a computer and how the numbers generated were not truly random. This model will be a good complement to Extension 3 and will allow students to better understand the flaws in the model.

  • DNA Sequence Analysis

    This model allows students to explore genes more in depth by studying the different nucleotide sequences on DNA strands. Users are given strands of DNA sequences and the model calculates the number of the specified nucleotide (C, G, A, T). The process moves in a step-by-step fashion, just as many biological processes do to inspect the strand for errors. The Cystic Fibrosis model focused only on the traits that come from genes, but this one will expand that into the actual sequences that make up and code for each individual allele.