Tags Science

I often want to recall how this stuff works, but I keep forgetting. Here's a short summary of computing the degree of genetic relatedness of two relatives.

Preliminaries

Human cells are diploid, meaning that each cell has two chromosomes, each containing the same genes at the same loci but possibly different genetic information, called alleles, at those genes. One chromosome comes from the father, another from the mother.

Sex (sperm/egg) cells are produced in the process of meiosis. Since each sex cell contains only one chromosome, the information from both chromosomes (father's and mother's) is mixed by the process of recomobination to create it. For further discussion we can assume that about 50% of the father's and 50% of the mother's genes make their way into it.

When during conception a male's sperm cell fuses with a female's egg cell, the relevant chromosomes pair and we once again have normal diploid cells that construct the embryo.

Basics of relatedness

So what is genetic relatedness? Roughly defined, it's the percentage of similar genes shared by two persons.

Let's consider the happy family of Mom, Dad, Sonny and Junior (parents + 2 children, who aren't identical twins).

By the process described in the previous section, it should be clear that when Dad's sperm fuses with Mom's eggs, the newly created embryo will have half of Dad's genes and half of Mom's genes. Therefore the relatedness of a child to either of his parents is 50%.

What about the Sonny and Junior. How are they related? Sonny's genes are half Mom's and half Dad's, so are Junior's. But they're not the same halves!

Think of it this way: Dad has some gene G. The chance that it made it to Sonny is 50%. So is the chance that it made it to Junior. So on average, about 25% of Sonny's and Junior's genes are shared from their father [1]. But that's not all, there's also Mom. By the same reasoning, 25% of Sonny's and Junior's genes are shared from their mother. In total, 50% of Sonny's and Junior's genes are shared. Therefore, the relatedness of two full siblings is 50%.

Some more calculations

From here let's leave the detailed analysis alone and just focus on the resulting maths, which is quite simple. What about half siblings? Say Sonny's and Junior's fathers are different persons. They still share 25% of Mom's genes, but no genes from the fathers, so they're only 25% related.

Uncles: Sonny's children will be 25% related to Junior. This is because Sonny and Junior are 50% related, and Sonny's genes are further 50% "diluted" by his wife's genes in his children, so it's 25% total.

Cousins: Since we saw that Junior is 25% related to his nephews, his children are 12.5% related to their cousins, because Junior's genes are mixed with his wife's genes 50-50 in them.

A method

Here's a computation method suggested by Richard Dawkins in "The Selfish Gene" (pp. 91-92):

First identify all the common ancestors of A and B. For instance, the common ancestors of a pair of first cousins are their shared grandfather and grandmother. Once you have found a common ancestor, it is of course logically true that all his ancestors are common to A and B as well. However, we ignore all but the most recent common ancestors. In this sense, first cousins have only two common ancestors. If B is a lineal descendant of A, for instance his great grandson, then A himself is the 'common ancestor' we are looking for.

Having located the common ancestor(s) of A and B, count the generation distance as follows. Starting at A, climb up the family tree until you hit a common ancestor, and then climb down again to B. The total number of steps up the tree and then down again is the generation distance. For instance, if A is B's uncle, the generation distance is 3. The common ancestor is A's father (say) and B's grandfather. Starting at A you have to climb up one generation in order to hit the common ancestor. Then to get down to B you have to descend two generations on the other side. Therefore the generation distance is 1 + 2 = 3.

Having found the generation distance between A and B via a particular common ancestor, calculate that part of their relatedness for which that ancestor is responsible. To do this, multiply 1/2 by itself once for each step of the generation distance. If the generation distance is 3, this means calculate 1/2 x 1/2 x 1/2 or (1/2)^3. If the generation distance via a particular ancestor is equal to g steps, the portion of relatedness due to that ancestor is (1/2)^g.

But this is only part of the relatedness between A and B. If they have more than one common ancestor we have to add on the equivalent figure for each ancestor. It is usually the case that the generation distance is the same for all common ancestors of a pair of individuals. Therefore, having worked out the relatedness between A and B due to any one of the ancestors, all you have to do in practice is to multiply by the number of ancestors. First cousins, for instance, have two common ancestors, and the generation distance via each one is 4. Therefore their relatedness is 2 x (1/2)^4 = 1/8. If A is B's great-grandchild, the generation distance is 3 and the number of common 'ancestors' is 1 (B himself), so the relatedness is 1 x (1/2)^3 = 1/8.

Uncommon relations

Consider the following case: A and B are married, and A's brother C is married to B's sister D. How related are their children?

The generation distance between the children of A and B, and the children of B and D is 4 (like normal first-cousins). But instead of having just 2 common ancestors, they have 4 (the parents of A and C, and the parents of B and D). So, according to Dawkins' method the relatedness is 4 x (1/2)^4 = 1/4.

[1]There are 4 options for gene G. Either it made it into both Sonny and Junior, or it made it into Junior only, or into Sonny only, or into neither. Assuming uniform random distribution, the chance of each occurrence is 25%.