- Thread starter angelshershey
- Start date

are you new to our breed? have you owned several newfs and shown or worked and gotten familiar with lines and what they throw? have you studied movement and structure, and have you also studied health issues? do you have a mentor to help you who has been long involved in the breed? we find so many out there "jumping" into the breeding game without knowing hardly anything.

i dont know if this helps, this is a yearly post that is made to newf-L written by the randalls:

I know I promised a total of three posts, but I decided I needed

another one before giving you my thoughts on implications. Even if

you don't want to wade through posts 1 through 3, the implications

post will be readable.

The other thing I want to discuss has to do with the probabilities

involved in breeding tests. This is another situation in which there

is a serious misconceptualization of the problem that is perpetuated

in Padgett's book (Control of Canine Genetic Diseases).

The classic breeding test is done when we have a dog that we may

suspect is a carrier for a recessive trait. It may be from a

breeding of known carriers or known carrier by clear. It may simply

be that a recessive is fairly frequent in the breed and you may want

to establish whether a particular promising stud dog is a carrier

before using him extensively.

The easiest breeding test to understand is one in which we take the

dog we are interested in testing (called the test-mate by Padgett)

and breed it too an affected (homozygous recessive) bitch (called the

"foil" by Padgett). [ Note: occassionally breeding tests are done

with females, but it is less frequent since the test may represent a

substantial portion of the reproductive capacity of the bitch.

Clearly this is not the case with males.] The foil can only

contribute the recessive allele to any puppies produced (say "a").

If the test dog is, homozygous dominant (AA, not a carrier) he can

only contribute the dominant allele and all of the puppies will be

unaffected. If he is a carrier (Aa) he will contribute both alleles

equally and, in the long run, 50% of the puppies will be affected.

The test then involves breeding to an affected bitch & then

evaluating whether or not any affected puppies are produced. If any

(at least one) affected puppy is produced, we know the test dog is a

carrier. However, if no affected puppies are produced, either 1) the

dog is a carrier, but happened to contribute the dominant allele to

all of the puppies or 2) the dog is not a carrier. If the test

breeding results in any affecteds, we have a certain answer. If not,

we have an ambiguous situation.

Now (and this is important) lets assume for the time being that the

test dog *is* a carrier. Now suppose that our test breeding results

in 2 normal puppies. How likely would that be. Well we could have 4

possible outcomes (using N for normal and A for affected) NN, NA, AN,

and AA. If the test dog is a carrier then these four litters have

equal likelihood and the probability of two normals (NN) is 25%. We

have just calculated the probability of getting two normal puppies in

a litter of two, under the assumption that the test dog is a carrier.

This is referred to as a "conditional" probability & is usually

indicated by prob(A|B) or the probability of A given B. In this case

the probability of the *litter given that the test dog is a carrier*.

Clearly, as the number of normal puppies without an affected

increases the probability of the litter decreases. In general that

probability is .5^p (.5 raised to the p power) where p is the number

of normal puppies.

Number of puppies Prob of litter

1 .5

2 .25

3 .125

4 .0625

5 .03125

6 .015625

7 .0078125

etc., etc.

Since, the probability of getting all normal puppies becomes

smaller than 1 in a hundred at 7 puppies, some people suggest that 7

normal puppies is what we should have before we consider the test dog

to be homozygous dominant(i.e. not a carrier). Please note here that

we have said nothing at all about the probability that the test dog

is a carrier. All these probabilities were calculated under the

assumption that the dog *is* a carrier and they are all probabilities

about getting particular results from our breeding test.

Unfortunately, this is not really what people want to know. What

they want to know is "what is the probability that the dog is a

carrier---not the probability of the litter, given that he is a

carrier. Now the common way out of this, and the one taken by

Padgett, is to calculate the probability of the litter and then

simply say that it is the probability that the dog is a carrier. If

we point out that the numbers are in error (see below) we're told

that it's an approximate calculation. I would like to point out,

that it is not an approximation of the probability that the dog is a

carrier, it is the answer (in fact, and exact one) of an entirely

different question, i.e. P(test results|test dog is a carrier),

*not*, P(test dog is a carrier|test results). Pointing out the

confusion between p(A|B) and p(B|A) is not exactly new or novel. In

fact, the problem was solved well over 200 years ago (published in

1763). I will give you a formula relating the important

probabilities in question later on. Right now I want to do a "common

sense" derivation, since common sense seems to be so popular.

An answer to the real question we want to ask (probability that the

test dog is a carrier) could be addressed in the following way:

Suppose we could examine all of the three normal puppy results. In

what proportion of these would the test dog be a carrier and in what

proportion would he be homozygous dominant? Now, let's assume for

the time being that we have a population of 2000 dogs in which half

are carriers and half are not. We subject them to breeding tests &

have 3 puppy results each time. In .125 of these tests (.125*1000 =

125) we will get a "negative test", i.e. all three puppies will be

normal. In the tests of the 1000 dogs that are not carriers, we will

always get a negative test. So there will be 1125 negative tests and

of them 125 will be carriers. Therefore the proportion of negative

tests in which the test dog is actually a carrier is 125/1125=.111.

Now .111 is not too different from .125, the answer that Padgett

gives: Actually he gives (.875 or 87.5%) that the dog is homozygous

rather than the .125 that he is a carrier. Note the title of table

5.2, pg 57. "Probability that an unknown animal is homzygous

dominant(AA) for a given autosomal recesive trait when bred to an

animal affected (aa) with that trait." There is no doubt that he is

taking .875 as the probability that the test dog is homozygous or

.125 that the test dog is a carrier. .125 is actually the

probability of the litter when the dog is a carrier, not vice-versa.

The generalization of the solution above for a population of 1000

homozygous dominants and 1000 heterozygotes or carriers, is called

Bayes Theorem. This allows us to calculate the probability that the

dog is a carrier given the test results and the "a priori"

probability that the dog is a carrier. This latter term is simply

the proportion of the population from which the test dog is drawn

that are, in fact, carriers (50% in our example above).

Pr(A|B)=Pr(B|A)XPr(A)/(Pr(B|A)XPr(A) + Pr(B|notA)Pr(not A)). In the

particular case of the breeding test we can simplify this to

Pr(carrier|test)=Pr(test|carrier)XPr(carrier)/(Pr(test|carrier)XPr(car

rier) + (1-Prob(carrier))

This can, of course, be easily programmed in something like EXCEL.

Padgett actually does mention this & gives a table (table 5.2a) for a

priori probability of 50%. He calls it a "more accurate method of

calculating the probabilities". Unfortunately, it is not a more

accurate method, it answers a different question.

Now, you may accept my argument that the common test probabilities

(i.e. the .125 instead of .111 above) are conceptually flawed.

However, if they're always as close as .125 & .111 you might not

care. I have given some examples where the probabilities are rather

far apart in previous posts, I will give an example here of an even

greater discrepancy. Padgett's table 5.4 (page 60) purports to

represent the probability that a dog is a carrier given different

results of breeding him to a random selection of bitches from a

population with different percentages of carriers. This is a

so-called random breeding test, and it is most useful when dealing

with heavily used stud dogs (or matadors as Padgett calls them).

Clearly, the more a dog is bred without producing affected puppies,

the less likely it is that he is a carrier of a recessive allele.

I'm only going to deal with a couple aspects of this table. This is

the portion of the table that looks at multiple breedings to randomly

selected bitches with a single normal puppy in each breeding.

AN ASIDE : Unfortunately, there is a confusing problem with Table 5.4

in the book & so if you have or borrow the book & look at the table

you might have trouble interpreting what I'm going to say about it.

The Title of the Table is "Chance that a dog will be homozygous

dominant (AA) for a given autosomal recessive gene when bred to a

partner population with a *given frequency of that gene* and all

offspring are normal. The heading for one part of the table reads

"Frequency of Carriers in the Population". The frequency of the gene

and the frequency of carriers are two different things (normally

related by the Hardy-Weinberg equilibrium equation. The table is

actually about gene frequencies, not about frequency of carriers. He

misinterprets the table as reflecting frequency of carriers in his

example on page 61. We will use the 5% column in the table which is

equivalent to a 9.5% carrier rate, i.e. carrier rate=2XpXq where q is

the frequency of one allele and q the frequency of the other--in this

case 2*.05*.95.

OK, back to the table. Here are the numbers, what he calls the

probability that the test dog is Homozygous (i.e. not a carrier)

given n single puppy litters with all normal offspring

1 2%

5 11%

10 22%

20 38%

30 52%

40 62%

50 70%

60 77%

80 82%

100 91%

150 97%

Where do these numbers come from? Well, the probability of a

single puppy litter being unaffected, *given that he is a carrer* is

.5 * .5 * .095, i.e. there is a 50% chance of getting the recessive

allele from him (if he is a carrier), there is a 9.5% chance that the

randomly selected bitch is a carrier and, if she is, a 50% chance of

getting the recessive allele from her. The probabilty a single puppy

litter would be unaffected is 2.375%. The probabilty that 2 single

puppy litters in a row would be unaffected is

(.5*.5*.095)*(.5*.5*.095) & the probability that this would not

happen is 1 - (1-.02375)^2 . Using this technique for the remainder

of the table gives us his entries. Note: these are based on

probabilities of test results given that he is a carrier, not the

probability that he is a carrier given test results.

Now we have the question of why we would have a potential matador,

presumably drawn from the same population as the bitches (i.e. a 9.5%

chance of being a carrier), do one breeding & get a single unaffected

puppy (tending to indicate that he is not a carrier) and decide that

he has a 98% chance of being a carrier (2% chance of being clear).

We're still saying that there is a 48% chance of being a carrier

after 30 of these breedings without an affected puppy. If he is

drawn from the same population as the bitch, the probability (from

Bayes theorem above) that he is a carrier, given the 30 breedings is

less than 6%. With a single puppy it would be slightly less than

.6666, not .98.

These are not trivial differences & under realistic assumptions the

probabilities can easily be off by over 100 fold. Clearly enough in

error that we might want to consider them a convenient fiction,

rather than a serious calculation.

Pat R

Part 3~~~~~~~~~~~~~~~~~~~~`

In the three previous posts I have tried to point out some of the problems with probability calculations with respect to dog breeding and pedigree analysis as described in G. Padgett's book, Control of Canine Genetic Diseases. These range from methods that are conceptually flawed, but may give you a number close to what you are looking for, to methods that don't even come close. Most of the actual calculations in these circumstances are trivial--but knowing when to apply the different techniques is not, nor is understanding the shortcomings of the different calculations or miscalculations. I'm sure that the writing in these posts was too condensed to provide a good basis for probability theory as it applies to dog breeding. However, I do think I justified pretty much everything I said and tried to show how the calculations work. I have nothing against George Padgett, I think he makes some very good general points about dog breeding and I agree with a lot of his views. However, I think that the casual misuse of probability concepts without providing a firm basis of understanding is a recipe for disaster. I am extraordinarily doubtful that anyone could learn to deal intelligently with pedigrees and probabilities from his book, no matter how diligently they study it. (even if all we had to deal with were simple recessives). Readers will go out into the world armed with a hodgepodge of fuzzy concepts that have essentially no chance of corresponding to reality. Rather than saying a calculation is something it is not, someone of Padgett's stature in the dog world has a responsibility to accurately portray the truth.

In some cases he is simply mistaken. In other cases, sloppy language leaves the reader in doubt about what is meant. For example, his transition between discussion of what he calls the product rule to a discussion of the "sum rule" reads as follows (page 74):

" The possibility of getting a dominant or a recessive gene from either parent is independent of getting a dominant or recessive gene from the opposite parent. Therefore, all four possible outcomes have an equal probabilty (25 percent), which is the product of the respective risks."

Pretty reasonable. However, he goes on to say:

"If we are interested in a generalized outcome and the probabilities associated with it, we use the sum rule. The outcomes must be independent and mutually exclusive in order to use this rule."

Now I defy anyone to explain to me what he means by "a generalized outcome and the probabilities associated with it". Apparently it is not terribly clear because he follows with a misapplication of the sum rule to illustrate it(covered in an earlier post).

We are also often told we are xx% sure that such and such is the case without defining exactly what is meant by "xx% sure". I would usually take that to mean that if when faced with the situation (breeding test results etc) I make such and such an inference, I will be correct xx% of the time. I don't think there is anyplace in the entire book where xx% sure or xx% certain can be interpreted in that way. A similar statement was made about a statistical probability in an Newf Tide article that Padgett co-authored with U. Mostosky & B. Jenness on forelimb anomaly, i.e." The p value derived from the chi square test allows us to predict that this trait is autosomal recessive with a greater than 95% chance of being accurate. This is a classic sophomoric misinterpretation of statistical findings, but (apparently under the rubric of simplification) Padgett doesn't see anything wrong with putting it in the article. I have to assume that this interpretation is considered to be all right because it is written for the layman-- to my mind a major disservice to the reader. Instead of oversimplifying to the extent of falsehood, I think that we should follow Einstein's advice to "make things as simple as possible, but no simpler".

Does all of this do any harm above and beyond condemning readers to a self-satisfied state of pseudo-knowledge? I'm not sure, maybe it's a good thing. It certainly lets people feel good about themselves. I do think there is one area in which it can do a great deal of harm. There are a number places throughout chapter 6 on the "Interpretation and use of pedigrees to determine genetic status of given dogs" where inappropriate techniques are advocated that will grossly overestimate the probability that frequently used dogs are carriers of all sorts of things. This is clearly a good recipe for the so-called witch-hunt that breeders fear and which represents a major stumbling block to things like open registries, which Padgett strongly advocates.

We may be better off simply armed with some "like begets like" type aphorisms and a good eye for dogs than with a montage of marginal information about how to perform probability calculations that we are unlikely to apply correctly. At least the aphorisms don't give the impression that they should be taken terribly seriously and they don't pretend to produce real numbers.

Pat R