Population Genetics forum: topic

This is a public forum

Relationship between significant deviations from HW and inbreeding coefficient

Penny Nelson

Tuesday, 24 Feb 2009 04:38 UTC

Hi.

Today I am struggling with the relationship between deviations from HWE locus by locus and the overall inbreeding coefficient (mean across all loci). For example – my data is showing a non significant deviation from HWE at all of my 8 microsat loci for one population but the inbreeding coefficient in this same pop is pretty high (higher than other populations which are registering as having significant deviations from HWE at one or more loci). Shouldn’t significant deviations from HWE at individual loci result in inbreeding coefficients that are clearly one way or the other (i,e very negative or very positive)depending on the direction of the deviation…

I don’t understand.

Any help would be greatly appreciated.

    • all tags

      • No tags for this topic.
  • Replies

    Post a reply
    • I haven’t looked at this a lot, but tests of deviation from HWE have a reputation for having low power, so this could just be a manifestation of that.

      If your estimate of the inbreeding coefficient is calculated using all loci, then it has more power anyway (because it’s using more data), so this could easily happen.

      There’s a much more general issue about significance tests, which I blogged about last year. In practice, what this means is whether your results are a problem depends on what you’re intending to do with the data.

    • I strongly recommend Bob’s blog from last year about the meaning and value of statistical significance. Statistically significant deviations are not necessarlity large or important deviations. Statistical significance confounds variability and sample size with the size of the deviation from HWE. The misuse of statistical significance is widespread, especially among journal editors, so I can’t say “don’t do significance tests”. You probably wouldn’t get published. But generally you should do the following analysis anytime you are drawn into the p-value trap: Find a meaningful measure of the effect you are studying (a measure whose absolute magnitude is interpretable), measure this using your data set, and then calculate a confidence interval for it (often by bootstrap techniques). This will give you not only an idea of the precision of your result but also will tell you HOW different the value is from some null value. The latter is the really important scientific question.
      I think the widespread misuse of p values is the root of much evil in pop gen. It excuses researchers from thinking about the meaning of their measures. They are content as long as they can generate p-values from them. The result is tolerance of nonsensical or uninterpretable measures, like Gst or Fst as measures of differentiation. For an example see my Table 2 in Molecular Ecology 17, 4015- 4026. According to GST, Species B in that table would be classified as more highly differentiated between populations than Species C, at whatever significance level we desired (this just depends on sample size), yet the allele frequencies show that Species C is more differentiated than species B. Reliance on p-values has kept geneticists from thinking about what their measures really mean.

    • Thankyou both for your thoughtful replies, i really though no-one would respond! I read both suggestions – both will impact the way in which I view and analyze my data.

      Since trying to find my way through data analysis in this field I have often come across the problem of trying to get the analysis to show what I can see in the raw data. For example, I have 13 remnant populations of a once continuous redgum community. I want the data to show me something about how different they are genetically so that I might get a a clue as to what distances gene flow might occur over. I have eight microsatellites. There are many many common alleles that are shared between populations – but some populations have relatively high frequencies of private alleles. In the multi-locus genotypes – these private alleles are connected to common alleles at other loci. The thing that I want the data to show is that the presence of a private allele in one individual implies that the common alleles (in the same individual but at different loci) are changed because of that private allele – they can no longer be considered as likely to have come from a neighboring population where the allele is also common – and if the private allele is homozygous – then the chances that any of the alleles at different loci were derived from a gene flow event from another population is negligible.

      I don’t know if I am making any sense here. This confusion probably reveals serious ignorance. The thing that I am feeling about this data is that the rare and private alleles have information but that it is hidden by the high frequency of common alleles and the way in which loci are analyzed independently.

      Thanks in advance to anyone who gets through this mumbo jumbo. Any reading suggestions would be appreciated.

    Post a reply

Search forums Advanced search

web feed

Submit this topic to

Advertisement