Molecular Evolution:From wolverine faeces to individual genotypes
From Carls wiki
Contents |
Introduction
By using microsatellite genotyping across 11 loci in the wolverine genome, we have attempted to identify the number of distinct individuals from 27 piles of wolverine faeces, after first extracting DNA from the piles. We also estimated the actual size of the population, as well as analysed the family interrelationships between the individuals.
Methods
Lab part
DNA extraction. Through an 18-step process, we extracted DNA from one of the faeces samples.
PCR amplification. We then amplified our extracted DNA. We prepared a master mix, with the following ingredients:
| Stock conc. | Final conc. | PCR | Mix 50 rx | |
|---|---|---|---|---|
| PCR buffer | 10 x, 15 mM MgCl2 | 1 x | 1 µl | 50 µl |
| MgCl2 | 25 mM | 1.5 mM | 0 µl | 0 µl |
| dNTP | 20 mM | 0.2 mM | 0.1 µl | 5 µl |
| Primer 1 | 10 µM | 0.32 µM | 0.32 µl | 16 µl |
| Primer 2 | 10 µM | 0.32 µM | 0.32 µl | 16 µl |
| BSA | 20 mg/ml | 0.1 mg/ml | 0.05 µl | 2.5 µl |
| Hotstar | 5 U/µl | 0.025 U/µl | 0.05 µl | 2.5 µl |
| ddH2O | 6.16 µl | 380 µl |
Sequencing. The samples were then centrifuged and denatured, and run through a MegaBACE™ system.
Computer part
Microsatellite scoring. With Genetic Profiler, we were able to identify the positions of the peaks from the sequenced data. All groups collected the data in a common spreadsheet.
Data analysis. The first thing we did with our data was to eliminate duplicate samples from the same individual, using The Excel Microsatellite Toolkit, complemented with manual inspection.
Population size estimation. After that, we made use of a web service, Specrich, to estimate the actual population size from the number of samples we had observed.
Parentage analysis. Lastly, we made a comparison of this year's wolverines with four individuals from earlier seasons (again with Microsatellite Toolkit) to see if they were still present, and then (using Cervus) to establish parentage relations between the old individuals and the new ones.
Results
Data analysis
The following samples turned out to be duplicates and hence most likely from the same individual:
| Duplicates | |
|---|---|
| 1 | 13 |
| 10 | 14, 15, 23, 25, 26, 27 |
| 17 | 20 |
| 5 | 6, 18 |
Population size estimation
Since the input numbers for the Specrich form are so easily reproducible from the duplication results, I didn't write them down. After that, I've tried several sets of numbers, and I always seem to get a much higher population number than we did when we ran the form during the lab. (The number we got then was 33.)
Additionally, due to the fact that only slightly different inputs yield wildly different outputs, I've more or less lost faith to Specrich as a reliable population estimator.
Here's (one example of) the results:
| K | N(JK) | SE(N(JK)) | T(K) | P(K) |
|---|---|---|---|---|
| 1 | 42 | 5.4772 | 3.8925 | 0.0001 |
| 2 | 55 | 9.4868 | 2.7255 | 0.0064 |
| 3 | 67 | 13.9284 | 1.9447 | 0.0518 |
| 4 | 79 | 19.4936 | 1.4562 | 0.1453 |
| 5 | 92 | 27.2029 | 0.0000 | 1.0000 |
| INTERPOLATED N | 66.5209 | |||
| STD ERROR OF INTERPOLATED N | 13.7451 | |||
Parentage analysis
Comparing the old individuals against the current population reveals that the old wolverines are indeed still in the area:
| id | Parent id | Sex |
|---|---|---|
| 1 | 102 | male |
| 10 | 101 | male |
| 17 | 106 | female |
| 5 | 108 | female |
The Cervus data was contradictory: the strong conclusions from the output files are that individual 101 is both the father of individual 2 and the mother of individual 16. The latter is impossible, since individual 101 has been sexed as a male in earlier seasons.
Discussion
Reliability of the results. While the duplicate identifications (both within the latest year and between the latest year and those before) went well and produced reliable results, the population estimation feels unreliable and the parentage analysis contradictory. Other groups seem to have been getting varying results as well. Since we started with the same spreadsheet data, the differences are due to manual errors.
Spreadsheet editing. Preparing the spreadsheets by hand was easily the most dull and time consuming part of the lab, if not all four labs. Also, it felt like a waste of resources since it can be automated fairly straightforwardly. In particular, Windows and Excel felt like blunt tools for that kind of semi-automatic number crunching, after courses involving script programming in Linux. A simple program to run through and process the big table would have been a big help, and would have made the lab feel less tedious.
