Measuring variation in curators' GO annotations through a controlled multi-MOD study.
W. John MacMullen
Graduate School of Library & Information Science, University of Illinois at Urbana-Champaign, 501 E. Daniel St., MC-493, Champaign, IL, 61820, USA
We investigated the origins, nature, and extent of variation in biocurators' Gene Ontology (GO) annotations by conducting a prospective controlled study using a common document collection covering five model organisms, during which 3,500 novel GO annotations were created. We have identified and characterized variation in annotation quantity, gene product, term- and evidence code selection, and in curators' manual annotations of the underlying articles. We have also collected significant contextual data about individual curators' backgrounds, relevant experience (subject matter and curation), and personal workflows, through interviews and observations. We are testing multi-faceted, organism-independent measures of variation in GO annotations for use in training and evaluation.