DNA methylation is a key mechanism of epigenetic control of gene expression and implicated in many cancers, but there had, until recently, been little study of automatic information extraction for DNA methylation. To address the opportunities of automatic extraction of information on DNA methylation from the literature, we annotated a corpus of relevant documents using the GENIA event representation.
This corpus was produced in part as a preparatory study for the organization of the BioNLP Shared Task 2011 Epigenetics and Post-translational Modifications (EPI) task. The EPI corpus annotations include a larger and more comprehensive set of annotations for associated events.
The DNA methylation corpus is distributed in the BioNLP Shared Task - flavored standoff format.
The DNA methylation corpus is annotated following the GENIA Event corpus annotation guidelines
Tomoko Ohta, Jin-Dong Kim and Jun’ichi Tsujii, Guidelines for event annotation, University of Tokyo Technical Report, 2007.
Tomoko Ohta, Sampo Pyysalo, Makoto Miwa, and Jun'ichi Tsujii. (2011). Event Extraction for DNA Methylation. Journal of Biomedical Semantics 2011, 2(Suppl 5):S2. (Open Access)
GENIA DNA methylation corpus version 0.9: DNA_methylation_corpus-0.9.tar.gz