An introduction and description of the corpus is available here.
Last weeks the committee has been noticed about new errors related to some aspects of the corpus (e.g. drug labeling error, charoffset errors, etc.). Thus, the corpus has been deeply evaluated, and the noticed errors have been fixed. We apology for any inconvenience.
Also a new change has been introduced into the new corpus: the label "interaction" has been replaced by a new label "pair", which identify all possible DDI candidate pairs appearing in a single sentence. Thanks to all the participants who sent their observation about inconsistencies or errors, we appreciate very much this valuable information that make us improve the DrugDDI corpus.
The errors and inconsistencies detected in this format have been corrected. We apology for any inconvenience.
The corpus has been generated in xml format, composed by the structure described in this document.
Participants are allowed to submit a maximum of 5 runs. Each run can include different sources of information and use different techniques. A submission file must be an txt file that includes all pairs of drugs (at the sentence level).
June 3, 2011 - NEW!! The Test dataset is available for registred participants here.
An Example of a test dataset in Unified format can be found here.
An Example of a test dataset in MMTx format can be found here.