How To Find A Hypothesis In An Article
- Research article
- Open Access
- Published:
Identification of research hypotheses and new cognition from scientific literature
BMC Medical Informatics and Conclusion Making volume xviii, Article number:46 (2018) Cite this article
Abstract
Background
Text mining (TM) methods accept been used extensively to extract relations and events from the literature. In addition, TM techniques have been used to extract diverse types or dimensions of interpretative information, known as Meta-Knowledge (MK), from the context of relations and events, e.m. negation, speculation, certainty and knowledge type. However, nearly existing methods have focussed on the extraction of individual dimensions of MK, without investigating how they tin exist combined to obtain even richer contextual information. In this paper, we describe a novel, supervised method to extract new MK dimensions that encode Research Hypotheses (an writer'southward intended knowledge gain) and New Knowledge (an writer'due south findings). The method incorporates various features, including a combination of elementary MK dimensions.
Methods
We place previously explored dimensions and then apply a random forest to combine these with linguistic features into a classification model. To facilitate evaluation of the model, we have enriched two existing corpora annotated with relations and events, i.e., a subset of the GENIA-MK corpus and the EU-ADR corpus, by adding attributes to encode whether each relation or event corresponds to Research Hypothesis or New Knowledge. In the GENIA-MK corpus, these new attributes complement simpler MK dimensions that had previously been annotated.
Results
Nosotros show that our approach is able to assign different types of MK dimensions to relations and events with a high degree of accuracy. Firstly, our method is able to improve upon the previously reported state of the fine art performance for an existing dimension, i.e., Cognition Type. Secondly, we too demonstrate high F1-score in predicting the new dimensions of Research Hypothesis (GENIA: 0.914, EU-ADR 0.802) and New Knowledge (GENIA: 0.829, European union-ADR 0.836).
Determination
We take presented a novel arroyo for predicting New Noesis and Research Hypothesis, which combines simple MK dimensions to achieve loftier F1-scores. The extraction of such data is valuable for a number of practical TM applications.
Background
The goal of information extraction (IE) is to automatically distil and construction associations from unstructured text, with the aim of making it easier to locate information of interest in huge volumes of text. Within biomedical research articles, the textual context of a particular piece of noesis often provides clues equally to its current condition along the 'research journeying' timeline. Sentences (1)–(3) beneath exemplify a number of different points along the inquiry timeline regarding the institution of an association between Interleukin-17 (IL-17) and psoriasis. The association is firstly introduced in (1) as a hypothesis to be investigated. In (2), which is taken from the same paper [i], the putative association is backed upward by initial experimental evidence. Sentence (3) comes from a paper published 10 years later [2], by which time the association is presented as widely accepted cognition, presumably on the basis of many farther positive experimental results.
(ane) 'To investigate the role of Interleukin-17 (IL-17) in the pathogenesis of psoriasis...'
(2) 'These findings indicate that upward-regulated expression of IL-17 might be involved in the pathogenesis of psoriasis.'
(3) 'IL-17 is a critical gene in the pathogenesis of psoriasis and other inflammatory diseases.'
At that place is a strong need to identify dissimilar types of emerging knowledge, such equally those shown in sentences (i–ii), in a number of unlike scenarios. It has been shown elsewhere that incorporating this type of information improves the automated curation of biomedical networks and models [iii].
In processing sentences (1)–(3) above, a typical IE organisation would firstly detect that Interleukin-17 and IL-17 are phrases that describe the same factor concept and that psoriasis represents a illness concept. After, the system would recognise that a specific clan exists between these concepts. These associations may exist binary relations between concepts, which encode that a specific type of association exists, or they may exist events, which encode complex n-ary relations between a trigger word and multiple concepts or other events. Figure 1 shows the specific characteristics of both a relation and an effect using the visualisation of the brat rapid annotation tool [4]. The output of the IE organization would permit the location of all sentences within a big document collection, regardless of their varied phrasing, that explicitly mention the same association, or those mentioning other related types of associations, e.g., to find dissimilar genes that have an association with psoriasis. The structured associations that are extracted may later on exist used equally input to further stages of reasoning or data mining. Many IE systems would consider that sentences (1)–(3) each conveys exactly the same data, since near such systems simply take into account the key information and not the wider context. Recently, however, there has been a trend towards detecting various aspects of contextual/interpretative information (such every bit negation or speculation) automatically [v–8].
In this work, nosotros focus on the automatic assignment of two interpretative dimensions to relations and events extracted by text mining tools. Specifically, we aim to determine whether or not each relation and event corresponds to a Inquiry Hypothesis, as in sentence (i), or to New Noesis, every bit in sentence (2). To the all-time of our knowledge, this work represents the commencement effort to apply a supervised approach to find this type of information at such a fine-grained level.
We envisage that the recognition of these two interpretative dimensions is valuable in tasks where the discovery of emerging knowledge is important. To demonstrate the utility and portability of our method, we show that it can be used to enrich instances of both events and relations.
Related work
The chore of automatically classifying knowledge independent inside scientific literature according to its intended interpretation has long been recognised every bit an important step towards helping researchers to make sense of the data reported, and to allow of import details to be located in an efficient fashion. Previous work, focussing either on general scientific text or biomedical text, has aimed to assign interpretative information to continuous textual units, varying in granularity from segments of sentences to complete paragraphs, but well-nigh frequently apropos consummate sentences. Specific aspects of estimation addressed have included negation [5], speculation [6–eight], full general information content/rhetorical intent, e.g., background, methods, results, insights, etc. [9–12] and the distinction betwixt novel data and groundwork knowledge [thirteen, 14].
Despite the demonstrated utility of approaches such as the above, performing such classifications at the level of continuous text spans is not straightforward. For example, a unmarried sentence or clause can innovate multiple types of data (e.thousand., several interactions or associations), each of which may take a dissimilar interpretation, in terms of speculation, negation, inquiry novelty, etc. As can exist seen from Fig. 1, events and relations can construction and categorise the potentially complex information that is described in a continuous text span. Following on from the successful evolution of IE systems that are able to extract both gene-affliction relations [15–17] and biomolecular events [xviii, 19], there has been a growing interest in the task of assigning interpretative information to relations and events. Still, given that a single sentence may comprise mutiple events or relations, the challenge is to make up one's mind whether and how the interpretation of each of these structures is afflicted by the presence of particular words or phrases in the sentence that denote negation or speculation, etc.
IE systems are typically developed past applying supervised or semi-supervised methods to annotated corpora marked upwardly with relations and events. There have been several efforts to manually enrich corpora with interpretative information, such that it is possible to railroad train models to determine automatically how item types of contxtual information in a judgement affect the estimation of different events and relations. Near piece of work on enriching relations and events has been focussed on one or two specific aspects of estimation (due east.g., negation [xx, 21] and/or speculation [22, 23]). Subsequent work has shown that these types of data can be detected automatically [24, 25].
In dissimilarity, work on Meta-Knowledge (MK) captures a wider range of contextual data, integrating and building upon various aspects of the above-mentioned schemes to create a number of dissever 'dimensions' of information, which are aimed at capturing subtle differences in the interpretation of relations and events. Domain-specific versions of the MK scheme have been created to enrich complex consequence structures in ii different domain corpora, i.e., the ACE-MK corpus [26], which enriches the general domain news-related events of the ACE2005 corpus [27], and the GENIA-MK corpus [28], which adds MK to the biomolecular interactions captured equally events in the GENIA event corpus [22]. Contempo piece of work has focussed on the detection of uncertainty effectually events in the GENIA-MK Corpus. Uncertainty was detected using a hybrid approach of rules and machine learning. The authors were able to show that incorporating uncertainty into a pathway modelling task led to an improvement in curator performance [three].
The GENIA-MK notation scheme defines 5 singled-out core dimensions of MK for events, each of which has a number of possible values, as shown in Fig. 2:
- 1.
Knowledge Blazon, which categorises the knowledge that the author wishes to express into one of: Observation, Investigation, Assay, Method, Fact or Other.
- ii.
Cognition Source, which encodes whether the writer presents the knowledge as part of their own work (Current), or whether it is referring to previous piece of work (Other).
- 3.
Polarity, which is fix to Positive if the event took place, and to Negative if it is negated, i.due east., it did not have place.
- iv.
Mode, which denotes the event's intensity, i.eastward., Loftier, Low or Neutral.
- five.
Certainty Level or Uncertainty, which indicates how certain an event is. Information technology may be certain (L3), probable (L2) or possible (L1).
These five dimensions are considered to exist independent of one another, in that the value of i dimension does not affect the value of any other dimension. At that place may, however, be emergent correlations between the dimensions (i.due east., an event with the MK value 'Knowledge Source=Other' is more frequently negated), which occur due to the characteristics of the events. Previous work using the GENIA-MK corpus has demonstrated the feasibility of automatically recognising i or more than of the MK dimensions [29–31]. In addition to the v core dimensions, Thompson et al. [28] introduced the notion of hyperdimensions, (i.e., New Knowledge and Hypothesis) which represent college level dimensions of information whose values are determined according to specific combinations of values that are assigned to different cadre MK dimensions. These hyperdimensions are likewise represented in Fig. two. We build upon these approaches in our own piece of work to develop novel techniques for the recognition of New Knowledge and Hypothesis, which accept into account several of the core MK dimensions described above, as well equally other features pertaining to the structure of the event and sentence.
Methods
Our piece of work took equally its starting betoken the MK hyperdimensions divers by Thompson et al. [28], since we are also interested in idenfifying relations and events that depict hypotheses or new knowledge. Nevertheless, we institute a number of bug with the original work on these hyperdimensions. Firstly, Thompson et al. [28] did not provide clear definitions for of 'Hypothesis' and 'New Knowledge'. In response, nosotros have formulated concise definitions for each of them, as shown below. Secondly, by performing an analysis of events that takes into business relationship these definitions, we found that it was not possible to reliably and consistently identify events that describe new noesis or hypotheses based only on the values of the core MK dimensions. As such, we decided to carry out a new annotation endeavour to mark up both 'Research Hypothesis' and 'New Knowledge' as independent MK dimensions (i.e., their values do not necessarily accept whatsoever dependence on the values of other core MK dimensons), and to explore supervised, rather than rule-based methods, to facilitate their automated recognition.
Annotation guidelines
The starting point for our novel annotation effort was our tightened definitions of Research Hypothesis and New Knowledge; our initial definitions were refined throughout the process of annotation. Every bit the definitions and guidelines evolved, nosotros asked the annotators to revisit previously annotated documents in each new round. Our final definitions are presented below:
Enquiry Hypothesis: A relation or issue is considered as a Inquiry Hypothesis if it encompasses a argument of the authors' anticipated knowledge gain. This is shown in examples (i) and (2) in Table one.
New Knowledge: A relation or outcome is considered as New Knowledge if it corresponds to a novel research event resulting from the piece of work the author is describing, as per examples (iii) and (4) in Table 1.
Whereas the value assigned to each of the core MK dimensions of Thompson et al. is completely independent of the values assigned to the other core dimensions, our newly introduced dimensions do non maintain this independence. Rather, Inquiry Hypothesis and New Knowledge possess the property of mutual exclusivity, as an result or relation cannnot exist simultaneously both a Research Hypothesis and New Knowledge. We chose to enrich two different corpora with attributes encoding Research Hypothesis and New Knowledge, i.e., a subset of the biomolecular interactions annotated every bit events in the GENIA-MK corpus [28], and the biomarker-relevant relations involving genes, diseases and treatments in the EU-ADR corpus [23]. Leveraging the previously-added core MK annotations in the GENIA-MK corpus, we explored how these can contribute to the accurate recognition of New Knowledge and Research Hypothesis. Specifically, nosotros accept introduced new approaches for predicting the values of the core Knowledge Blazon and Knowledge Source dimensions, demonstrating an improvement over the former state of the fine art for Knowledge Blazon. Nosotros later on use supervised methods to automatically detect New Noesis and Enquiry Hypothesis, incorporating the values of Knowledge Type, Knowledge Source and Uncertainty every bit features into the trained models.
Corpora
The GENIA-MK corpus consists of 1 yard MEDLINE abstracts on the discipline of transcription factors in human claret cells, which have been annotated with a range of entities and events that provide detailed, structured data about various types of biomolecular interactions that are described in text. In the GENIA-MK corpus, values for all five core MK dimensions are already manually annotated for all of the 36,000 events. The MK notation effort as well involved the identification of 'clue words', i.e., words or phrases that provide show for the assignment of values for detail MK dimensions. For instance, the give-and-take 'propose' would be annotated as a clue both for Doubtfulness and Knowledge Type, as information technology indicates that the information encoded in the event is stated based on a speculative assay of results.
The EU-ADR corpus consists of three sets of 100 MEDLINE abstracts, each obtained using different PubMed queries aimed at retrieving abstracts that are likely to comprise iii specific types of relations (i.e., cistron-disease, gene-drug and drug-disease), the one-time two of which can exist important in discovering how dissimilar types of genetic data influence disease susceptibility and treatment response. The original annotation task involved identifying three types of entities, i.eastward., targets (proteins, genes and variants), diseases and drugs, together with relationships between these entity types, where these are present. In dissimilarity to the richness of the event representations in the GENIA-MK corpus, each relation annotation in the EU-ADR corpus consists only of links between entities of 2 specific types. Relations were annotated in 159 of the 300 abstracts selected for inclusion in the corpus.
Note of new knowledge and inquiry hypothesis
Equally an initial step of our work, subsets of GENIA-MK and Eu-ADR were manually enriched with additional annotations, which identify those events or relations respective to Research Hypotheses or New Cognition. Since high quality annotations are key to ensuring that authentic supervised models tin can be trained, we engaged with a number of experts and carried out an exploratory annotation practice prior to the the concluding annotation effort, in order to ensure the highest possible inter-annotator agreement (IAA).
Initially, we worked with 2 domain experts, a text mining researcher and a medical professional person. They added the novel MK annotations to events that had been automatically detected in sentences from full-text papers. Nosotros institute, however, that at that place were some issues with this annotation prepare-up. Firstly, we institute that events denoting Research Hypotheses and New Noesis were very thin in full papers. Secondly, we constitute that isolated sentences often provided bereft context for annotators to decide accurately whether or not the event described new noesis or a hypothesis. Finally, we found that errors in the automatically detected events were detracting the annotators' attending from the job at paw. Based on these findings, we decided not to pursue this apporach, and instead focussed our anotation efforts on annotating Research Hypotheses and New Knowledge in abstracts containing gold-standard, expert-annotated events and relations, whose quality had previously been verified. Since abstracts besides generally contain denser and more than consolidated statements of New Knowledge and Enquiry Hypotheses than full papers [32], nosotros besides expected that this approach would produce more than useful training data.
We then employed ii PhD students (both working in disciplines related to biological sciences) to carry out the side by side round of annotation work. Nosotros held regular meetings to discuss new annotations and provided feedback as necessary. A subset of the abstracts was doubly annotated by both annotators, allowing us to evaluate the annotation quality by calculating IAA using Cohen's Kappa [33].
Table two, which shows IAA at three different points during the annotation process, illustrates a steady increase in IAA as time progressed and as more discussions were held, demonstrating a convergence towards a common agreement of the guidelines past the 2 annotators. We go a concluding agreement of above 0.8 on most dimensions, indicating a strong level of agreement [34]. Annotation of Enquiry Hypothesis in the Eu-ADR corpus achieved slightly lower understanding of 0.761, indicating moderate agreement between the annotators [34]. At the end of the annotation process, the annotators were asked to revisit their earlier annotations to make revisions based on their enhanced understanding of the guidelines. Remaining discrepancies were resolved past the atomic number 82 author after consultation with both annotators.
Each analyst marked upwards 112 abstracts from the EU-ADR corpus (70 of which were doubly annotated), and 100 abstracts from the GENIA-MK corpus (fifty of which were doubly annotated). This resulted in a full of 150 GENIA-MK abstracts and 159 Eu-ADR abstracts annotated with New Cognition and Research Hypothesis. Statistics on the final corpus are shown in Tabular array 3.
Baseline method for new knowledge and research hypothesis
Thompson et al. [28] suggest a method for detecting new noesis and hypothesis based on automatic inferences from core MK values. Their inferences state that an event will exist an instance of new knowledge if the Knowledge Source dimension is equal to 'Electric current', the Uncertainty dimension is equal to 'L3' (equivalent to 'Certain' in our work, see below) and the Knowledge Type dimension is equal to either 'Observation' or 'Analysis'. Similarly, according to their inferences, an outcome will be an instance of Hypothesis if the Knowledge Type dimension is equal to 'Analysis' and Incertitude is equal to either 'L2' or 'L1' (which are both equivalent to 'Uncertain' in our work, see below).
We use these automated inferences every bit a baseline for our techniques. To best reflect the work of Thompson et al. [28], we apply their manually annotated values of Knowledge Blazon, Uncertainty and Knowledge Source for the GENIA-MK corpus. This allows us to compare our own work with previous efforts, too as providing a lower bound for the operation of a rule based system, which nosotros contrast with our supervised learning arrangement, as introduced in the next department.
A supervised method for extracting new knowledge and research hypothesis
We took a supervised arroyo to annotating events with instances of our target dimensions of New Knowledge and Research Hypothesis. According to the previously mentioned intrinsic links to the core MK dimensions of Knowledge Source, Knowledge Type and Dubiety, we incorporated the values of these dimensions as features that are used by our classifiers.
Uncertainty
For the Doubtfulness dimension, we used an existing system [three]. Adopting their treatment of Uncertainty, we differ from Thompson et al. [28] as we utilize merely have 2 levels (certain and uncertain), as opposed to their three levels (L3 = certain, L2 = probable and L1 = possible). Since our development of the original MK scheme, we take experimented and discussed different levels of granularity for this dimension with domain experts, and accept concluded that the differences between the two different levels of uncertainty in our original scheme (i.e., L1 and L2) are ofttimes too subtle to be of benefit in practical scenarios. Therefore, it was decided to focus instead on the binary distinction between certainty and dubiousness.
Knowledge source
The Knowledge Source dimension distinguishes events that encode information originating from an author'southward own work (Knowledge Source = Electric current), from those describing work from an alternative source (Noesis Source = Other). Such data is relevant to the identification of New Knowledge, as a relation or issue that corresponds to information reported in groundwork literature definitely cannot be classed as New Noesis. Attribution by commendation is a well-established practice in the scientific literature. Citations can be expressed heterogeneously between documents, only are typically expressed homogeneously within a single document, or a collection of similarly-sourced documents. We used regular expressions to identify citations following the work of Miwa et al. [35], in conjunction with a ready of clue expressions that aim to detect groundwork knowledge in cases where no citation is given. These include statements such as 'we previously showed…' or 'as seen in our former work'. Whereas Miwa et al. utilize a supervised learning method to detect Knowledge Source, nosotros establish that supervised learning approaches overfitted to the overwhelming majority class (Source =Current) in the GENIA-MK dataset. This meant that we suffered poor performance on unseen data, such as the Eu-ADR corpus. To alleviate this, nosotros simply used the regular expression feature as described above as an indicator of Knowledge Source being 'Other'. A list of our regular expressions and inkling expressions is made bachelor equally function of the Additional files.
Cognition type
For Knowledge Type, we used an implementation of the random forest algorithm [36] from the WEKA library [37]. We used the standard parameters of the random forest in the WEKA implementation. We used ten-fold cross validation for all experiments, and results are reported as the macro-average across the ten folds. Nosotros treat the identification of Knowledge Type as a multi-class classification trouble and we took a supervised approach to categorising relations and events in the two corpora according to the values of the Noesis Type dimension. To facilitate this, we used the following seven types of features to generate information virtually each event from GENIA-MK and relation from EU-ADR:
- 1
Sentence features describing the sentence containing the relation or event.
- 2
Structural features, inspired past the structural differences of events.
- 3
Participant features, representing the participants in the relation or event.
- 4
Lexical features, capturing the presence of clue words.
- 5
Constituency features, respective to relationships between a inkling and the relation or upshot, based on the output of a parser.
- 6
Dependency features, which capture relationships betwixt a clue and the relation or event based on the dependency parse tree.
- 7
Parse tree features, which pertain to the construction of the dependency parse tree.
These features are further described in Table 4. To generate these features, nosotros fabricated utilise of the GENIA Tagger [38] to obtain role-of-speech (POS) tags, and the Enju parser [39] to compute syntactic parse copse.
Research hypotheses and new knowledge
We followed a similar approach to predicting Research Hypothesis and New Knowledge values to that described above for the recognition of Noesis Type. We used the same features and also a random wood classifier. We incorporated additional features encoding the Noesis Source, Cognition Type and Uncertainty of each relation and event.
Clue lists, adult past the authors, were used for the detection of Knowledge Type, Knowledge Source and Doubt. For the detection of New Knowledge and Hypothesis, a combination of clues for Cognition Type, Cognition Source and Uncertainty was used. The exact inkling lists are available in the Boosted files.
Results
In this section, we present our experiments to detect the core Knowledge Blazon dimension, in which we decide the most appropriate feature subset to use, and too compare our arroyo to previous work. We then extend this arroyo to recognise New Knowledge and Inquiry Hypothesis, and to evaluate our results in terms of precision Footnote one, recall, Footnote 2 and F1-score. Footnote iii
Our experiments to predict the correct values for the Knowledge Type dimension were carried out only using the events in the GENIA-MK corpus, given that Knowledge Type is but annotated in this corpus and not in EU-ADR. We performed an analysis of each feature subset to assess its touch on classifier performance, as shown in Table 5. It was established that removing each of the participant, dependency and parse tree features individually leads to a modest increase in F1-score. Nevertheless, in subsequent experiments, we found that removing all three features does not pb to an additional increase in functioning. We therefore used all feature subsets except for the participant features in subsequent experiments, equally this gave usa the all-time overall score. By observing the isolated performance of each feature subset, we as well adamant that the lexical and structural features are both pregnant individual contributors to the final classification score. In Table 6, we compare the operation of our classifier in predicting each Noesis Type value with the results obtained by the state-of-the-fine art method developed by Miwa et al. [31]. The results reveal that our approach achieves an increase in F1-score over Miwa et al. [31] by a minimum of 0.063 for the Other value, and a maximum of 0.113 for Method. We also run into corresponding performance boosts in terms of precision and remember. Although we observe a pocket-sized drop in think for Fact and Method, this is beginning by an increase in precision of 0.210 and 0.299, respectively.
To further investigate our improvement over Miwa et al., nosotros swapped our classifier for an SVM, but used all the same features. The results of this are shown in Table 6. This experiment allowed u.s. to compare the performance of our features with the same classification algorithm (SVM), as used by Miwa et al. We notation that using the SVM with our features leads to a similar, but slightly worse operation in terms of F1 score than Miwa et al. on all categories except for Assay. Even so we practise notation an increase in Precision for certain categories (Method, Investigation, Analysis) and Recall for others (Observation, Assay). Every bit our features are tuned for performance with a Random Forest, this experiment demonstrates that different types of classifiers may require different feature sets to attain optimal performance.
To further empathize the bear on of our characteristic categories, we analysed the correlation of each feature with each Cognition Type value. This immune us to determine the most informative features for each Knowlegde Type value, as displayed in Table vii. In addition to this, nosotros calculated the average rank of each feature across all Knowledge Blazon values. This measure shows us the most globally useful features. The tiptop features according to boilerplate rank are displayed in Table viii.
For the identification of New Cognition and Research Hypothesis, we firstly performed 10-fold cantankerous validation on each corpus (GENIA-MK and Eu-ADR) and for each dimension of interest, yielding the results in Table 9. In our presentation of results, we term the negative class for New Knowledge as "Other Cognition", as it covers a number of categories that we wish to exclude (east.g., groundwork knowledge, irrelevant knowledge, supporting knowledge, etc.). We were able to allocate Cognition Type for relations in the European union-ADR corpus by setting the event and participant features to sensible static values — east.chiliad., the number of participants in a relation is e'er two.
Discussion
In Table 5, we observed the effects of each characteristic subset on the overall classification score for Knowledge Type. We found that the structural, lexical and sentence features had particularly potent contributions. The structural features encoded information about the construction of the effect and were particularly useful for identifying events that participate in other events. The lexical features depended on the identification of clue words that appeared in the context of relations and events, which provided important evidence to determine the most appropriate MK values to assign. Yet, the usefulness of this feature is direct tied to the comprehensiveness of the list of clues associated with each MK value.
In addition to the characteristic analysis in Tabular array 5, we besides provided additional analysis of each specific characteristic in Tables vii and viii. In line with the results from Table 5, these tables demonstrate that the structural features were particularly informative for near classes, too as the lexical, dependency and constituency features. Information technology is interesting to notation from Tabular array vii that no individual feature is particularly strongly correlated with each grade label. This supports our ensemble arroyo and indicates that multiple feature sources are needed to achieve a loftier nomenclature accuracy. In addition, nosotros can see that the correlations drop fairly quickly for all classes - indicating that not all features are used for every class. Finally, nosotros can see that dissimilar features occur in each column (with some repetition), indicating that certain features were more useful for specific classes.
For the classification of New Cognition and Hypothesis, we incorporated features denoting the existing meta-knowledge values of the upshot for Cognition Source, Knowledge Type and Doubt. Knowledge Source indicates whether an event is current to the research in question, or whether information technology describes background piece of work. This may be specially helpful for the detection of new knowledge, since it is clear that whatsoever background work cannot be classified equally new knowledge. Knowledge Type classifies events equally falling into one of six categories, i.east., Fact, Method, Analysis, Investigation, Observation or Other. The Investigation category may have contributed to the classification of Hypothetical events, whereas Observation and Analysis may accept helped to contribute to the detection of New Knowledge events. The Fact, Method and Other categories could have helped the system to determine that events did not convey either hyperdimension. Finally, Uncertainty describes whether an author presented their results with confidence in their accuracy, or with some hedging (eastward.chiliad., employ of the words may, possibly, perhaps, etc.). This dimension could take helped to contribute to the classification of hypotheses (where an author states that an event may occur) and new knowledge, where nosotros expect an writer to be certain about their results.
We compared our results to those of Miwa et al. (2012) in Table 6, where nosotros showed a consistent improvement of precision, recall and F1-score beyond all categories. Their system used support vector machines (SVMs) for classification, with a set of features similar to our lexical and structural features. However, our work used an enhanced ready of features as well every bit a random forest classifier, which is typically robust in high dimensional classification issues [36]. These two factors contributed to our system'due south improved performance. Our system yielded an average increase in precision of 0.156, but only yielded an average increase in call back of 0.04. This implies that the utilise of a random woods and boosted features mainly helped to ensure that the arrangement returned results which are consistently correct. For both the 'Fact' and 'Method' Knowledge Type values, our system yielded a slight dip in recall compared to previous work. Still, this was coupled with an increase in precision of 0.210 and 0.298, respectively.
To empathise the relative contributions fabricated by our switches in both feature set and type of classifier, compared to previous work, we analysed the operation of our system when using an SVM with our features instead of a Random Wood. We attained a similar performance to Miwa et al. using our feature set and SVM, although some values were lower than those reported past Miwa et al. This implies that our decision to use a dissimilar type of classifier to Miwa et al. (i.e., Random Forest instead of SVM) was the primary reason backside our improved functioning. Different feature sets are better suited to different types of classifiers, and our feature ready was carefully selected (as documented in Table 5) to be performant with a Random Forest. Miwa et al.'s features were every bit selected to perform well with an SVM. Nosotros have shown similar results in prior work for a task on detecting metaknowledge for negated bio-events [29], where we showed that tree-based methods, including the Random Woods, outperformed other techniques such equally the SVM for detecting the negation dimension of metaknowledge.
Nosotros illustrated our results for the identification of the novel dimensions New Knowledge and Inquiry Hypothesis in Table 9. These showed strong performance across both corpora and clan types (events and relations). The results for the GENIA-MK corpus (events) outperformed those for the European union-ADR corpus (relations). This was most likely due to the difference in size between the corpora. There are over ten times more annotated events in the subset of GENIA-MK that nosotros annotated than relations in the subset of European union-ADR (6899 events vs. 622 relations). The fact that nosotros annotated all of the 159 abstracts available in the European union-ADR corpus and only 150 abstracts from GENIA-MK indicates that outcome structures are more densely packed in GENIA-MK than relations in European union-ADR.
In particular, the Eu-ADR corpus yielded a poor call back value for Research Hypotheses. There were only 38 examples of relations annotated as Research Hypothesis in the EU-ADR corpus. Our annotators reported that several relations occuring in hypothetical contexts appeared to have been missed by the original annotators of the European union-ADR corpus, which may be the cause of this sparsity. Even so, calculation boosted relations to the corpus was across the scope of the current work. The precision for the prediction of Enquiry Hypothesis in the EU-ADR corpus was 1.00, indicating that of those relations automatically classified as Research Hypothesis, all were indeed Inquiry Hypotheses (i.e., in that location were no faux positives). It is ordinarily the instance in minority grade situations that a classifier will tend towards classifying instances as the majority class (i.e., favouring false negatives over imitation positives), and then this event is expected. Nosotros chose non to perform subsampling of the majority class, every bit the density of Research Hypotheses or New Cognition in our training information is reflective of the density we would wait in other biomedical abstracts.
Our corpus has focussed on identifying Research Hypotheses and New Knowledge in biomedical abstracts. However, it has been shown elsewhere that full texts contain more data than abstracts alone [40]. Whilst our hereafter goal is to additionally facilitate the recognition of New Knowledge and Research Hypothesis in full papers, our decision to focus initially on abstracts was motivated past the findings of our earlier rounds of annotation. These initial annotation efforts revealed that the density of the types of MK that form the focus of the current paper are very low in total papers and are consequently difficult for annotators to reliably identify. Therefore we chose to apply abstracts, where the density was college, since the availability of as many examples equally possible of relevant MK was important for the evolution of our methods. We noted that abstracts adequately consistently mention the main Inquiry Hypotheses and New Noesis outcomes from a paper. However, further information may exist available in the total paper that has not been mentioned in the abstract. To access this data nosotros volition demand to further adapt our techniques and develop annotated corpora of full papers — this is left for futurity work.
Error analysis
Finally, nosotros nowadays an assay of some mutual errors that our system makes and strategies for overcoming these in future piece of work. In the following sentence, the event centred on "regulation" was marked as Non-Hypothetical by the annotators, simply our system recognised it every bit a Hypothetical event.
To continue our investigation of the cellular events that occur post-obit human being CMV (HCMV) infection, we focused on the regulation of cellular activation following viral bounden to human monocytes.
Event ID: | E1 |
Trigger: | regulation |
Theme: | activation following viral binding |
Cause: | North/A |
Clue: | focused |
Information technology is likely that this issue was marked equally a hypothesis by the system because of the words 'investigation' and 'focused' that occur before it. However in this example, the main hypothesis that the annotators have marked is on the event centred on 'occur' preceding the event centred around 'focused'. To overcome this in futurity piece of work, we could implement a classification strategy that takes into account MK information that has already been assigned to other events that occur in the context of the focussed event. A conditional random field or deep learning model could be used for sequence labelling to accomplish this.
The second mistake, which concerns the effect centred on "furnishings" in the following judgement, was marked as Hypothetical by our annotators, but was classified as Not-Hypothetical by our organisation.
MATERIAL AND METHODS: In the present study, we analyzed the effects of CyA, aspirin, and indomethacin \(\dots \)
Effect ID: | E2 |
Trigger: | effects |
Theme: | Cya, aspirin, and indomethacin |
Crusade: | Northward/A |
Inkling: | nowadays written report |
This outcome is clearly stating the subject of the authors' investigation, and so should be marked as hypothesis. It is probable that our system was confused past the preceding section heading, which led it to believe that this was office of the background or methods, and non a statement of the authors' intended inquiry goals. To overcome this, we could identify these section headings automatically and either exclude them from the text to exist analysed, or use them equally actress features in our classification scheme.
In our third example error, the effect in the sentence below is centred on the phrase "consequence in decreased". The event was marked equally new noesis by the annotators, simply the system was not able to recognise it as such.
Down-regulation of MCP-1 expression by aspirin may upshot in decreased recruitment of monocytes into the arterial intima below stressed EC.
Event ID: | E3 |
Trigger: | result in decreased |
Theme: | recruitment of monocytes |
Cause: | Downward-regulation of MCP-one expression |
by aspirin | |
Clue: | Northward/A |
We believe that the cause of this classification errors is the unusual event trigger - the bulk of events only accept a single verb every bit their trigger. To aid the organization to better decide cases in which such events denote new knowledge, information technology would be necessary to further increment our corpus size, such that the grooming set includes a wider variety of trigger types. A further factor affecting the disability of the system to determine the new knowledge classification may take been be the lack of an appropriate new knowledge inkling. In this instance, the annotators most likely adamant this equally an example of new cognition due to information from the wider context of the discourse. We could improve our classifier by looking for clues in a wider window, or past looking for discourse clues that might point that the author is cartoon their conclusions.
The final example below concerns an event (centred on the verb "enhanced"), which was marked as 'other knowledge' by the annotators, merely which the organization determined to be an example of new knowledge.
Taken together, these information signal that the unexpected expression of megakaryocytic genes is a specific property of immortalized cells that cannot be explained merely by enhanced expression of Spi-1 and/or Fli-one genes
Event ID: | E4 |
Trigger: | expression |
Theme: | megakaryotic genes |
Cause: | N/A |
Inkling: | indicate |
Event ID: | E5 |
Trigger: | enhanced |
Theme: | expression of Spi-i and… |
Cause: | E4 |
Clue: | North/A |
In this instance, the event is somewhat problematic as regards the consignment of MK. Although it is clear both that the sentence is a last statement, and that there is some new knowledge independent inside it, the annotators chose not to mark the event with the trigger "enhanced" as new knowledge, indicating that they did not consider information technology to convey the main aspect of new cognition in this sentence. Interestingly, all the same, both annotators agreed with the arrangement that the event centred on the get-go instance of "expression" should be marked as an instance of new knowledge. The presence of the clue 'indicate' may be affecting the system's classification decision in both cases. A human annotator tin can distinguish that point is well-nigh relevant to 'expression', rather than 'enhanced', whereas our system was unable to brand this distinction.
Conclusions
We have presented a novel application of text mining techniques for the discovery of Research Hypotheses and New Knowledge at the level of events and relations. This constitutes the first written report into the awarding of supervised methods to assign these interpretative aspects at such a fine-grained level. Nosotros firstly showed that past applying a Random Forest classifier using a new feature set, we were able to accomplish a better operation than previous efforts in detecting Knowledge Type. We subsequently showed that the cadre MK dimensions of Cognition Type, Noesis Source and Uncertainty could feed into the training of classifiers that can predict whether events and relations represent Inquiry Hypotheses and New Cognition, with a high degree of accuracy. Our techniques can be incorporated into a arrangement that allows researchers to quickly filter information contained within the abstracts of research articles, every bit shown in previous literature [3]. Our methods generally favour precision on the positive form (i.due east., Inquiry Hypothesis or New Knowledge). Specifically, we reach a precision of between 0.863 and 1.00 on all of the corpus experiments. This demonstrates that our approach is successful in avoiding the identification of false positives, thus allowing researchers to exist confident that instances of Research Hypothesis or New Knowledge identified by our method will usually exist correct.
Notes
-
the proportion of results returned by the system which are correct.
-
the proportion of correct results returned by the system as a fraction of all the correct results that should have been found.
-
the balanced harmonic hateful betwixt precision and recall, providing a single overall measure of performance.
Abbreviations
- ADR:
-
Adverse Drug Reaction
- F1:
-
F1 Score (The harmonic mean between Precision and Recall)
- IE:
-
Information Extraction
- IAA:
-
Inter-Annotator Agreement
- MK:
-
Meta-Knowledge
- P:
-
Precision
- R:
-
Recall
- SVM:
-
Back up Vector Motorcar
- TM:
-
Text Mining
References
-
Jiawen 50, Dongsheng L, Zhijian T. The expression of interleukin-17, interferon-gamma, and macrophage inflammatory protein-3 alpha mRNA in patients with psoriasis vulgaris. J Huazhong Academy Sci Technol [Med Sci]. 2004; 24(3):294–6. https://doi.org/10.1007/BF02832018.
-
Scharffetter-Kochanek Thousand, Singh One thousand, Tasdogan A, Wlaschek Thousand, Gatzka Chiliad, Hainzl A, Peters T. Reduction of CD18 promotes expansion of inflammatory gd T cells collaborating with CD4 T cells in chronic murine psoriasiform dermatitis. J Immunol. 2013; 191:5477–88. https://doi.org/x.4049/jimmunol.1300976.
-
Zerva C, Batista-Navarro R, Twenty-four hour period P, Ananiadou S. Using doubt to link and rank evidence from biomedical literature for model curation. Bioinformatics. btx466. https://doi.org/10.1093/bioinformatics/btx466.
-
Stenetorp P, Pyysalo Due south, Topić One thousand, Ohta T, Ananiadou S, Tsujii J. Brat: a web-based tool for NLP-assisted text annotation. In: Proceedings of the Demonstrations at the 13th Conference of the European Chapter of the Association for Computational Linguistics. Clan for Computational Linguistics: 2012. p. 102–107.
-
Agarwal S, Yu H, Kohane I. BioNØT: A searchable database of biomedical negated sentences. BMC Bioinformatics. 2011; 12(i):420. https://doi.org/10.1186/1471-2105-12-420.
-
Medlock B, Briscoe T. Weakly supervised learning for hedge classification in scientific literature. In: Proceedings of the 45th Almanac Meeting of the Association of Computational Linguistics. Prague, Czech Republic: Association for Computational Linguistics: 2007. p. 992–ix. http://www.aclweb.org/anthology/P07-1125.
-
Vincze V, Szarvas G, Farkas R, Móra G, Csirik J. The BioScope corpus: biomedical texts annotated for uncertainty, negation and their scopes. BMC Bioinformatics. 2008; ix(eleven):ane–nine.
-
Malhotra A, Younesi E, Gurulingappa H, Hofmann-Apitius M. 'HypothesisFinder:' a strategy for the detection of speculative statements in scientific text. PLOS Comput Biol. 2013; 9(7):1–10. https://doi.org/10.1371/journal.pcbi.1003117.
-
Ruch P, Boyer C, Chichester C, Tbahriti I, Geissbühler A, Fabry P, Gobeill J, Pillet Five, Rebholz-Schuhmann D, Lovis C, et al. Using argumentation to extract primal sentences from biomedical abstracts. Int J Med Inform. 2007; 76(2):195–200.
-
Teufel South, Carletta J, Moens M. An note scheme for discourse-level argumentation in research articles. In: Proceedings of the Ninth Briefing on European Chapter of the Clan for Computational Linguistics. EACL '99. Stroudsburg: Association for Computational Linguistics: 1999. p. 110–7. https://doi.org/10.3115/977035.977051.
-
Mizuta Y, Collier N. Zone identification in biological science articles as a ground for information extraction. In: Proceedings of the International Joint Workshop on Tongue Processing in Biomedicine and Its Applications. JNLPBA '04. Stroudsburg: Clan for Computational Linguistics: 2004. p. 29–35. http://dl.acm.org/citation.cfm?id=1567594.1567600.
-
Burns G, Dasigi P, de Waard A, Hovy EH. Automated detection of soapbox segment and experimental types from the text of cancer pathway results sections. Database. 2016; 2016:122. https://doi.org/10.1093/database/baw122.
-
Liakata K, Saha S, Dobnik S, Batchelor C, Rebholz-Schuhmann D. Automatic recognition of conceptualization zones in scientific articles and two life science applications. Bioinformatics. 2012; 28(7):991. https://doi.org/ten.1093/bioinformatics/bts071.
-
Simsek D, Buckingham Shum S, Sandor A, De Liddo A, Ferguson R. Xip dashboard: visual analytics from automated rhetorical parsing of scientific metadiscourse. In: 1st International Workshop on Discourse-Centric Learning Analytics. Leuven: 2013.
-
Bundschus Chiliad, Dejori M, Stetter M, Tresp V, Kriegel HP. Extraction of semantic biomedical relations from text using provisional random fields. BMC Bioinformatics. 2008; 9(1):207.
-
Bravo A, Piñero J, Queralt-Rosinach N, Rautschka LIM. Furlong: Extraction of relations between genes and diseases from text and large-scale data analysis: implications for translational research. BMC Bioinformatics. 2015; 16(1):55.
-
Verspoor KM, Heo EG, Kang KY, Song Grand. Establishing a baseline for literature mining man genetic variants and their relationships to illness cohorts. BMC Med Inf Decis Mak. 2016; xvi(i):68.
-
Nedellec C. Learning language in logic-genic interaction extraction challenge. In: Proceedings of the ICML-2005 Workshop on Learning Language in Logic (LLL05): 2005. p. 31–7.
-
Kim JD, Pyysalo S, Ohta T, Snobby R, Nguyen North, Tsujii J. Overview of BioNLP shared task 2011. In: Proceedings of the BioNLP Shared Task 2011 Workshop. Portland: Association for Computational Linguistics: 2011. p. one–6.
-
Pyysalo S, Ginter F, Heimonen J, Björne F, Boberg F, Järvinen F, Salakoski T. BioInfer: a corpus for information extraction in the biomedical domain. BMC Bioinformatics. 2007; 8(1):50.
-
Sanchez-Graillet O, Poesio M. Negation of protein—poly peptide interactions: analysis and extraction. Bioinformatics. 2007; 23(13):424. https://doi.org/ten.1093/bioinformatics/btm184.
-
Kim JD, Ohta T, Tsujii J. Corpus annotation for mining biomedical events from literature. BMC Bioinformatics. 2008; ix(1):1–25.
-
Van Mulligen EM, Fourrier-Reglat A, Gurwitz D, Molokhia M, Nieto A, Trifiro G, Kors JA, Furlong LI. The Eu-ADR corpus: annotated drugs, diseases, targets, and their relationships. J Biomed Inform. 2012; 45(5):879–84.
-
Björne J, Ginter F, Salakoski T. Academy of Turku in the BioNLP'xi shared task. BMC Bioinformatics. 2012; 13(xi):4.
-
Kilicoglu H, Bergler South. Biological upshot composition. BMC Bioinformatics. 2012; 13(xi):vii.
-
Thompson P, Nawaz R, McNaught J, Ananiadou Due south. Enriching news events with meta-knowledge data. Lang Resour Eval. 2016:1–30. https://doi.org/10.1007/s10579-016-9344-9.
-
Walker C, Strassel Due south, Medero J, Maeda G. ACE 2005 multilingual training corpus. Philadelphia: Linguistic Data Consortium; 2006.
-
Thompson P, Nawaz R, McNaught J, Ananiadou S. Enriching a biomedical event corpus with meta-knowledge annotation. BMC Bioinformatics. 2011; 12(one):ane–xviii.
-
Nawaz R, Thompson P, Ananiadou Southward. Negated BioEvents: Assay and identification. BMC Bioinformatics. 2013; 14(one):xiv. https://doi.org/10.1186/1471-2105-14-14.
-
Nawaz R, Thompson P, Ananiadou South. Something old, something new: identifying knowledge source in bio-events. Int J Comput Linguist Appl. 2013; 4(1):129–44.
-
Miwa Chiliad, Thompson P, McNaught J, Kell DB, Ananiadou S. Extracting semantically enriched events from biomedical literature. BMC Bioinformatics. 2012; 13:108. https://doi.org/10.1186/1471-2105-13-108. Highly Accessed.
-
Nawaz R, Thompson P, Ananiadou S. Meta-cognition annotation at the consequence level: Comparing between abstracts and full papers. In: Proceedings of the Third Workshop on Building and Evaluating Resources for Biomedical Text Mining (BioTxtM 2012): 2012. p. 24–31.
-
Cohen J. A coefficient of agreement for nominal scales. Educ Psychol Meas. 1960; 20(i):37–46. https://doi.org/ten.1177/001316446002000104.
-
McHugh ML. Interrater reliability: the kappa statistic. Biochemia medica. 2012; 22(3):276–82.
-
Miwa Grand, Sætre R, Kim JD, Tsujii J. Event extraction with complex upshot nomenclature using rich features. J Bioinforma Comput Biol. 2010; viii(01):131–46.
-
Breiman L. Random forests. Machine Learning. 2001; 45(1):5–32.
-
Hall M, Frank Eastward, Holmes Grand, Pfahringer B, Reutemann P, Witten IH. The WEKA information mining software: An update. SIGKDD Explor Newsl. 2009; 11(ane):10–18. https://doi.org/10.1145/1656274.1656278.
-
Tsuruoka Y, Tateishi Y, Kim JD, Ohta T, McNaught J, Ananiadou S, Tsujii J. Developing a robust part-of-speech tagger for biomedical text. Berlin, Heidelberg: Springer; 2005, pp. 382–92. Advances in Information science: 10th Panhellenic Conference on Information science, PCI 2005, Volas, Greece, November xi-13, 2005.
-
Miyao Y, Tsujii J. Feature wood models for probabilistic HPSG parsing. Comput Linguist. 2008; 34(one):35–80. https://doi.org/10.1162/coli.2008.34.1.35.
-
Schuemie MJ, Weeber One thousand, Schijvenaars BJA, van Mulligen EM, van der Eijk CC, Jelier R, Mons B, Kors JA. Distribution of data in biomedical abstracts and full-text publications. Bioinformatics. 2004; 20(16):2597–604. https://doi.org/x.1093/bioinformatics/bth291.
Acknowledgements
The authors wish to give thanks the annotators involved in creating the dataset for this paper, without whom this research would not accept been possible. Out thanks also go to the reviewers for their considered feedback on our research.
Funding
The authors of this work were funded by the European Commission (an Open Mining Infrastructure for Text and Data. OpenMinTeD. Grant: 654021), the Medical Research Council (Manchester Molecular Pathology Innovation Centre. MMPathIC Grant: MR/N00583X/ane) and the Biotechnology and Biological Sciences Research Council (Enriching Metabolic PATHwaY models with evidence from the literature. EMPATHY. Grant: BB/M006891/ane). The funders played no part in either the design of the study or the collection, assay, and interpretation of data, or in writing the manuscript.
Availability of data and materials
The datasets generated and analysed during the current study are available as Boosted files to this paper.
Author information
Affiliations
Contributions
MS ran the primary experiments, performed the analysis of the results and participated in authoring the newspaper. RB helped with the design of the experiments and authoring the paper. PT contributed work on the training of the EU-ADR corpus equally well as participating in the authorship of the newspaper. RN contributed to the experimental pattern, guidelines for the annotators and participated in the authorship of the paper. JM and SA jointly supervised the research and participated in authoring the newspaper. All authors read and approved the final version of this manuscript prior to publication.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
No ethics approval was required for whatever element of this study.
Consent for publication
Not Applicative.
Competing interests
The authors declare that they take no competing interests.
Publisher'due south Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Additional files
Boosted file i
The notation guidelines that were given to annotators for reference. (PDF 830 kb)
Boosted file ii
A tabular array providing an in depth description of each feature. (PDF 32 kb)
Additional file 3
Read me documentation explaining the structure of the clue files. (TXT 4 kb)
Additional file iv
The clues used to detect the Analysis component of the Knowledge Type meta-knowledge dimension. (FILE 3 kb)
Additional file v
The clues used to discover the Fact component of the Cognition Blazon meta-noesis dimension. (FILE 4 kb)
Boosted file vi
The clues used to notice the Investigation component of the Noesis Type meta-noesis dimension. (FILE ii kb)
Additional file 7
The clues used to detect the Method component of the Cognition Type meta-noesis dimension. (FILE 4 kb)
Additional file 8
The clues used to detect the Observation component of the Knowledge Type meta-knowledge dimension. (FILE 4 kb)
Additional file nine
The clues used to detect the Other component of the Knowledge Source meta-knowledge dimension. (FILE 1 kb)
Boosted file 10
The clues used to discover the Uncertain component of the Certainty Level meta-knowledge dimension. (FILE four kb)
Rights and permissions
Open Access This commodity is distributed under the terms of the Creative Eatables Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted utilize, distribution, and reproduction in whatsoever medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this commodity, unless otherwise stated.
Reprints and Permissions
About this commodity
Cite this commodity
Shardlow, 1000., Batista-Navarro, R., Thompson, P. et al. Identification of research hypotheses and new noesis from scientific literature. BMC Med Inform Decis Mak 18, 46 (2018). https://doi.org/10.1186/s12911-018-0639-1
-
Received:
-
Accepted:
-
Published:
-
DOI : https://doi.org/10.1186/s12911-018-0639-1
Keywords
- Text mining
- Events
- Meta-knowledge
- Hypothesis
- New knowledge
Source: https://bmcmedinformdecismak.biomedcentral.com/articles/10.1186/s12911-018-0639-1
Posted by: buchanmusur1997.blogspot.com
0 Response to "How To Find A Hypothesis In An Article"
Post a Comment