Distant Supervision: Mike Mintz, Steven Bills, Rion Snow and Dan Jurafsky
[called distant supervision] for relation extraction. This algorithm combines the advantages of Super- vised Information Extraction and Unsupervised Information Extraction to achieve greater precision.
Apart from this, they also analyze feature performance for better understanding of the roles of lexical and syntactic features. Some of the key observations from this research are :
1) A combination of syntactic and lexical features offers a substantial improvement in relation extraction precision over either of these feature sets on its own.
2) …show more content…
For constructing the classifier, negative training data is needed. For this, the authors of the paper create a feature vector during the training phase for an unrelated relation by randomly selecting entity pairs that do not appear in any freebase relation and extract features for them. Real care must be taken while randomly selecting the unrelated relations as skewed distribution might result in a decreased precision.
Consider the statements \Astronomer Edwin Hubble was born in Marsheld, Missouri" and \As- tronomer Edwin Hubble took birth in Marsheld, Missouri". Both these sentences convey the exact same thing. Similarly, consider the statements \The critic wrote a scathing review" and \A scathing review was written by the critic". One statement is in active voice and the other in passive voice.
Even though these sentences theoretically convey the same thing, in order to extract relations from them different set of features must be conjuncted. This is a computationally expensive process.
Instead of this, if we can identify the correlation between these sentences beforehand, we could reduce the number of computations by almost half.
In the research paper \Answer Extraction as