Publications

Learning to Rank Implicit Entities on Twitter

Hawre Hosseini and Ebrahim Bagheri
Reference:
Links to Publication: [www][pdf]
Abstract:
Linking textual content to entities from the knowledge graph has received increasing attention in the context of which surface form representations of entities, e.g., terms or phrases, are disambiguated and linked to appropriate entities. This allows textual content, e.g., social user-generated content, to be interpreted and reasoned on at a higher semantic level. However, recent research has shown that at least 15% of social user-generated content do not have explicit surface form representation of entities that they discuss. In other words, the subject of the content is only implied. For such cases, existing entity linking methods, known as explicit entity linking, cannot perform linking because entity surface form is missing. In this paper, we investigate how implicit entities within social content can be identified and linked. The contributions of our work include (1) modeling the problem of implicit entity linking as a learn to rank problem where knowledge graph entities are ranked based on their relevance to the input tweet, (2) the introduction and systematic classification of appropriate features for identifying implicit entities, (3) extensive evaluation of the proposed approach in comparison with existing state of the art as well as performing feature analysis over proposed features, and (4) the qualitative assessment of the root causes for mislabeled instances in our experiments and careful discussion on how mislabeled entity links can be addressed as a part of future work. In our experiments, we show that our proposed features are able to improve the state of the art over the standard Precision at 1 (P@1) metric.
Bibtex Entry:
@article{ipm2021a, title={Learning to Rank Implicit Entities on Twitter}, journal={Information Processing and Management}, author={Hawre Hosseini and Ebrahim Bagheri}, abstract = {Linking textual content to entities from the knowledge graph has received increasing attention in the context of which surface form representations of entities, e.g., terms or phrases, are disambiguated and linked to appropriate entities. This allows textual content, e.g., social user-generated content, to be interpreted and reasoned on at a higher semantic level. However, recent research has shown that at least 15\% of social user-generated content do not have explicit surface form representation of entities that they discuss. In other words, the subject of the content is only implied. For such cases, existing entity linking methods, known as explicit entity linking, cannot perform linking because entity surface form is missing. In this paper, we investigate how implicit entities within social content can be identified and linked. The contributions of our work include (1) modeling the problem of implicit entity linking as a learn to rank problem where knowledge graph entities are ranked based on their relevance to the input tweet, (2) the introduction and systematic classification of appropriate features for identifying implicit entities, (3) extensive evaluation of the proposed approach in comparison with existing state of the art as well as performing feature analysis over proposed features, and (4) the qualitative assessment of the root causes for mislabeled instances in our experiments and careful discussion on how mislabeled entity links can be addressed as a part of future work. In our experiments, we show that our proposed features are able to improve the state of the art over the standard Precision at 1 (P@1) metric.}, year = {2021}, webpdf={http://ls3.rnet.ryerson.ca/wiki/images/8/82/Ipm2021a.pdf}, url={https://www.journals.elsevier.com/information-processing-and-management} }




Powered by WordPress