Publications

Lexical Semantic Relatedness for Twitter Analytics

Yue Feng and Hossein Fani and Ebrahim Bagheri and Jelena Jovanovic
Reference:
Links to Publication: [doi][www][pdf]
Abstract:
Existing work in the semantic relatedness literature has already considered various information sources such as WordNet, Wikipedia and Web search engines to identify the semantic relatedness between two words. We will show that existing semantic relatedness measures might not be directly applicable to microblogging content such as tweets due to i) the informality and short length of microblogging content, which can lead to shift in the meaning of words when used in microblog posts, ii) the presence of non-dictionary words that have their semantics defined/evolved by the Twitter community. Therefore, we propose the Twitter Space Semantic Relatedness (TSSR) technique that relies on the latent relation hypothesis to measure semantic relatedness of words on Twitter. We construct a graph representation of terms in tweets and apply a random walk procedure to produce a stationary distribution for each word, which is the basis for relatedness calculation. Our experiments examine TSSR from three different perspectives and show that TSSR is better suited for Twitter analytics compared to the standard semantic relatedness techniques.
Bibtex Entry:
@inproceedings{DBLP:conf/ictai/FengFBJ15, author = {Yue Feng and Hossein Fani and Ebrahim Bagheri and Jelena Jovanovic}, title = {Lexical Semantic Relatedness for Twitter Analytics}, booktitle = {27th {IEEE} International Conference on Tools with Artificial Intelligence, {ICTAI} 2015, Vietri sul Mare, Italy, November 9-11, 2015}, pages = {202--209}, year = {2015}, crossref = {DBLP:conf/ictai/2015}, url = {http://dx.doi.org/10.1109/ICTAI.2015.41}, doi = {10.1109/ICTAI.2015.41}, webpdf = {http://ls3.rnet.ryerson.ca/papers/Lexical_Semantic_Relatedness_for_Twitter_Analytics-ictai15.pdf}, timestamp = {Wed, 04 May 2016 12:17:10 +0200}, biburl = {http://dblp.uni-trier.de/rec/bib/conf/ictai/FengFBJ15}, bibsource = {dblp computer science bibliography, http://dblp.org}, abstract = {Existing work in the semantic relatedness literature has already considered various information sources such as WordNet, Wikipedia and Web search engines to identify the semantic relatedness between two words. We will show that existing semantic relatedness measures might not be directly applicable to microblogging content such as tweets due to i) the informality and short length of microblogging content, which can lead to shift in the meaning of words when used in microblog posts, ii) the presence of non-dictionary words that have their semantics defined/evolved by the Twitter community. Therefore, we propose the Twitter Space Semantic Relatedness (TSSR) technique that relies on the latent relation hypothesis to measure semantic relatedness of words on Twitter. We construct a graph representation of terms in tweets and apply a random walk procedure to produce a stationary distribution for each word, which is the basis for relatedness calculation. Our experiments examine TSSR from three different perspectives and show that TSSR is better suited for Twitter analytics compared to the standard semantic relatedness techniques.} }




Powered by WordPress