The abundance of user generated content on social networks provides the opportunity to build models that are able to accurately and effectively extract, mine and predict users’ interests with the hopes of enabling more effective user engagement, better quality delivery of appropriate services and higher user satisfaction. While traditional methods for building user profiles relied on AI-based preference elicitation techniques that could have been considered to be intrusive and undesirable by the users, more recent advances are focused on a non-intrusive yet accurate way of determining users’ interests and preferences. In this tutorial, we will cover five important aspects related to the effective mining of user interests:
- The information sources that are used for extracting user interests
- Various types of user interest profiles that have been proposed in the literature
- Techniques that have been adopted or proposed for mining user interests
- The scalability and resource requirements of the state of the art methods
- The evaluation methodologies that are adopted in the literature for validating the appropriateness of the mined user interest profiles. We will also introduce existing challenges, open research question and exciting opportunities for further work.
SCHEDULE [Presentation Slides]
Session A [30 Minutes]: Background and Introduction to Theory of User Interest Mining
Session B [120 Minutes]: Techniques and Methods in User Interest Mining from Social Networks
Session C [30 Minutes]: Evaluation Methodologies, Future Directions and Open Challenges
MOTIVATION AND OVERVIEW
Mining user interests from user behavioral data is critical for applications such as online advertising. Based on user interests, service providers such as advertisers, can significantly reduce service delivery costs by offering the most relevant products (e.g., ads) to their customers. The challenge of accurately and efficiently identifying user interests has been the subject of increasing attention in the past several years. Early approaches were based on explicit input from individuals about their own interests. To avoid the extra burden of manually filling in and maintaining interest profiles, most methods in the past two decades have focused on the development of techniques that can automatically and unobtrusively determine users’ interests based on user behavioral data from data sources such as browsing history, page visits, the links they click on, the searches they perform and the topics they interact with. With the emergence and growing popularity of social networks such as blogging systems, wikis, social bookmarking, and microblogging services, many users are extensively engaged in at least some of these applications to express their feelings and views about a wide variety of social events/topics as they happen in real time by commenting, tagging, joining, sharing, liking, and publishing posts. This has made social networks an exciting and unique source of information about users’ interests. For instance, when looking at Twitter data during the first week of March 2019, the rivalry between the two English Premier League soccer clubs, Tottenham Hotspur and Arsenal, is a topic that has attracted a lot of discussion and interest. The development of techniques that can automatically detect such topics and model users’ interests towards them from online social networks would be highly important and have the potential to improve the quality of applications that work on a user modeling basis, such as filtering twitter streams, news recommendation and retweet prediction, among others. In this tutorial, we comprehensively introduce different strategies proposed in the literature, including our own work, for mining user interests from social networks with respect to the following five perspectives:
- Information Sources: The type of information sources used for extracting user interests from within social networks such as textual content (comments, #tags), social network structure, and
images. Additionally, we will review external background knowledge sources such as semantic web resources and knowledge graphs that have been incorporated by some researchers
to enhance the accuracy of user profiles.
- Profile Types: Most of works in user interest mining from social networks extract users’ explicit interests that are directly observable from user content. However, given the increasingly noticeable free-rider, some other techniques focus on passive users and extract their implicit interests by considering the interaction patterns between users and topics. There is another line of work that is dedicated to predicting users’ future interests instead of modeling current or past interests of users. These works are primarily focused on predicting if and which users would be interested in future topics on social networks. The accurate identification of users’ future interests on social networks allows one to perform future planning by studying how users will react if certain topics emerge in the future.
- Underlying Techniques: Previous methods have employed different techniques to build user profiles including neural embeddings, collaborative filtering, topic modeling, link prediction, regression, graph-based methods and Semantic Web technologies. We will review the techniques that have been used for identifying user interests and their different architectural variations.
- Scalability and Resource Requirements: Scalability is fundamental to user interest mining in order to accommodate torrents of social content. To this end, we provide a comprehensive overview of the speed-accuracy (efficiency-accuracy) trade-off when building user interest profiles for existing techniques of the literature. In particular, we present a critical review of those which scale to online vs. offline for massive streaming social content.
- Evaluation Methodology: Intrinsic vs. extrinsic evaluations are two main evaluation techniques, which have been widely adopted in the literature. Intrinsic evaluation helps to assess the quality of the constructed user interest profiles based on user studies while extrinsic evaluations measure the quality of the user interest profiles by looking at its impact on the effectiveness of other applications such as news recommendation and retweet prediction. We will review how each of these evaluation methodologies have been used in the literature.
Dr. Fattane Zarrinkalam is a Postdoctoral Fellow at the Laboratory of Systems, Software and Semantics (LS3) at Ryerson University, where she works on projects related to Semantic-enabled Social Network Analysis. During her PhD studies, she focused on the identification of social media users’ interests based on their individual and collective behavior on social networks especially Twitter. She has published her work in venues such as CIKM, ESWC and ECIR. In addition, she has published journal papers in premier journals including Information Retrieval and Information Processing and Management. Further, during her PhD, she was involved in two patent applications that were filed with USPTO.
Hossein Fani is a PhD student at the University of New Brunswick and research assistant at the Laboratory of Systems, Software and Semantics (LS3) at Ryerson University, Canada. Hossein has worked in the broad area of Social Network Analytics with special attention to content-based and temporal user community identification. Hossein has extensively published during his PhD studies in venues such as CIKM, ECIR and WSDM. His peer-reviewed journal publications also appear in Wiley’s Computational Intelligence and Springer’s Social Network Analysis and Mining. His PhD work has resulted in a provisional patent with USPTO. He has also reviewed for conferences such as ECIR and NAACL.
Dr. Ebrahim Bagheri is an Associate Professor and the Director for the Laboratory for Systems, Software and Semantics (LS3) at Ryerson University. He also holds a Canada Research Chair (Tier II) in Software and Semantic Computing as well as an NSERC Industrial Research Chair in Social Media Analytics. He has been PI on projects worth over $8M funded by partners such as NSERC, AIF and IBM. Most recently in 2018, he was the Program Committee co-Chair for the Canadian Conference on Artificial Intelligence and also the Industry Program Committee co-Chair at IEEE/ACM International Conference on ASONAM and an Area Chair for NAACL-HLT 2019. He also serves on the Program Committee of venues such as RecSys and ICWSM as well guest-editor for international journals such as Information Systems and Information Processing and Management