Publications

Learning Event Count Models with Application to Affiliation Ranking

Tam T. Nguyen and Ebrahim Bagheri
Reference:
Links to Publication: [www][pdf]
Abstract:
Event count prediction is a class of problems in time series analysis, which has been extensively studied over the years. Its applications range from the prediction of the number of publications in the scientific community to ATM cash withdrawal transaction prediction in the banking industry. However, in applied data science problems, using event count prediction models for real-world data often faces difficulties because the data violates not only the Poisson distribution assumption, i.e., the rate at which events occur should be constant, but the data is also relatively sparse, i.e., only a few event count values are greater than zero. Traditional techniques do not work well under these two conditions. To overcome these limitations, some researchers have proposed the generic autoregressive (AR) models for event count prediction, which work with non-constant event occurrence rates. As AR models solely use historical event count for forecasting, they might not be as flexible for incorporating domain knowledge. Moreover, and similarly, AR models may not work very well with the relatively short length-time series. In order to overcome these challenges, we propose a machine learning approach to address the event count prediction problem. We benchmark our proposed solution on the KDD Cup 2016 dataset by formalizing affiliation ranking as an event count time series prediction problem. We map the time series onto a highly dimensional state space and systematically apply the state-of-the-art machine learning algorithms to predict event counts. We then compare our proposed approach against solutions in the KDD Cup 2016 competition and show that our work outperforms the best models in this with an NDCG@20 score of 0.7573.
Bibtex Entry:
@inproceedings{cascon2017, author = {Tam T. Nguyen and Ebrahim Bagheri}, title = {Learning Event Count Models with Application to Affiliation Ranking}, booktitle = {27th Annual International Conference on Computer Science and Software Engineering (CASCON 2017)}, year = {2017}, url = {https://www.ibm.com/ibm/cas/cascon/}, webpdf = {http://ls3.rnet.ryerson.ca/wiki/images/6/6a/CASCON2017.pdf} abstract ={Event count prediction is a class of problems in time series analysis, which has been extensively studied over the years. Its applications range from the prediction of the number of publications in the scientific community to ATM cash withdrawal transaction prediction in the banking industry. However, in applied data science problems, using event count prediction models for real-world data often faces difficulties because the data violates not only the Poisson distribution assumption, i.e., the rate at which events occur should be constant, but the data is also relatively sparse, i.e., only a few event count values are greater than zero. Traditional techniques do not work well under these two conditions. To overcome these limitations, some researchers have proposed the generic autoregressive (AR) models for event count prediction, which work with non-constant event occurrence rates. As AR models solely use historical event count for forecasting, they might not be as flexible for incorporating domain knowledge. Moreover, and similarly, AR models may not work very well with the relatively short length-time series. In order to overcome these challenges, we propose a machine learning approach to address the event count prediction problem. We benchmark our proposed solution on the KDD Cup 2016 dataset by formalizing affiliation ranking as an event count time series prediction problem. We map the time series onto a highly dimensional state space and systematically apply the state-of-the-art machine learning algorithms to predict event counts. We then compare our proposed approach against solutions in the KDD Cup 2016 competition and show that our work outperforms the best models in this with an NDCG@20 score of 0.7573.} }




Powered by WordPress