Credible Review Detection with Limited Information using Consistency Features

Online reviews provide viewpoints on the strengths and shortcomings of products/services, influencing potential customers' purchasing decisions. However, the proliferation of non-credible reviews -- either fake (promoting/ demoting an item), incompetent (involving irrelevant aspects), or biased -- entails the problem of identifying credible reviews. Prior works involve classifiers harnessing rich information about items/users -- which might not be readily available in several domains -- that provide only limited interpretability as to why a review is deemed non-credible. This paper presents a novel approach to address the above issues. We utilize latent topic models leveraging review texts, item ratings, and timestamps to derive consistency features without relying on item/user histories, unavailable for "long-tail" items/users. We develop models, for computing review credibility scores to provide interpretable evidence for non-credible reviews, that are also transferable to other domains -- addressing the scarcity of labeled data. Experiments on real-world datasets demonstrate improvements over state-of-the-art baselines.

Publications

  • Subhabrata Mukherjee, Sourav Dutta and Gerhard Weikum.
    Credible Review Detection with Limited Information using Consistency Features
    Proc. of the Machine Learning and Knowledge Discovery in Databases - European Conference (ECML-PKDD). 2016.
    [PDF] [SLIDES] [BIB]