Affiliations: [a] Department of Enterprise Engineering, University of Roma, Tor Vergata, Via del Politecnico 1, Roma, Italy
| [b] Amazon, Manhattan Beach, CA, USA
| [c] DISI, University of Trento, Povo (TN), Italy
Note: [1] This work was carried out when the author was employed at QCRI and the University of Trento.
Abstract: In recent years, forums offering community Question Answering (cQA) services gained popularity on the web, as they offer a new opportunity for users to search and share knowledge. In fact, forums allow users to freely ask questions and expect answers from the community. Although the idea of receiving a direct, targeted response from other users is very attractive, it is not rare to see long threads of comments, where only a small portion of them are actually valid answers. In many cases users start conversations, ask for other information, and discuss about things, which are not central to the original topic. Therefore, finding the desired information in a long list of answers might be very time-consuming. Designing automatic systems to select good answers is not an easy task. In many cases the question and the answer do not share a large textual content, and approaches based on measuring the question-answer similarity will often fail. A more intriguing and promising approach would be trying to define valid question-answer templates and use a system to understand whether any of these templates is satisfied for a given question-answer pair. Unfortunately, the manual definition of these templates is extremely complex and requires a domain-expert. In this paper, we propose a supervised kernel-based framework that automatically learns from training question-answer pairs the syntactic/semantic patterns useful to recognize good answers. We carry out a detailed experimental evaluation, where we demonstrate that the proposed framework achieves state-of-the-art results on the Qatar Living datasets released in three different editions of the Community Question Answering Challenge of SemEval.
Keywords: Community Question Answering, Kernel methods, Structured Language Learning