Text Relatedness using Word and Phrase Relatedness

Md Rashadul Hasan Rakib
Dr. Aminul Islam
Dr. Evangelos Milios

Text is composed of words and phrases. In bag-of-word model (BoW), phrases in texts are split into words that might lose the inner semantics of phrases, can give inconsistent relatedness score between two texts. Our objective is to apply phrase relatedness in conjunction with word relatedness on text relatedness task to improve the result. To measure phrase relatedness we propose an unsupervised function f, using sum-ratio (SR) technique. To compute word relatedness we adopt an existing state-of-the-art unsupervised word relatedness method based on Google tri-gram.

A full report can be found in http://dalspace.library.dal.ca/handle/10222/54044