Fact Based Search Engine: News Fact Finder Utilizing Naive Bayes Classification

Title

Fact Based Search Engine: News Fact Finder Utilizing Naive Bayes Classification

Files

Document Type

Book Chapter

Description

There are a number of quality news sources available on the Internet. Searching through all these sources for facts related to a certain subject would be exhaustive for a user. We developed a niche sentence level search engine called News Fact Finder in order to provide users with factual information relevant to the query. Sentence level search is based on the intuition that if all the query words are within the same sentence, that result is more relevant than a result containing the query words in remote parts of the text. We therefore use suffix arrays which excel at exact substring matching to index our database. Our framework uses a Naive Bayes classifier for classification of sentences as facts and opinions. Ranking was performed at the document level, such that a document with many related facts would be ranked higher. News Fact Finder performs competitively on a large collection of news documents in providing relevant fact-based results to users. This is a novel approach to perform quality-based searching, ranking, indexing and categorization of news information.

ISBN

978-3-642-37688-7

Publication Date

2013

Publisher

Springer

Keywords

search engine, word list, news article, inverted index, sentence level, news fact finder

Disciplines

Computer Engineering | Engineering

Faculty

Faculty of Applied Science & Technology (FAST)

Terms of Use

Terms of Use for Works posted in SOURCE.

Creative Commons License

Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License.

Original Citation

Salmon R., Ribeiro C., Amarala S. (2013) Fact Based Search Engine: News Fact Finder Utilizing Naive Bayes Classification. In: Pasi G., Bordogna G., Jain L. (Eds) Quality Issues in the Management of Web Information. Intelligent Systems Reference Library, vol 50. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37688-7_6

Fact Based Search Engine: News Fact Finder Utilizing Naive Bayes Classification

Share

COinS