automatic text summary, statistic based summary, unsupervised learning summarization, summarization techniques, generated summaries, automatic generated summaries
Food-health articles (FHA) contain invaluable information for health promotion. However, extracting this information manually is a challenging process due to the length and number of articles published yearly. Automatic text summarization efficiently identifies useful information across large bodies of text which in turn speeds up the delivery of useful information from FHA. This research work aims to investigate the performance of statistical based summarization and graphical based unsupervised learning summarization in extracting useful information from FHA related to diabetes, cardiovascular disease and cancer. Various combinations of introduction, result and conclusion sections of three hundred articles were collected, preprocessed and used for evaluating the performance of the two summarization technique types. Generated summaries are compared to the original abstracts using two measures. The first quantifies the similarity of the generated summary to the abstract. The second measure gauges the coverage of the generated summary and the article abstract to the article sections. Overall, this experiment showed the automatically generated summaries are not comparable to the human-made abstracts found in FHA and there is room for improvement since the highest similarity of the generated to the written abstract was 52-57% and the sentence scoring of summarization could be optimized for various domains.
Faculty of Applied Science & Technology
© Ken Suong
Creative Commons License
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License.
Suong, K. (2019). Automatic extraction of useful information from food -health articles related to diabetes, cardiovascular disease and cancer (Unpublished thesis). Sheridan college, Ontario, Canada.