SỬ DỤNG SỰ TƯƠNG QUAN VỊ TRÍ CỦA CÁC TỪ KHOÁ ĐỂ XÂY DỰNG PHƯƠNG PHÁP XẾP HẠNG CÁC ĐOẠN VĂN THEO TIÊU CHÍ LÀ MỨC ĐỘ GIỐNG VỚI MỘT ĐOẠN VĂN KHÁC

Phan Hiền

Phan Hiền Khoa Hệ Thống Thông Tin Kinh Doanh – ĐH Kinh Tế HCM

Abstract

Hiện nay hầu hết các phương pháp xem xét mức độ giống nhau giữa hai đoạn văn dựa trên bộ từ khóa đều ít quan tâm đến vai trò của tương quan vị trí. Bài báo này, nhóm chúng tôi giới thiệu một phương pháp tính mới và quan tâm nhiều đến cấu trúc bố trí các từ khóa trong đoạn văn. Chính xác là chúng tôi quan tâm đến sự tương quan về vị trí của các từ khóa. Phương pháp này giúp cho chúng tôi có thể tìm ra các sự tương đồng nhau về cấu trúc của hai đoạn văn cho dù vị trí hay khoảng cách giữa các từ khóa có thay đổi.

References

[1] Gomaa, W. H., & Fahmy, A. A. (2013). A survey of text similarity approaches. International Journal of Computer Applications, 68(13), 13-18.
[2] Diego A. Rodríguez Torrejón and José Manuel Martín Ramos, Text Alignment Module in CoReMo 2.1 Plagiarism Detector, Notebook for PAN at CLEF 2013. In Forner et al
[3] Leilei Kong, Haoliang Qi, Cuixia Du, Mingxing Wang, and Zhongyuan Han. Approaches for Source Retrieval and Text Alignment of Plagiarism Detection, Notebook for PAN at CLEF 2013. In Forner et a
[4] Prasha Shrestha and Thamar Solorio, Using a Variety of n-Grams for the Detection of Different Kinds of Plagiarism, Notebook for PAN at CLEF 2013. In Forner et al
[5] Brown, Peter F., et al. Class-based n-gram models of natural language, Computational linguistics 18.4 (1992): 467-479.
[6] Tomović, Andrija, Predrag Janičić, and Vlado Kešelj, n-Gram-based classification and unsupervised hierarchical clustering of genome sequences, Computer methods and programs in biomedicine 81.2 (2006): 137-153.
[7] Bela Gipp, Norman Meuschke, Citation Pattern Matching Algorithms for Citation-based Plagiarism Detection: Greedy Citation Tiling, Citation Chunking and Longest Common Citation Sequence, 2011.
[8] Eidoon, Zahra, Nasser Yazdani, and Farhad Oroumchian, A vector based method of ontology matching, Semantics, Knowledge and Grid, Third International Conference on. IEEE, 2007.
[9] G. Salton, A. Wong, and C. S. Yang, A Vector Space Model for automatic indexing, Communications of the ACM, vol. 18, no. 11, pp. 613-620, 1975.
[10] G. Salton and C. Buckley, Term-weighting approaches in automatic text retrieval, Information processing & management, vol. 24, no. 5, pp. 513-523, 1988.
[11] Lee, Dik L., Huei Chuang, and Kent Seamons, Document ranking and the vector-space model, Software, IEEE 14.2 (1997): 67-75.
[12] Tous, Rubén, and Jaime Delgado, A vector space model for semantic similarity calculation and OWL ontology alignment, Database and Expert Systems Applications. Springer Berlin Heidelberg, 2006.

SỬ DỤNG SỰ TƯƠNG QUAN VỊ TRÍ CỦA CÁC TỪ KHOÁ ĐỂ XÂY DỰNG PHƯƠNG PHÁP XẾP HẠNG CÁC ĐOẠN VĂN THEO TIÊU CHÍ LÀ MỨC ĐỘ GIỐNG VỚI MỘT ĐOẠN VĂN KHÁC

Abstract

References

Most read articles by the same author(s)