Scroll Back to Top
United Arab Emirates University (UAEU) - Best University in Abu Dhabi, UAE

Accessibility Options

A A A eye

Night Reading

Langauge

Translate this page to

The UAEU is not responsible for the translation output by google
accessibility

New model for identifying reuse and plagiarism of Arabic text created by UAEU student

New model for identifying reuse and plagiarism of Arabic text created by UAEU student

Mon, 6 June 2022
United Arab Emirates University (UAEU) - Top Universities in Middle East

Pioneering techniques designed to detect whether Arabic text is being reused online – and help to identify plagiarism – have been developed by a graduate student at United Arab Emirates University (UAEU).

A thesis produced by Leena Mahmoud Ahmed Lulu, who is studying for a Ph.D. in Philosophy and conducted her research within the United Arab Emirates University College of Information Technology, has outlined how a new method based on document-fingerprinting can discover whether original Arabic content on the internet is being used again by others.

She conducted the research after saying little or no work had been carried out on discovering instances of text reuse – where existing documents are used, partially or wholly, to make new ones - and plagiarism in the Arabic language. Her research paper, which has now been published, also proposes a new web search tool to accompany the detection method, allowing lengthier queries to be entered when trying to assess if content is entirely original or has been used before.

“The Arabic language is a rich, morphological language that is among the most widely-used in the world, and on the Web,” said Lulu in her thesis.

“While the local text reuse [meaning only a small part of a document is coped and modified] detection problem has been mostly studied for Western languages, it is still one of the biggest challenges in the Arabic language and the research has remained quite limited. The results of this research can be thought of as rich tools for information analysts, to validate and assess information coming from uncertain sources.

“It is also time for Web users to become ‘fact inspectors’, by providing them with a tool that allows people to quickly check the validity and originality of statements and sources.”

A series of experiments were conducted to see how the “unique features” of the Arabic language affected the possibility of text reuse being detected using existing techniques. Lulu’s research paper explained that the most widely-used and effective approach is the detection of documents which share one or more “fingerprints”, a reliable indicator that they share some reused text.

However, it also pointed out that the linkage between Arabic letters, the right-to-left writing direction of Arabic text, and the flexibility of its word order, reduces the efficiency of such techniques – a problem the new fingerprinting model developed through her research, tailored for the Arabic language, aims to solve. “Our proposed method proved to be more robust for detecting text reuse, particularly when the sentence length increases toward the average sentence length in the Arabic language,” Lulu said.

“The system first creates an initial documents collection obtained from the Web, then applies the detection techniques for finding text reuse with a given input document from this collection.”

Describing text reuse detection as “an interesting and challenging area which has not been given the attention it deserves” in terms of the Arabic language, Lulu said possible future research could focus on areas including the development of new approaches that would allow the document fingerprints to be more targeted and specific, enhancing the effectiveness of the method. She also suggested a thesaurus of “paraphrased” Arabic sayings, which might otherwise go undetected through a text reuse search, could be compiled.

 

United Arab Emirates University (UAEU) - Best University in Abu Dhabi, UAE

عفوا

لايوجد محتوى عربي لهذه الصفحة

عفوا

يوجد مشكلة في الصفحة التي تحاول الوصول إليها

Nov 18, 2022