Review Article

A Comparative Study of Some Automatic Arabic Text Diacritization Systems

Table 7

Common mistakes done by the deep-net-based systems.

Common mistakes done by the deep-network-based systems
The predictedThe correctDescription

الوَلَدُ الوَاحِــــدِالْوَلَدُ الْـوَاحِــــــدُ(i) For “الوَاحِــــدِ”/Alwahidi/here it is a case-ending error. The right form is “الْـوَاحِــــــدُ”/Alwahidu/.
لِأَنَّهُ تَــــمْـــــلِــــــكْلِأَنَّهُ تَـــمَــــلَّكَ(i) Here the form “تَــــمْـــــلِــــــكْ”/Tamlik/is wrong because the internal diacritics and the case ending are wrong. The right answer would be “تَـــمَـــــلَّكَ”/Tamallaka/
ضَحِكَ ضَـــحِّـــــكَا أَكْـثَـــــرُضَحِكَ ضَــحِــكًــا أَكْـثَــرَ(i) For the word “ضَـــحِّـــــكَا”/Dahhikan/it has a case-ending error because a nunation is required in the kaf letter “كـ”. It has also an internal error where the germination should not appear. The correct form is “ضَــحِــكًــا”/dahikan/.
(ii) For the word “أَكْـثَـــــرُ”/Akthara/, it is a case-ending error. In this context, the correct form should be “أَكْـثَــرَ“.
تَـحْـــرِجُــــتْتَحَرَّجْتَ(i) For the word “تَـحْـــرِجُــــتْ”/Tahrijut/the whole form is incorrect. The correct form is “تَحَرَّجْتَ”/Taharrajta/.
مَاذَا تَــــصِــنِـــــعَانِمَاذَا تَصْــــــنَــعَـــانِ(i) “تَــــصِــنِـــــعَانِ”/Tassini’ani/here it has an error related to the internal diacritics. The correct form is “تَـصْــــنَـــعَانِ”/Tasna’ani/.
أَعْلَمُ أَنّــــهُــــــمْأَعْلَمُ أَنَّــــهُـــــــمْ(i) The word “أَنّــــهُــــــمْ”/Ann-hum/has a wrong internal diacritic over the letter “نـ”. The germination should be accompanied by the short vowel Fat-ha. The correct form is “أَنَّــــهُـــــــمْ”/Annahum/.