Review Article

A Comparative Study of Some Automatic Arabic Text Diacritization Systems

Algorithm 3

create_Distribution_Of _Letter_N-grams.
(i)Input: list of diacritized sentences sents,
(ii)The size of the n-grams n
(iii)Output: dictionary d distribution of letter n-grams
(iv)SET grams EQUAL TO EMPTY LIST
(v)SET words EQUAL TO EMPTY LIST
(vi)SET ngrDist EQUAL TO EMPTY DICTIONARY
(vii)tool: = MyToolKit()
(viii)FOR EACH s IN sents
(ix) words.extend(tool.words(s))//tool.words(): splits s to words
(x)FOR EACH IN words:
(xi) grams: = grams + n-grams(tool.LettersDiac(w),n)//n-grams() is a method that extract n-grams
(xii)ngrDist: = freqDist(grams)
(xiii)RETURN ngrDist