Review Article
A Comparative Study of Some Automatic Arabic Text Diacritization Systems
Algorithm 3
create_Distribution_Of _Letter_N-grams.
(i) | Input: list of diacritized sentences sents, | (ii) | The size of the n-grams n | (iii) | Output: dictionary d distribution of letter n-grams | (iv) | SET grams EQUAL TO EMPTY LIST | (v) | SET words EQUAL TO EMPTY LIST | (vi) | SET ngrDist EQUAL TO EMPTY DICTIONARY | (vii) | tool: = MyToolKit() | (viii) | FOR EACH s IN sents | (ix) | words.extend(tool.words(s))//tool.words(): splits s to words | (x) | FOR EACH IN words: | (xi) | grams: = grams + n-grams(tool.LettersDiac(w),n)//n-grams() is a method that extract n-grams | (xii) | ngrDist: = freqDist(grams) | (xiii) | RETURN ngrDist |
|