Review Article

A Comparative Study of Some Automatic Arabic Text Diacritization Systems

Algorithm 5

convert_A_Sequence_Of_Embeddings_to_A_Sequence_Of_Indexes
(i)Input: 2D tensor representing the sequence of embeddings seq,
(ii)//shape of seq is (time_steps, embedding_dim)
(iii)2D tensor for the learned embedding matrix emb_Mat,
(iv)//shape of Emb_Mat is (number of chars, embedding_dim)
(v)Output: a tensor t//of shape (time_steps) where each row contains the index of the char
(vi)seq_shape: = shape(seq)
(vii)b_shape: = shape(emb_Mat)
(viii)//tile seq along new dimension
(ix)seq_tiled: = tile(seq, [1,b_shape[0]])
//reshape
(x)seq_tiled: = reshape(seq_tiled, [seq_shape[0], b_shape[0],seq_shape [1]])
(xi)//Elementwise comparison
(xii)eq: = equal(emb_Mat, seq_tiled)
(xiii)//Reduce the last dimension
(xiv)red: = reduce(eq, -1)
(xv)//element where condition eq is True
(xvi)z: = where(red)
(xvii)t: = z[:,1]
(xviii)RETURN t