Evaluation of feature-embedding methods for word spotting in historical arabic documents - Institut Polytechnique de Paris Access content directly
Conference Papers Year : 2020

Evaluation of feature-embedding methods for word spotting in historical arabic documents

Abstract

Retrieving and indexing historical Arabic documents remain a very significant challenge. The purpose of this paper is to compare the feature representation spaces for word spotting in historical Arabic documents. Our goal is to create embedding spaces using the characteristics of different machine learning methods: i) linear such as principal component analysis and linear discriminant analysis, and ii) non-linear including convolutional neural networks for triplets and Siamese. Subsequently, each word image is represented by a dense vector. Thus, to match feature representations, a Euclidean distance is used. An evaluation of various representation space models is presented. The embedding word models are evaluated on the VML-HD dataset, and the experiments show the effectiveness of non-linear methods compared to linear ones.
Not file

Dates and versions

hal-03094910 , version 1 (04-01-2021)

Identifiers

Cite

Abir Fathallah, Mohamed Ibn Khedher, Mounim El Yacoubi, Najoua Essoukri Ben Amara. Evaluation of feature-embedding methods for word spotting in historical arabic documents. SSD 2020: 17th international multi-conference on Systems, Signals and Devices, Jul 2020, Monastir (online), Tunisia. pp.34-39, ⟨10.1109/SSD49366.2020.9364134⟩. ⟨hal-03094910⟩
42 View
0 Download

Altmetric

Share

Gmail Facebook Twitter LinkedIn More