Skip to Main content Skip to Navigation
Conference papers

Evaluation of feature-embedding methods for word spotting in historical arabic documents

Abstract : Retrieving and indexing historical Arabic documents remain a very significant challenge. The purpose of this paper is to compare the feature representation spaces for word spotting in historical Arabic documents. Our goal is to create embedding spaces using the characteristics of different machine learning methods: i) linear such as principal component analysis and linear discriminant analysis, and ii) non-linear including convolutional neural networks for triplets and Siamese. Subsequently, each word image is represented by a dense vector. Thus, to match feature representations, a Euclidean distance is used. An evaluation of various representation space models is presented. The embedding word models are evaluated on the VML-HD dataset, and the experiments show the effectiveness of non-linear methods compared to linear ones.
Document type :
Conference papers
Complete list of metadata

https://hal.archives-ouvertes.fr/hal-03094910
Contributor : Mohamed IBN KHEDHER Connect in order to contact the contributor
Submitted on : Monday, January 4, 2021 - 2:40:54 PM
Last modification on : Thursday, April 15, 2021 - 10:17:05 AM

Identifiers

Citation

Abir Fathallah, Mohamed Ibn Khedher, Mounim El yacoubi, Najoua Essoukri Ben Amara. Evaluation of feature-embedding methods for word spotting in historical arabic documents. SSD 2020: 17th international multi-conference on Systems, Signals and Devices, Jul 2020, Monastir (online), Tunisia. pp.34-39, ⟨10.1109/SSD49366.2020.9364134⟩. ⟨hal-03094910⟩

Share

Metrics

Record views

26