Publication:
ViBERTgrid BiLSTM-CRF: Multimodal Key Information Extraction from Unstructured Financial Documents

Loading...
Thumbnail Image

Advisor

Journal Title

Journal ISSN

Volume Title

Publisher

Springer Nature Switzerland

Research Projects

Organizational Units

Journal Issue

Abstract

Multimodal key information extraction (KIE) models have been studied extensively on semi-structured documents. However, their investigation on unstructured documents is an emerging research topic. The paper presents an approach to adapt a multimodal transformer (i.e., ViBERTgrid previously explored on semi-structured documents) for unstructured financial documents, by incorporating a BiLSTM-CRF layer. The proposed ViBERTgrid BiLSTM-CRF model demonstrates a significant improvement in performance (up to 2 percentage points) on named entity recognition from unstructured documents in financial domain, while maintaining its KIE performance on semi-structured documents. As an additional contribution, we publicly released token-level annotations for the SROIE dataset in order to pave the way for its use in multimodal sequence labeling models.
Accepted in MIDAS (The 8th Workshop on MIning DAta for financial applicationS) workshop of ECML PKDD 2023 conference

Description

Subject

FOS: Computer and information sciences, Computer Science - Computation and Language, Artificial Intelligence (cs.AI), Computer Science - Artificial Intelligence, Computer Vision and Pattern Recognition (cs.CV), Computer Science - Computer Vision and Pattern Recognition, Computation and Language (cs.CL), Information Retrieval (cs.IR), Computer Science - Information Retrieval

Citation

Collections

Endorsement

Review

Supplemented By

Referenced By

Related Goal

0

Views

0

Downloads
View PlumX Details