no code implementations • 29 Mar 2024 • Benjamin Townsend, Madison May, Christopher Wells
We introduce RealKIE, a benchmark of five challenging datasets aimed at advancing key information extraction methods, with an emphasis on enterprise applications.
Key Information Extraction Optical Character Recognition (OCR)
1 code implementation • 16 May 2021 • Benjamin Townsend, Eamon Ito-Fisher, Lily Zhang, Madison May
Typically, information extraction (IE) requires a pipeline approach: first, a sequence labeling model is trained on manually annotated documents to extract relevant spans; then, when a new document arrives, a model predicts spans which are then post-processed and standardized to convert the information into a database entry.