CLIEL":" Context-Based Information Extraction from Commercial Law Documents

Abstract

The effectiveness of document Information Extraction (IE) is greatly affected by the structure and layout of the documents being considered. In the case of legal documents relating to commercial law, an additional challenge is the many different and varied formats, structures and layouts used. In this paper, we present work on a flexible and scalable IE environment, the CLIEL (Commercial Law Information Extraction based on Layout) environment, for application to commercial law documentation that allows layout rules to be derived and then utilised to support IE. The proposed CLIEL environment operates using NLP (Natural Language Processing) techniques, JAPE (Java Annotation Patterns Engine) rules and some GATE (General Architecture for Text Engineering) modules. The system is fully described and evaluated using a commercial law document corpus. The results demonstrate that considering the layout is beneficial for extracting data point instances from legal document collections.

Publication
In Proceedings - 16th International Conference on Artificial Intelligence and Law (ICAIL2017)
Avatar
Matias Garcia-Constantino
Lecturer in Computer Science

My research interests include Data Analysis, Internet of Things (IoT), Artificial Intelligence, Human-Computer Interaction and Network Science. matter.