CLIEL":" Context-Based Information Extraction from Commercial Law Documents

Matias Garcia-Constantino, Katie Atkinson, Danushka Bollegala, Karl Chapman, Frans Coenen, Claire Roberts, Katy Robson

June 2017

PDF DOI

Abstract

The effectiveness of document Information Extraction (IE) is greatly affected by the structure and layout of the documents being considered. In the case of legal documents relating to commercial law, an additional challenge is the many different and varied formats, structures and layouts used. In this paper, we present work on a flexible and scalable IE environment, the CLIEL (Commercial Law Information Extraction based on Layout) environment, for application to commercial law documentation that allows layout rules to be derived and then utilised to support IE. The proposed CLIEL environment operates using NLP (Natural Language Processing) techniques, JAPE (Java Annotation Patterns Engine) rules and some GATE (General Architecture for Text Engineering) modules. The system is fully described and evaluated using a commercial law document corpus. The results demonstrate that considering the layout is beneficial for extracting data point instances from legal document collections.

Type

Conference paper

Publication

In Proceedings - 16th International Conference on Artificial Intelligence and Law (ICAIL2017)

Source Themes

Matias Garcia-Constantino

Lecturer in Computer Science

My research interests include Data Analysis, Internet of Things (IoT), Artificial Intelligence, Human-Computer Interaction and Network Science. matter.