SSP Group Meeting

11am, 20 February, 2007
Appleton Tower 4.03
CISA, School of Informatics
University of Edinburgh

Parts that add up to a whole: enforcing coherence in table understanding

Ana Cristina da Costa e Silva

Extracting information from tables that are embedded in unstructured data sources requires following through a large set of division and aggregation decisions at character, cell, column, line and finally table level. These decisions should lead to a table that is:

     - graphically coherent, i.e. the result is a grid-like representation of the sort that is intrinsic to the very definition of a table;

     - and ontologically coherent, i.e. the resulting table presents information that complies with the characteristics that its context prescribes for it.

If these two aspects of coherence are not verified, then a mistake was made during the decision process and it becomes relevant to locate it and try arrangements that lead to a more coherent overall result. In this talk we will be presenting the table understanding problem and portraying different strategies for representing uncertainty through an intricate set of interdependent decisions, in a way that is theoretically accurate but also computationally scalable.