Learning Table Similarity Measures
Facts
Security and Dependability, Operating, Communication and Distributed Systems
Information Systems, Process and Knowledge Management
DFG Individual Research Grant
![]()
Description
Existing table similarity measures build on simple models of table metadata, structure, and content. They are designed mainly for tables with a horizontal layout where each column represents one attribute and data values are in rows, and they cannot be easily used for tables with other structures, such as matrix tables where both rows and columns are represented by attributes and values. Moreover, they rely in different manners on computing with frequency values of individual words which is not sufficient to capture the semantics of table elements because these have comparable (compared to words in a document) little and difficult to model context.
The main objective of this proposal is to research methods that bring more "semantics" to table similarity measures. We expect that better TSM will significantly improve the quality of applications relying on tables, such as table similarity search and table auto completion. We will approach this problem in two ways: By learning specific word embeddings optimized to yield semantically meaningful comparisons of single tokens within tables, and by designing a particular neural network architecture addressing table normalization and table comparison in a single, trainable framework.
Organization entities
Department of Computer Science
Address
Johann von Neumann-Haus, Institutsgeb?ude, Rudower Chaussee 25, 12489 BerlinGeneral contactTel.: 030 2093-41140Faculty of Mathematics and Natural Sciences
Address
Johann von Neumann-Haus, Institutsgeb?ude, Rudower Chaussee 25, 12489 BerlinKnowledge Management in Bioinformatics
Address
Johann von Neumann-Haus, Institutsgeb?ude, Rudower Chaussee 25, 12489 BerlinGeneral contactTel.: 030 2093-41280