Software conception for semantic interpretation of spreadsheet data

Авторы: Dorodnykh N., Yurin A.

Журнал: CEUR Workshop Proceedings: Proc. of 2nd Scientific-Practical Workshop Information Technologies: Algorithms, Models, Systems (ITAMS'2019)

Год: 2019

Аннотация: Spreadsheet data are a valuable source of knowledge in data science and business intelligence applications. However, most commonly, spreadsheets are not accompanied by explicit semantics which are necessary for a machine interpretation of their contents. Information accumulated in spreadsheets is often poorly structured and not standardized. Analysis of this tabular data requires its preliminary extraction and transformation to a structured representation with the subsequent recovering of the implicit semantics. In this paper, we consider a conception of software for semantic interpretation of spreadsheet data in XLSX format and the linked data generation in the form of RDF triplets. We suggest to use DBpedia as a global taxonomy of concepts for understanding and conceptualizing the content of tables. A list of the main functions of this software is also provided. Issues of the further software development are discussed.

