Spreadsheet Data Transformation for Ontology Engineering in Petrochemical Equipment Inspection Tasks

Авторы: Dorodnykh N.O., Yurin A.Y.

Журнал: Lecture Notes in Networks and Systems: 5th Intern. Scientific Conf. on Intelligent Information Technologies for Industry (IITI 2021, Sochi, 30 September - 4 October 2021)

Том: 330


Год: 2022

Отчётный год: 2021


Аннотация: Currently, ontologies remain one of the most effective ways to conceptualize and formalize domain knowledge. The process of their creation requires automation and improvement including the use of various information sources. One of the domains that require the use of ontology engineering is the diagnosis and assessment of the technical state of petrochemical equipment and technological complexes. In turn, spreadsheets are one of the most accessible and common ways of representing and storing information. They are characterized by a great variety and heterogeneity of layouts, styles, and content. Spreadsheets are a valuable source of structured domain knowledge. In this paper, we propose to automate the ontology engineering in petrochemical equipment inspection tasks (including diagnosis and assessment of the technical states) based on the analysis and transformation of spreadsheet data. For this purpose, we present a new technique that provides the restoration of tabular data semantics, conceptualization, and formalization of tabular content in the form of ontologies. The main activities of our technique are the following: transforming input arbitrary spreadsheets into a canonicalized form; obtaining ontology fragments based on the analysis and transformation of canonical spreadsheets; aggregating ontology fragments into a complete ontological model; generating ontological model code in the OWL format. The technique proposed is implemented in the form of a prototype of the software that was evaluated when solving tasks of ontology engineering for industrial petrochemical equipment. Spreadsheets from reports on industrial safety inspection were used as a data source.

