Title: How to handle multiple expertise from several experts: a general text clustering approach
Author: Stéphane Lapalut
Reference: 2nd Knowledge Engineering Forum (KEF'96), Karlsruhe, SFB 501 Bericht 01/96, Kaiserlautern University, Germany, F. Maurer (Ed.), 1996
article (compressed postscript file, 136712 bytes)

Abstract: At the earlier stage of the knowledge acquisition process, interviews of experts produce a large amount of rich but ill-structured texts. Knowledge engineers need some tool to help them in the exploitation of all these texts, especially when dealing with expertise from different experts, in different but overlapping domains. We propose the use of a statistical method, the top-down hierarchical classification and a new interpretation of its results. The initial statistical analysis proposed by M. Reinert \cite{reinert79, reinert92} gives two kinds of results: first a segmentation of texts that reflects their ``semantic contexts'', which we use to raise structures of texts and inter-relations between them, and second, classes of significant terms belonging to these contexts, which can be related to the experts or to their specialities. In this paper, we briefly describe the method, then we explain the exploitation of its result on a multiple text corpus case study, relating interviews of road safety experts. We conclude with some research directions to deal with so-called ``ontologies'' on expert's domains.

Keywords: knowledge acquisition, mutliple expertise, text clustering, text structuring, top-down hierarchical classification, statistical method.