BACKGROUND
Companies accumulate millions of textual documents of different nature: invoices, contracts, reports, presentations, etc. An ontology can help process them in many ways.
Problem
The documents are distributed unorganised and their content is not considered, therefore losing huge amounts of knowledge. To have an overview of the entire information, it is necessary to contact one by one each user managing those files to analyze the content with them.
Benefit
By means of AI, a conceptual graph (ontology) describing the different properties found in the texts is generated. Thus, delivering an easy summary of all documents.
METHODOLOGY & results
Architecture: On-premise development using local databases such as MySQL.
Developing language: Java
ML techniques: Unsupervised learning algorithms and NLP techniques.
Results: Depending on the complexity of the texts and their associated vocabulary, the system generates a much deeper (or not) conceptual graph summarising the concepts available linking the most important aspects among them.
