Faculty of Science and Information Technology > Artificial Intelligence

Ontology-based Text Document Clustering

(1/1)

Nazia Nishat:
Text  clustering  typically  involves  clustering  in  a  high  dimensional  space,  which  appears  difficult  with  regard  to  virtually  all  practical  settings.  In  addition,  given  a  particular  clustering  result it is typically very hard to come up with a good explanation of why the text clusters have been  constructed  the  way  they  are.  In  this  paper,  we  propose  a  new  approach  for  applying  background knowledge during preprocessing in order to improve clustering results and allow for selection between results. We preprocess our input data applying an ontology-based heuristics for feature  selection  and  feature  aggregation.  Thus,  we  construct  a  number  of  alternative  text  representations. Based on these representations, we compute multiple clustering results using K-Means.  The  results  may  be  distinguished  and  explained  by  the  corresponding  selection  of  concepts   in   the   ontology.   Our   results   compare   favourably   with   a   sophisticated   baseline   preprocessing strategy.

Link: https://www.kde.cs.uni-kassel.de/wp-content/uploads/benz/hotho/pub/Ontology_based_Text_Document_Clustering_2002.pdf

s.arman:
Thanks for sharing

khalid:
helpful

lamisha:
Informative post madam

Navigation

[0] Message Index

Go to full version