Real-time optimisation algorithms for solving clustering problems in very large data sets

Project Title:

Real-time optimisation algorithms for solving clustering problems in very large data sets.

Supervisor(s):

A/Prof Adil Baghirov and A/Prof Madhusudan Chetty

Contact person and email address:

A/Prof Adil Baghirov, a.bagirov@federation.edu.au

A brief description of the project:

The rapid development of new technologies in science and communications have led to the significant growth of data. A huge amount of data is available from different sources such as social media, online transactions, network sensors, satellite and astronomical information. Dealing with massive amounts of data poses a challenge for researchers and practitioners, due to the physical limitations of the current computational resources. Such data might be available in different forms and in particular, in the form of data streams. A data stream is a massive sequence of data objects and in general, this sequence might be unbounded. The development of accurate clustering algorithms in such data sets is important as they provide more accurate cluster structure of a data set with least number of clusters. Such outcomes may help to improve the decision making process. In many applications the clustering is considered as an off-line process that is computational effort used by algorithms is assumed to be unlimited. However, clustering of data streams in many applications is an on-line process and algorithms may use only limited computational effort. More specifically, at each iteration clustering should be completed in a given time-frame. The aim of this proposal is to develop real-time and accurate clustering algorithms based on optimisation techniques for solving clustering problems in massive stream data sets.