Academic publications which I have authored are listed below, with PDF links. Please note that all papers are included for academic research and educational purposes only; their copyright is retained by the publishers.
Feel free to contact me through my university email address (included in the papers) for any clarifications on the contents of the following publications. Program code is available on request also.
The list below is automatically updated from csauthors.net and so will contain reasonably accurate listings for all published work along with DOI links. Additional details on papers (with PDF links) is available in the list further down this page.
In addition, a (beta) .bib file of my published work can be downloaded at this external link.
The first paper below is an initial look at using GP to automatically augment existing datasets with "redundant features", in order to allow the creation of challenging feature selection datasets.
The set of papers below focus on performing simultaneous clustering and feature reduction (i.e. selection or construction) using EC techniques. Clustering is an inherently difficult problem as it is generally performed in an unsupervised manner and often the number of clusters (K) is not even known in advance. My work has focused on investigating potential representations, fitness functions, and evaluation metrics for producing good clustering results while using the minimum number of features (which in turn improves interpretability and decreases complexity of solutions and the search space).
Using GP to automatically evolve similarity functions for performing graph-based clustering:
Using GP to perform feature construction (FC) to improve the performance of k-means clustering:
The first paper below is an initial take on comparing Particle Swarm Optimisation (PSO) representations for simultaneous clustering and feature reduction, for both the case where K is known and unknown. In particular, it highlights the potential of a medoid-based approach for giving better performance than the classically used centroid representation. The work in this paper has been extended significantly more recently in the second paper below, using a more advanced three-stage approach, where the number of clusters is first estimated, then simultaneous feature selection and clustering is performed using PSO (with a medoid approach that is encouraged to search around the estimated K), and then finally a pseudo-local search is applied to fine-tune the clusters produced.
The following two papers summarise the research done as part of my Honours project. They discuss the use of high-level image feature extraction directly within a GP program. Two approaches are proposed, using Histogram of Oriented Gradients (HoG) and Speeded Up Robust Features (SURF) features respectively. Using GP to automatically optimise high-level feature extraction methods allows these generic methods to be tailored to a problem domain.
The below paper showcases an application of GP to a real-world problem: the automated quantitative analysis of algae in river images. GP was trained on a variety of images from the Hutt River in Wellington, New Zealand as well as other rivers in the Nelson Region.
My Honour's project was titled "Genetic Programming for Image Classification using High-Level Features". The report provides additional details and examples beyond what is included in the conference papers, as well as a broad background on the EC, GP and image analysis domains. I will endeavour to write a more useful discussion on the contents on the report at some stage, but for now the abstract hopefully gives an acceptable summary:
"Image analysis is a key area in the computer vision domain that has many applications. Genetic Programming (GP) has been applied to this area extensively, with positive results. High-level features extracted from methods such as Speeded Up Robust Features (SURF) and Histogram of Orientated Gradients (HoG) are commonly used for object detection using machine learning techniques. However, GP techniques are not often used with these methods, despite being applied extensively to image analysis problems. This work investigates several novel approaches for using GP with high-level features for image classification. These new approaches are applied across a range of datasets, with promising results when compared to a variety of well-known machine learning techniques. Some high-performing GP individuals are analysed to give insight into how GP can effectively be used with high-level features. The use of GP for feature extraction and construction is also investigated, achieving high performance using only a few constructed features."