In my research over the last several years, I have explored numerous variations of this approach. I have experimented with several different clustering algorithms as well as several different techniques for action selection. The goal of this research agenda is to make it easier to program a robot to exhibit a desired behavior given complex sensor inputs.
My first work in this area was to apply clustering as a technique for enabling sonar values to be used as state indices for a robot employing the popular reinforcement learning algorithm Q-Learning. In the research I did on this subject, I employed the Kohonen Self-Organizing Map (SOM) as the clustering algorithm that performed that transformation.
Part of the appeal of the SOM was its ability to learn and adapt the clusters on every cycle as new sonar data arrived. I became intrigued by the possible applicability of this idea to images. To that end, I stopped investigating the Q-Learning context and pursued the idea of enabling a user to directly specify actions for each cluster. I employed the Growing Neural Gas algorithm in that work, as I was intrigued by its potential to adaptively determine an appropriate number of clusters. I followed up by replacing the manual action-selection GUI I built with an imitation learner.
In continuing to work with GNG, I became frustrated with its large number of hyperparameters. I found it tedious and frustrating to try to figure out appropriate values. Furthermore, although I had initially been intrigued by its flexibility in terms of the total number of clusters, I was starting to find this to be a liability. With a variable number of clusters, the cycle time becomes unpredictable. I concluded it was preferable to be able to bound the cycle time based on knowing ahead of time the total number of clusters.
To address these concerns, I invented my own clustering algorithm, Bounded Self-Organizing Clusters (BSOC), and benchmarked its performance against the SOM. The key ideas of BSOC are as follows:
- There is a fixed number of clusters.
- This is the only hyperparameter.
- Each input is added to the set of clusters as it arrives.
- If the maximum number of clusters is exceeded, the two clusters closest to each other according to the distance metric are merged.
- Both the distance and the merge are weighted by the number of "ancestors" that the clusters have.
Motivated by the need to get a better understanding of the dynamics of the BSOC clustering process, I decided to benchmark it against the well-known k-means clustering algorithm. To simplify the analysis, instead of the complete image clustering I had been doing in previous work, I focused instead on the problem of color quantization. I created two videos with my Android phone as test cases for the benchmarking process. As the phone's resolution is 1920x1080, clustering every pixel over 10 frames of video (spaced 4.5 seconds apart) results in over 20 million values to cluster. That proved to be about as much as my k-means implementation could handle before running out of heap space. (Maybe Java wasn't the best language in which to program this.)
Wanting to process more frames, I concluded I needed another benchmark algorithm. I quickly implemented an algorithm I called Random Incremental Clustering (RIC). The concept is very simple:
- Specify a maximum number of clusters k.
- Initialize the number of attempts a to zero.
- As each new input arrives:
- Increase a by one.
- If fewer than the maximum number of clusters is in place, add it to the clusters.
- Otherwise, with probability k/a, replace at random one of the existing clusters with the newly received input.
Creating this algorithm enabled me to benchmark BSOC with close to 200 million pixel inputs. The surprising result was that as the number of clusters increased to 8, the performance of RIC began to closely approximate both BSOC and k-means. (With 2 clusters, RIC performed very poorly.)
I would not want to conclude too much from this result as of yet. This convergence of performance occurred with an extremely large number of inputs for the very simple RGB representation of a pixel's color. This result might not hold when clustering entire images, as I had been doing in prior work. The result might also improve with an approach to randomized clustering that does non-uniform sampling.
I am looking forward to the next stages of this research project, in which I plan to explore further the questions that arose from this latest study. I continue to be fascinated by the possibilities that real-time clustering enables, and I plan to investigate both the techniques for clustering and its applications. If anyone reading this is interested in collaborating or comparing results, definitely let me know!
Here is a summary of the papers and posters I have written on the topic of this post:
- Gabriel J. Ferrer. "Performance Evaluation of a Real-Time Clustering Algorithm." In FLAIRS-31 (Melbourne, FL, May 21-23, 2018) (Poster PDF)
- Gabriel J. Ferrer and Eric Huynh. "Real-Time Imitation Learning of Visual Behavior by a Mobile Robot." In Proceedings of the Thirtieth International Florida Artificial Intelligence Research Society Conference (Marco Island, FL, May 22-24, 2017) (PDF, implementation)
- Gabriel J. Ferrer. "Real-Time Unsupervised Clustering." In 27th Modern Artificial Intelligence and Cognitive Science Conference (MAICS-2016) (Dayton, OH, April 22-23, 2016) (PDF, implementation)
- Gabriel J. Ferrer. “Towards Human-Induced Vision-Guided Robot Behavior.” In Proceedings of the 2014 AAAI Fall Symposium: Knowledge, Skill, and Behavior Transfer in Autonomous Robots (Arlington, VA, November 13-15, 2014). (PDF)
- Gabriel J. Ferrer, "Creating Visual Reactive Robot Behaviors Using Growing Neural Gas," 25th Modern Artificial Intelligence and Cognitive Science Conference (MAICS2014) (Spokane, WA, April 26, 2014) (PDF)
- Gabriel J. Ferrer, "Encoding Robotic Sensor States for Q-Learning Using the Self-Organizing Map," Journal of Computing Sciences in Colleges 25:5 (May 2010) 133-139. (PDF)
No comments:
Post a Comment