Tuesday, May 10, 2016

Event handling with JavaFX ChoiceBox objects

Writing event handlers for many JavaFX components is relatively straightforward. In comparison, writing event handlers for the JavaFX ChoiceBox can be very confusing. Once you get the hang of it, though, it really is not too difficult.

For this example, we will build a JavaFX GUI containing a ChoiceBox and two TextFields. One TextField will allow the user to enter a string, which will then be placed in the ChoiceBox. The other TextField will show the length of the currently displayed String from the ChoiceBox. I'll provide the controller class below; the remainder of the interface is an exercise for the reader.

import javafx.fxml.FXML;
import javafx.scene.control.ChoiceBox;
import javafx.scene.control.TextField;

public class ChoiceBoxDemoController {
 
  @FXML ChoiceBox choices;
  @FXML TextField entries;
  @FXML TextField length;
 
  @FXML
  void initialize() {
    length.setEditable(false);
  
    entries.setOnAction(event -> {
      if (entries.getText().length() > 0) {
        choices.getItems().add(entries.getText());
        entries.setText("");
        int added = choices.getItems().size() - 1;
        choices.getSelectionModel().select(added);
      }
    });
  
    choices.getSelectionModel().selectedItemProperty()
    .addListener((obs, oldV, newV) -> 
    length.setText(Integer.toString(newV.length())));
  }
}

Note that we have created two different event handlers. The handler for the TextField is very straightforward. The setOnAction() method specifies a handler for whenever the user types the Enter key on that TextField. The ChoiceBox is backed by an ObservableArrayList containing its items. This list is accessed using the getItems() method. So to stick an item on the end of the list, I just use its add() method. I also want to update the ChoiceBox to display what I just added. The index will be the last possible index (i.e., size() - 1), and the select() method of the SelectionModel will designate it. Note that calling select() will also trigger the selection event handler described below.

In contrast, setting up the ChoiceBox event handler is more involved. There are several levels of object that we must traverse before actually being able to set it up. Please note that I do not seek here to justify this design, only to explain it! First, we need to get a reference to the SelectionModel object. Next, we get a reference to the SelectedItemProperty. This property is an Observer of the currently selected item. (If you wish to work with the item index, use the SelectedIndexProperty instead.) It is to this property that we can finally attach an event handler.

The handler itself conforms to the ChangeListener interface. The changed() method takes three arguments. The first argument ("obs") is a reference to the SelectedItemProperty to which the handler is attached. I've never found a use for this. The second argument ("oldV") is the value of the selected ChoiceBox item before the event occurs. The third argument ("newV") is the value of the selected ChoiceBox item after the event occurs. This last argument, in particular, tends to be the most useful, and it is what I use here in this example. What I recommend avoiding at all costs is any reference back to the ChoiceBox object itself; I have found the program's behavior to be much more predictable using these arguments.

Friday, May 6, 2016

JavaFX Data Structures Labs

This semester, Mark Goadrich and I have been working to overhaul the Hendrix Data Structures labs to make use of JavaFX. We use Eclipse. For each assignment, students are given a suite of JUnit tests that their code must pass. One of our goals is to get them accustomed to running JUnit tests. In our experience, this makes it easier for them to get used to writing their own JUnit tests in the next course on software development. We also make heavy use of Enum data types.

A major goal of these labs is to provide a context for each data structure. This way, students can see data structures as components of a larger system, rather than just abstractions in isolation.

While these labs are definitely a work in progress at the moment, I'd like to share some of what we have achieved thus far.

We begin by getting students up to speed with creating Eclipse projects and familiar with interfaces. To that end, they implement strategies for the Prisoner's Dilemma.

The next lab involves implementation of stacks. They must implement the data structures themselves. We provide a JavaFX GUI that enables them to visualize the behavior of what they create.

This is followed by a lab where they create a maze data structure. Again, we provide the GUI, but they create the underlying data structure. This sets them up for the next lab, where they implement maze solvers using stacks and queues. This is also the stage where we introduce the implementation of generic data types.

To illustrate and visualize doubly-linked lists, their next task is to implement a text editor. We again provide the GUI. The students implement the doubly-linked list. In creating a cursor to navigate the document, we also provide a gentle introduction to the concept of an iterator.

As no data structures course is complete without sorting algorithms, in the next lab the students implement animated sorting. Again, we provide the GUI and they provide the algorithms.

Having worked with four JavaFX projects up to this point, in the next lab the students are finally responsible for a GUI implementation. We guide them very closely through the process of creating it. After the lab, the students have a project assigned in which they are to create another GUI in a more free-form way.

Another classic data structure is the binary search tree, and in the next lab students create their own, with a simple visualization.

One of the most ubiquitous features of the modern age is the text-predicting device. We introduce students to the use of a trie for text prediction. The GUI shows the predicted text based on the input suffix. In practice, the trie is a more efficient tree structure than the binary search tree when the keys are strings, and we thought it would be valuable to introduce students to this concept.

The classic lookup data structure is of course the hash table. We actually struggled a bit to come up with a context in which a hash table was actually an appealing data structure (as opposed to a tree or trie). That is, every context we devised, most especially with string keys, seemed to involve an application where maintaining a sorted order was beneficial.

What we came up with as our application was the creation of a tic-tac-toe program that learns from experience. The students implement a hash table to store board positions and expected outcomes.

In our final lab, we revisit maze searching in order to introduce heaps.  Students create a heap that is then used in an implementation of A*.

Some more labs we'd like to work on would include:

  • Balanced binary trees
  • Functional programming concepts introduced in Java 8, including
    • lambdas
    • stream computation.
      • With regard to stream computation, it would be cool to come up with something that parallelizes nicely on a multicore.
I'd love to hear from anyone who finds any of this useful, or who has other ideas to share!

Real-time machine learning

In this post, I give an overview of my recent research efforts in the area of real-time machine learning. I assume a basic familiarity with machine learning concepts. I've written a brief and friendly introduction to machine learning for any readers who would find it helpful before proceeding with this post.

Definitions

Arthur Samuel, in his classic 1959 paper "Some studies in machine learning using the game of Checkers", stated the goals of machine learning in the following terms:
...it is necessary to specify methods of problem solution in minute and exact detail, a time-consuming and costly procedure. Programming computers to learn from experience should eventually eliminate the need for much of this detailed programming effort.
In Liu and Layland's 1973 paper "Scheduling algorithms for multiprogramming in a hard-real-time environment", a key characteristic ("Assumption 4", in their description) of a task that is amenable to real-time scheduling is:
Run-time for each task is constant for that task and does not vary with time. Run-time refers to the time which is taken by a processor to execute the task without interruption.
Putting these two ideas together, then, my current focus is on finding appropriate algorithms and data structures for real-time machine learning. Real-time machine learning data structures have these characteristics:
  • By learning from experience, they reduce programmer effort in creating and deploying a software system.
  • They update themselves to incorporate each new experience in constant time.
There are a couple of existing data structures with these characteristics that I'd like to highlight:
  • Q-learning maintains a two-dimensional array as its core data structure. This array is indexed by states and actions. For each state-action pair, an expected-reward ("Q") value is stored. This value is what the system expects to receive for applying the given action in the given state. 
    • Assuming that a mapping from sensor values to states can be performed in constant time, this is a real-time algorithm.
    • By adjusting the Q-values based on experience, the agent's behavior should become more consistent with the system design goals.
  • The Kohonen Self-Organizing Map (SOM) is an unsupervised learning algorithm that maintains a two-dimensional array of clusters. In its traditional formulation, these clusters are formed from multiple passes over the input data. 
    • By following the example of Touzet 1997 ("Neural reinforcement learning for behavior synthesis"), the SOM can be reconfigured to incorporate one input at a time, thus becoming a real-time machine learning algorithm.
    • The specific task Touzet addressed was for the SOM to cluster sonar readings. Each cluster then became a state designation for Q-learning.
One reason I find unsupervised learning algorithms like the SOM intriguing is that the concept of a cluster can be used to model the formation of invariant representations in the neocortex. For each concept we know, we have a sensory representation in the neocortex. For example, for the concept of an "apple", the neocortex has a stored concept of what you expect to perceive through the senses. Unsupervised learning algorithms provide a means of computationally modeling this concept. From this perspective, the application of the SOM to Q-learning could be interpreted as a learning architecture that is modeling a cognitive process.

Bounded Self-Organizing Clusters

In a later paper (Touzet 2006, "Modeling and simulation of elementary robot behaviors using associative memories"), Touzet developed an alternative to Q-learning using two SOMs. Intrigued by the concept, I attempted to replicate it. I got frustrated with the inadequate coverage of the input space I was getting from the SOM, however. 

What do I mean by "inadequate coverage"? The performance metric I have chosen for this purpose is to minimize the Sum-of-Squared-Differences (SSD) between each training input and the reference input for its best-matching cluster. Informally speaking, I'd like every input to be "reasonably close" to the reference input of some cluster.

To address this problem, I devised a novel real-time clustering algorithm, Bounded Self-Organizing Clusters (BSOC). The core of the idea is as follows:
  • Initialize with a fixed maximum number M of clusters.
  • Until M clusters are reached, save every input as a reference input for a new cluster.
  • Once M clusters are in place, as each new input arrives merge the two closest reference inputs into a new reference input, thus creating space for the new arrival.
In my paper "Real-time unsupervised clustering", presented at MAICS-2016, I give a full presentation of the algorithm, as well as an experimental assessment showing it to outperform the SOM on three data sets I assembled. My presentation at the conference is available for viewing as well, for any who would be interested.

My implementation is available via Github. It is an Eclipse project that should be ready to run. I have included several things:

Real-Time Performance

A relevant issue that I did not address in the paper is the actual frame rate of employing this algorithm on physical hardware when employing images as the input representation. I presently have two such implementations: one for the Lego Mindstorms EV3, and the other running on my MacBook Pro Retina.

DeviceResolution# ClustersFrames/second
EV3160x120160.84
EV380x60162.85
EV340x30169.19
EV340x30325.52
MacBook176x144165.37
MacBook88x721612.02
MacBook88x723210.60
MacBook88x72646.02

With a modest number of clusters and a small but usable image size, we can see that real-time processing is achievable with this data structure.

Demonstration of Video Clustering

This demonstration can be replicated by running the VideoView program in the distributed code. This is the second video discussed in the paper. This video demonstrates the following features of BSOC:
  • When driving straight, the cluster assignment for each image in the sequence is relatively stable.
  • The reference inputs are reasonable interpolations of the input video frames from which they were created.

Next Steps

Supervised Learning

This summer, I will work with a Hendrix undergraduate student to use BSOC as the basis for a supervised learning algorithm. BSOC will serve as a back-end for an implementation of the k-nearest-neighbor (kNN) algorithm. The standard kNN algorithm retains every single labeled training input. As the number of training samples grows, the retrieval time increases proportionately, preventing it from working as a real-time machine learning algorithm. With BSOC as a kNN back-end, we hope to determine the performance tradeoffs involved between meeting real-time deadlines and the quality of the learned classifier.

Improved Cluster Matching with Bag-of-Words

As the number of clusters and images sizes increase, time performance degrades rapidly. We plan to investigate using BRIEF descriptors to create image-vocabulary trees following the approach described by Gálvez-López and Tardós in their paper "Bags of Binary Words for Fast Place Recognition in Image Sequences." By indexing the clusters based on the visual words they contain, we hope to be able to ignore large numbers of irrelevant reference inputs when finding the best match. An additional possibility would be to redefine the distance metric itself in terms of the visual words contained in the images.

Behavior learning

I mentioned earlier that this work was inspired by my attempts to replicate the work described in (Touzet 2006). I still haven't completed that project. In particular, I want to adapt the system he describes to use image input instead of sonar input for defining reactive behaviors.

Collaboration?

If anyone is interested in experimenting with these ideas, let me know. I'd be happy to collaborate with others who have synergistic goals. If nothing else, we could probably work together to assemble a diverse suite of data sets for testing and experimentation.

Recent Revision and Update

After the conference, I reviewed my source code, and it had diverged a bit from my pseudocode in the paper. In particular, there was a rather Byzantine scheme in place for determining whether a newly presented input ought to be merged into an existing cluster, or whether two existing clusters ought to be merged. I simplified it as follows:
  • Maintain an array with M+1 elements.
  • When adding a new reference input fills the array, find the two closest clusters, merge them, and eliminate the extra cluster, retaining only M elements in the array.
This implementation was much closer to the pseudocode in the paper and also took fewer lines of code. In addition, it also improved the performance of the BSOC2 algorithm. I'm including revised versions of Figures 8, 9, 10, and 11 below. As it happens, the revisions did not affect the performance of BSOC1 at all. All variations are retained in the Github repository.



Thursday, May 5, 2016

A brief overview of machine learning

This article is a brief introduction to machine learning, for those wishing to have some idea of what the term entails.

Arthur Samuel, in his classic 1959 paper "Some studies in machine learning using the game of Checkers", stated the goals of machine learning in the following terms:
...it is necessary to specify methods of problem solution in minute and exact detail, a time-consuming and costly procedure. Programming computers to learn from experience should eventually eliminate the need for much of this detailed programming effort.
Many decades later, two commonly employed types of machine learning algorithms are supervised and unsupervised learning algorithms:

  • Supervised learning
    • The algorithm is presented with a body of training input. 
      • Each input has a human-designated label. 
    • After the training process is complete, when presented with an unlabeled input the data structure produces a label. 
      • Ideally, that label will be what the human labeler "would have picked" for that input.
  • Unsupervised learning
    • The algorithm is presented with a body of training input.
      • The inputs do not have human-designated labels.
    • A data structure is built that associates inputs that "look similar".
      • This similarity is determined by a user-submitted distance metric.
    • Each group of "similar" inputs is referred to as a cluster.
      • Each cluster is defined in terms of a reference input. This reference input is typically calculated as a weighted average of the inputs belonging to the cluster.
    • After the training process is complete, when presented with a new input the data structure indicates the cluster to which it belongs.
      • This is typically calculated by finding the distance between the new input and the reference input for each cluster.
      • The reference input with the shortest distance to the new input determines the winning cluster.
What type you use depends on the problem to be solved and the data that is available. 

Revisiting our quotation from Arthur Samuel, the goal of machine learning is to "eventually eliminate the need" for "specify[ing] methods of problem solution in minute and exact detail, a time-consuming and costly procedure." 

Seen in this light, if labeled data is available, supervised learning is often preferable. Since the learned data structure yields a symbolic label for each input, the software in which the learner is embedded can be directly programmed to respond in specific ways that reflect the known semantics of each label.

For example, if one wishes to create a spam filter, supervised learning is a great approach. The labels are easily obtained, as the email user assigns labels to spam emails by putting them in a junk folder. In time, the system should improve its ability to predict whether a given email is spam or not.

Now, if the available data has no labels, unsupervised learning is the only choice. It often proves useful to examine the reference inputs for the clusters that are learned as a first step in getting a deeper understanding of the data. This is, in practice, probably the most common use case for unsupervised learning.

If, however, the unsupervised learner is embedded in a larger software system, some additional human input will be needed to assign workable semantics to the clusters that were learned. This concept intrigues me and has been a focus of my work in recent years. One example is a clustering system I created to facilitate building robot behaviors. I presented the work at MAICS-2014. The robot drives around and builds clusters based on the images it sees. The user then assigns a move to each reference input. By examining the reference inputs, the human is able to determine an appropriate semantics for that cluster.

Of course, it would be really cool to have the robot automatically assign the actions to the clusters based on the input from a human pilot. I have made some tentative steps in that direction, but it remains a work in progress.

Some commonly-used supervised learning algorithms include:
  • Decision trees
  • Multi-layer perceptrons 
    • Also called "feed-forward neural networks"
    • "Deep learning" systems fall into this category
  • Support vector machines
  • Naive Bayesian classifiers
  • k-nearest-neighbor
Well-known unsupervised learners include:
  • Kohonen Self-Organizing Map (SOM)
  • k-means clustering
  • Growing Neural Gas (GNG)

Wednesday, May 4, 2016

JavaFX and webcams

There is an excellent webcam library available for Java. Unfortunately, when trying to integrate it with a JavaFX application I got into some kind of threading trouble. I was able to escape this by creating my own thread to deal with the webcam.

My demo is available on github as an Eclipse project. You'll need to download the aforementioned library and set up its three JAR files in your build path to make it work. Beyond that, it should be straightforward.

Here are some highlights from the code I wrote.

The ImageThread class handles interactions with the webcam. It requires a function to actually perform the rendering on the interface, but aside from that it takes care of dealing with the webcam. I separated out the rendering function so that anyone can just borrow this class wholesale for whatever application they are writing. Note that it is pretty easy to add additional image processing to the threaded loop.

The implementation is straightforward. When the thread starts, it opens the webcam and begins counting frames and time. Whenever the webcam has a new image available, it grabs the image and calls the renderer with both the image and the current frame rate.

Be sure that your rendering function invokes Platform.runLater() to avoid threading problems!

package edu.hendrix.webcamdemo;

import java.awt.image.BufferedImage;
import java.util.function.BiConsumer;

import com.github.sarxos.webcam.Webcam;

public class ImageThread extends Thread {
  private boolean quit = false;
  private int frames;
  private BiConsumer<Bufferedimage,Double> renderer;
 
  public ImageThread(BiConsumer<Bufferedimage,Double> renderer) {
    this.renderer = renderer;
  }
 
  public void quit() {quit = true;}
 
  @Override
  public void run() {
    Webcam webcam = Webcam.getDefault();
    webcam.open();
    frames = 0;
    long start = System.currentTimeMillis();
    while (!quit) {
      if (webcam.isImageNew()) {
        BufferedImage img = webcam.getImage();
        frames += 1;
        renderer.accept(img, 1000.0*frames / (System.currentTimeMillis() - start));
      }
    }
    webcam.close();
  }
}

The WebcamDemoController class mediates between the GUI and the ImageThread. This gives a nice example as to how to provide a rendering function for the GUI.

package edu.hendrix.webcamdemo;

import javafx.application.Platform;
import javafx.fxml.FXML;
import javafx.scene.canvas.Canvas;
import javafx.scene.canvas.GraphicsContext;
import javafx.scene.control.TextField;

import java.awt.image.BufferedImage;

public class WebcamDemoController implements Quittable {
  @FXML Canvas image;
 
  @FXML TextField frameRate;
 
  ImageThread renderer;
 
  int frames;
 
  public static void render(BufferedImage img, Canvas canv) {
    double cellWidth = canv.getWidth() / img.getWidth();
    double cellHeight = canv.getHeight() / img.getHeight();
    GraphicsContext g = canv.getGraphicsContext2D();
    for (int x = 0; x < img.getWidth(); x++) {
      for (int y = 0; y < img.getHeight(); y++) {
        g.setFill(ColorChannel.buildColorFrom(img.getRGB(x, y)));
        g.fillRect(x * cellWidth, y * cellHeight, cellWidth, cellHeight);
      }
    }
  }
 
  @FXML
  void initialize() {
    renderer = new ImageThread((img, rate) -> Platform.runLater(() -> {
      render(img, image);
      frameRate.setText(String.format("%4.2f", rate));
    }));
    renderer.start();
    frameRate.setEditable(false);
  }

  @Override
  public void quit() {
    renderer.quit();
  }
}

Finally, I also created an Enum to handle the ARGB conversions. Since I'm interested in image processing, it is helpful to be able to directly access the pertinent color channels.

package edu.hendrix.webcamdemo;

import java.util.EnumMap;

import javafx.scene.paint.Color;

public enum ColorChannel {
  ALPHA {
    @Override
    public int shift() {
      return 24;
    }
  }, RED {
    @Override
    public int shift() {
      return 16;
    }
  }, GREEN {
    @Override
    public int shift() {
      return 8;
    }
  }, BLUE {
    @Override
    public int shift() {
      return 0;
    }
  };
 
  abstract public int shift();
 
  public int extractFrom(int pixel) {
    return (pixel >> shift()) & 0xFF;
  }
 
  public static int buildPixelFrom(EnumMap colors) {
    int pixel = 0;
    for (ColorChannel c: values()) {
      pixel += (colors.containsKey(c) ? colors.get(c) : 0) << c.shift();
    }
    return pixel;
  }
 
  public static javafx.scene.paint.Color buildColorFrom(int argb) {
    return Color.rgb(ColorChannel.RED.extractFrom(argb), 
        ColorChannel.GREEN.extractFrom(argb), 
        ColorChannel.BLUE.extractFrom(argb), 
        ColorChannel.ALPHA.extractFrom(argb) / 255.0);
  }
 
  public static EnumMap<ColorChannel,Integer> buildChannelsFrom(int argb) {
    EnumMap<ColorChannel,Integer> channels = new EnumMap<>(ColorChannel.class);
    for (ColorChannel c: values()) {
      channels.put(c, c.extractFrom(argb));
    }
    return channels;
  }
}

Monday, April 18, 2016

Overflow and comparators

When writing a Comparator for integer-valued types in Java, the following implementation seems straightforward and clever:

private Comparator<Integer> c = (i, j) -> i - j;

When I used this implementation when writing unit tests for a Data Structures assignment (implementing a heap), much to my surprise the results were rather buggy. I was generating random integers and throwing them in the heap, followed by checking the heap property. When I replaced this implementation with the following alternative, everything worked perfectly:

private Comparator<Integer> c = (i, j) -> i < j ? -1 : i > j ? 1 : 0;

So what's the difference? Overflow. Imagine i = 2135909750 and j = -21314406. The subtraction yields -2137743140, which, being negative, is not exactly the desired result. Of course, with roles reversed underflow is also a hazard.

The lesson for me today is that sometimes, when trying to be clever and succinct, don't be too clever!