Tuesday, November 6, 2018

An excellent guide to setting up OpenCV on Android

This post will be short and to the point.

I've been trying to get OpenCV going in Android Studio for quite some time. I finally found an article that clearly described how to get it done. Special thanks to Elvis Chidera for his great work on that tutorial!

Now it is time to start on the actually interesting work.

Wednesday, October 24, 2018

A subtle and interesting race condition


Race conditions are an important and common topic when discussing multithreaded programming. Every instructor benefits by including realistic examples when lecturing on this subject. The best source of realistic examples is, arguably, real examples from real code. In the spirit of recording such an example both for my benefit and for that of anyone reading this, let me describe for you a race condition I fixed today.

The program is an Android app that sends commands to an Arduino board to control a robot. When the user is ready to begin controlling the robot, it starts up a thread to handle the communication. The communication thread maintains a TreeSet data structure (trueFlags) that tracks which sensory conditions are true. On each iteration, it updates the user interface to display these conditions for the user.

In an Android app, to update the user interface from a separate thread requires specifying a block of code for the UI to run when it is ready. In our example, the code looks like this:

    private void updateGUI(final Action currentAction) {
        getActivity().runOnUiThread(new Runnable() {   
            @Override
            public void run() {
                allTrueFlags.setText(trueFlags.toString());

                StringBuilder values = new StringBuilder();
                for (int i : sensorValues) {
                    values.append(i + ", ");
                }
                bytesReceived.setText(values.toString());
                bytesSent.setText(currentAction.toString());
            }
        });
    }

It looks pretty harmless. After all, only the UI data structures are being modified, and only on a thread set aside for that purpose! But it caused our app to crash at seemingly random intervals as the robot wandered around a room.

The trouble was the call to trueFlags.toString(). This TreeSet is constantly being modified in the communication thread. On the rare occasions when these modifications were interrupted to run the UI thread, a ConcurrentModificationException was thrown when trying to construct a string from it. Often, the robot would run for minutes at a time before this occurred!

It was not difficult to fix this bug, but it took quite a lot of effort to find it! Hopefully, posting this example will help someone else out there identify a similar bug sometime.  Here is the fixed code that avoids the bug, by moving the string conversion back into the communication thread:


    private void updateGUI(final String trueFlagString, final Action currentAction) {
        getActivity().runOnUiThread(new Runnable() {
            @Override
            public void run() {
                allTrueFlags.setText(trueFlagString);

                StringBuilder values = new StringBuilder();
                for (int i : sensorValues) {
                    values.append(i + ", ");
                }
                bytesReceived.setText(values.toString());
                bytesSent.setText(currentAction.toString());
            }
        });
    }

Thursday, August 16, 2018

Probability and Boolean Logic

To understand probability, it is helpful to understand boolean logic. In this post, I will review both and show the connections. This is a first draft of the explanation I have in mind. I welcome suggestions to improve the presentation of these ideas.

A probability is a real number between 0 and 1 inclusive. This number represents uncertain knowledge of an outcome. The sum of the probabilities of all possible outcomes of an event is 1.

A classic example is the event of flipping of a coin. There are two possible outcomes: heads and tails. Without performing any experiments, we would estimate the probability of heads as 0.5 and tails as 0.5.

Boolean logic relies on variables that have two possible values (true/false and 1/0 are the most popular; we'll rely on true/false for the rest of this post). These variables can be combined and manipulated using the andor, and not operators. The not operator has one argument, and returns true if the argument is false and false if the argument is true. The and operator has two arguments, returning true if they are both true and false otherwise. The or operator has two arguments, returning false if they are both false and true otherwise.

We can extend boolean logic to reason about probabilities. We will give arithmetic definitions of the three boolean operators for use with probabilities:
  • x and y -> x * y (multiplication)
  • x or y -> x + y (addition)
  • not x -> -x (negation)
So if we do two coin tosses, the probability of getting heads twice is:
  • heads and heads
  • 0.5 * 0.5
  • 0.25
The same reasoning applies to getting tails twice, yielding the same probability (0.25). The probability of getting either heads twice or tails twice is:
  • heads and heads or tails and tails
  • 0.25 + 0.25
  • 0.5
Let's extend this line of reasoning to another example: the rolling of a six-sided die. I'll use fractions to describe the probabilities in order to avoid gratuitous repeating decimals. The probability of any particular value showing up is 16. To calculate the probability of a member of a set of values showing up, we can add them up individually, as this is just an or operation. So the probability of a 1 or a 4 is 16   + 16   =  26.

This gets more interesting when we want to compute probabilities such as "anything except a 6". Since that is equivalent to not, we calculate 66   - 16   =  56. An alternative is to sum the probabilities of those values, but using not makes the calculation much simpler.

In general, by using not we can greatly simplify our calculations. Consider calculating the probability of getting 16 or less when rolling three dice. If we view it as "not (17 or 18)", the calculation becomes:
  • p(18) = p(6, 6, 6) = p(6) * p(6) * p(6) = 16   * 16  1 =  1216.
  • p(17) = p(6, 5, 5) + p(5, 6, 5) + p(5, 5, 6) = 3216.
  • p(16 or less) = not (p(17) or p(18)) = 1 - (1216  + 3216) = 1 - (4216) = 2122165354.
Two important caveats:
  1. To represent and as multiplication, the events must be independent of each other. This applies, for example, to coin flipping, because nothing about the first flip has any causative effect on the outcome of the second flip. 
  2. To represent or as addition, the events must be mutually exclusive. For example, a die roll cannot be both a 1 and a 4 at the same time.


So, to conclude for now, I personally find it helpful to leverage the convertibility between boolean logic and probability to simplify my calculations and save effort.

Wednesday, May 23, 2018

Real-time Clustering

Clustering algorithms can be conceptualized as a technique for reducing input data in a high-dimensional space into a lower-dimensional space. From the viewpoint of robot programming, clustering is useful for transforming sensor data with a wide range into a small number of categories, each of which can be associated with a suitable action.

In my research over the last several years, I have explored numerous variations of this approach. I have experimented with several different clustering algorithms as well as several different techniques for action selection. The goal of this research agenda is to make it easier to program a robot to exhibit a desired behavior given complex sensor inputs.

My first work in this area was to apply clustering as a technique for enabling sonar values to be used as state indices for a robot employing the popular reinforcement learning algorithm Q-Learning. In the research I did on this subject, I employed the Kohonen Self-Organizing Map (SOM) as the clustering algorithm that performed that transformation.

Part of the appeal of the SOM was its ability to learn and adapt the clusters on every cycle as new sonar data arrived. I became intrigued by the possible applicability of this idea to images. To that end, I stopped investigating the Q-Learning context and pursued the idea of enabling a user to directly specify actions for each cluster. I employed the Growing Neural Gas algorithm in that work, as I was intrigued by its potential to adaptively determine an appropriate number of clusters. I followed up by replacing the manual action-selection GUI I built with an imitation learner.

In continuing to work with GNG, I became frustrated with its large number of hyperparameters. I found it tedious and frustrating to try to figure out appropriate values. Furthermore, although I had initially been intrigued by its flexibility in terms of the total number of clusters, I was starting to find this to be a liability. With a variable number of clusters, the cycle time becomes unpredictable. I concluded it was preferable to be able to bound the cycle time based on knowing ahead of time the total number of clusters.

To address these concerns, I invented my own clustering algorithm, Bounded Self-Organizing Clusters (BSOC), and benchmarked its performance against the SOM. The key ideas of BSOC are as follows:

  • There is a fixed number of clusters. 
    • This is the only hyperparameter.
  • Each input is added to the set of clusters as it arrives.
    • If the maximum number of clusters is exceeded, the two clusters closest to each other according to the distance metric are merged. 
    • Both the distance and the merge are weighted by the number of "ancestors" that the clusters have.
My student Eric Huynh and I then employed BSOC to implement an imitation learner, which we assessed with a large number of volunteers.

Motivated by the need to get a better understanding of the dynamics of the BSOC clustering process, I decided to benchmark it against the well-known k-means clustering algorithm. To simplify the analysis, instead of the complete image clustering I had been doing in previous work, I focused instead on the problem of color quantization. I created two videos with my Android phone as test cases for the benchmarking process. As the phone's resolution is 1920x1080, clustering every pixel over 10 frames of video (spaced 4.5 seconds apart) results in over 20 million values to cluster. That proved to be about as much as my k-means implementation could handle before running out of heap space. (Maybe Java wasn't the best language in which to program this.)

Wanting to process more frames, I concluded I needed another benchmark algorithm. I quickly implemented an algorithm I called Random Incremental Clustering (RIC). The concept is very simple:

  • Specify a maximum number of clusters k.
  • Initialize the number of attempts a to zero.
  • As each new input arrives:
    • Increase a by one. 
    • If fewer than the maximum number of clusters is in place, add it to the clusters.
    • Otherwise, with probability k/a, replace at random one of the existing clusters with the newly received input.
As the number of inputs received increases, it becomes increasingly rare for a new input to replace one of the veteran cluster centers. But it still happens occasionally. This has the effect of creating a uniform statistical sample of the inputs.

Creating this algorithm enabled me to benchmark BSOC with close to 200 million pixel inputs. The surprising result was that as the number of clusters increased to 8, the performance of RIC began to closely approximate both BSOC and k-means. (With 2 clusters, RIC performed very poorly.)

I would not want to conclude too much from this result as of yet. This convergence of performance occurred with an extremely large number of inputs for the very simple RGB representation of a pixel's color. This result might not hold when clustering entire images, as I had been doing in prior work. The result might also improve with an approach to randomized clustering that does non-uniform sampling.

I am looking forward to the next stages of this research project, in which I plan to explore further the questions that arose from this latest study. I continue to be fascinated by the possibilities that real-time clustering enables, and I plan to investigate both the techniques for clustering and its applications. If anyone reading this is interested in collaborating or comparing results, definitely let me know!

Here is a summary of the papers and posters I have written on the topic of this post:
  • Gabriel J. Ferrer. "Performance Evaluation of a Real-Time Clustering Algorithm." In FLAIRS-31 (Melbourne, FL, May 21-23, 2018) (Poster PDF)
  • Gabriel J. Ferrer and Eric Huynh. "Real-Time Imitation Learning of Visual Behavior by a Mobile Robot." In Proceedings of the Thirtieth International Florida Artificial Intelligence Research Society Conference (Marco Island, FL, May 22-24, 2017) (PDFimplementation)
  • Gabriel J. Ferrer. "Real-Time Unsupervised Clustering." In 27th Modern Artificial Intelligence and Cognitive Science Conference (MAICS-2016) (Dayton, OH, April 22-23, 2016) (PDFimplementation)
  • Gabriel J. Ferrer. “Towards Human-Induced Vision-Guided Robot Behavior.” In Proceedings of the 2014 AAAI Fall Symposium: Knowledge, Skill, and Behavior Transfer in Autonomous Robots (Arlington, VA, November 13-15, 2014). (PDF)
  • Gabriel J. Ferrer, "Creating Visual Reactive Robot Behaviors Using Growing Neural Gas," 25th Modern Artificial Intelligence and Cognitive Science Conference (MAICS2014) (Spokane, WA, April 26, 2014) (PDF)
  • Gabriel J. Ferrer, "Encoding Robotic Sensor States for Q-Learning Using the Self-Organizing Map," Journal of Computing Sciences in Colleges 25:5 (May 2010) 133-139. (PDF)

Tuesday, March 13, 2018

Loading image files in JavaFX

Loading image files into JavaFX applications is not too difficult, but there is a lot of confusing and contradictory advice out there. Even worse, SceneBuilder encodes the files in a manner that is incompatible with how they will be loaded in the Java code. So I would recommend making direct references in the Controller to the files that one is loading.

Image files are easy to load if you place them in the same package as your code. Given that, an example like the following will do the trick:

import javafx.fxml.FXML;
import javafx.scene.image.Image;
import javafx.scene.image.ImageView;
import javafx.scene.layout.Pane;

public class Controller {
 
 @FXML
 private Pane pane;
 
 private ImageView img;

 @FXML
 void initialize() {
  img = new ImageView(new Image(Controller.class.getResourceAsStream("Screen.png")));
  pane.getChildren().add(img);
 }
}

Handling Key Events in JavaFX

Handling keyboard events is typically a tricky issue in any GUI environment, because it is not always clear which component is supposed to receive those events. This is typically handled by the concept of focus. That is, a component with the focus is the component that receives those events.

I have put together a short example as to how to make this work well in JavaFX. In this example, when the mouse hovers over a component, that component receives the focus. It retains the focus until another component requests it.

The two Circle objects change their color when they have the focus and an appropriate key is pressed. The Pane responds to keystrokes by printing them out on the console.

Although it is not part of this example, it is worth mentioning that clicking on a TextField or TextArea will always give that component the focus.

The below example is a Controller. I leave the main() and FXML file as exercises for the reader.


import javafx.fxml.FXML;
import javafx.scene.input.KeyCode;
import javafx.scene.input.KeyEvent;
import javafx.scene.layout.Pane;
import javafx.scene.paint.Color;
import javafx.scene.shape.Circle;

public class Controller {
 
  @FXML
  Pane pane;

  @FXML
  Circle one;
 
  @FXML
  Circle two;
 
  @FXML
  void initialize() {
    one.setOnMouseEntered(event -> one.requestFocus());
    two.setOnMouseEntered(event -> two.requestFocus());
  
    one.setOnKeyPressed(event -> decodeColor(event, one));
    two.setOnKeyPressed(event -> decodeColor(event, two));

    pane.setOnMouseEntered(event -> pane.requestFocus());
    pane.setOnKeyPressed(event -> System.out.println("typed " + event.getText()));
  }
 
  void decodeColor(KeyEvent key, Circle obj) {
    if (key.getCode() == KeyCode.R) {
      obj.setFill(Color.RED);
    }
    if (key.getCode() == KeyCode.B) {
      obj.setFill(Color.BLUE);
    }
    if (key.getCode() == KeyCode.G) {
      obj.setFill(Color.GREEN);
    }
  }
}


Tuesday, February 20, 2018

The JavaFX canvas and mouse clicks

The Canvas class in JavaFX is a very flexible platform for drawing whatever one would like. It is not that difficult to use, but there are a few elements that are a bit confusing.

To take advantage of this example, start by creating an FXML file containing a Canvas and a Button. Then connect it to the CanvasDemoController class below:

package canvas;

import javafx.fxml.FXML;
import javafx.scene.canvas.Canvas;

public class CanvasDemoController {
  @FXML
  Canvas canvas;
 
  @FXML
  void move() {
    d.move(10, 10);
    refresh();
  }
 
  void refresh() {
    canvas.getGraphicsContext2D()
          .clearRect(0, 0, canvas.getWidth(), canvas.getHeight());
    d.draw(canvas);
  }
 
  Drawing d = new Drawing();
 
  @FXML
  void initialize() {
    canvas.setOnMouseClicked(event -> {
      d.teleport(event.getX(), event.getY());
      refresh();
    });
  }
}

The class above references the Drawing class, shown below:

package canvas;

import javafx.scene.canvas.Canvas;

public class Drawing {
  private double x, y, radius;
 
  public Drawing() {
    x = y = 0.0;
    radius = 10.0;
  }
 
  public void move(double dx, double dy) {
    x += dx;
    y += dy;
  }
 
  public void teleport(double x, double y) {
    this.x = x;
    this.y = y;
  }
 
  public void draw(Canvas canvas) {
    canvas.getGraphicsContext2D()
          .fillOval(x - radius, y - radius, radius*2, radius*2);
  }
}


The combined example illustrates several useful ideas:
  • Creating graphics on a Canvas is not that difficult, but the Graphics2DContext object must be acquired from it in order to draw anything. 
  • Drawings on a Canvas persist until erased. Erase and redraw as necessary to animate.
  • Mouse click events are pretty easy to handle, as the coordinates of the mouse click are easily obtained from the event parameter. However, we need to code the event handler explicitly in the controller class.

Thursday, February 1, 2018

JavaFX and TableView

The TableView control in JavaFX is a flexible and powerful interface. Every TableView object is parameterized by a data type that represents a row of the table. The documentation encourages developers to create row objects that represent each column using ObservableValue types. Following the documentation, it is necessary to create three methods (a setter and two different getters) for each of these values.

I find this approach exceedingly verbose, and in this blog post I will demonstrate an alternative that is hinted at in the documentation. I will assume that the reader is comfortable creating a JavaFX interface using SceneBuilder. I will just present the controller to which the FXML file can be attached.

This is a simple application that lets the user enter names and ages, which are then displayed in the table. The table is not editable. First, we define a data type for the table rows:

public class NameAgeRow {
  private String name;
  private int age;
 
  public NameAgeRow(String name, int age) {
    this.name = name;
    this.age = age;
  }
 
  public NameAgeRow(String name, String age) {
    this(name, Integer.parseInt(age));
  }
 
  public String getName() {return name;}
  public int getAge() {return age;}
}

Next, we define the Controller for the application:

import javafx.beans.property.SimpleIntegerProperty;
import javafx.beans.property.SimpleStringProperty;
import javafx.fxml.FXML;
import javafx.scene.control.Button;
import javafx.scene.control.TableColumn;
import javafx.scene.control.TableView;
import javafx.scene.control.TextField;

public class Controller {
  @FXML
  TableView<NameAgeRow> table;
  @FXML
  TableColumn<NameAgeRow,String> names;
  @FXML
  TableColumn<NameAgeRow,Integer> ages;
 
  @FXML
  TextField name;
  @FXML
  TextField age;
  @FXML
  Button add;
 
  @FXML
  void initialize() {
    add.setOnAction(e -> {
      table.getItems().add(new NameAgeRow(name.getText(), 
                                          age.getText()));
      name.setText("");
      age.setText("");
    });
  
    names.setCellValueFactory(cdf -> 
      new SimpleStringProperty(cdf.getValue().getName()));
    ages.setCellValueFactory(cdf -> 
      new SimpleIntegerProperty(cdf.getValue().getAge()).asObject());
  }
}

The overall structure is straightforward. For each column, we define a lambda that obtains the value from the NameAgeRow object for the current table row. A real convenience of this approach is that it can be made to work with any user-defined class that represents the row data. I find this more convenient and flexible than what is described in the documentation.