Saturday, February 18, 2017

Pipelines in Rust

A classic Operating Systems assignment is to implement a Unix shell. It's a great assignment for several reasons. Students have to:
  • Get a solid understanding of the concept of a process.
  • Master some of the more subtle aspects of the file system.
  • Make direct system calls with subtle semantics.
Since I am teaching my course in Rust, I needed to figure out how I wanted the students to implement pipelines and I/O redirection. The options are:
  • Rely entirely on the standard library.
  • Interact directly with libc.
The benefit of relying on the standard library is, first of all, that the resulting code is portable across platforms. Second, there is no need for unsafe blocks.  The problem is that the pipe from the standard library only connects a parent and child. The Command object is great, but it doesn't allow for the separation between fork() and execv() that is essential for implementing proper I/O redirection. 

The trouble with the standard libc crate is that all of the function calls are unsafe. As far as I am concerned, this is bad news pedagogically because I do not want my students using unsafe blocks. They are still learning the language, and I want them to learn to operate within the safety parameters it gives. Unsafe blocks, while necessary, are for experts, and my students will not be experts after one semester.

Fortunately, I recently discovered the nix crate. It is a safe alternative to libc that provides everyone's favorite system calls in a Rust-friendly form.  The program below executes a simple three-command pipeline.  Here are the key concepts for understanding the program:

  • The fork() system call creates a new process. The new process is a perfect duplicate of the original process. The only difference between them is that fork() returns a different value if it is in the child process. If it is in the parent process, it returns the child's PID. 
  • A parent and child process are connected using a pipe. A pipe is accessed using two file descriptors: one for the pipe's input, one for the pipe's output. The output of the pipe serves as an input stream for a process, and likewise the input of the pipe serves as an output stream for a process.
  • When a parent and child are connected by a pipe, the parent will typically be receiving data from the child. Therefore, the parent's input will be the pipe's output, and the child's output will be the pipe's input. 
  • When the child is finished generating output, it exits. Then, the parent can finish processing the data it received from the child before it in turn exits as well. It is essential that the child exit first; if the parent exits first, unless other provisions are made both processes end.
  • To redirect standard input from the pipe instead of from the keyboard, we need to close the file descriptor for the keyboard (always file descriptor 0) and replace it with the file descriptor for the output of the pipe. The dup2() system call accomplishes this.
  • Similarly, to redirect standard output into the pipe instead of to the monitor, we need to close the monitor's file descriptor (always file descriptor 1) and replace it with the file descriptor for the input of the pipe. Again, we use dup2() for this purpose.
  • Remember that fork() completely duplicates a process. This includes its open file descriptors. In our case, that includes both ends of the pipe. We need to explicitly close the end of the pipe that the process is not using.
  • The execvp() system call loads a program and executes it. In doing so, the data for the current process is overwritten completely (except for the open file descriptors). The first argument is the name of the program. The PATH environment variable is searched for that program. The second argument is the complete list of command-line arguments, including the program name.
  • These parameters to execvp() need to be formatted as fixed-size arrays of C-style strings (which are very different from Rust strings). The program below shows how this conversion can be done.

If you'd like an alternative explanation of how the program works, feel free to view a 15-minute video in which I explain the details of this program.

extern crate nix;
use nix::unistd::*;
use std::ffi::CString;
use nix::sys::wait::waitpid;

// Add dependency nix = "0.7.0" to Cargo.toml

fn main() {
    println!("Executes ls -l | grep Cargo | wc");

    let ls:   Vec = vec![CString::new("ls").unwrap(),
                         CString::new("-l").unwrap()];
    let grep: Vec = vec![CString::new("grep").unwrap(),
                         CString::new("Cargo").unwrap()];
    let wc:   Vec = vec![CString::new("wc").unwrap()];

    match fork().unwrap() {
        ForkResult::Parent{child} => {
            println!("wc pid is {}", child);
            waitpid(child, Option::None).unwrap();
            println!("Finished! Exiting...");
        },

        ForkResult::Child => {
            let (wc_in, grep_out) = pipe().unwrap();

            match fork().unwrap() {
                ForkResult::Parent{child} => {
                    close(grep_out).unwrap();

                    println!("grep pid is {}", child);
                    dup2(wc_in, 0).unwrap();
                    let array = wc.into_boxed_slice();
                    execvp(&array[0], &*array).unwrap();
                },

                ForkResult::Child => {
                    close(wc_in).unwrap();

                    let (grep_in, ls_out) = pipe().unwrap();

                    match fork().unwrap() {
                        ForkResult::Parent {child} => {
                            close(ls_out).unwrap();

                            println!("ls pid is {}", child);
                            dup2(grep_out, 1).unwrap();
                            dup2(grep_in, 0).unwrap();
                            let array = grep.into_boxed_slice();
                            execvp(&array[0], &*array).unwrap();
                        },

                        ForkResult::Child => {
                            close(grep_in).unwrap();

                            dup2(ls_out, 1).unwrap();
                            let array = ls.into_boxed_slice();
                            execvp(&array[0], &*array).unwrap();
                        }
                    }
                }
            }
        }
    }
}

Threads, locks, and deadlocks in Rust

A very compelling feature of the Rust language is compile-time protection against data races.  This makes it much easier to write correct threaded programs that share mutable state.  The following program illustrates how this works. The program runs eight parallel simulations of a coin-flipping experiment.

extern crate rand;
use std::thread;
use std::sync::{Arc, Mutex};

fn main() {
    let total_flips = Arc::new(Mutex::new(0));
    let completed = Arc::new(Mutex::new(0));
    let runs = 8;
    let target_flips = 10;

    for _ in 0..runs {
        let total_flips = total_flips.clone();
        let completed = completed.clone();
        thread::spawn(move || {
            simulate(target_flips, total_flips);
            let mut completed = completed.lock().unwrap();
            *completed += 1;
        });
    }

    loop {
        let completed = completed.lock().unwrap();
        if *completed == runs {
            let total_flips = total_flips.lock().unwrap();
            println!("Final average: {}", *total_flips / *completed);
            break;
        }
    }
}

fn simulate(target_flips: u64, total_flips: Arc<Mutex<u64>>) {
    let mut consecutive = 0;
    let mut iterations = 0;
    while consecutive < target_flips {
        iterations += 1;
        if rand::random() {
            consecutive += 1;
        } else {
            consecutive = 0;
        }
    }
    println!("iterations: {}", iterations);
    let mut total_flips = total_flips.lock().unwrap();
    *total_flips += iterations;
}

The main() function creates two mutable variables that can be shared among threads. One represents the total flips across all experiments. The other represents the total number of experiments that have completed. They are Arc (atomic reference counted pointers) objects that each contain a Mutex (lockable resource).

As each thread is created, it needs its own reference to these shared variables. To create a new shared reference, we need to clone() each variable. This clone is then moved into the thread as it is created, so that the thread is now the owner of the cloned reference.

To access the shared variables, a thread must acquire a lock. The thread blocks if the lock is already held by another thread. A thread releases a lock it holds when the locked variable goes out of scope. The total_flips variable is locked within the simulate() function, while the completed variable is locked after that function completes. The main thread also locks completed periodically as it checks for completion.

The program works fine as written above. But it would be easy to make a mistake that leads to a deadlock. Consider the following variation.

extern crate rand;
use std::thread;
use std::sync::{Arc, Mutex};

fn main() {
    let total_flips = Arc::new(Mutex::new(0));
    let completed = Arc::new(Mutex::new(0));
    let runs = 8;
    let target_flips = 10;

    for _ in 0..runs {
        let total_flips = total_flips.clone();
        let completed = completed.clone();
        thread::spawn(move || {
            simulate(target_flips, total_flips);
            let mut completed = completed.lock().unwrap();
            *completed += 1;
        });
    }
    
    let completed = completed.lock().unwrap();
    while *completed < runs {}
    let total_flips = total_flips.lock().unwrap();
    println!("Final average: {}", *total_flips / *completed);
}

fn simulate(target_flips: u64, total_flips: Arc<Mutex<u64>>) {
    let mut consecutive = 0;
    let mut iterations = 0;
    while consecutive < target_flips {
        iterations += 1;
        if rand::random() {
            consecutive += 1;
        } else {
            consecutive = 0;
        }
    }
    println!("iterations: {}", iterations);
    let mut total_flips = total_flips.lock().unwrap();
    *total_flips += iterations;
}

In this variation, the main thread retains ownership of the lock on completed. This causes all of the simulating threads to block. Unfortunately, the simulating threads cannot update completed because they are blocked. In the meantime, the main thread is waiting for the simulating threads to complete! This is an example of a deadlock. Although Rust prevents lots of concurrency problems, it can't prevent all of them! When writing multithreaded Rust code, it is always essential to retain locks for the smallest possible scope in order to avoid deadlocks.

I'd like to point out that there are alternatives to locks available in Rust. One intriguing option involves using lock-free data structures. I'd like to explore that library in more depth when time permits.

Strings in Rust

One concept in Rust that those learning the language can find very confusing is the existence of two different types of strings: String and &str. (This post assumes at least a basic understanding of the Rust ownership system.)

The String type represents a block of memory that is owned. Mutation is possible if the variable corresponding to the object is appropriately designated.

The &str type is (generally speaking) borrowed from elsewhere rather than owned.  This can happen in a couple of ways. String literals are all &str objects. Methods like split() return iterators into their source strings. The iterated items are then borrowed references to substrings within the source string.

The program below is intended to illustrate these distinctions.

use std::io;
use std::io::Write;

fn main() {
    match run() {
        Ok(_) => println!("Goodbye."),
        Err(e) => println!("Error: {}", e),
    }
}

fn run() -> io::Result<()> {
    let stdin = io::stdin();
    let mut longest_overall = String::from("");
    loop {
        print!("> ");
        io::stdout().flush()?;
        let mut line = String::new();
        stdin.read_line(&mut line)?;

        if line.trim().len() == 0 {
            break;
        }

        let mut line_longest: &str = "";
        for word in line.split_whitespace() {
            if word.len() > line_longest.len() {
                line_longest = word;
            }
            if word.len() > longest_overall.len() {
                longest_overall = String::from(word);
            }
        }
        println!("Longest word in this line: {}", line_longest);
    }
    println!("Longest overall word: {}", longest_overall);
    Ok(())
}

This program accepts command-line input from the user. Each line of input is split up to find the constituent words. The program reports the longest word from the current line. When the program ends, it also reports the longest overall word.

The variable longest_overall is a String. I chose to make it a String because I need for that variable to maintain long-term ownership of the value it is storing, and that's not possible with a &str.

The variable line is a String. It has to be a String because it gets mutated when input is read from the user.

The variable line_longest is a &str. It borrows a substring from line. Because it has the same lifetime as line, borrowing is all we need. Note that both of their lifetimes end when they go out of scope at the end of each iteration of the loop.

This brings us back to the need for longest_overall to be a String. Ultimately, the longest word overall is simply a substring of line, which goes out of scope at the end of each loop iteration. That makes it absolutely necessary to make a long-term copy of that longest word, in order to preserve it in existence after its source is deallocated.

To sum up, when figuring out which type of Rust string to use, pay attention to the lifetime. Remember that &str objects are always borrowed from somewhere else. If the string is needed beyond the lifetime of the source of the borrow, you need a String.

This example does not illustrate this, but many methods and functions expect a &str. If you have a String, and a &str is demanded, just use the as_str() method to lend that String to the function.

Teaching Operating Systems using Rust

This semester (inspired by the example and compelling rationale of David Evans) I have been teaching Operating Systems using the Rust language. (I have explicitly rejected the possibility of teaching in C.) Rust is a great choice for this course for several reasons:

  • It is sufficiently low-level that one can implement a complete operating system. In particular, it compiles to a binary and does not require a run-time due to its lack of a garbage collector.
  • Concurrency is a major theme in operating systems. In Rust, a program that experiences data races will not compile. This enables the compiler to assist in teaching the students about safe mutable data structures when running concurrent threads.
  • In our curriculum, the programming languages we employ in two or more courses are Python, Java, and Haskell. Something that all three of these languages have in common is a heavy reliance on heap allocation. Most stack-allocated variables in those languages are pointers into the heap. In contrast, Rust explicitly provides for both stack-allocated and heap-allocated data, with arguably a bias towards stack-allocation. 
My overall strategy for the course is twofold:
  • Spend the first half of the semester learning the Rust language and getting familiar with OS concepts. To this end, I'm following the outline from David Evans's class, where the students implement a web server and a Unix shell. To get accustomed to Rust, I began the semester by having students write lots of short command-line programs.
  • Spend the second half of the semester exploring the source code of Redox, an open-source microkernel OS written entirely in Rust. Students will be assigned sections of the source code to present to the class. This will be a way to flesh out the concepts explored in the first half of the course in terms of a concrete implementation.
For a textbook, I am using the freely-available online book Operating Systems: Three Easy Pieces by Remzi and Andrea Arpaci-Dusseau. I first learned about the book from (again) David Evans, and I later learned that my friend and colleague Mark Goodrich had taken their OS course while in graduate school. What I like about this book how it lives up to its name. The "three easy pieces" are virtualization, concurrency, and persistence. It has been a great tool for conceptually organizing my lectures. I constantly make reference to how each topic we explore relates to these three ideas.

I plan to write some additional posts about the course and our experience with Rust as it proceeds.

What's wrong with C?

(The following post is adapted from an answer I wrote in response to this question.)

I started programming in C++ when I was a sophomore in college. Later, I also learned to program in C (without the ++) when working as a student Unix systems administrator.  I wrote somewhere between 10,000 and 20,000 lines of C++ when working on my doctoral dissertation.  Early in my career as a professor, I taught both the C++ and C languages in different courses at different times.

I've since given up entirely on the C and C++ languages. I no longer teach them or write any code in them.

So what do I use now? For quick scripting, I use Python. For building something with a GUI, I usually program in Java. For robot programming, I use both Python and Java, depending on the situation and the available libraries. (I have ambitions to begin using Rust for robot programming as well.)  I like Haskell a lot, but I have not really found a niche for it in the projects I have been pursuing.

My goal in this post is to outline my rationale for giving up on the C family. Some of these comments apply specifically to C, others to both C and C++. None of these critiques are original; all of these points have been made by other people. I am assembling them here to give a concise rationale as to why one long-time C/C++ programmer decided to move on.

What is great about C is that it is well-suited to low-level programming that directly addresses hardware. Unlike Python, Java, any .NET language, or anything with a garbage collector, it does not have a run-time. The code you write in C is translated directly into assembly language, and aside from libraries you explicitly invoke, the only code that runs is the code you write. If you are writing an operating system or code for an embedded device with real-time constraints, it can be an appealing option.  (I now believe that Rust is a strictly superior option in these situations, but that is a subject for a future blog post.)

However, there are numerous problems with C that often make it unproductive or impractical. Problems that also apply to C++ are noted explicitly:
  • Minimal facilities for data abstraction (C). Higher-level languages like Python and Java provide many advanced mechanisms for data abstraction, including classes, objects, polymorphism, lambdas, and so forth. While all of the above can be imitated using C (especially through clever use of function pointers), it is quite painful.
  • Minimal data-structure libraries (C). Other languages include data-structure libraries as part of their standard libraries. While third-party data-structure libraries are available for C, it can be a significant inconvenience in comparison to standard libraries shipped with every implementation.
  • Poor string-handling (C). Using strings in C requires lots of low-level operations. Equivalent work can be done in languages like Python with much less code.
  • Lack of safety (C, C++). In a safe language, like Java or Python, errors in programs are trapped as they happen. C is an unsafe language. That is, a program with an error results in undefined behavior. This can make C programs insanely difficult to debug. In a safe language, the program throws an exception; you trace the exception, find the problem, and fix it. In an unsafe language, all bets are off. Get very familiar with valgrind.
  • Poorly defined semantics (C, C++). According to the C standard, the compiler can generate any code it wants in the presence of undefined behavior. Furthermore, even extremely experienced C programmers can fail to understand situations that result in undefined behavior. As compiler writers discover more opportunities to optimize away code that results in undefined behavior, existing programs that are believed to work suddenly start failing simply by updating the compiler. As bug submitter felix-gcc put it, “Guys, your obligation is not just to implement the C standard. Your obligation is also not to break apps that depend on you.” And this isn’t just one developer; even gcc itself and the Linux kernel frequently execute code with undefined behavior. The fact that this can happen and is so pervasive reveals fatal flaws in the C standard itself. For more details about these problems, I recommend the following:
All of that said, what really brings this issue to be important to me is the fact that I am now my department's Operating Systems instructor. I had to conclude that I did not want to teach that course using the C language. I have decided to teach it using Rust, the details of which I will describe in a series of future blog posts.

Tuesday, May 10, 2016

Event handling with JavaFX ChoiceBox objects

Writing event handlers for many JavaFX components is relatively straightforward. In comparison, writing event handlers for the JavaFX ChoiceBox can be very confusing. Once you get the hang of it, though, it really is not too difficult.

For this example, we will build a JavaFX GUI containing a ChoiceBox and two TextFields. One TextField will allow the user to enter a string, which will then be placed in the ChoiceBox. The other TextField will show the length of the currently displayed String from the ChoiceBox. I'll provide the controller class below; the remainder of the interface is an exercise for the reader.

import javafx.fxml.FXML;
import javafx.scene.control.ChoiceBox;
import javafx.scene.control.TextField;

public class ChoiceBoxDemoController {
 
  @FXML ChoiceBox choices;
  @FXML TextField entries;
  @FXML TextField length;
 
  @FXML
  void initialize() {
    length.setEditable(false);
  
    entries.setOnAction(event -> {
      if (entries.getText().length() > 0) {
        choices.getItems().add(entries.getText());
        entries.setText("");
        int added = choices.getItems().size() - 1;
        choices.getSelectionModel().select(added);
      }
    });
  
    choices.getSelectionModel().selectedItemProperty()
    .addListener((obs, oldV, newV) -> 
    length.setText(Integer.toString(newV.length())));
  }
}

Note that we have created two different event handlers. The handler for the TextField is very straightforward. The setOnAction() method specifies a handler for whenever the user types the Enter key on that TextField. The ChoiceBox is backed by an ObservableArrayList containing its items. This list is accessed using the getItems() method. So to stick an item on the end of the list, I just use its add() method. I also want to update the ChoiceBox to display what I just added. The index will be the last possible index (i.e., size() - 1), and the select() method of the SelectionModel will designate it. Note that calling select() will also trigger the selection event handler described below.

In contrast, setting up the ChoiceBox event handler is more involved. There are several levels of object that we must traverse before actually being able to set it up. Please note that I do not seek here to justify this design, only to explain it! First, we need to get a reference to the SelectionModel object. Next, we get a reference to the SelectedItemProperty. This property is an Observer of the currently selected item. (If you wish to work with the item index, use the SelectedIndexProperty instead.) It is to this property that we can finally attach an event handler.

The handler itself conforms to the ChangeListener interface. The changed() method takes three arguments. The first argument ("obs") is a reference to the SelectedItemProperty to which the handler is attached. I've never found a use for this. The second argument ("oldV") is the value of the selected ChoiceBox item before the event occurs. The third argument ("newV") is the value of the selected ChoiceBox item after the event occurs. This last argument, in particular, tends to be the most useful, and it is what I use here in this example. What I recommend avoiding at all costs is any reference back to the ChoiceBox object itself; I have found the program's behavior to be much more predictable using these arguments.

Friday, May 6, 2016

JavaFX Data Structures Labs

This semester, Mark Goadrich and I have been working to overhaul the Hendrix Data Structures labs to make use of JavaFX. We use Eclipse. For each assignment, students are given a suite of JUnit tests that their code must pass. One of our goals is to get them accustomed to running JUnit tests. In our experience, this makes it easier for them to get used to writing their own JUnit tests in the next course on software development. We also make heavy use of Enum data types.

A major goal of these labs is to provide a context for each data structure. This way, students can see data structures as components of a larger system, rather than just abstractions in isolation.

While these labs are definitely a work in progress at the moment, I'd like to share some of what we have achieved thus far.

We begin by getting students up to speed with creating Eclipse projects and familiar with interfaces. To that end, they implement strategies for the Prisoner's Dilemma.

The next lab involves implementation of stacks. They must implement the data structures themselves. We provide a JavaFX GUI that enables them to visualize the behavior of what they create.

This is followed by a lab where they create a maze data structure. Again, we provide the GUI, but they create the underlying data structure. This sets them up for the next lab, where they implement maze solvers using stacks and queues. This is also the stage where we introduce the implementation of generic data types.

To illustrate and visualize doubly-linked lists, their next task is to implement a text editor. We again provide the GUI. The students implement the doubly-linked list. In creating a cursor to navigate the document, we also provide a gentle introduction to the concept of an iterator.

As no data structures course is complete without sorting algorithms, in the next lab the students implement animated sorting. Again, we provide the GUI and they provide the algorithms.

Having worked with four JavaFX projects up to this point, in the next lab the students are finally responsible for a GUI implementation. We guide them very closely through the process of creating it. After the lab, the students have a project assigned in which they are to create another GUI in a more free-form way.

Another classic data structure is the binary search tree, and in the next lab students create their own, with a simple visualization.

One of the most ubiquitous features of the modern age is the text-predicting device. We introduce students to the use of a trie for text prediction. The GUI shows the predicted text based on the input suffix. In practice, the trie is a more efficient tree structure than the binary search tree when the keys are strings, and we thought it would be valuable to introduce students to this concept.

The classic lookup data structure is of course the hash table. We actually struggled a bit to come up with a context in which a hash table was actually an appealing data structure (as opposed to a tree or trie). That is, every context we devised, most especially with string keys, seemed to involve an application where maintaining a sorted order was beneficial.

What we came up with as our application was the creation of a tic-tac-toe program that learns from experience. The students implement a hash table to store board positions and expected outcomes.

In our final lab, we revisit maze searching in order to introduce heaps.  Students create a heap that is then used in an implementation of A*.

Some more labs we'd like to work on would include:

  • Balanced binary trees
  • Functional programming concepts introduced in Java 8, including
    • lambdas
    • stream computation.
      • With regard to stream computation, it would be cool to come up with something that parallelizes nicely on a multicore.
I'd love to hear from anyone who finds any of this useful, or who has other ideas to share!