24 July 2012

So I write compilers for a living now.

I’m a regular visitor of Stack Overflow, and have a Stack Overflow Careers profile. A month ago, I saw a job listing for Spaceport.ioa YouWeb-funded company that makes a tool for converting Flash applications to run natively on several different platforms. A key component of their technology is a compiler, written in Haskell, from ActionScript 3 to the Spaceport runtime.

Since I enjoy working in Haskell, and compilers and games are two of my favourite things, I readily applied. After phone and Skype interviews, I flew out to California for a day of in-person interviews. Much as I like flying, it’s no fun to spend more time in the air than at your destination. Lesson learned.

Soon afterward I received an offer, accepted, and in a few short weeks found myself moving across the country from southern New Hampshire to the Bay Area. It’s been a great adventure so far, with plenty of good people, good food, and endless opportunities to practice my Chinese. :)

Stack Overflow Careers

When it launched, I believed Careers would soon become the go-to place to hire developers. Unfortunately, none of the job offers I actually received were the right thing for me. Most were a mediocre, uninteresting kind of bad, not the hilariously terribad stuff of great blog posts.

But it was well worth the wait. I got a job on the basis of reputation and merit, not just a résumé. And now I can happily say I make my living off compilers, which is what I want to be doing for the foreseeable future. The coolest part is that I can apply much of the theory I learned while doing language research in college—to solve practical problems.

From the company’s side, Careers is a win as well. You get many good developers looking at your job listing, for a downright minuscule fraction of the financial and time cost of a recruiter. When you get an application from a developer, all of the information you could possibly want to know about their skills and history is right there in their public profile.

So, fellow devs, if I’m ever in a position to hire you, it will probably be through Stack Overflow Careers.

Professional Haskell

Since I started using it professionally, my opinions on Haskell have definitely evolved. It’s a well designed language, to be sure, but till coming to Spaceport I hadn’t really had the opportunity to explore a large project. Here are some of the things I’ve learned while using Haskell in production.

Haskell is Scalable

I was skeptical that the language could scale, not only to a large project but to a complex one. It’s not that I was doubtful, exactly. It’s just that languages that perform well for small code don’t necessarily work well for big code, and vice versa. You can’t know unless you test.

After working with a roughly 10,000-line codebase for several weeks, I’m now convinced that Haskell is excellent for coding in the large. With good unit tests in place—we have about four times as much test code as application code—I was able to productively make changes and feature additions starting on day one.

A Haskell codebase seems to remain readily comprehensible well beyond the size where, say, a C++ codebase starts becoming unwieldy; it’s maybe a 2:1 ratio. Functional languages are good at factoring, too, which is basically the essence of abstraction and the key to managing complex code.

Still, you must be diligent about it. Using Haskell does not automatically make you a better programmer, athough the people who learn Haskell well do seem to be better programmers for it—if only because Haskell is pretty progressive as languages go.

Haskell is Productive

Proponents of any language often talk about increased productivity. I do feel more productive in Haskell—in terms of the amount of code I have to write in order to add and test a feature—than in most any other language. But I can’t exactly quantify that, so I invite you to try it for yourself.

LOC is funny in that it’s simultaneously an important metric and a completely useless one. Language designers tend to be in agreement, more or less, that syntax matters—that good syntax does wonders for the effectiveness of language as a tool of thought. Naturally, they disagree about what actually constitutes “good syntax”. There might be something more to it, or it may just be a matter of taste that Python looks good to Alice and APL looks good to Bob.

Regardless, Haskell looks good to Jon. It’s pleasant to read—succinct, but not so dense that reading it is a chore. As is the case for any language, tutorial code looks nothing like production code. In Haskell you also see many academic papers where the style is very mathy—one-letter names, excessive abbreviations—and that style doesn’t scale at all beyond 200 lines. You need good names, no matter what.

Static Typing? More. Gimme.

Haskell’s type system is powerful enough that type signatures can tell you a good amount of useful information as to what a polymorphic function does. The “polymorphic” part is important: specific types naturally have much broader interfaces than simple type variables. You can easily say that a function of type (a → [a] → [a]) has only a few possible sane definitions. (Well, okay, countably infinitely many, but still.)

  1. f x xs = x : xs
  2. f x xs = xs ++ [x]
  3. f x xs = intersperse x xs

Whereas you know considerably less about what a function of type (Double → Double → Double) might do, because the set of floating-point numbers, and functions operating on them, is so large. The type system makes the trade-off explicit: the more domain-specific your functions are, the fewer guarantees you have about them.

In practice what this means is that, say, glancing at the type signature of a poorly-named function can give you a decent clue about what it ought to do. Or if you need a utility function, you can search Hoogle for a plausible type signature and usually find what you need. No other popular language’s API search comes close to matching that kind of utility. Thanks to Hackage, new libraries are very easy to discover and integrate into a project.

Haskell is Reliable

Where Haskell really does shine is in building reliable software. Static typing, unit testing with HUnit, and property testing with QuickCheck let you easily make a lot of guarantees that your code does what it’s supposed to. This is especially valuable for compilers, which are essentially (very large) pure functions, and thus very much amenable to such testing.

Testing definitely requires more up-front thought, but the end result is more reliable software—and that’s synonymous with better software, because good software is, for lack of a better term, obedient. Except maybe for AI.

Haskell lets us have a near-complete, robust AS3 compiler in under 10,000 non-blank lines of not-particularly-terse code. And that’s saying something: ActionScript is not what anyone would call a “pretty little thing”. And you don’t even know the half of it.

Conclusions

  • Compilers are fun and I will write them forever.
  • Haskell is good and I will use it for a good long time.
  • The Bay Area is a cool place to be, and I’m glad I moved here.

For those interested in my language work, an update is coming soon. Till next time.

24 May 2012

The Utopian University

Musician Tom Milsom recently wrote on his Tumblr:

I just wanna make shit

I don’t want money

I want to be able to live in a nice place and eat and fund US tours and stuff but apart from that

Nope

Pirate my music if you can’t afford it, cause it all comes round eventually in like goodwill and stuff, I don’t mind

I just want people to be able to enjoy what I do in the best possible environment

This really got me thinking. It’s a sentiment you often hear expressed among creative folks, and yet it never really seems to be addressed. I’ve been wishing for ages that there were some kind of patronage system for creative people—to do for makers what Y Combinator does for tech startups.

Creative people mostly just want the opportunity to create without being bothered by irrelevancies. Art colonies fill this need somewhat—you apply for a fellowship and get space and potentially funding for a summer to work on whatever you want. But this is, by nature, small-scale.

No matter how hard you wish oh wish for something, nobody is going to make it for you. But if you have a problem, and you fix it yourself, other people will come to you and say “Take my money!” or at least that’s the hope.

So what is my problem?

In order to create, I have to work on stuff not immediately related to my creations.

Even as a programmer, who can ostensibly create value out of just caffeine and time, most of my ideas are not easy to monetise. My hobby project right now is the compiler and tools for a new programming language. Whatever money I stand to make from that would be in book sales or donations.

So I, like many, cannot reasonably hope that my creative pursuits will ever be self-sustaining. And that’s fine, though I do envy Tom in that he is successful enough as a musician to continue being a musician.

Now, I could solve this problem by aggressively contacting investors to set up some kind of über-patronage venture fund thing for creators. That is, to my mind, the obvious solution.

But I don’t think it would scale. At all. In fact, I think it would be a spectacular failure.

Sure, the idea is straightforward. You apply with a portfolio. If accepted, you get a small amount of funding (say around $10,000) for a dozen-odd weeks to work on a particular project. That’s enough to ensure you have a place to live, food to eat, and the equipment and materials you need to work. Your patrons get a stake in anything you produce during that time.

But that’s not an attractive proposition for the people who would actually provide the funding. They have no way to force creators to stay on task. Worse, they won’t necessarily have the experience to know what to do with rights in widely varying projects.

And at the end of the season, they won’t have equity in a business. They’ll have rights in an album, or a game—a product with no business around it and no guarantee of growth unless somebody does some aggressive marketing. Which costs more money. That’s why specialised creative industries exist—record companies being the number one example. You make, we market.

Why would you, as an investor, buy a stake in something unless you expected it to be worth at least as much as you paid? This hypothetical patronage group would necessarily become a seed funding platform for creative startups, not just creative individuals in general. Even barring the other potential problems with that, it’s not what I want.

What I really want, quite frankly, is a Utopian University where creators can freely live and work and drink and mate and generally forget about the outside world. Creators create, the University helps market, the public buys, and the dollars come back to fund the whole shebang.

In order to keep everyone motivated, creators would be held accountable for their productivity—if you aren’t making progress on anything, you get the boot, or have to take a leave of absence. Call it an “artistic GPA” requirement.

That is something I think could work—with substantial initial funding, careful planning, and the right people behind it. I have no idea how to make it happen, but it will remain in the back of my mind as a long-term goal. And of course I’d love to hear a better idea.

27 April 2012

C++ Functional Style Tips

To say that C++ is a complex language would be an understatement—I had to spend years using it before properly understanding all of its many gotchas. Because I have plumbed its depths, I cannot recommend it to anyone for any purpose. If you know how to use the language effectively, then by all means do use it—but if you do not already know it, then there are plenty of other, perfectly good languages out there. Personally, I’m a fan of C, Haskell, and Scheme.

However, the language is still undeniably useful, particularly in game development. So I must try to benefit from the power of C++ while avoiding its failings in every way that I can. Following, then, are some brief guidelines that I have found to work well for me to write C++ code in a more functional style. And functional style makes it easier to grasp a program by reducing the effort required to understand its possible states—so you can focus more on solving problems than fixing bugs.

For the sake of concision, I’ll assume that using namespace std; and using namespace std::placeholders; are in effect. Obviously in real code you ought to import only what’s necessary, and in as small a scope as possible.

♦ ♦ ♦

1. Use const Obsessively

C++ unfortunately has no reasonable way of enforcing referential transparency. You cannot readily make the compiler complain when a function has externally visible side effects. Indeed, I hope C++2x gets a pure keyword. But in the meantime, at least we have the ability to enforce immutability with our good friend const.

In general, everything should be const unless it truly needs to be modified—whether because an in-place algorithm is more efficient than its non-mutating equivalent, or because of failings in the type system and standard library. As an example of the latter, you cannot use standard algorithms to initialise an immutable container, resulting in objects that are unnecessarily non-const.

Declare member functions const if they are logically non-mutating, that is, you would expect that calling the function in and of itself will have no effect on the observable state of the object.  “Observable” means that you can deduce said state through the public interface alone.

struct Point {
    double distance(const Point&) const;
    double distance(double, double) const;
    double x, y;
};

When taking parameters by lvalue reference, use const when possible.

    double distance(const Point&) const;

When taking parameters by value, declare the function as usual.

    double distance(double, double) const;

Then make the parameters const in the definition, unless of course they are to be modified.

double Point::distance(const double x, const double y) const {
    …
}

Don’t bother returning by const value—all it does is arbitrarily prevent calling mutating member functions on temporaries—but do return const references where appropriate.

const vector<double>& TreeNode::get_children() const {
    return children;
}

Make local variables const when possible.

double Point::distance(const double x, const double y) const {
    const auto dx = this->x - x;
    const auto dy = this->y - y;
    return sqrt(dx * dx + dy * dy);
}

Be sure to use const qualification for pointers where appropriate.

T* x;             // Mutable pointer to mutable T.
const T* x;       // Mutable pointer to immutable T.
T* const x;       // Immutable pointer to mutable T.
const T* const x; // Immutable pointer to immutable T.

This also applies to smart pointers such as unique_ptr (which you should use a lot) and shared_ptr (which you probably don’t need).


shared_ptr<T> x;             // Mutable shared pointer to mutable T.
shared_ptr<const T> x;       // Mutable shared pointer to immutable T.
const shared_ptr<T> x;       // Immutable shared pointer to mutable T.
const shared_ptr<const T> x; // Immutable shared pointer to immutable T.

Related to const is constexpr, which allows you to ensure that if a function can be evaluated at compile time, then it will be. This has less to do with functional style and is more a matter of optimisation, so I mention it only for the sake of completeness.

♦ ♦ ♦

2. Use <functional>, <algorithm>, <numeric>, and <iterator>

The C++ standard library has a number of common algorithms abstracted into iterator-based functions. While iterators have some flaws, they can often help you write code that is more readable than the alternative using explicit loops.

bind() lets you perform partial application of function objects.

const auto succ = bind(plus<int>(), _1, 1);
cout << succ(3) << '\n';

Here, bind(plus<int>(), _1, 1) is equivalent to the section (+1) in Haskell.

There are two versions of transform() to be found in <algorithm>. The first is analogous to the familiar functional map, applying a unary function to each element in a range and sending the results to an output iterator.

const vector<int> a{1, 2, 3, 4, 5};
vector<int> b;
transform(a.begin(), a.end(), back_inserter(b),
    bind(plus<int>(), _1, 1));

The second version is analogous to zipWith, which takes two input ranges, applies a binary function to each pair of elements, and sends the results to an output iterator.

const vector<int> a{1, 2, 3, 4, 5};
const vector<int> b{5, 4, 3, 2, 1};
vector<int> c;
transform(a.begin(), a.end(), b.begin(),
    back_inserter(c), plus<int>());

accumulate(), from the <numeric> header, is by default a sum:

const vector<double> values{3.1, 4.1, 5.9, 2.6, 5.3, 5.8};
const auto sum = accumulate(values.begin(), values.end(), 0.0);

With a binary function, it becomes a left fold:

const auto product = accumulate(values.begin(), values.end(),
    1.0, multiplies<double>());

C++11 introduces lambdas to C++, which are largely syntactic sugar for anonymous classes—the members of the class are the (explicit) closure from the lambda’s lexical environment, and the operator() of the class is the lambda body. When you need to use a non-trivial function in a standard algorithm, it’s clearest to use a lambda rather than trying to compose function objects.

const auto paraboloid_product = accumulate(values.begin(), values.end(),
    2.0, [](double x, double y) { return (x - 1.0) * (y + 1.0); });

On the other hand, sometimes it’s better to use a simple for loop. Case in point: the for_each() algorithm, which conveys no more information than an ordinary range-based for, and can become unreadable when the types are non-trivial.

// Do this.
for (auto& p : symbols)
    p.second->method();

// Or this!
for (auto p = symbols.begin(); p != symbols.end(); ++p)
    p->second->method();

// Not this.
for_each(symbols.begin(), symbols.end(),
    [](pair<const string, shared_ptr<Value>>& p)
        { p.second->method(); });

♦ ♦ ♦

3. Use auto and decltype()

The C++11 auto keyword allows you to omit the manifest type of a variable, if a type can be deduced from the variable’s initialiser. Combined with const, this makes C++ variables look an awful lot like single static assignment registers.

const auto v = f(x, y);

Just as auto is used to deduce a declaration from an initialiser, decltype() is used to deduce a type from an expression:

decltype(v[0]) w;

Like sizeof(), decltype() does not evaluate its argument, so it’s safe to use side-effectful functions in a decltype() expression. Like const, you ought to use auto and decltype() wherever possible, specifying manifest types only when necessary. auto lets you avoid repetition at the type level, and decltype() lets you express types as contracts in terms of other types.

♦ ♦ ♦

4. Use Higher-Order Macros

One of the reasons I’m fond of functional and concatenative programming is an increased ability to reduce repetition through factoring. A functional style, and in particular a point-free style, is vastly more amenable to refactoring than imperative code. Unfortunately, C++ often makes it unwieldy to do such refactoring, even with functional style, because it is inherently an imperative language.

Despite its limitations and caveats, the C preprocessor is useful for eliminating repetition. Unlike, say, Lisp macros, which are Turing-complete, CPP is only equivalent in power to a pushdown automaton. Still, macros have one powerful feature that I use regularly: they can take other macros as arguments. Thus you can define a sequence like so:

#define FUNCTIONS(M) \
    M(add, +) \
    M(sub, -) \
    M(mul, *) \
    M(div, /)

And apply macros to every term in the sequence:

#define DECLARE(N, O) \
    int N(int, int);

#define DEFINE(N, O) \
    int N(int x, int y) { return x O y; }

FUNCTIONS(DECLARE)
FUNCTIONS(DEFINE)

#undef DEFINE
#undef DECLARE

When C++ fails to provide a mechanism of semantic abstraction for a particular problem, higher-order macros at least allow you to abstract syntactically repetitive code. Judicious use of macros to reduce repetition is a Good Thing, even if it might make you feel a bit dirty.

♦ ♦ ♦

I’m sure there’s more I’m forgetting, but that’s the meat of it. Functional style can help you out, and here are some ways to do it. These aren’t rules, just guidelines, and you should always evaluate each problem individually to determine if the usual patterns really fit. Programming is, after all, a very subtle magic. But if there’s any way to make that magic a little more fun, I’ll take it.

16 April 2012

Frighteningly Ambitious Programming Language Ideas

Use versus Mention

The use–mention distinction—in addition to being one of the rare opportunities to use an en dash—is a distinction between application of a term for its inherent meaning (Brian is alive and has no letters) and examination of the term for its superficial representation (“Brian” has five letters and is not alive). In natural languages, we typically use quotation to differentiate between use and mention.

This distinction is one of those apparently mundane linguistic phenomena that turns out to have rather subtle and interesting philosophical implications. In particular, use without mention is the basis of pronouns and demonstratives—in “He picked up the book and studied it”, it refers to the same thing as the book, without repeated mention of the book. These structures in turn allow self-reference, as in the liar paradox (“This statement is false”) and Quine’s paradox:

“Yields falsehood when preceded by its quotation” yields falsehood when preceded by its quotation.

Reference and self-reference are two things that come up quite often in programming. If you’re a language designer with a mind to fundamentally alter how programming is done, then the ideas that follow are yours for the taking. They are difficult, unattractive, dragonish ideas, and it will be many years before ordinary developers take them seriously.

But first, some background.

♦ ♦ ♦

Reference and Quotation

Every aspect of natural language has a parallel in artificial language. Concrete referential types such as pointers are also references in the abstract sense. Pointers are like pronouns in that they allow you to refer to a thing itself, rather than through its name. Take this C code:

T x;
T *y = &x;

Here, *y refers to the very thing that x does. In addition, it allows a form of lazy evaluation: we can pass around copies of y as much as we want, without creating copies of *y. Deferred evaluation thus has performance benefits, but more fundamentally, pointers allow us to create data structures of a wholly different shape than the memory in which they live—trees and graphs and so forth. More on that later.

So in C, taking the address of a variable is a form of quotation, preventing the evaluation (copying) of that variable until the pointer is dereferenced. In Lisp, the quote special form likewise denotes mention rather than use, but of any expression:

(f x y)         ; Use
(quote (f x y)) ; Mention
'(f x y)        ; (Shorthand for the above.)

Concatenative languages have this as well:

y x f     # Use
[ y x f ] # Mention

This is essentially a generalisation of pointers—deferring evaluation through quotation. Lisp programs can pass around code unevaluated as easily as C programs pass around data uncopied. This largely eliminates the distinction between data and code, and allows still more kinds of laziness. Anonymous functions are similar to quotations, but a bit heavier in that they also have a closure of variables from their (lexical or dynamic) environment.

♦ ♦ ♦

To Infinity

But as it turns out, it’s not even strictly necessary to make a distinction between use and mention. The most visible example of this in practice is probably Haskell, where evaluation is non-strict by default. Every expression is nominally a reference (thunk) to be evaluated only as needed. To support this, most types are implicitly referential (boxed). Non-referential (unboxed) types are available for internal use and low-level optimisations.

In a way, lazy evaluation is rather like garbage collection. GC simulates a machine with infinite memory; lazy evaluation simulates a machine capable of infinite computations. In Haskell, it’s normal practice to construct infinite lists and cull them as needed. That’s backward as far as most languages are concerned, but it also turns out to be a very intuitive way of modelling many kinds of problems.

So we can work with infinities of space and time on finite machines. But can we generalise further? Well, yes, but the results look pretty alien. Here be the aforementioned dragons.

♦ ♦ ♦

Dimensionality and Topology

Our computations are essentially two-dimensional, with one dimension of space (memory) and one of time. More time dimensions are possible, but I don’t see the use, considering our universe seems to have only one. I’d appreciate comments from anyone who can offer an application of such a thing.

You can very easily generalise to more memory dimensions, though. 2D memory allows you to construct, say, a matrix where each element is adjacent in memory to all of its neighbours. You just allocate memory in rectangular rather than linear regions. Unfortunately, you lose a sane ordering relation on pointers.

You can also play with the memory topology—cylindrical, spherical, or toroidal memory lets you construct fixed-size circular arrays, where each element is adjacent to the next, and the last is also adjacent to the first. Slightly stranger topologies allow variable-length circular arrays. And beyond familiar shapes and grids, any lattice graph can serve: in a 2D hexagonal topology, for instance, each memory location is adjacent to six others.

Our programming languages, too, tend to be one-dimensional, much like our natural languages; but this, too, is not a requirement. Indeed, two-dimensional languages and higher-dimensional languages have been made, but it’s probably more practical to add 2D features to a predominantly 1D language:

# Probably a good idea:

matrix = 1 2 3
         4 5 6
         7 8 9

# Maybe a good idea:

lispy_boxes = +-------------+
              | * +-------+ |
              |   | + 2 3 | |
              |   +-------+ |
              |   +-------+ |
              |   | + 4 5 | |
              |   +-------+ |
              +-------------+

# DEAR GOD WHY

         (   divisor  ) exponent    n
result = | ---------- |          * sum set(i)
         (  dividend  )            i=0

Furthermore, 1D languages require jumps for flow control, but since any graph can be embedded in three dimensions, languages of three or more dimensions can be written jump-free. An easy way to do this is to allow changing the direction of the instruction pointer rather than its position.

And for you CS theory types looking for something to do a Ph.D. on, it’s a fun problem to figure out whether a language with a strictly planar state diagram can be Turing-complete. Go nuts.

♦ ♦ ♦

Diffing, Visualisation, and Editing

As with memory, languages could be given any number of interesting topologies beyond the usual grid. Tool support, of course, becomes something of an issue. It’s relatively easy to see how a diff would work in a 2D or 3D Cartesian language, but more complex spaces are problematic.

In one dimension, diffing relies on the longest common subsequence problem (LCS). While this problem is NP-hard, it can be solved in polynomial time using dynamic programming for a constant number of sequences—usually two, the previous and current versions of a file.

In more complex spaces, you have to rely on the maximum common subgraph isomorphism problem (MCS), which too is NP-hard. Luckily, MCS is solvable in polynomial time when the circular ordering of edges around vertices in an embedding is constrained, as it would (almost?) certainly be. But you don’t need to add complexity when you generalise—hell, for some languages, you might only need simple graph difference.

No, the really problematic things are visualisation and editing. How do you display removals and insertions in multiple dimensions and unusual topologies? Even for 2D images there are many equally good ways to visualise differences. How do you let a programmer efficiently and reliably create, alter, and share programs outside the realm of 1D text?

♦ ♦ ♦

And Beyond

Programming is still a fairly new field. Our languages are hovering around a local maximum, but there’s little reason to think we couldn’t come up with something better, more fun, and more productive. We need to be willing to sacrifice some of the things that make existing languages great, because they might have no analogue in a better system.

In other words, we should not imagine aliens in terms of the creatures we know. A green dog with antennae is still a dog. We should instead imagine things that make us legitimately uncomfortable to think about. Things that are unattractive not because they’re peculiar or dangerous, but because they’re deeply unsettling. Think Marvin the Martian versus the blind idiot demon sultan Azathoth. If those aliens exist, then not only does it mean we’re not alone—it means we’ve been wrong this whole time. And that, unfortunately, is how progress works.

28 March 2012

Thinking Broadly

I just read an article called “Overthinking”, which spawned some interesting discussion on Hacker News. I was rather disappointed in the article, because we do overthink, but not in exactly the way the author suggests. So here’s what I think I think—if I haven’t overthought it.

You cannot deliberately solve a problem without thinking about it. Even in the rare case that a solution arises from your subconscious for a problem you didn’t realise you had, your brain has been processing patterns and, as it were, thinking for you. This happens whether you like it or not—and for some reason it happens extraordinarily often in the shower.

Now, whenever we set out to solve a difficult problem, we use our powers of reasoning and intuition to guide us in the direction of a solution. Starting from the origin of a problem, we use thought to propel ourselves through the space of possible solutions in search of local maxima:



Overthinking is not thinking too much, because that’s simply not possible. Rather, to overthink is to doggedly pursue a single direction as far as possible before moving on, like the non-solution above. That direction might eventually lead to a good solution, or even the very best one. But there are solutions in other directions that require much less work to get to.

That’s why “overthinking” is often a waste of time: only when you know the direction to go should you put the blinders on and get to work. That’s essentially why I’m more of a starter than a finisher: I find it much easier to discover paths than to persist in treading them. But again, overthinking is not a measure of thinking too much or too little. It’s conducting a depth-first search when you ought to be searching breadth first.

Children are undoubtedly better than adults at avoiding overthinking, and the reason is simple. Without having had the chance to develop specialised knowledge, their solution spaces are relatively shallow and uniform:



Children don’t know as much as adults, so they give equal weight to most of the directions they could take when solving a problem, and thus are better at appearing observant and intuitive. We generally use intuition to decide our angle, and knowledge to go the distance—polar coordinates, literally. When you have knowledge in a specific area, you often discount alternative ways of thinking about a problem—in other words, if all you have is a hammer, everything looks like a nail.

But that’s not necessarily a bad thing. A lot of the time, your problem is in fact a nail, or something very like one, and your hammer will work just dandy. The idea that knowledge is the problem couldn’t be further from the truth. No, the real enemy is closed-mindedness.

Take this number puzzle from the article:


To an adult, puzzles involving numbers are almost always seen as mathematical. Through years of exposure to math problems involving numbers, we condition ourselves to solve numerical problems mathematically. Our solution space is specialised. We see a numbers problem, and we know with great certainty that it must be a math problem.


However, this cannot really be too mathematical a problem, since it “can be solved by preschool children in five to ten minutes”. To a person without significant mathematical experience, numbers are used primarily for counting. And when numbers and letters are both fairly new to you, you place them in basically the same category: visual figures. By thinking along these lines and keeping an open mind, anyone can arrive at the solution straight away.

I must admit that for this particular problem I have a bit of an unfair advantage. Since I have grapheme→colour synesthesia, I see figures with different features very differently from one another. Thus I can’t help but notice that the number on the right is just the count of enclosed loops in the figures on the left. The experience is hard to describe, but it’s for basically the same reason that a non-synesthete can’t help but notice the same fact in this image:


The author writes: “…to become truly cre­ative you need to be able to lib­er­ate your mind from the the shell of knowl­edge, edu­ca­tion and adul­ti­fi­ca­tion you have accu­mu­lated”. This is, I think, too narrow a view—there is nothing wrong with having knowledge and using it, nothing wrong with listening to your education. There is also nothing wrong with being an adult, but I’ll pass.

The key is to keep an open mind, to judge each new thing on its own terms. To relax and let our work tell us how it ought to be solved. And above all, to think broadly before thinking deeply.

12 February 2012

Why Concatenative Programming Matters

Introduction

There doesn’t seem to be a good tutorial out there for concatenative programming, so I figured I’d write one, inspired by the classic “Why Functional Programming Matters” by John Hughes. With any luck it will get more people interested in the topic, and give me a URL to hand people when they ask what the heck I’m so excited about.

Foremost, there seems to be some disagreement over what the term “concatenative” actually means. This Stack Overflow answer by Norman Ramsey even goes so far as to say:
“…it is not useful to use the word ‘concatenative’ to describe programming languages. This area appears to be a private playground for Manfred von Thun. There is no real definition of what constitutes a concatenative language, and there is no mature theory underlying the idea of a concatenative language.”
This is rather harsh, and, well, wrong. Not entirely wrong, mind, just rather wrong. But it’s not surprising that this kind of misinformation gets around, because concatenative programming isn’t all that well known. (I aim to change that.)

Concatenative programming is so called because it uses function composition instead of function application—a non-concatenative language is thus called applicative. That’s the defining difference, and it’s as “real” a definition as I need, because, well, it’s the one that people are using. Bang. Descriptivism.

One of the problems with functional programming is that the oft-touted advantages—immutability, referential transparency, mathematical purity, &c.—don’t immediately seem to apply in the real world. The reason “Why Functional Programming Matters” was necessary in the first place was that functional programming had been mischaracterised as a paradigm of negatives—no mutation, no side-effects—so everybody knew what you couldn’t do, but few people grasped what you could.
There is a similar problem with concatenative programming. Just look at the Wikipedia introduction:
A concatenative programming language is a point-free programming language in which all expressions denote functions and the juxtaposition of expressions denotes function composition. The combination of a compositional semantics with a syntax that mirrors such a semantics makes concatenative languages highly amenable to algebraic manipulation.
This is all true—and all irrelevant to your immediate problem of why you should care. So in the next sections, I will show you:
  • How concatenative programming works
  • How typing in a concatenative language works
  • How concatenative languages are efficient
  • What concatenative languages are not good for
  • Where to go for more information

Now let’s get started!

♦ ♦ ♦

The Basics

In an applicative language, functions are applied to values to get other values. λ-calculus, the basis of functional languages, formalises application as “β-reduction”, which just says that if you have a function (f x = x + 1) then you can substitute a call to that function (f y) with its result (y + 1). In λ-calculus, simple juxtaposition (f x) denotes application, but function composition must be handled with an explicit composition function:
compose := λf. λg. λx. f (g x)
This definition says “the composition (compose) of two functions (f and g) is the result of applying one (f) to the result of the other (g x)”, which is pretty much a literal definition. Note that this function can only be used to compose functions of a single argument—more on this later.

In concatenative languages, composition is implicit: “f g” is the composition of f and g. However, this does not mean that function application becomes explicit—it actually becomes unnecessary. And as it turns out, this peculiar fact makes these languages a whole lot simpler to build, use, and reason about.

The untyped λ-calculus is a relatively simple system. However, it and the many systems derived from it still need three kinds of term—variables, lambdas, and applications—as well as a number of rules about how to correctly replace variable names when applying functions. You have to deal with name binding, closures, and scope. For a supposedly low-level language, it has quite a bit of inherent complexity.

Concatenative languages have a much simpler basis—there are only functions and compositions, and evaluation is just the simplification of functions. It is never necessary to deal with named state—there are no variables. In a sense, concatenative languages are “more functional” than traditional functional languages! And yet, as we will see, they are also easy to implement efficiently.


♦ ♦ ♦

Composition

Let’s say we want to multiply a couple of numbers. In a typical concatenative language, this looks like:
2 3 ×
There are two weird things about this.
First, we use postfix notation (2 3 ×) rather than the prefix notation (× 2 3) that is common in the functional world, or the infix notation (2 × 3) that most languages provide for convenience. There is nothing stopping a concatenative language from having infix operators, but for the sake of consistency, most stick to postfix notation: “f g” means (g ∘ f), i.e., the reverse of function composition. This is actually rather nice notation, because it means that data flows in the order the functions are written in.

Second, we said all terms denote functions—so what the heck are 2 and 3 doing in there? They sure look like values to me! But if you tilt your head a little, you can see them as functions too: values take no arguments and return themselves. If we were to write down the inputs and outputs of these functions, it would look like this:
2 :: () → (int)
3 :: () → (int)
As you may guess, “x :: T” means “x is of type T”, and “T1 → T2” means “a function from type T1 to type T2”. So these functions take no input, and return one integer. We also know the type of the multiplication function, which takes two integers and returns just one:
× :: (int, int) → (int)
Now how do you compose all of these functions together? Remember we said “f g” means the (reverse) composition of f and g, but how can we compose 2 with 3 when their inputs and outputs don’t match up? You can’t pass an integer to a function that takes no arguments.

The solution lies in something called stack polymorphism. Basically, we can give a generic, polymorphic type to these functions that says they’ll take any input, followed by what they actually need. They return the arguments they don’t use, followed by an actual return value:
2 :: ∀A. (A) → (A, int)
3 :: ∀A. (A) → (A, int)
× :: ∀A. (A, int, int) → (A, int)
“∀A.” means “For all A”—in these examples, even if A has commas in it. So now the meaning of the expression “2 3” is clear: it is a function that takes no input and returns both 2 and 3. This works because when we compose two functions, we match up the output of one with the input of the other, so we start with the following definitions:
2 :: ∀A. (A) → (A, int)
3 :: ∀B. (B) → (B, int)
We match up the types:
(A, int) = (B)
By substituting, we get a new polymorphic type for 3 within the expression:
3 :: ∀C. (C, int) → (C, int, int)
This matches the non-polymorphic type:
3 :: ∀A. (A, int) → (A, int, int) = ∀B. B → (B, int)
So the final type of the expression becomes:
2 3 :: ∀A. (A) → (A, int, int)
Going through the same process for multiplication, we get:
2 3 :: ∀A. (A) → (A, int, int)
× :: ∀B. (B, int, int) → (B, int)
A = B
2 3 × :: ∀A. (A) → (A, int)
This is correct: the expression “2 3 ×” takes no input and produces one integer. Whew! As a sanity check, note also that the equivalent function “6” has the same type as “2 3 ×”:
6 :: ∀A. (A) → (A, int)
So already, concatenative languages give us something applicative functional languages generally can’t: we can actually return multiple values from a function, not just tuples. And thanks to stack polymorphism, we have a uniform way to compose functions of different types, so the flow of data in our programs doesn’t get obscured, as it were, by the plumbing.

♦ ♦ ♦

Cool Stuff

In the above example, we worked from left to right, but because composition is associative, you can actually do it in any order. In math terms, (f ∘ g) ∘ h = f ∘ (g ∘ h). Just as “2 3 ×” contains “2 3”, a function returning two integers, it also contains “3 ×”, a function that returns thrice its argument:
3 × :: (int) → (int)
(From now on I’ll omit the ∀ bit from type signatures to make them easier to read.)

So we can already trivially represent partial function application. But this is actually a huge win in another way. Applicative languages need to have a defined associativity for function application (almost always from left to right), but here we’re free from this restriction. A compiler for a statically typed concatenative language could literally:
  1. Divide the program into arbitrary segments
  2. Compile every segment in parallel
  3. Compose all the segments at the end
This is impossible to do with any other type of language. With concatenative programming, a parallel compiler is a plain old map-reduce!

Because all we have are functions and composition, a concatenative program is a single function—typically an impure one with side effects, but that’s by no means a requirement. (You can conceive of a pure, lazy concatenative language with side-effect management along the lines of Haskell.) Because a program is just a function, you can think about composing programs in the same way.

This is the basic reason Unix pipes are so powerful: they form a rudimentary string-based concatenative programming language. You can send the output of one program to another (|); send, receive, and redirect multiple I/O streams (n<, 2&>1); and more. At the end of the day, a concatenative program is all about the flow of data from start to finish. And that again is why concatenative composition is written in “reverse” order—because it’s actually forward:

+---+
| 2 |
+---+
  |
  |   +---+
  |   | 3 |
  |   +---+
  |     |
  V     V
+---------+
| *       |
+---------+
  |
  V


♦ ♦ ♦

Implementation

So far, I have deliberately stuck to high-level terms that pertain to all concatenative languages, without any details about how they’re actually implemented. One of the very cool things about concatenative languages is that while they are inherently quite functional, they also have a very straightforward and efficient imperative implementation. In fact, concatenative languages are the basis of many things you use every day:
  • The Java Virtual Machine on your PC and mobile phone
  • The CPython bytecode interpreter that powers BitTorrent, Dropbox, and YouTube
  • The PostScript page description language that runs many of the world’s printers
  • The Forth language that started it all, which still enjoys popularity on embedded systems
The type of a concatenative function is formulated so that it takes any number of inputs, uses only the topmost of these, and returns the unused input followed by actual output. These functions are essentially operating on a list-like data structure, one that allows removal and insertion only at one end. And any programmer worth his salt can tell you what that structure is called.

It’s a stack. Yep. Just a stack.

Consider the information flowing between functions in the expression “2 3 × 4 5 × +”, which, if you’re not up on your postfix, is equal to (2 × 3 + 4 × 5):

FunctionOutput

()
2(2)
3(2, 3)
×(6)
4(6, 4)
5(6, 4, 5)
×(6, 20)
+(26)

Moving from left to right in the expression, whenever we encounter a “value” (remember: a nullary self-returning function), we push its result to the stack. Whenever we encounter an “operator” (a non-nullary function), we pop its arguments, perform the computation, and push its result. Another name for postfix is reverse Polish notation, which achieved great success in the calculator market on every HP calculator sold between 1968 and 1977—and many thereafter.

So a concatenative language is a functional language that is not only easy, but trivial to run efficiently, so much so that most language VMs are essentially concatenative. x86 relies heavily on a stack for local state, so even C programs have a little bit of concatenativity in ’em, even though x86 machines are register-based.

Furthermore, it’s straightforward to make some very clever optimisations, which are ultimately based on simple pattern matching and replacement. The Factor compiler uses these principles to produce very efficient code. The JVM and CPython VMs, being stack-based, are also in the business of executing and optimising concatenative languages, so the paradigm is far from unresearched. In fact, a sizable portion of all the compiler optimisation research that has ever been done has involved virtual stack machines.

♦ ♦ ♦

Point-free Expressions

It is considered good functional programming style to write functions in point-free form, omitting the unnecessary mention of variables (points) on which the function operates. For example, “f x y = x + y” can be written as “f = (+)”. It is clearer and less repetitious to say “f is the addition function” than “f returns the sum of its two arguments”.

More importantly, it’s more meaningful to write what a function is versus what it does, and point-free functions are more succinct than so-called “pointful” ones. For all these reasons, point-free style is generally considered a Good Thing™.

However, if functional programmers really believe that point-free style is ideal, they shouldn’t be using applicative languages! Let’s say you want to write a function that tells you the number of elements in a list that satisfy a predicate. In Haskell, for instance:

countWhere :: (a -> Bool) -> [a] -> Int
countWhere predicate list = length (filter predicate list)

It’s pretty simple, even if you’re not so familiar with Haskell. countWhere returns the length of the list you get when you filter out elements of a list that don’t satisfy a predicate. Now we can use it like so:

countWhere (>2) [1, 2, 3, 4, 5] == 3

We can write this a couple of ways in point-free style, omitting predicate and list:

countWhere = (length .) . filter
countWhere = (.) (.) (.) length filter

But the meaning of the weird repeated self-application of the composition operator (.) isn’t necessarily obvious. The expression (.) (.) (.)—equivalently (.) . (.) using infix syntax—represents a function that composes a unary function (length) with a binary function (filter). This type of composition is occasionally written .:, with the type you might expect:

(.:) :: (c -> d) -> (a -> b -> c) -> a -> b -> d
(.:) = (.) . (.)

countWhere = length .: filter

But what are we really doing here? In an applicative language, we have to jump through some hoops in order to get the basic concatenative operators we want, and get them to typecheck. When implementing composition in terms of application, we must explicitly implement every type of composition.

In particular, there is no uniform way to compose functions of different numbers of arguments or results. To even get close to that in Haskell, you have to use the curry and uncurry functions to explicitly wrap things up into tuples. No matter what, you need different combinators to compose functions of different types, because Haskell doesn’t have stack polymorphism for function arguments, and it inherently can’t.

Writing point-free expressions demands concatenative semantics, which just aren’t natural in an applicative language. Now contrast a concatenative example:

define countWhere [filter length]
(1, 2, 3, 4, 5) [2 >] countWhere

It’s almost painfully literal: “to count the number of elements in a list that match a predicate is to filter it and take the length”. In a concatenative language, there is no need at all for variables, because all you’re doing is building a machine for data to flow through:

+-----------------+
| (1, 2, 3, 4, 5) |
+-----------------+
    |
    |               +-------+
    |               | [2 >] |
    |               +-------+
    |                 |
+---|-----------------|-----+
|   |    countWhere   |     |
|   V                 V     |
| +-----------------------+ |
| | filter                | |
| +-----------------------+ |
|   |                       |
|   V                       |
| +--------+                |
| | length |                |
| +--------+                |
|   |                       |
+---|-----------------------+
    |
    V

When you’re building a diagram like this, just follow a few simple rules:
  1. Each block is one function
  2. A block takes up as many columns as needed by its inputs or outputs, whichever are more
  3. When adding a block, put it in the rightmost column possible:
    • If it takes no inputs, add a column
    • If there aren’t enough arrows to match the block, the program is ill-typed 

♦ ♦ ♦

Quotations

Notice the use of brackets for the predicate [2 >] in the preceding example? In addition to composition, the feature that completes a concatenative language is quotation, which allows deferred composition of functions. For example, “2 >” is a function that returns whether its argument is greater than 2, and [2 >] is a function that returns “2 >”.

It’s at this point that we go meta. While just composition lets us build descriptions of dataflow machines, quotation lets us build machines that operate on descriptions of other machines. Quotation eliminates the distinction between code and data, in a simple, type-safe manner.

The “filter” machine mentioned earlier takes in the blueprints for a machine that accepts list values and returns Booleans, and filters a list according to the instructions in those blueprints. Here’s the type signature for it:
filter :: (list T, T → bool) → (list T)
There are all kinds of things you can do with this. You can write a function that applies a quotation to some arguments, without knowing what those arguments are:
apply :: ∀A B. (A, AB) → (B)
You can write a function to compose two quotations into a new one:
compose :: ∀A B C. (AB, BC) → (AC)
And you can write one to convert a function to a quotation that returns it:
quote :: (T) → (() → T)

♦ ♦ ♦

The Dark Side

Say you want to convert a math expression into concatenative form:
f x y z = y2 + x2 − |y|
This has a bit of everything: it mentions one of its inputs multiple times, the order of the variables in the expression doesn’t match the order of the inputs, and one of the inputs is ignored. So we need a function that gives us an extra copy of a value:
dup :: (T) → (T, T)
A function that lets us reorder our arguments:
swap :: (T1, T2) → (T2, T1)
And a function that lets us ignore an argument:
drop :: (T) → ()
From the basic functions we’ve defined so far, we can make some other useful functions, such as one that joins two values into a quotation:
join2 :: (T1, T2) → (() → (T1, T2))
join2 = quote swap quote swap compose
Or a function that rotates its three arguments leftward:
rot3 :: (T1, T2, T3) → (T2, T3, T1)
rot3 = join2 swap quote compose apply
And hey, thanks to quotations, it’s also easy to declare your own control structures:

define true  [[drop apply]]      # Apply the first of two arguments.
define false [[swap drop apply]] # Apply the second of two arguments.
define if    [apply]             # Apply a Boolean to two quotations.

And using them is like—actually, just is—using a regular function:

["2 is still less than 3."]    # "Then" branch.
["Oops, we must be in space."] # "Else" branch.
2 3 <                          # Condition.
if print                       # Print the resulting string.

Those particular definitions for true and false will be familiar to anyone who’s used Booleans in the λ-calculus. A Boolean is a quotation, so it behaves like an ordinary value, but it contains a binary function that chooses one branch and discards another. “If-then-else” is merely the application of that quotation to the particular branches.

Anyway, getting back to the math, we already know the type of our function ((int, int, int) → (int)), we just need to deduce how to get there. If we build a diagram of how the data flows through the expression, we might get this:

   +---+  +---+  +---+
   | x |  | y |  | z |
   +---+  +---+  +---+
     x      y      z
     |      |      |
     |      |      V
     |      |    +------+
     |      |    | drop |
     |      |    +------+
     |      V
     |    +----------+
     |    | dup      |
     |    +----------+
     |      y      y
     |      |      |
     |      |      V
     |      |    +----------+
     |      |    | dup      |
     |      |    +----------+
     |      |      y      y
     |      |      |      |
     |      |      V      V
     |      |    +----------+
     |      |    | *        |
     |      |    +----------+
     |      |     y^2
     |      |      |
     |      V      V
     |    +----------+
     |    | swap     |
     |    +----------+
     |     y^2     y
     |      |      |
     |      |      V
     |      |    +-----+
     |      |    | abs |
     |      |    +-----+
     |      |     |y|
     |      |      |
     V      V      V
   +-----------------+
   | rot3            |
   +-----------------+
    y^2    |y|     x
     |      |      |
     |      |      V
     |      |    +----------+
     |      |    | dup      |
     |      |    +----------+
     |      |      x      x
     |      |      |      |
     |      |      V      V
     |      |    +----------+
     |      |    | *        |
     |      |    +----------+
     |      |     x^2
     |      |      |
     |      V      V
     |    +----------+
     |    | swap     |
     |    +----------+
     |     x^2    |y|
     |      |      |
     |      V      V
     |    +----------+
     |    | -        |
     |    +----------+
     |   x^2-|y|
     |      |
     V      V
   +----------+
   | +        |
   +----------+
y^2+x^2-|y|
     |
     V
So our final function can be written:
f = drop dup dup × swap abs rot3 dup × swap − +
Well…that sucked.

♦ ♦ ♦

A Lighter Note

You’ve just seen one of the major problems with concatenative programming—hey, every kind of language has its strengths and weaknesses, but most language designers will lie to you about the latter. Writing seemingly simple math expressions can be difficult and unintuitive, especially using just the functions we’ve seen so far. To do so exposes all of the underlying complexity of the expression that we’re accustomed to deferring to a compiler.

Factor gets around this by introducing a facility for lexically scoped local variables. Some things are simply more natural to write with named state rather than a bunch of stack-shuffling. However, the vast majority of programs are not predominated by mathematical expressions, so in practice this feature is not used very much:
“Out of 38,088 word and method definitions in the source code of Factor and its development environment at the time of this writing, 310 were defined with named parameters.”—Factor: A Dynamic Stack-based Programming Language
One of the great strengths of concatenative languages, however, is their ability to refactor complex expressions. Because every sequence of terms is just a function, you can directly pull out commonly used code into its own function, without even needing to rewrite anything. There are generally no variables to rename, nor state to manage.

So in practice, a lot of expressions can be refactored using a wide array of handy functions, of the sort that commonly end up in a standard library. With some refactoring, that math expression might look like this:
square = dup ×
f = drop [square] [abs] bi − [square] dip +
Which doesn’t look so bad, and actually reads pretty well: the difference between the square and absolute value of the second argument, plus the square of the first. But even that description shows that our mathematical language has evolved as inherently applicative. It’s better sometimes just to stick to tradition.

♦ ♦ ♦

Whither Hence

So you’ve got the gist, and it only took a few dozen mentions of the word “concatenative” to get there. I hope you’re not suffering from semantic satiation.

You’ve seen that concatenative programming is a paradigm like any other, with a real definition and its own pros and cons:

  • Concatenative languages are simple and consistent (“everything is a”)
  • They are amenable to dataflow programming
  • Stack languages are well studied and have good performance
  • You can easily roll your own control structures
  • Everything is written in point-free style (for better or worse)

If you’re interested in trying out a mature, practical concatenative language, check out Factor and the official blog of the creator, Slava Pestov. Also see Cat for more information on static typing in concatenative languages.

I’ve been idly working on a little concatenative language called Kitten off and on for a while now. It’s dynamically typed and compiles to C, so you can run it just about anywhere. I wanted a language I could use for a site on a shared host where installing compilers was irritating. That shows you the extent of my language geekery—I’d rather spend hours writing a language than twenty minutes figuring out how to install GHC on Bluehost.

Anyway, the implementation is just barely complete enough to play around with. Feel free to browse the source, try it out, and offer feedback. You’ll need GHC and GCC to build it, and I imagine it works best on Linux or Unix, but there shouldn’t be any particularly horrible incompatibilities.

This would also be a good time to mention that I’m working on a more serious language called Magnet, which I mentioned in my last article about how programming is borked. It’s principally concatenative, but also has applicative syntax for convenience, and relies heavily on the twin magics of pattern matching and generalised algebraic data types. Hell, half the reason I wrote this article was to provide background for Magnet. So expect more articles about that.

Edit (20 April 2013) The above information is no longer accurate. Kitten is currently being rewritten; the work that was done on Magnet has been incorporated. Kitten is now statically typed and, until the compiler is complete, interpreted.

And that about does it. As always, feel free to email me at evincarofautumn@gmail.com to talk at length about anything. Happy coding!

25 January 2012

Yep, Programming Is Borked

I recently read “I want to fix programming”, which touches on one of the fundamental problems with programming as we know it. I agree with the author, but I think many of the commenters failed to understand the issue, so I’d like to present it in another way.

I’ve been programming for a long time, mostly because I’m addicted to solving puzzles. I’m very much into programming languages, partly because as a programmer I’m surrounded by them, but also because language is an important part of what makes us human.

So while programming is a craft I enjoy, and I love having the power to create awesome stuff practically ex nihilo, I too am disappointed in what we have to work with. Here’s why.

Process-Oriented versus Result-Oriented Programming

Concatenative, functional, object-oriented, logic—whatever paradigm you choose, they all have the same issue: they require that you describe the how rather than the what. In any language you can name, a program is a description of the process by which you compute what you want, when it could be just a description of what you want. Sure, some languages are more declarative than others, but this is orthogonal to the issue of “declarative versus imperative”—a language that expresses a process declaratively is still process-oriented, not result-oriented.

That’s the real issue. As programmers, we solve problems using code. Even a fun problem like “a game like this should exist but doesn’t” is valid. So we should be able to express our code in terms of its intended result, not the process by which that result is attained.

The author proposes a style of programming based on constraint satisfaction. From a set of constraints, it’s possible to derive an algorithm that meets them. Of course, we naturally worry about the properties of the algorithm the compiler will derive; is it Θ(n!) when it could be Θ(n log n)? But while that is a reasonable worry, it is not—strictly speaking—relevant.

See, there are many equivalent sets of constraints that define a particular algorithm, and by selecting intelligently from those, we can be confident that the compiler will derive what we intend. Our job should be to intelligently express requirements as formal constraints for the compiler to satisfy.

I know that sounds boring, and I was initially skeptical that useful programs could be made this way. But think about it: every non-trivial program already is implemented this way, just in a slow, error-prone, manual fashion. A programmer takes some requirements, formalises them, then painstakingly translates them into a suitable process. We can avoid one of those steps.

A Realistic Example

Say you’re writing a game. The first thing on your to-do list is to be able to move a character around the screen with the arrow keys. In just about any existing language, you have to construct a bunch of unrelated infrastructure to make this happen:
  • Transforming continuous physics into a discrete representation
  • Dealing with issues of frame rates, rendering, and collision
  • Managing the state and resources of living and dead objects
All of this crap is unrelated to just making something that works.

Often, much of this work has already been done for you by a library author, but the fact remains that someone had to do the busywork. Functional reactive programming, in contrast, gets this right. You can say precisely what you mean:
  • The character consists of:
    • Acceleration of ( ax , ay )
    • Velocity of ( vx , vy )
    • Position at ( x, y )
    • Sprite of size ( w, h )
  • When an arrow key is pressed, change the acceleration:
    • Left: ( ‒1, 0 )
    • Right: ( +1, 0 )
    • Up: ( 0, ‒1 )
    • Down: ( 0, +1 )
  • Integrate over time:
    • Velocity from acceleration
    • Position from velocity
  • The colour of a pixel at ( px , py ) is:
    • If ( px , py ) is within the sprite ( x, y, x + w, y + h ):
      • The colour of the pixel at ( pxx, pyy ) in the sprite
    • Otherwise:
      • The background colour

This is a complete, fully declarative, fully result-oriented description of the program. It contains only those details that are relevant to the problem at hand, and every detail is expressed as a constraint that can be plugged directly into an FRP system—which is really a simple constraint solver—to derive the runtime behaviour.

And Whither Hence?

Prolog serves as an example of constraint-based programming, and Haskell FRP libraries are nice, but these are basically just toys compared to what we can do.

There is no reason we cannot make a better, more result-oriented language, or even just make existing process-oriented languages look more result-oriented to get some momentum for the paradigm. Take a look at Swym (“Say What You Mean”), where the etc keyword lets you do things like this:

List.byPairs: [[.1st, .2nd], [.3rd, .4th], etc];
byPairs[1..10] == [[1,2],[3,4],[5,6],[7,8],[9,10]];

Which even does something sane when you give it a different number of arguments:

[1,2,3].byPairs == [[1,2]];
[1,2].byPairs == [[1,2]];
[1].byPairs == [];

This is really exciting stuff, but it shouldn’t have to be, and we can do even better. The implementation of a few “magic” operators like etc and each could make even an imperative language look much more result-oriented.

For a while, I was working on a language called Prog. It was to be an imperative language for generic programming, with dependent typing. Many of the magic was implemented with “radioactive types”, unstable types that could exist only during evaluation, which decayed into ordinary types thereafter. These were used, for instance, to implement various query operators on iterable structures:

which (1..10) > 5 == (6, 7, 8, 9, 10);
every (1..5) ** 2 == (1, 4, 9, 16, 25);
every (1..3) * every (1..3) == ((1,2,3), (2,4,6), (3,6,9));

After reading that article and the comments, I’m inspired to work on Prog again, if I can find the time. Expect to see more blarticles about it. A project in the same vein that’s consuming a lot of my energy right now is a tool to help non-programmers make professional-quality interactive media. If you don’t know how to program, you have limited means of creating games and other media applications. All the authoring software that’s out there falls into two categories:
  • Tragically underpowered
  • Needlessly complicated
So, using research into how people best solve problems, I’m making a tool that helps people solve the problem of having an idea and not being able to make it a reality. At the end of the day, that’s a frustration that a lot of programmers face. If as a programmer you feel inspired to contribute, please contact me at evincarofautumn@gmail.com, and I’ll let you know how you can get involved.