Scala and functional programming

I have completely reversed my position on Scala over the space of a few months. There is no reason you should take anything I write on the topic seriously.

§ April 2010

I've spent the last dozen years programming mostly in Java. I now find limitations in the language preventing me from safely managing systems of a certain age and complexity. Most bugs and instability in these systems result from difficult-to-see side effects and poorly managed mutable state. Over many years, I became convinced that I needed a statically typed functional language. I visualized a system of well-defined behaviors that accept many types of data, rather than data objects with behavior attached. Purely functional code should have an entirely predictable result for a given visible input. Much functional code can actually be proved correct for all cases, using mathematical induction. Putting behavior at the center of the system, and data on the outside, seemed to turn the object-oriented model inside-out. I could not see how to reconcile the two approaches.

At first, I had a poor impression of Scala as an uncomfortable compromise, still trying to support both a object-oriented programming model, and a functional model, with no clear resulting style. Haskell on the other hand seemed to take the functional approach as far as it could go, with perfect consistency. If I was going to make the jump, why not go all the way?

My opinions on Scala have reversed drastically over the last few months, mostly as a result of studying Haskell and trying to visualize it in my numerical domain. I spent that time reading through "Real World Haskell" by O'Sullivan et al, and as much advanced online discussion as I could find, particularly on "monads." (I've digested three other Haskell books in the past, stressing the pure functional part of the language.)

I've always been a big fan of the functional approach, wherever possible, and I would prefer as much of a system be purely functional as possible: all data should be immutable, and functions should have visible inputs and outputs, with no side-effects. My difficulty is the approach to handling impure parts of a system, such as I/O, user activity, data stores, and any sources of randomness. Much of my particular domain involves revisiting large chunks of heterogeneous data and making periodic updates. These data need to be altered in-place. These data need encapsulation and rigid protocols for interacting with them.

Haskell's solution, monads, does not work for me. Monads put mutable state in an outer layer, where it can be pulled down into purely functional code without loss of encapsulation. Monads are theoretically satisfying, but in practice I find them too invasive. The functional code cannot ignore the existence of those monads. I can understand the Haskell State monad perfectly -- how it is motivated and derived -- but I cannot forget its existence while I concentrate on other matters. Layering monads becomes geometrically complex. I cannot ignore one monad while alternately interacting with another.

I finally concluded that I prefer Scala's approach of mixing a functional and object-oriented approach. It is a trade-off of purity against clarity, and for me the compromise wins. The OO model has survived for a reason. I just need to exercise more discipline with it.

I miss some basic features of Haskell, such as algebraic types and pattern matching on function arguments. The conceptual elegance of its approach is unmatched. Scala's version of these features feels like a weak emulation. I originally blamed these compromises on the Java VM, but now I see it is a consequence of making everything, including functions, an object. This is a price I am now willing to pay. A mutable object is still a valuable thing to have. I can put it aside and not think about it until I need it. I think I can still keep most of my code purely functional.

Now that I accept the premise that objects still have a place in my code, the "Programming in Scala" book by Odersky et al is an exciting read. I'm impressed by its serendipity and power. As I see academics moving away from Scala back to Haskell, I also see increasing excitement among enterprise Java programmers. They feel the potential right away.

(Why not a dynamically-typed functional language like Clojure? Dynamically typed libraries have proved too difficult to document and share, without exposing and explaining implementation. Too many bugs are unnecessarily delayed to run-time.)

I still think many Java programmers will not make the jump to Scala because they will not see the point of the extra functional abstractions. Many C programmers never got far with C++ or Java because they could not grasp the point of polymorphism. On the other hand, we might increase our chances of attracting programmers of a more mathematical bent, who found the grueling drudgery of previous languages too much.

Footnote: Monads and objects are not mutually exclusive. If you google, you will find examples of monads implemented in scala. The for-loop treats lists like a monad. The problem is that in a purely functional language, monads are the only game in town. You can't really take a short-cut with a mutable object when the occasion demands.

§ January 2010

I have seen a surprisingly fast falloff in interest in Scala in online channels, since the release of 1.0. First-rate developers said it was too complicated for their groups. It was not scaling well to many developers.

It is not at all clear what good style looks like in Scala because it deliberately supports multiple paradigms. Different individuals may write in incompatible styles that do not play well together. Unlike Java, Scala does not show the same effort to leave out non-essential complexity.

After a while, Scala feels like a poor-man's substitute for Haskell or OCaml, with not so many benefits to compensate for the compromises. Yes, we can still access our Java code, but it is very awkward to write new Scala code for use from Java. For access to our legacy java code, we lose most of the generality and power we wanted from a functional language in the first place.

When we moved to OO from purely procedural code, a great many developers were left behind. They could not handle so much abstraction. I have found that many Java developers even now cannot grasp polymorphism or the proper use of an interface.

Good functional programming -- with generic functions, algebraic types, and pattern matching -- kicks that abstraction up a level. We want to pass behaviors around without also passing along state. Only then can we minimize side-effects and use mostly immutable objects. We can eliminate most of our bug opportunities and thread safety issues. We can actually reason inductively to prove the correctness of our code.

But many productive Java developers will never be able to write that kind of code. If they cannot see the point of an interface, then they will never understand a functor or monad. At some point, developers who want that extra level of productivity and power are going to have to move on without the rest.

So I think I am giving up on a second language to play well with the Java VM. If I see a problem I can solve much more cleanly in Haskell than in Java, then I will do legacy integration the hard way, as I often did with C and Java: out of process, with interaction via data, interprocess communication, or a message server.

There are many more VM alternatives, such as Clojure, but Scala was to me the most promising of the statically typed choices. Type inference solves the verbosity problem without delaying type errors to run time. Dynamically typed libraries seem to be more difficult to document for large teams and systems, and too often require clients to inspect the implementation.

Java has been the dominant language for long enough now that it begins to feel constraining. It is preventing us from scaling up to more complicated problems. For me those problems have very heterogeneous data types with lots of similar but slightly different operations. It is hard to reuse code as much as I would like. It is hard to guarantee correctness with pervasive side-effects and mutability.

Bill Harlan, 2010

Return to parent directory.