Scala, a general-purpose, object-oriented, functional language for the
JVM, is the brainchild of Martin Odersky, a professor at Ecole
Polytechnique Fédérale de Lausanne (EPFL). In the first part of a
multi-part interview series, Martin Odersky discusses Scala's history
and origins with Artima's Bill Venners.
Discovering a fascination with compilers
Bill Venners: Let's start at the beginning. How did you
first become involved
with programming languages?
Martin Odersky: My favorite subject was always
compilers and
programming languages. When I first discovered what a compiler was, as
an undergrad in 1980, I immediately wanted to build one. The only
computer I could remotely afford at the time would have been a
Sinclair ZX 80 which had one kilobyte of RAM. I was very close to
giving it a try, but, fortunately, soon after got access to a much
more powerful machine, an Osborne-1. It was the world's first
“portable” (meaning luggable) computer, and it looked remotely like
a sewing machine tilted by 90 degrees. It had a five-inch screen which
displayed 52 tiny characters per line. But it also had a very
impressive 56 usable kilobytes of RAM and two floppy drives of 90K
each.
In those days, I spent some time with another
student in my college named Peter Sollich. We had read about a new
language called
Modula-2, which we found very elegant and well-engineered. So the plan
was born to write a Modula-2 compiler for 8-bit Z80 computers. There
was a small problem in that the only language that came with the
Osborne was Microsoft Basic, which was utterly unsuitable for what we
had in mind, because it did not even support procedures with
parameters—all you had was global variables. Other compilers at the
time were too expensive for our means. So we decided to apply the
classic bootstrapping technique. Peter had written a first compiler
for a small subset of Pascal in Z80 assembly language. We then used
this compiler to compile a slightly larger language, and so on, during
several generations, until we could compile all of Modula-2. It could
produce interpreted bytecode as well as Z80 binaries. The bytecode was
the most compact of any system at the time, and the binary version was
the fastest for 8-bit computers. It was a pretty capable system for its
time.
Shortly before we finished our compiler, Borland came out with Turbo
Pascal, and they were considering going into the Modula-2 market as
well. In fact, Borland decided to buy our Modula-2 compiler to be sold
under the name of Turbo Modula-2 for CP/M alongside an IBM PC version
they wanted to develop. We offered to do the IBM PC version for them,
but they told us they had it already covered. Unfortunately that
version took them much longer than planned. By the time it came out,
three or four years later, their implementor team had split from the
company, and it became known as TopSpeed Modula-2. In the absence of
an IBM-PC version, Borland never put any marketing muscle behind
Turbo-Modula-2, so it remained rather obscure.
When we had finished the Modula-2 compiler, Borland offered to hire
both Peter and me on the spot. Peter went to join them. I was very
close to doing the same, but had the problem that I still had a year
of classes and a Masters project ahead of me. I was very tempted at
the time to become a college dropout. In the end, I decided to stick
it out at university. During my masters project (which was about
incremental parsing), I discovered that I liked doing research a
lot. So in the end, I gave up on the idea of joining Borland to write
compilers, and went on instead to do a Ph.D with Niklaus Wirth, the
inventor of Pascal and Modula-2, at ETH Zurich.
Working to improve Java
Bill Venners: How did Scala come about? What is its
history?
Martin Odersky: Towards the end of my stay in Zurich,
around 1988/89, I became very
fond of functional programming. So I stayed in research and eventually
became
a university professor in Karlsruhe, Germany. I initially
worked on the more theoretical side of programming, on things like
call-by-need lambda calculus. That work was done together with Phil
Wadler, who at the
time was at the University of Glasgow. One day, Phil told me that a
wired-in assistant in his group had heard that there was a new language
coming out, still in alpha stage, called Java. This assistant told
Phil: "Look at this Java thing. It's portable. It has bytecode. It
runs on the web. It has garbage collection. This thing is going to
bury you. What are you going to do about it?" Phil said, well, maybe
he's got a point there.
The answer was that Phil Wadler and I decided take
some of the ideas from functional programming and move them into the
Java space. That effort became a language called Pizza, which had
three features from functional programming: generics, higher-order
functions, and pattern matching. Pizza's initial distribution was in
1996, a year after Java came out. It was moderately successful in that
it showed that one could implement functional language features on the
JVM platform.
Then we got in contact with Gilad Bracha and David Stoutamire from
the Sun core developer team. They said, "We're really interested in
the generics stuff you've been doing; let's do a new project that
does just that." And that became GJ (Generic Java). So we developed GJ
in 1997/98, and six years later it became the generics in
Java 5, with some additions that we didn't do at the time. In
particular, the wildcards in Java generics were developed later
independently by
Gilad Bracha and people at Aarhus university.
Although our generics extensions were put on hold for six years, Sun
developed a much keener interest in the compiler I had written for
GJ. It proved to be more stable and maintainable than their first Java
compiler. So they decided to make the GJ compiler the standard javac
compiler from their 1.3 release on, which came out in 2000.
Designing a language better than Java
Martin Odersky: Now, during the
Pizza and GJ experience I sometimes felt frustrated, because
Java is an existing language with very hard constraints. As a result,
I couldn't do a lot of things the way I would have wanted to do
them—the way I was convinced would be the right way to do them.
So after that time, when essentially the focus of my work was to make
Java better, I decided that it was time to take a step back. I wanted
to start with a clean sheet, and see whether I could design something
that's better than Java. But at the same time I knew that I couldn't
start from scratch. I had to connect to an existing infrastructure,
because otherwise it's just impractical to bootstrap yourself out of
nothing without any libraries, tools, and things like that.
So I
decided that even though I wanted to design a language that was
different from Java, it
would always connect to the Java infrastructure—to the JVM and its
libraries. That was the idea. It was a great opportunity for me that
at that time I became a professor at EPFL, which provides an excellent
environment for independent research. I could form a small group of
researchers
that could work without having to chase all the time after external
grants.
At first we were pretty radical. We wanted to create something that
built on a very beautiful model of concurrency called the join
calculus. We created an object-oriented version of the join
calculus
called Functional Nets and a language called Funnel. After a while,
however, we found out that Funnel, being a very pure language, wasn't
necessarily very practical to use. Funnel was built on a very small
core. A lot of things that people usually take for granted (such as
classes, or pattern matching) were provided only by encodings into
that core. This is a very elegant technique from an academic point of
view. But in practice it does not work so well. Beginners found the
necessary encodings rather difficult, whereas experts found it boring
to have to do them time and time again.
As a result, we decided to start over again and do something that was
sort of midway between the very pure academic language Funnel, and the
very pragmatic but at some points restrictive GJ. We wanted to create
something that would be at the same time practical and useful and more
advanced than what we could achieve with Java. We started working on
this language, which we came to call Scala, in about 2002. The first
public release was in 2003. A relatively large redesign happened early
2006. And it's been growing and stabilizing since.
Constraints on improving Java
Bill Venners: You said you found it frustrating at
times to have the
constraints of needing to be backwards compatible with Java. Can you
give some specific examples of things you couldn't do when you were
trying to live within those constraints, which you were then able to do
when you changed to doing something that's binary but not source
compatible?
Martin Odersky: In the generics design, there were a
lot of very, very
hard constraints. The strongest constraint, the most difficult to cope
with, was that it had to be fully backwards compatible with
ungenerified Java. The story was the collections library had just
shipped with 1.2, and Sun was not prepared to ship a completely new
collections library just because generics came about. So instead it
had to just work completely transparently.
That's why there were a number of fairly ugly things. You always had
to have ungenerified types with generified types, the so called
raw types. Also you couldn't change what arrays were doing so
you had unchecked warnings. Most importantly you couldn't do a lot of
the things you wanted to do with arrays, like generate an array with a
type parameter T, an array of something where you didn't know the type.
You couldn't do that. Later in Scala we actually found out how to do
that, but that was possible only because we could drop in Scala the
requirement that arrays are covariant.
Bill Venners: Can you elaborate on the problem with
Java's covariant
arrays?
Martin Odersky: When Java first shipped, Bill Joy and
James
Gosling and the other members of the Java team thought that Java
should have generics, only they didn't have the time to do a good job
designing it in. So because there would be no generics in Java, at least
initially, they felt that
arrays had to be covariant. That means an array of
String
is a subtype
of array of
Object
, for example. The reason for that was
they wanted
to be able to write, say, a “generic” sort method that took an
array of
Object
and a comparator and that would sort this
array of
Object
. And then let you pass an array of
String
to it. It
turns out that this thing is type unsound in general. That's why you
can get an array store exception in Java. And it actually also turns
out that this very same thing blocks a decent implementation of
generics for arrays. That's why arrays in Java generics don't work at
all. You can't have an array of list of string, it's
impossible. You're forced to do the ugly raw type, just an array of
list, forever. So it was sort of like an original sin. They did
something very quickly and thought it was a quick hack. But it
actually ruined every design decision later on. So in order not to fall
into the same trap again, we had to break off and say, now we will not
be upwards compatible with Java, there are some things we want to do
differently