from http://www.infoq.com/articles/tilkov-rest-doubts
Invariably, learning about REST means that you’ll end up wondering just
how applicable the concept really is for your specific scenario. And
given that you’re probably used to entirely different architectural
approaches, it’s only natural that you start doubting whether
REST, or rather RESTful HTTP,
really
works in practice, or simply breaks down once you go beyond
introductory, “Hello, World”-level stuff. In this article, I will try
to address 10 of the most common doubts people have about REST when
they start exploring it, especially if they have a strong background in
the architectural approach behind SOAP/WSDL-based Web services.
1. REST may be usable for CRUD, but not for “real” business logic
This is the most common reaction I see among people who are
skeptical about REST benefits. After all, if all you have is
create/read/update/delete, how can you possibly express more
complicated application semantics? I have tried to address some of
these concerns in the introductory article of this series, but this point definitely merits closer discussion.
First of all, the HTTP verbs - GET, PUT, POST, and DELETE - do not
have a 1:1 mapping to the CRUD database operations. For example, both
POST and PUT can be used to create new resources: they differ in that
with PUT, it’s the client that determines the resource’s URI (which is
then updated or created), whereas a POST is issued to a “collection” or
“factory” resource and it’s the server’s task to assign a URI. But
anyway, back to the question: how do you handle more complex business
logic?
Any computation calc(a, b)
that returns a result c
can be transformed into a URI that identifies its result — e.g. x = calc(2,3)
might become http://example.com/calculation?a=2&b=3
.
At first, this seems like a gross misuse of RESTful HTTP — aren’t we
supposed to use URIs to identify resources, not operations? Yes, but in
fact this is what we do: http://example.com/sum?augend=2&addend=3
identifies a resource, namely the result of adding 2 and 3.
And in this particular (obviously contrived) example, using a GET to
retrieve the result might be a good idea — after all, this is
cacheable, you can reference it, and computing it is probably safe and
not very costly.
Of course in many, if not most cases, using a GET to compute
something might be the wrong approach. Remember that GET is supposed to
be a “safe” operation, i.e. the client does not accept any obligations
(such as paying you for your services) or assume any responsibility,
when all it does is follow a link by issuing a GET. In many other
cases, it’s therefore more reasonable to provide input data to the
server so that it can create a new resource via POST. In its response,
the server can indicate the URI of the result (and possibly issue a
redirect to take you there). The result is then re-usable, can be
bookmarked, can be cached when it’s retrieved … you can basically
extend this model to any operation that yields a result — which is
likely to be every one you can think of.
2. There is no formal contract/no description language
From RPC to CORBA, from DCOM to Web Services we’re used to having an
interface description that lists the operations, their names, and the
types of their input and output parameters. How can REST possibly be
usable without an interface description language?
There are three answers to this very frequently asked question.
First of all, if you decide to use RESTful HTTP together with XML —
a very common choice — the whole world of XML schema languages, such as
DTDs, XML Schema, RELAX NG or Schematron
are still available to you. Arguably, 95% of what you usually describe
using WSDL is not tied to WSDL at all, but rather concerned with the
XML Schema complex types you define. The stuff WSDL adds on top is
mostly concerned with operations and their names — and describing these
becomes pretty boring with REST’s uniform interface: After all, GET,
PUT, POST and DELETE are all the operations you have. With regards to
the use of XML Schema, this means that you can use your favorite data
binding tool (if you happen to have one) to generate data binding code
for your language of choice, even if you rely on a RESTful interface.
(This is not an entirely complete answer, see below.)
Secondly, ask yourself what you need a description for. The most
common — albeit not the only — use case for having some description is
to generate stubs and skeletons for the interface you’re describing. It
is usually not documentation, since the description in e.g.
WSDL format tells you nothing about the semantics of an operation — it
just lists a name. You need some human-readable documentation anyway to
know how to call it. In a typical REST approach, what you would provide
is documentation in HTML format, possibly including direct links to
your resources. Using the approach of having multiple representations,
you might actually have self-documenting resources — just do an HTTP
GET on a resource from your browser and get an HTML document containing
data as well as a list of the operations (HTTP verbs) you can perform
on it and the content types it accepts and delivers.
Finally, if you insist on using a description language for your RESTful service, you can either use the Web Application Description Language (WADL) or — within limitations — WSDL 2.0,
which according to its authors is able to describe RESTful services,
too. Neither WADL nor WSDL 2 are useful for describing hypermedia,
though — and given that this is one of the core aspects of REST, I’m
not at all sure they’re sufficiently useful.
3. Who would actually want to expose so much of their application’s implementation internals?
Another common concern is that resources are too low-level, i.e. an
implementation detail one should not expose. After all, won’t this put
the burden of using the resources to achieve something meaningful on
the client (the consumer)?
The short answer is: No. The implementation of a GET, PUT or any of
the other methods on a resource can be just as simple or complicated as
the implementation of a “service” or RPC operation. Applying REST
design principles does not mean you have to expose individual items
from your underlying data model — it just means that instead of
exposing your business logic in an operation-centric way, you do so in
a data-centric way.
A related concern is that not enabling direct access to resources
will increase security. This is based on an old fallacy known as
“security by obscurity”, and one can argue that in fact it’s the other
way round: By hiding which individual resources you access in your
application-specific protocol, you can no longer easily use the
infrastructure to protect them. By assigning individual URIs to
meaningful resources, you can e.g. use Apache’s security rules (as well
as rewriting logic, logging, statistics etc.) to work differently for
different resources. By making these explicit, you don’t decrease, you
increase your security.
4. REST works with HTTP only, it’s not transport protocol independent
First of all, HTTP is most emphatically not a transport
protocol, but an application protocol. It uses TCP as the underlying
transport, but it has semantics that go beyond it (otherwise it would
be of little use). Using HTTP as a mere transport is abusing it.
Secondly, abstraction is not always a good idea. Web services take
the approach of trying to hide many very different technologies under a
single abstraction layer — but abstractions tend to leak. For example,
there is a huge difference between sending a message via JMS or as an
HTTP request. Trying to dumb widely different options down to their
least common denominator serves no-one. An analogy would be to create a
common abstraction that hides a relational database and a file system
under a common API. Of course this is doable, but as soon as you
address aspects such as querying, the abstraction turns into a problem.
Finally, as Mark Baker once coined: “Protocol independence is a bug,
not a feature”. While this may seem strange at first, you need to
consider that true protocol independence is impossible to achieve — you
can only decide to depend on a different protocol that may or may not
be on a different level. Depending on a widely accepted, officially
standardized protocol such as HTTP is not really a problem. This is
especially true if it is much more wide-spread and supported than the
abstraction that tries to replace it.
5. There is no practical, clear & consistent guidance on how to design RESTful applications
There are many aspects of RESTful design where there are no
“official” best practices, no standard way on how to solve a particular
problem using HTTP in a way conforming to the REST principles. There is
little doubt that things could be better. Still, REST embodies many
more application concepts than WSDL/SOAP-based web services. In other
words: while this criticism has a lot of value to it, it’s far more
relevant for the alternatives (which basically offer you no guidance at
all).
Occasionally, this doubt comes up in the form of “even the REST
experts can’t agree how to do it”. In general, that’s not true — for
example, I tend to believe that the core concepts I described here
a few weeks ago haven’t been (nor will they be) disputed by any member
of the REST community (if we can assume there is such a thing), not
because it’s a particularly great article, but simply because there is
a lot of common understanding once people have learned a little more
than the basics. If you have any chance to try out an experiment, try
whether it’s easier to get five SOA proponents to agree on anything
than trying to get five REST proponents to do so. Based on past
experience and long participation in several SOA and REST discussion
groups, I’d tend to bet my money on the REST folks.
6. REST does not support transactions
The term “transaction” is quite overloaded, but in general, when
people talk about transactions, they refer to the ACID variety found in
databases. In an SOA environment — whether based on web services or
HTTP only — each service (or system, or web app) implementation is
still likely to interact with a database that supports transactions: no
big change here, except you’re likely to create the transaction
explicitly yourself (unless your service runs in an EJB container or
another environment that handles the transaction creation for you). The
same is true if you interact with more than one resource.
Things start to differ once you combine (or compose, if you prefer)
transactions into a larger unit. In a Web services environment, there
is at least an option to make things behave similarly to what people
are used to from 2PC scenarios as supported e.g. in a Java EE
environment: WS-Atomic Transaction (WS-AT), which is part of the WS-Coordination
family of standards. Essentially, WS-AT implements something very
similar or equal to the 2PC protocol specified by XA. This means that
your transaction context will be propagated using SOAP headers, and
your implementation will take care of ensuring the resource managers
hook into an existing transaction. Essentially, the same model in EJB
developer is used to — your distributed transaction behaves just as
atomically as a local one.
There are lots of things to say about, or rather against, atomic transactions in an SOA environment:
- Loose coupling and transactions, especially those of the
ACID variety, simply don’t match. The very fact that you are
co-ordinating a commit across multiple independent systems creates a
pretty tight coupling between them.
- Being able to do this
co-ordination requires central control over all of the services — it’s
very unlikely, probably impossible to run a 2PC transaction across
company boundaries
- The infrastructure required to support this is usually quite expensive and complicated.
For the most part, the need for ACID transactions in a SOA or REST
environment is actually a design smell — you’ve likely modeled your
services or resources the wrong way. Of course, atomic transactions are
just one type of transaction — there are extended transaction models
that might be a better match for loosely-coupled systems. They haven’t
seen much adoption yet, though — not even in the Web services camp.
7. REST is unreliable
It’s often pointed out that there is no equivalent to WS-ReliableMessaging
for RESTful HTTP, and many conclude that because of this, it can’t be
applied where reliability is an issue (which translates to pretty much
every system that has any relevance in business scenarios). But very
often what you want is not necessarily some infrastructure component
that handles message delivery; rather, you need to know whether a message has been delivered or not.
Typically, receiving a response message — such as a simple 200 OK in
case of HTTP — means that you know your communication partner has
received the request. Problems occur when you don’t receive a response:
You don’t know whether your request has never reached the others side,
or whether it has been received (resulting in some processing) and it’s
the response message that got lost.
The simplest way to ensure the request message reaches the other
side is to re-send it, which is of course only possible if the receiver
can handle duplicates (e.g. by ignoring them). This capability is
called idempotency. HTTP guarantees that GET, PUT and DELETE are
idempotent — and if your application is implemented correctly, a client
can simply re-issue any of those requests if it hasn’t received a
response. A POST message is not idempotent, though — at least there are
no guarantees in the HTTP spec that say it is. You are left with a
number of options: You can either switch to using PUT (if your
semantics can be mapped to it), use a common best practice described by Joe Gregorio, or adopt any of the existing proposals that aim to standardize this (such as Mark Nottingham’s POE, Yaron Goland’s SOA-Rity, or Bill de hÓra’s HTTPLR).
Personally, I prefer the best-practice approach — i.e., turn the
reliability problem into an application design aspect, but opinions on
this differ quite a bit.
While any of these solutions address a good part of the reliability
challenge, there is nothing — or at least, nothing that I’m aware of —
that would support delivery guarantees such as in-order delivery for a
sequence of HTTP requests and responses. It might be worth pointing
out, though, that many existing SOAP/WSDL scenarios get by without
WS-Reliable Messaging or any of its numerous predecessors, too.
8. No pub/sub support
REST is fundamentally based on a client-server model, and HTTP
always refers to a client and a server as the endpoints of
communication. A client interacts with a server by sending requests and
receiving responses. In a pub/sub model, an interested party subscribes
to a particular category of information and gets notified each time
something new appears. How could pub/sub be supported in a RESTful HTTP
environment?
We don’t have to look far to see a perfect example of this: it’s called syndication, and RSS and Atom Syndication
are examples of it. A client queries for new information by issuing an
HTTP against a resource that represents the collection of changes, e.g.
for a particular category or time interval. This would be extremely
inefficient, but isn’t, because GET is the most optimized operation on
the Web. In fact, you can easily imagine that a popular weblog server
would have scale up much more if it had to actively notify each
subscribed client individually about each change. Notification by
polling scales extremely well.
You can extend the syndication model to your application resources —
e.g., offer an Atom feed for changes to customer resources, or an audit
trail of bookings. In addition to being able to satisfy a basically
unlimited number of subscribing applications, you can also view these
feeds in a feed reader, similarly to viewing a resource’s HTML
representation in your browser.
Of course, this is not a suitable answer for some scenarios. For
example, soft realtime requirements might rule this option out, and
another technology might be more appropriate. But in many cases, the
mixture of loose coupling, scalability and notification enabled by the
syndication model is an excellent fit.
Given HTTP’s request/response model, how can one achieve
asynchronous communication? Again, we have to be aware that there are
multiple things people mean when they talk about asynchronicity. Some
refer to the programming model, which can be blocking or non-blocking
independently of the wire interactions. This is not our concern here.
But how do you deliver a request from a client (consumer) to the server
(provider) where the processing might take a few hours? How does the
consumer get to know the processing is done?
HTTP has a specific response code, 202 (Accepted), the meaning of
which is defined as “The request has been accepted for processing, but
the processing has not been completed.” This is obviously exactly what
we’re looking for. Regarding the result, there are multiple options:
The server can return a URI of a resource which the client can GET to
access the result (although if it has been created specifically due to
this request, a 201 Created would probably be better). Or the client
can include a URI that it expects the server to POST the result to once
it’s done.
10. Lack of tools
Finally, people often complain about the lack of tools available to
support RESTful HTTP development. As indicated in item #2, this is not
really true for the data aspect — you can use all of the data binding
and other data APIs you are used to, as this is a concern that’s
orthogonal to the number of methods and the means of invoking them.
Regarding plain HTTP and URI support, absolutely every programming
language, framework and toolkit on the planet supports them out of the
box. Finally, vendors are coming up with more and more (supposedly)
easier and better support for RESTful HTTP development in their
frameworks, e.g. Sun with JAX-RS (JSR 311) or Microsoft with the REST
support in .NET 3.5 or the ADO.NET Data Services Framework.
Conclusion
So: Is REST, and its most common implementation, HTTP, perfect? Of
course not. Nothing is perfect, definitely not for every scenario, and
most of the time not even for a single scenario. I’ve completely
ignored a number of very reasonable problem areas that require more
complicated answers, for example message-based security, partial
updates and batch processing, and I solemnly promise to address these
in a future installment. I still hope I could address some of the
doubts you have — and if I’ve missed the most important ones, you know
what the comments are for.
Stefan Tilkov
is the lead editor of InfoQ’s SOA community and co-founder, principal
consultant and chief RESTafarian of Germany/Switzerland-based innoQ.