from http://java.dzone.com/articles/intro-rest
The bulk of my career has been spent working with and implementing
distributed middleware. In the mid-90's I worked for the parent
company of Open Environment Corporation working on DCE tools. Later
on, I worked for Iona developing their next generation CORBA ORB.
Currently, I work for the JBoss division of Red Hat, which is
entrenched in Java middleware. So, you could say that I have a
pretty rich perspective when it comes to middleware.
Over a year ago, I became exposed to a new way of writing Web
Services called REST. REST is about using the principles of the
World Wide Web to build applications. REST
stands for REpresentational State Transfer and was first defined
within a
PhD thesis by Roy Fielding. REST is a set of architectural principles which ask the following
questions:
- Why is the World Wide Web so prevalent and ubiquitous?
- What makes the Web scale?
- How can I apply the architecture
of the Web to my own applications?
While REST has many similarities to the more traditional ways of
writing SOA applications, in many important ways it is very
different. You would think that my background would be an asset to
understanding this new way of creating web services, but
unfortunately this was not the case. The
reason is that some of the concepts of REST are hard to swallow,
especially if you have written successful SOAP or CORBA applications.
If your career has a foundation in one of these older technologies,
there's a bit of emotional baggage you will have to overcome. For
me, it took a few months and a lot of reading. For you, it may be
easier. For others, they will never pick REST over something like
SOAP and WS-*. I just ask that you keep an open mind and do some
research if I fail to convince you that REST is an intriguing
alternative to WS-*. So...
RESTful Architectural
Principles
REST isn't protocol specific, but when
people talk about REST they usually mean REST over HTTP.
Technologies like SOAP use HTTP strictly as a transport protocol and
thus use a very small subset of its capabilities. Many would say
that WS-* uses HTTP solely to tunnel through firewalls. HTTP is
actually a very rich application protocol which gives us things like
content negotiation and distributed caching. RESTful web services
try to leverage HTTP in its entirety using specific architectural
principles. What are those RESTful principles?
- Addressable Resources. Every
“thing” on your network should have an ID. With REST over HTTP,
every object will have its own specific URI.
- A Uniform, Constrained Interface.
When applying REST over HTTP, stick to the methods provided by the
protocol. This means following the meaning of GET, POST, PUT, and
DELETE religiously.
- Representation oriented. You
interact with services using representations of that service. An
object referenced by one URI can have different formats available.
Different platforms need different formats. AJAX may need JSON. A
Java application may need XML.
- Communicate statelessly.
Stateless applications are easier to scale.
Let's go into more detail on each of
these individual principles.
Addressability
Addressability is the idea that every
object and resource in your system is reachable through a unique
identifier. This seems like a no-brainer, but, if you think about
it, standardized object identity isn't available in many
environments. If you have tried to implement a portable J2EE
application you probably know what I mean. In J2EE, distributed and
even local references to services are not standardized so it make
portability really difficult. This isn't such a big deal for one
application, but with the new popularity of SOA, we're heading to a
world where disparate applications must integrate and interact. Not
having something as simple as service addressability standardized
adds a whole complex dimension to integration efforts.
In the REST world, addressability is
addressed through the use of URIs. URIs
are standardized and well-known. Anybody who has ever used a
browser is familiar with URIs. From a URI we know the object's
protocol. In other words, we know how to communicate with the
object. We know its host and port or rather, where it is on the
network. Finally, we know the resource's path on its host, which is
its identity on the server it resides.
Using a unique URI to identify each of
your services make each of your resources linkable. Service
references can be embedded in documents or even emails. For
instance, consider the situation where somebody calls your company's
help desk with a problem with your SOA application. They can email a
link to the developers on what exact service there was problems with.
Furthermore, the data which services publish can also be composed
into larger data streams fairly easily.
Figure 1-1
<order id="111">
<customer>http://customers.myintranet.com/customers/32133</customer>
<order-entries>
<order-entry>
<quantity>5</quantity>
<product>http://products.myintranet.com/products/111</product>
...
In this
example, we have an XML document that describes a e-commerce order
entry. We can reference data provided by different divisions in a
company. From this reference we can not only obtain information
about the linked customer and products that were bought, but we also
have the identifier of the service this data comes from. We know
exactly where we can further interact and manipulate this data if we
so desired.
The Uniform, Constrained Interface
The REST principle of a constrained interface is perhaps the hardest pill
for an experienced CORBA or SOAP developer to swallow. The idea
behind it is that you stick to the finite set of operations of the
application protocol you're distributing your services upon. For
HTTP, this means that services are restricted to using the methods
GET, PUT, DELETE, and POST. Let's explain each of these methods:
- GET is a read only operation. It is both an idempotent
and safe operation. Idempotent means that no matter how many times you apply
the operation, the result is always the same. The act of reading an
HTML document shouldn't change the document. Safe means that
invoking a GET does not change the state of the server at all.
That, other than request load, the operation will not affect the
server.
- PUT is usually modeled as an insert or update. It is also idempotent.
When using PUT, the client knows the identity of the resource it is
creating or updating. It is idempotent because sending the same PUT
message more than once has no affect on the underlying service. An
analogy is an MS Word document that you are editing. No matter how
many times you click the “save” button, the file that stores
your document will logically be the same document.
- DELETE is used to remove services. It is idempotent as well.
- POST is the only non-idempotent and unsafe operation of HTTP. It is a
method where the constraints are relaxed to give some flexibility to
the user. In a RESTFul system, POST usually models a factory
service. Where with PUT you know exactly which object you are
creating, with POST you are relying on a factory service to create
the object for you.
You may be scratching your head and thinking, “How is it possible to
write a distributed service with only 4 methods?” Well... SQL only
has 4 operations: SELECT, INSERT, UPDATE, and DELETE. JMS and other
MOMs really only have two: send
and receive. How powerful are both of these tools? For both SQL and JMS, the
complexity of the interaction is confined purely to the data model.
The addressability and operations are well defined and finite and the
hard stuff is delegated to the data model (in SQL's case) or the
message body(JMS's case).
Why is the Uniform Interface Important?
Constraining
the interface for your web services has many more advantages than
disadvantages. Let's look at a few:
Familiarity
If you have a URI that points to a service you know exactly what methods
are available on that resource. You don't need a IDL-like file
describing what methods are available. You don't need stubs. All you
need is an HTTP client library. If you have a document that is
composed of links to data provided by many different services, you
already know what method to call to pull in data from those links.
Interoperability
HTTP is a very ubiquitous protocol. Most programming languages have an
HTTP client library available to them. So, if your web service is
exposed via REST, there is a very high probably that people that want
to use your service will be able to without any additional
requirements beyond being able to exchange the data formats the
service is expecting. With CORBA or WS-* you have to install vendor
specific client libraries. How many of you have had the problem of
getting CORBA or WS-* vendors to interoperate? It has traditionally
been very problematic. The WS-* set of specifications have also been
a moving target over the years. So with WS-* and CORBA, you not only
have to worry about vendor interoperability, you have to make sure
that your client and server are using the same specification version.
With REST over HTTP, you don't have to worry about either of these
things and can just focus on understanding the data format of the
service. I like to think that you are focusing on what is really
important: application interoperability, rather than vendor
interoperability.
Scalability
Because REST constrains you to a well-defined set of methods, you have
predictable behavior which can have incredible performance benefits.
GET is the strongest example. Because GET is a read method that is
both idempotent and safe, browsers and HTTP proxies can cache
responses to servers which can save a huge amount of network traffic
and hits to your website. Add the capabilities of HTTP
1.1's Cache-Control
header, and you have a incredibly rich way of defining caching
policies for your services.
It doesn't end with caching though. Consider both PUT and DELETE.
Because they are idempotent, the client, nor the server have to worry
about handling duplicate message delivery. This saves a lot of book
keeping and complex code.
Representation Oriented
The third architectural principle of
REST is that your services should be representation oriented. Each
service is addressable through a specific URI and representations are
exchanged between the client and service. With a GET operation you
are receiving a representation of the current state of that resource.
A PUT or POST passes a representation of the resource to the server
so that the underlying resource's state can change.
In a RESTful system, the complexity of
the client-server interaction is within the representations being
passed back and forth. These representations could be XML, JSON,
YAML, or really any format you can come up with. One really cool
thing about HTTP is that it provides a simple content negotiation
protocol between the client and server. Through the Content-Type
header, the client specifies the representation's type. With the
Accept header, the client can list its preferred response formats.
AJAX clients can ask for JSON, Java for XML, Ruby for YAML. Another
thing this is very useful for is versioning of services. The same
service can be available through the same URI with the same methods
(GET, POST, etc.), and all that changes is the mime type. For
example, the mime type could be “application/xml” for an old
service while newer services could exchange
“application/xml;schemaVersion=1.1” mime types.
All and all, because REST and HTTP
have a layered approach to addressability, method choice, and data
format, you have a much more decoupled protocol that allows your
service to interact with a wide variety of different clients in a
consistent way.
Communicate Statelessly
The last RESTful principle I will
discuss is the idea of statelessness. When I talk about
statelessness though, I don't mean that your applications can't have
state. In REST, stateless means that there is no client session data
stored on the server. The server only records and manages the state
of the resources it exposes. If there needs to be session specific
data, it should be held and maintained by the client and transfered
to the server with each request as needed. A service layer that does
not have to maintain client sessions is a lot easier to scale as it
has to do a lot less expensive replications in a clustered
environment. Its a lot easier to scale up as all you have to do is
add machines.
A world without server maintained
session data isn't so hard to imagine if you look back 12-15 years
ago. Back then many distributed applications had a fat GUI client
written in Visual Basic, Power Builder, or Visual C++ talking RPCs to
a middle-tier that sat in front of a database. The server was
stateless and just processed data. The fat client held all session
state. The problem with this architecture was an operations one. It
was very hard for operations to upgrade, patch, and maintain client
GUIs in large environments. Web applications solved this problem
because the applications could be delivered from a central server and
rendered by the browser. We started maintaining client sessions on
the server because of the limitations of the browser. Now, circa
2008, with the growing popularity of AJAX, Flex, and Java FX, the
browsers are sophisticated enough to maintain their own session state
like their fat-client counterparts in the mid-90s used to do. We can
now go back to that stateless scalable middle tier that we enjoyed in
the past. Its funny how things go full circle sometimes.
Conclusion
REST identifies
the key architectural principles of why the World Wide Web is so
prevalent and scalable. The next step in the evolution of the web is
to apply these principles to the semantic web and the world of web
services. REST offers a simple, interoperable, and flexible way of
writing web services that can be very different than the RPC
mechanisms like CORBA and WS-* that so many of us have had training
in.
This article is
the first of a two part series. In this article I wanted to
introduce you to the basic concepts of REST. In my next article
“Putting Java to REST”, we will build a very simple RESTful
service in Java using the new JCP standard JAX-RS.
In other words, you'll get to see the theory being put into action.
Until then, I urge you to read more about REST. Below are some
interesting links.