6 Best Practices for J2EE Architecture
Leverage "in-the-trench" J2EE best practices
to improve the architecture and design of your existing and future J2EE
applications.
by Tarak Modi
Posted
June 28, 2004
Numerous
articles have discussed J2EE best practices. So, why am I writing another one,
and how is this article any different from—or better than—the others?
First, this article is aimed at practicing technical
architects. To avoid insulting anyone's intelligence, I'll avoid the cliché
best practices such as "build daily," "test everything,"
and "integrate often." Any projects with architects worth their salt
would have well-defined team structures with properly delineated roles. They
would also have properly documented processes for conducting code reviews,
building the code (daily and on-demand), testing (unit, integration, and
system), deployment, and configuration/release management.
Second, I'll skip commonly touted best practices such as
"interface-based design," "use well-known design patterns,"
and "use service-oriented architecture." Instead, I'll focus on six
(out of many) "in-the-trench" lessons I have learned and followed
over the years. Finally, this article's intent is to get you thinking about
your architecture; providing working code samples or solutions is beyond this
article's scope. Without further ado, let's examine the six lessons.
Lesson
1: Never Shortcut Server-Side Validation
As a software consultant, I've had the opportunity
not only to design and implement Web applications, but to assess/audit many Web
applications as well. I often encounter Web pages within the application that
are sophisticated and packed with client-side JavaScript that performs
extensive checks on user-entered data. Even the HTML elements have data validation
attributes such as MAXLENGTH. The HTML form is submitted only upon successful
validation of all the entered data. As a result, the server side happily
performs the business logic once it receives the posted form (request).
Do you see the problem here? The developers have made
several major assumptions. For example, they assume all Web application users
will be equally honest. The developers also assume all users will always access
the Web application through the browser(s) they've tested. And the list goes
on. These developers have forgotten that it's easy to simulate browser-like
behavior through the command line using freely available tools. In fact, you
can send almost any "posted" form by typing in the appropriate URL in
the browser window; although, you can easily prevent such "form
posting" by disabling GET requests for these pages. But you can't prevent
people from simulating or even creating their own browsers to hack into your
system.
The underlying problem is that the developers have failed
to recognize the main difference between client-side validation and server-side
validation. The main difference between the two is not where the
validation occurs, such as on the client or on the server. The main difference
is in the purpose behind the validation.
Client-side validation is merely a convenience. It is
performed to provide the user with quick feedback—to make the application
appear responsive and give the illusion of a desktop application.
Server-side validation, on the other hand, is a must for
building secure Web applications. It ensures that all data the client sends to
the server is valid, no matter how the data was entered on the client side.
Thus, only server-side validation provides real
application-level security. Many developers fall
into the trap of a false sense of security by performing all data validation
only on the client side. Here's a common example that illustrates the point:
A typical logon page has a textbox to enter a username and
another textbox to enter a password. On the server side, one might encounter
some code in the receiving servlet that constructs a
SQL query of the form "SELECT * FROM SecurityTable
WHERE username = '" + form.getParameter("username") + "' AND password = '" + form.getParameter("password") + "';"
and executes it. If the query comes back with a row in the result set, the user
is successfully logged in. If not, the user is not logged in.
The first problem is the way the SQL is constructed, but
let's ignore that for now. What if the user types in a username such as "Alice'--"? Assuming
a user named "Alice"
exists in SecurityTable, the user (or more
appropriately the "hacker") successfully logs in. I'll leave finding
out why this happens as an exercise for you.
Some creative client-side validation can prevent typical
users from doing this from the browser. But what about the case where
JavaScript is disabled on the client or for those advanced users (or hackers)
who can use another browser-like program to send direct commands (HTTP POST and
GET commands)? Server-side validation is a must to prevent this type of
exploitation. SSL, firewalls, and the like won't help
you here.
Lesson
2: Security is Not an Add-On
As I mentioned in Lesson 1, I have had the privilege of examining many Web
applications. A common theme I see is that all JavaServer
Pages (JSP) pages have a layout similar to this pseudo-code:
<%
User user =
session.getAttribute("User");
if(user == null)
{
// redirect to
// the logon page…
}
if(!user.role.equals("manager"))
{
// redirect to the
// "unauthorized" page…
}
%>
<!-
HTML, JavaScript, and JSP
code to display data and
allow user interaction -->
If the project uses an MVC framework such as Struts, all
Action Beans have similar code as well. While ultimately this code works fine,
it presents a maintenance nightmare if, for example, you find a bug or you must
add a new role (such as "guest" or "admin").
Furthermore, all developers, no matter how junior, need to
be familiar with this coding pattern. Sure, you can clean up JSP code with some
JSP tags, and you can create a base Action Bean that cleans up the derived
Action Beans. Even so, the maintenance nightmares still remain because the
security-related code is spread out in multiple places. The Web application is
also more likely to contain vulnerabilities because security is enforced at the
application code level (by multiple developers) rather than at the architecture
level.
More likely, the underlying problem is that security was
slapped onto the project near the end. I recently worked as the architect on a
project to be implemented in six releases over the course of more than a year,
and security wasn't even mentioned until the fourth release—even though the
project was exposing highly sensitive personal data over the Web. We engaged in
a battle with the project sponsors and their management to change the release
schedule to include all security-related functionality in Release 1 and move
some of the "business" capability into subsequent releases. We
ultimately won. We also have a happy client because it has an extremely secure
application that protects its customers' private data, a fact in which it takes
great pride.
In most applications, unfortunately, security does not appear to add any
real business value, so it gets swept under the rug until the end. When this
happens, security-related code just gets bolted on without any consideration of
the solution's long-term maintainability or robustness. Another symptom of this
security neglect is the absence of comprehensive server-side validation, which,
as I illustrated in Lesson 1, is an important part of a secure Web application.
Remember, security in a J2EE Web application is not just about using the
proper declarations in the web.xml or ejb-jar.xml file, or about using J2EE technologies such as
Java Authentication and Authorization Service (JAAS). It is about having a
well-thought-out plan and then implementing an architecture that supports it.
Lesson 3: I18N is Not a Just a Buzzword Anymore
The reality of today's world is that non-native English speakers will access
your public Web application. This is especially true with e-government
initiatives that allow constituents (residents of a state) to interact with
their governmental agencies online. Examples include renewing your driver's
license or vehicle registration. Many people whose primary language is not
English will likely access such applications. Internationalization (or
"i18n" because there are 18 characters between the "i" and the "n" in
"internationalization") enables your application to work in multiple
languages.
Obviously if you have hard-coded text in your JSP pages, or if your Java
code returns hard-coded error messages, then you will have a tough time
creating a Spanish version of your Web application. However, text is not the
only piece that must be "externalized" in a Web application that
supports multiple languages. Graphics and images should also be configurable
because many images have text embedded in them. In extreme cases, images (or
colors) that mean one thing in one culture portray a completely different
meaning in another culture. Similarly, any Java code that formats numbers and
dates must be localized. But, here's the biggie: Your page layout might require
change as well.
For example, if you use HTML tables to format and display your menu
options, application headers, or footers, then you might have to change the
column widths at a minimum and possibly some other aspects of the table for
each supported language. To accommodate for varying colors and fonts, you might
have to use a separate stylesheet for each language.
It should be obvious by now that creating an "internationalizable"
Web application is an architectural challenge rather than an application
challenge. A well-architected Web application means that your JSP pages and all
business-related (application-specific) Java code are oblivious to the selected
locale. The moral here: Don't take internationalization for granted just
because Java and the J2EE platform support it. You must architect your solution
with internationalization in mind from day one.
Lesson 4: Avoid Common Mistakes With
MVC Presentation
J2EE development has matured enough that most projects use some form of an MVC
architecture, such as Struts, on the presentation tier. A common theme I see in
such projects is the misuse of the MVC pattern. Here are a few examples.
A common misuse is that all the business logic is implemented in the model
layer (for example, in the Action Beans in Struts). Remember that the
presentation layer's model layer is still part of the presentation
layer. The proper way to use this model layer is to call the appropriate
business layer services (or objects) and forward the results to the view layer.
In design pattern terms, the MVC presentation layer's model should be
implemented as a Façade for the business layer. Better yet, use the Business
Delegate pattern discussed in Core J2EE Patterns. This excerpt from the book
elegantly summarizes the gist and benefits of implementing your model as a
Business Delegate:
The Business Delegate acts as a client-side business
abstraction; it provides an abstraction for, and thus hides, the implementation
of the business services. Using a Business Delegate reduces the coupling
between presentation-tier clients and the system's business services. Depending
on the implementation strategy, the Business Delegate may shield clients from
possible volatility in the implementation of the business service API.
Potentially, this reduces the number of changes that must be made to the
presentation-tier client code when the business service API or its underlying
implementation changes.
Another common mistake is putting a lot of
presentation-type logic in the model layer. For example, if the JSP page needs
the date formatted in a specific way or the data ordered in a specific manner, some
would place that logic in the model layer, which is the wrong place for this
logic. It should actually be in a set of helper classes the JSP pages use. The
Action Bean should forward the data to the view layer as the business layer
returns it. This allows flexibility in supporting multiple view layers (JSP,
Velocity, XML, and so on) without creating unnecessary coupling between the
model and the view. It also allows the view to decide the best way to display
the data to the user.
Finally, most MVC applications I've seen have an
under-utilized controller. For example, most Struts applications will create a
base Action class and perform all security-related functions there. All other
Action Beans are derived from this base class. This functionality should be
part of the controller because if the security conditions are not met, then the
call should never reach the Action Bean (or model) in the first place.
Remember, one of the most powerful features of well-designed MVC frameworks is
the presence of a robust and extensible controller. You should leverage this
power to your advantage.
Lesson
5: Don't Be Embarrassed by POJOs
I have witnessed many projects that use Enterprise JavaBeans for the sake of
using Enterprise JavaBeans. Sometimes it's the coolness factor, because EJBs appear to give the project an air of superiority and
self-importance. At other times it arises from confusion about the difference
between J2EE and EJB. Remember that EJB and J2EE are not synonyms. EJB is only
one part of J2EE. J2EE is a set of many technologies, including JSP, servlets, Java Message Service (JMS), Java Database
Connectivity (JDBC), JAAS, Java Management Extensions (JMX), and EJBs. J2EE is also a set of guiding principles and patterns
on how to use these technologies together to create solutions.
If you use EJBs when they are not
required, they can hurt your application's performance. EJBs
typically require a more demanding application server than a plain old Web
server. They typically consume more memory and CPU time because of all the
value-added services they provide. Many applications don't require these
services, and the application server consequently competes with the application
for resources.
In some cases, unnecessary EJB use can even cause your
applications to break. For example, I recently came across an application
developed on an open source application server. The business logic was
encapsulated in a series of stateful session beans (EJBs). The developers had worked hard to completely disable
"passivation" of these beans in the
application server. The client wanted the application deployed in a commercial
application server that was part of the client's technology stack. This
application server did not allow turning passivation
off. In fact, the client did not want any changes to its corporate application
server settings. As a result, the vendor had a big problem on its hands. The
(almost) funny thing is that the vendor couldn't provide a good reason why it
even implemented the code as EJBs (and stateful session beans at that). Not only did the vendor
suffer from performance problems, but its application did not work at the
client site.
Plain Old Java Objects, or POJOs,
are powerful alternatives to EJBs in Web
applications. They are lightweight and don't carry all the extra baggage
associated with EJBs. In my opinion, many EJB
benefits such as object pooling are overrated. Don't be embarrassed by POJOs; they are your friends.
Lesson
6: Data Access Does Not Mandate O/R Mapping
All Web applications I have worked with that provided user value accessed data
from somewhere and hence required a data access layer. That does not mean all
the projects identified and delineated such a layer; it simply means that such
a layer existed either implicitly or explicitly. In the case of an implicit
data layer, the data layer was part of the business object layer (or business
services). This works for small applications, but it goes against generally
accepted architecture guidelines for larger projects.
In general, a data access layer must meet or exceed these
four criteria:
Enables transparency
Business objects can use the data source without knowing the specific details
of the data source implementation. Access is transparent because the
implementation details are hidden inside the data access layer.
Enables easier migration
A data access layer makes it easier for an application
to migrate to a different database implementation. The business objects have no
knowledge of the underlying data implementation, so the migration involves
changes only to the data access layer. Further, if you're employing a factory
strategy, you can provide a concrete factory implementation for each underlying
storage implementation. In that case, migrating to a different storage
implementation means providing a new factory implementation to the application.
Reduces code complexity in business objects
Because the data access layer manages all the data
access complexities, it simplifies the code in the business objects and other
data clients that use the data access layer. The data access layer, not the
business object, contains all implementation-related code (such as SQL
statements). Benefits include higher developer productivity, better
maintainability, and improved code readability.
Centralizes all data access into a separate layer
Because all data access operations are now delegated
to the data access layer, you can view the separate data access layer as the
layer that can isolate the rest of the application from the data access
implementation. This centralization makes the application easier to maintain
and manage.
Note that none of these criteria explicitly call out the
need for an O/R (object-to-relational) mapping layer. An O/R mapping layer,
typically created with an O/R mapping tool, provides an object look-and-feel to
a relational data structure. In my opinion, using O/R mapping is similar to
using EJBs on a project. In most cases, it is simply
not required. O/R mapping can become quite complex for a relational database
with even moderate levels of joins and many-to-many relationships. Add to that
the inherent complexity of O/R mapping solutions themselves, such as lazy
loading and caching, and you have introduced quite a bit of complexity (and
risk) to your project.
To further support my point, I'll point out the many failed
attempts by Sun Microsystems to popularize its Entity Beans (an implementation
of O/R mapping), which has been plagued with problems since version 1.0. In
Sun's defense, some of the earlier problems involved vendors' implementations
of the EJB specification. This, in turn, speaks to the complexity of the Entity
Beans specification itself. As a result, most J2EE architects generally agree
that staying away from Entity Beans is a good idea.
Most applications have a finite number of queries they run
on their data. An efficient way of accessing the data in such applications is
to implement a data access layer that exposes a series of services (or objects,
or APIs) that execute these queries. As I mentioned earlier, O/R mapping is
simply not required in such cases. O/R mapping works well when you require
query flexibility, but remember that this additional flexibility does not come
for free.
As promised, I kept my distance from parroting cliché best
practices in this article. Instead, I focused and offered my opinions on the
most significant decisions every architect on a J2EE project must make.
Ultimately, you should remember that J2EE is not about any specific technology
or about how many acronyms you can force-fit into the solution. Rather, you
should use the right technology at the right place and right time, and follow
the guidelines and practices embodied within J2EE that are more important than
the technology itself.
About the Author
Tarak Modi is a senior
specialist with North
Highland, a management and technology consulting company. His professional
experience includes working with COM, MTS, COM+, .NET, J2EE, and CORBA. He is a
coauthor of Professional Java Web Services (Wrox
Press, 2002). Visit his personal Web site at http://www.tekNirvana.com.