Java开发人员可以做出的最重要的架构性决策之一就是如何使用Java异常模型。Java异常一直以来就是社群中许多争议的靶子。有人争论到,在Java语言中的异常检查已是一场失败的试验。本文将辨析,失败的原因不在于Java异常模型,而在于Java类库的设计者未能充分了解到方法失败的两个基本原因。
本文倡导一种对异常条件本质的思考方式,并描述一些有助于设计的模式。最后,本文还将在AOP模型中,作为相互渗透的问题,来讨论异常的处理。当你能正确使用异常时,它们会有极大的好处。本文将帮助你做到这一点。
为何异常是如此重要
Java应用中的异常处理在很大程度上揭示了其所基于架构的强度。架构是在应用程序各个层次上所做出并遵循的决定。其中最重要的一个就是决定应用程序中的类,亚系统,或层之间沟通的方式。Java异常是Java方法将另类执行结果交流出去的方式,所以值得在应用架构中给予特殊关注。
一个衡量Java设计师水平和开发团队纪律性的好方法就是读读他们应用程序里的异常处理代码。首先要注意的是有多少代码用于捕获异常,写进日志文件,决定发生了什么,和在不同的异常间跳转。干净,简捷,关联性强的异常处理通常表明开发团队有着稳定的使用Java异常的方式。当异常处理代码的数量甚至要超过其他代码时,你可以看出团队之间的交流合作有很大的问题(可能在一开始就不存在),每个人都在用他们自己的方式来处理异常。
对突发异常的处理结果是可以预见的。如果你问问团队成员为什么异常会被抛出,捕获,或在特定的一处代码里忽视了异常的发生,他们的回答通常是,“我没有别的可做”。如果你问当他们编写的异常真的发生了会怎么样,他们会皱皱眉,你得到的回答类似于这样,“我不知道。我们从没测试过。”
你可以从客户端的代码判断一个java的组件是否有效利用了java的异常。如果它们包含着大堆的逻辑去弄清楚在何时一笔操作失败了,为何失败,是否有弥补的余地,那么原因很有可能要归咎于组件的报错设计。错误的报错系统会在客户端产生大量的“记录然后忘掉”的代码,这些代码鲜有用途。最差的是弄拧的逻辑,嵌套的try/catch/finally代码块,和一些其他的混乱而导致脆弱而难于管理的应用程序。
事后再来解决Java异常的问题,或根本就不解决,是软件项目产生混乱并导致滞后的主要原因。异常处理是一个在设计的各个部分都急需解决的问题。对异常处理建立一个架构性的约定是项目中首要做出的决定。合理使用Java异常模型对确保你的应用简单,易维护,和正确有着长远的影响。
解析异常
正确使用Java异常模型所包含的内容一直以来有着很大的争议。Java不是第一种支持异常算法语义的;但是,它却是第一种通过编译器来执行声明和处理某些异常的规则的语言。许多人都认为编译时的异常检查对精确的软件设计颇有帮助。图1显示的Java异常的等级。
图1:Java异常的等级
通常,Java编译器强迫抛出基于java.lang.Throwable的异常的方法要在它声明中的“throws”部分加上那个异常。而且,编译器还会证实客户端的方法或者捕获已声明的异常,或者特别声明自己也抛出同样的异常。这些简单的规则对世界范围的Java程序员都有深远的影响。
编译器放松了对Throwable继承树中两个分支的异常检查。java.long.Error和java.lang.RuntimeException的子类免于编译时的检查。在这两类中,软件工程师通常对运行中异常更感兴趣。“不检查”的异常指的是这一组,以便和所有其它“检查”的异常区别开。
我可以想象那些接受“检查”的异常的人,也会很看重Java的数据类型。毕竟,编译器对数据类型施加的限制鼓励严格的编码和精确的思维。编译时的类型检查对减少运行时的严重问题有帮助。编译时的异常检查也能起到类似的作用,它会提醒开发人员某个方法可能会有预想不到的结果需要处理好。
早期的建议是尽可能的使用“检察的异常”,以此来最大限度的利用编译器提供的帮助来写出无错误的软件。Java类库API的设计者们都认同这一点,他们广泛地使用“检察的异常”来模拟类库方法中几乎所有的紧急应变措施。在J2SE5.1 API规格中,“检察的异常”类型已2比1的比率超过了“未检查的异常”类型。
对程序员而言,看上去在Java类库中大多数的常用方法对每一个可能的失败都声明了“检察的异常”。例如,java.io包
对IOException这个“检察的异常”就有着很大的依赖。至少有63个Java类库包,或直接,或通过十几个下面的子类,抛出这个异常。
I/O的失败极其稀有,但是却很严重。而且,一旦发生,从你所写的代码里基本上是无法补救的。Java程序员意识到他们不得不提供IOException或类似的不可补救的事件,而一个简单的Java类库方法的调用就可能让这些事件发生。捕获这些异常给本来简单的代码带来了一定的晦涩,因为即使在捕获的代码块里也基本上帮不上忙。但是不加以捕获又可能更糟糕,因为编译器要求你的方法必须要抛出那些异常。这样你的实施细则就不得不暴露在外了,而通常好的面向对象的设计都是要隐藏细节的。
这样一个不可能赢的局面导致了我们今天所警告的绝大多数臭名卓著的异常处理的颠覆性格局。同时也衍生了很多正确或错误的补救之道。
一些Java界的知名人物开始质疑Java的“检察的异常”的模型是否是一个失败的试验。有一些东西肯定是失败的,但是这和在Java语言里加入对异常的检查是毫无关联的。失败是由于在Java API的设计者们的思维里,大多数失败的情形是雷同的,所以可以通过同一种异常传达出去。
故障和应变
让我们来考虑在一个假想的银行应用中的CheckingAccount类。一个CheckingAcccount属于一个用户,记载着用户的存款余额,也能接受存款,接受止兑的通知,和处理汇入的支票。一个CheckingAcccount对象必须协调同步线程的访问,因为任何一个线程都可能改变它的状态。CheckingAcccount类里processCheck的方法会接受一个Check对象为参数,通常从帐户余额里减去支票的金额。但是一个管理支票清算的用户端程序调用processCheck方法时,必须有两种可能的应变措施。一,CheckingAccount对象里可能对该支票已有一个止付的命令;二,帐户的余额可能不足已满足支票的金额。
所以,processCheck的方法对来自客户端的调用可以有3种方式回应。正常的是处理好支票,并把方法签名里声明的结果返回给调用方。两种应变的回应则是需要与支票清算端沟通的在银行领域实实在在存在的情况。processCheck方法所有3种返回结果都是按照典型的银行支票帐户的行为而精心设计的。
在Java里,一个自然的方法来表示上述紧急的应变是定义两种异常,比如StopPaymentException(止付异常)和InsufficientFundsException(余额不足异常)。一个客户端如果忽略这些异常是不对的,因为这些异常在正常操作的情况下一定会被抛出。他们如同方法的签名一样反映了方法的全面行为。
客户端可以很容易的处理好这两种异常。如果对支票的兑付被停止了,客户端把该支票交付特别处理。如果是因为资金不足,用户端可以从用户的储蓄帐户里转移一些资金到支票帐户里,然后再试一次。
在使用CheckingAccount的API时,这些应变都是可以预计的和自然的结果。他们并不是意味着软件或运行环境的失败。这些异常和由于CheckingAccount类中一些内部实施细则引起的真正失败是不同的。
设想CheckingAccount对象在数据库里保持着一个恒定的状态,并使用JDBC API来对之访问。在那个API里,几乎所有的数据库访问方法都有可能因为和CheckingAccount实施无关的原因而失败。比如,有人可能忘了把数据库服务器运行起来,一个未有连上的网络数据线,访问数据库的密码改变了,等等。
JDBC依靠一种“检查的异常”,SQLException,来汇报任何可能的错误。可能出错的绝大多数原由都是数据库的配置,连接,或其所在的硬件设施。对processCheck方法而言,它对上述错误是无计可施的。这不应该,因为processCheck至少了解它自己的实施细则。在调用栈里上游的方法能处理这些问题的可能就更小。
CheckingAccount这个例子说明了一个方法不能成功返回它想要的结果的两个基本原因。这里是两个描述性的术语:
应变
与实际预料相符,一个方法给出另外一种回应,而这种回应可以表达成该方法所要达到的目的之一。这个方法的调用者预料到这个情况的出现,并有相对的应付之道。
故障
在未经计划的情况下,一个方法不能达到它的初衷,这是一个不诉诸该方法的实施细则就很难搞清的情况。
应用这些术语,对processCheck方法而言,一个止付的命令和一个超额的提取是两种可能的应变。而SQLException反映了可能的故障。processCheck方法的调用者应该能够提供应变,但却不一定能有效的处理好可能发生的故障。
Java异常的匹配
在建立应用架构中Java异常的规则时,以应变和故障的方式仔细考虑好“什么可能会出错”是有长远意义的。
条件
|
应变
|
故障
|
被考虑成
|
设计的一部分
|
一个糟糕的意外
|
预计到会发生
|
经常发生
|
绝不发生 |
关注方
|
上游对该方法的调用者
|
需要修好这个问题的人
|
举例 |
另一种返回方式
|
程序bug,硬件系统故障,配置错误,丢失的文件,服务器没有运行 |
最好的匹配
|
一个检查的异常
|
一个未检查的异常
|
应变情况恰如其分地匹配给了Java检查的异常。因为它们是方法的语义算法合同中不可缺少的一部分,在这里借助于编译器的帮助来确保它们得到解决是很有道理的。如果你发现编译器坚持应变的异常必须要处理或者在不方便的时候必须要声明会给你带来些麻烦,你在设计上几乎肯定要做些重构了。这其实是件好事。
出现故障的情况对开发人员而言是蛮有意思的,但对软件逻辑而言却并非如此。那些软件”消化问题“的专家们需要关于故障的信息以便来解决问题。因此,未检查的异常是表示故障的很好方式。他们让故障的通知原封不动地从调用栈上所有的方法滤过,到达一个专门来捕获它们的地方,并得到它们自身包含的有利于诊断的信息,对整个事件提供一个有节制的优雅的结论。产生故障的方法不需要来声明(异常),上游的调用方法不需要捕获它们,方法的实施细则被正确的隐藏起来- 以最低的代码复杂度。
新一些的Java API,比如像Spring架构和Java Data Ojects类库对检查的异常几乎没有依赖。Hibernate ORM架构在3.0版本里重新定义了一些关键功能来去除对检查的异常的使用。这就意味着在这些架构举报的绝大部分异常都是不可恢复的,归咎于错误的方法调用代码,或是类似于数据库服务器之类的底层部件的失败。特别的,强迫一个调用方来捕获或声明这些异常几乎没有任何好处。
设计里的故障处理
在你的计划里,承认你需要去做就迈好了有效处理好故障的第一步。对那些坚信自己能写出无懈可击的软件的工程师们来说,承认这一点是不容易的。这里是一些有帮助的思考方式。首先,如果错误俯拾即是,应用的开发时间将很长,当然前提是程序员自己的bug自己修理。第二,在Java类库中,过度使用检查的异常来处理故障情形将迫使你的代码要应对好故障,即使你的调用次序完全正确。如果没有一个故障处理的架构,凑合的异常处理将导致应用中的信息丢失。
一个成功的故障处理架构一定要达到下面的目标:
- 减少代码的复杂性
- 捕获和保存诊断性信息
- 对合适的人提醒注意
- 优雅地退出行动
故障是应用的真实意图的干扰。因此,用来处理它们的代码应尽量的少,理想上,把它们和应用的语义算法部分隔离开。故障的处理必须满足那些负责改正它们的人的需要。开发人员需要知道故障发生了,并得到能帮助他们搞清为何发生的信息。即使一个故障,在定义上而言,是不可补救的,好的故障处理会试着优雅地结束引起故障的活动。
对故障情况使用未检查的异常
在做框架上的决定时,用未检查的异常来代表故障情况是有很多原因的。Java的运行环境对代码的错误会抛出“运行时异常”的子类,比如,ArithmeticException或ClassCastException。这为你的框架设了一个先例。未检查的异常让上游的调用方法不需要为和它们目的不相关的情况而添加代码,从而减少了混乱。
你的故障处理策略应该认识到Java类库的方法和其他API可能会使用检查的异常来代表对你的应用而言只可能是故障的情况。在这种情形下,采用设计约定来捕获API异常,将其以故障来看待,抛出一个未检查的异常来指示故障的情况和捕获诊断的信息。
在这种情况下抛出的特定异常类型应该由你的框架来定义。不要忘记一个故障异常的主要目的是传递记录下来的诊断信息,以便让人们来想出出错的原因。使用多个故障异常类型可能有些过,因为你的架构对它们都一视同仁。多数情况下,一条好的,描述性强的信息将单一的故障类型嵌入就够用了。使用Java基本的RuntimeException来代表故障情况是很容易的。截止到Java1.4,RuntimeException,和其他的抛出类型一样,都支持异常的嵌套,这样你就可以捕获和报出导向故障的检查的异常。
你也许会为了故障报告的目的而定义你自己的未检查的异常。这样做可能是必要的,如果你使用Java1.3或更早的版本,它们都不支持异常的嵌套。实施一个类似的嵌套功能来捕获和转换你应用中构成故障的检查的异常是很简单的。你的应用在报错时可能需要一个特殊的行为。这可能是你在架构中创建RuntimeException子类的另一个原因。
建立一个故障的屏障
对你的故障处理架构而言,决定抛出什么样的异常,何时抛出是重要的决定。同样重要的是,何时来捕获一个故障异常,之后再怎么办。这里的目的是让你应用中的功能性部分不需要处理故障。把问题分开来处理通常都是一件好事情,有一个中央故障处理机制长远来看是很有裨益的。
在故障屏障的模式里,任何应用组件都可以抛出故障异常,但是只有作为“故障屏障”的组件才捕获异常。采用此种模式去除了大多数程序员为了在本地处理故障而插入的复杂的代码。故障屏障逻辑上位于调用栈的上层,这样在一个默认的行动被激发前,一个异常向上举报的行为就被阻止了。根据不同的应用类型,默认的行动所指也不同。对一个独立的Java应用而言,这个行动指活着的线程被停止。对一个位于应用服务器上的Web应用而言,这个行动指应用服务器向浏览器送出不友好的(甚至令人尴尬的)回应。
一个故障屏障组件的第一要务就是记录下故障异常中包含的信息以为将来所用。到现在为止,一个应用日志是做成此事的首选。异常的嵌套的信息,栈日志,等等,都是对诊断有价值的信息。传递故障信息最差的地方是通过用户界面。把应用的使用者卷进查错的进程对你,对你的用户而言都不好。如果你真的很想把诊断信息放上用户界面,那可能意味着你的日志策略需要改进。
故障屏障的下一个要务是以一种可控的方式来结束操作。这具体的意义要取决于你应用的设计,但通常包括产生一个可通用的回应给可能正在等待的客户端。如果你的应用是一个Web service,这就意味着在回应中用soap:Server的<faultcode>和通用的失败信息<faultstring>来建立一个SOAP故障元素<fault>。如果你的应用于浏览器交流,这个屏障就会安排好一个通用的HTML回应来表明需求是不能被处理的。
在一个Struts的应用里,你的故障屏障会以一种全局异常处理器的形式出现,并被配置成处理RuntimeException的任何子类。你的故障屏障类将延伸org.apache.struts.action.ExceptionHandler类,必要的话,重写它的方法来实施用户自己的特别处理。这样就会处理好不小心产生的故障情况和在处理一个Struts动作时发现的故障。图2显示的就是应变和故障异常。
图2 应变和故障异常
如果你使用的是Spring MVC架构,你可以继承SimpleMappingExceptionResolver类,并配置成处理RuntimeException和它的子类们,这样很容易的就建起了故障屏障。通过重写resolveException的方法,你可以在使用父类的方法来把需求导引到一个发出通用错误提示的view组件之前,加入你需要的用户化的处理。
当你的架构包含了故障屏障,程序员都知晓了后,再写出一次性的故障异常的冲动就会锐减。结果就是应用中出现更干净,更易于维护的代码。
架构中应变的处理
将故障处理交与屏障后,主要组件间的应变交流变得容易多了。一个应变代表着与主要返回结果同等重要的另外一种方法结果。因此,检查的异常类型是一个能够很好地传递应变情况的存在并提供必要的信息来与它竞争的工具。这个方式借助于Java编译器的帮助来提醒程序员关于他们所用的API的方方面面以及提供全套的方法输出的必要性。
仅仅使用方法的返回值类型来传递简单的应变是可能的。比如,返回一个空引用,而不是一个具体的对象,可以意味着对象由于一个已定义的原因不能被建立。Java I/O的方法通常返回一个整数值-1,而不是字节的值或字节的数来表示文件的结尾。如果你的方法的语义简单到可以允许的地步,另一种返回值的方法是可以使用的,因为它摒弃了异常带来的额外的花销。不足之处是方法的调用方要检测一下返回的值来判断是主要结果,还是应变结果。但是,编译器没有办法来保证方法调用者会使用这个判断。
如果一个方法有一个void的返回类型,异常是唯一的方法来表示应变发生了。如果一个方法返回的是一个对象的引用,那么返回值只可能是空或非空(null and non-null)。如果一个方法返回一个整数型,选择与主要返回值不冲突的,可以表示多种应变情况的数值是可能的。但是这样的话,我们就进入了错误代码检查的世界,而这正式Java异常模式所着力避免的。
提供一些有用的信息
定义不同的故障报告的异常类型是没什么道理的,因为故障屏障对所有异常类型一视同仁。应变异常就有很大的不同,因为它们的原意是要向方法调用者传递各种情况。你的架构可能会指出这些异常应该继承java.lang.Exception或一个指定的基类。
不要忘记你的异常应该是百分百的Java类型,你可以用它来存放为你的特殊目的服务的特殊字段,方法,甚至是构造器。比如,被假想的processCheck()方法抛出的InsufficientFundsException这个异常类型就应该包含着一个OverdraftProtection的对象,它能够从另外一个帐户里把短缺的资金转过来。
日志还是不要日志
记录下故障异常是有用处的,因为日志的目的是在一些需要改正的情况下,日志可以吸引人们的注意力。但对应变异常而言却并非如此。应变异常可能代表的只是极少数情况,但是在你的应用里,每一个情况还是会发生的。它们意味着你的应用正在如最初的设计般正常工作着。经常把日志代码加进应变的捕获块里会使你的代码晦涩难懂,而又没有实际的好处。如果一个应变代表了一重要的事件,在抛出一个异常应变来警醒调用者之前,产生一笔日志,记录下这个事件可能会让这个方法更好些。
异常的各个方面
在Aspect Oriented Programming(AOP)的术语里,故障和应变的处理是互相渗透的问题。比如,要实施故障屏障的模式,所有参与的类必须遵循通用规格:
- 故障屏障方法必须存活在遍历参与类的方法调用图的最前端
- 参与类必须使用未检查的异常来表示故障情况
- 参与类必须使用故障屏障期望得到的有针对性的未检查的异常类型
- 参与类必须捕获并从低端方法中把在执行情境下注定的故障转换成检查的异常
- 参与类不能干扰故障异常被传递到故障屏障的过程
这些问题超越了那些本不相干的类的边界。结果就是少数零散的故障处理代码,以及屏障类和参与类间暗含的耦合(这已经比不使用模式进步多了!)。AOP让故障处理的问题被封装在通用的可以作用到参与类的层面上。如AspectJ和Spring AOP这样的Java AOP架构认为异常的处理是添加故障处理行为的切入点。这样,把参与者绑定在故障屏障的模式可以放松些。故障的处理可以存活在一个独立的,不相干的方面里,从而摒弃了屏障方法需要放在方法激活次序的最前头的要求。
如果在你的架构里利用了AOP,故障和应变的处理是理想的在应用里用到的在方面上的候选。对故障和应变的处理在AOP架构下的使用做一个完整的勘探将是将来论文里一个很有意思的题目。
结论
虽然Java异常模型自它出现以来就激发了热烈的讨论,如果使用正确的话,它的价值还是很大的。作为一个设计师,你的任务是建立好规格来最大限度地利用好这个模型。以故障和应变的方式来考量异常可以帮助你做出正确的决定。合理使用好Java异常模型可以让你的应用简单,易维护,和正确。AOP技术将故障和应变定位为相互渗透的问题,这个方法可能会对你的架构提供一些帮助。
引用
作者
Barry Ruzek被Open Group提名为注册IT设计师的大师。他有着30多年的开发操作系统和企业应用的经验。
Abstract
One of the most important architectural decisions a Java developer can make is how to use the Java exception model. Java exceptions have been the subject of considerable debate in the community. Some have argued that checked exceptions in the Java language are an experiment that failed. This article argues that the fault does not lie with the Java model, but with Java library designers who failed to acknowledge the two basic causes of method failure. It advocates a way of thinking about the nature of exceptional conditions and describes design patterns that will help your design. Finally, it discusses exception handling as a crosscutting concern in the Aspect Oriented Programming model. Java exceptions are a great benefit when they are used correctly. This article will help you do that.
Why Exceptions Matter
Exception handling in a Java application tells you a lot about the strength of the architecture used to build it. Architecture is about decisions made and followed consistently at all levels of an application. One of the most important decisions to make is the way that the classes, subsystems, or tiers within your application will communicate with each other. Java exceptions are the means by which methods communicate alternative outcomes for an operation and therefore deserve special attention in your application architecture.
A good way to measure the skill of a Java architect and the development team's discipline is to look at exception handling code inside their application. The first thing to observe is how much code is devoted to catching exceptions, logging them, trying to determine what happened, and translating one exception to another. Clean, compact, and coherent exception handling is a sign that the team has a consistent approach to using Java exceptions. When the amount of exception handling code threatens to outweigh everything else, you can tell that communication between team members has broken down (or was never there in the first place), and everyone is treating exceptions "their own way."
The results of ad hoc exception handling are very predictable. If you ask team members why they threw, caught, or ignored an exception at a particular point in their code, the response is usually, "I didn't know what else to do." If you ask them what would happen if an exception they are coding for actually occurred, a frown follows, and you get a statement similar to, "I don't know. We never tested that."
You can tell if a Java component has made effective use of Java exceptions by looking at the code of its clients. If they contain reams of logic to figure out when an operation failed, why it failed, and if there's anything to do about it, the reason is almost always because of the component's error reporting design. Flawed reporting produces lots of "log and forget" code in clients and rarely anything useful. Worst of all are the twisted logic paths, nested try/catch/finally blocks, and other confusion that results in a fragile and unmanageable application.
Addressing exceptions as an afterthought (or not addressing them at all) is a major cause of confusion and delay in software projects. Exception handling is a concern that cuts across all parts of a design. Establishing architectural conventions for exceptions should be among the first decisions made in your project. Using the Java exception model properly will go a long way toward keeping your application simple, maintainable, and correct.
Challenging the Exception Canon
What constitutes "proper use" of Java's exception model has been the subject of considerable debate. Java was not the first language to support exception-like semantics; however, it was the first language in which the compiler enforced rules governing how certain exceptions were declared and treated. Compile-time exception checking was seen by many as an aid to precise software design that harmonized nicely with other language features. Figure 1 shows the Java exception hierarchy.
In general, the Java compiler forces a method that throws an exception based on java.lang.Throwable
including that exception in the "throws" clause in its declaration. Also, the compiler verifies that clients of the method either catch the declared exception type or specify that they throw that exception type themselves. These simple rules have had far-reaching consequences for Java developers world-wide.
The compiler relaxes its exception checking behavior for two branches of the Throwable
inheritance tree. Subclasses of java.lang.Error
and java.lang.RuntimeException
are exempt from compile-time checking. Of the two, runtime exceptions are usually of greater interest to software designers. The term "unchecked" exception is applied to this group to distinguish it from all other "checked" exceptions.
Figure 1. Java exception hierarchy
I imagine that checked exceptions were embraced by those who also valued strong typing in Java. After all, compiler-imposed constraints on data types encouraged rigorous coding and precise thinking. Compile-time type checking helped prevent nasty surprises at run-time. Compile-time exception checking would work similarly, reminding developers that a method had potential alternate outcomes that needed to be addressed.
Early on, the recommendation was to use checked exceptions wherever possible to take maximum advantage of the help provided by the compiler to produce error-free software. The designers of the Java library API evidently subscribed to the checked exception canon, using these exceptions extensively to model almost any contingency that could occur in a library method. Checked exception types still outnumber unchecked types by more than two to one in the J2SE 5.1 API Specification.
To programmers, it seemed like most of the common methods in Java library classes declared checked exceptions for every possible failure. For example, the java.io
package relies heavily on the checked exception IOException
. At least 63 Java library packages issue this exception, either directly or through one of its dozens of subclasses.
An I/O failure is a serious but extremely rare event. On top of that, there is usually nothing your code can do to recover from one. Java programmers found themselves forced to provide for IOException
and similar unrecoverable events that could possibly occur in a simple Java library method call. Catching these exceptions added clutter to what should be simple code because there was very little that could be done in a catch block to help the situation. Not catching them was probably worse since the compiler required that you add them to the list of exceptions your method throws. This exposes implementation details that good object-oriented design would naturally want to hide.
This no-win situation resulted in most of the notorious exception handling anti-patterns we are warned about today. It also spawned lots of advice on the right ways and the wrong ways to build workarounds.
Some Java luminaries started to question whether Java's checked exception model was a failed experiment. Something failed for sure, but it had nothing to do with including exception checking in the Java language. The failure was in the thinking by the Java API designers that most failure conditions were the same and could be communicated by the same kind of exception.
Faults and Contingencies
Consider a CheckingAccount
class within an imaginary banking application. A CheckingAccount
belongs to a customer, maintains a current balance, and is able to accept deposits, accept stop payment orders on checks, and process incoming checks. A CheckingAccount
object must coordinate accesses by concurrent threads, any of which may alter its state. CheckingAccount
's processCheck()
method accepts a Check
object as an argument and normally deducts the check amount from the account balance. But a check-clearing client that calls processCheck()
must be ready for two contingencies. First, the CheckingAccount
may have a stop payment order registered for the check. Second, the account may not have sufficient funds to cover the check amount.
So, the processCheck()
method can respond to its caller in three possible ways. The nominal response is that the check gets processed and the result declared in the method signature is returned to the invoking service. The two contingency responses represent very real situations in the banking domain that need to be communicated to the check-clearing client. All three processCheck()
responses were designed intentionally to model the behavior of a typical checking account.
The natural way to represent the contingency responses in Java is to define two exceptions, say StopPaymentException
and InsufficientFundsException
. It wouldn't be right for a client to ignore these, since they are sure to be thrown in the normal operation of the application. They help express the full behavior of the method just as importantly as the method signature.
Clients can easily handle both kinds of exception. If payment on a check is stopped, the client can route the check for special handling. If there are insufficient funds, the client can transfer funds from the customer's savings account to cover the check and try again.
The contingencies are expected and natural consequences of using the CheckingAccount
API. They do not represent a failure of the software or of the execution environment. Contrast these with actual failures that could arise due to problems related to the internal implementation details of the CheckingAccount
class.
Imagine that CheckingAccount
maintains its persistent state in a database and uses the JDBC API to access it. Almost every database access method in that API has the potential to fail for reasons unrelated to the implementation of CheckingAccount
. For example, someone may have forgotten to turn on the database server, unplugged a network cable, or changed the password needed to access the database.
JDBC relies on a single checked exception, SQLException
, to report everything that could possibly go wrong. Most of what could go wrong has to do with configuring the database, the connectivity to it, and the hardware it resides on. There's nothing that the processCheck()
method could do to deal with these situations in a meaningful way. That's a shame, because processCheck()
at least knows about its own implementation. Upstream methods in the call stack have an even smaller chance of being able to address problems.
The CheckingAccount
example illustrates the two basic reasons that a method execution can fail to return its intended result. They are worthy of some descriptive terms:
- Contingency
- An expected condition demanding an alternative response from a method that can be expressed in terms of the method's intended purpose. The caller of the method expects these kinds of conditions and has a strategy for coping with them.
- Fault
- An unplanned condition that prevents a method from achieving its intended purpose that cannot be described without reference to the method's internal implementation.
Using this terminology, a stop payment order and an overdraft are the two possible contingencies for the processCheck()
method. The SQL problem represents a possible fault condition. The caller of processCheck()
ought to have a way of providing for the contingencies, but could not be reasonably expected to handle the fault, should it occur.
Mapping Java Exceptions
Thinking about "what could go wrong" in terms of contingencies and faults will go a long way toward establishing conventions for Java exceptions in your application architecture.
Condition
|
Contingency |
Fault |
Is considered to be |
A part of the design |
A nasty surprise |
Is expected to happen |
Regularly but rarely |
Never |
Who cares about it |
The upstream code that invokes the method |
The people who need to fix the problem |
Examples |
Alternative return modes |
Programming bugs, hardware malfunctions, configuration mistakes, missing files, unavailable servers |
Best Mapping |
A checked exception |
An unchecked exception |
Contingency conditions map admirably well to Java checked exceptions. Since they are an integral part of a method's semantic contract, it makes sense to enlist the compiler's help to ensure that they are addressed. If you find that the compiler is "getting in the way" by insisting that contingency exceptions be handled or declared when it is inconvenient, it's a sure bet that your design could use some refactoring. That's actually a good thing.
Fault conditions are interesting to people but not to software logic. Those acting in the role of "software proctologist" need information about faults to diagnose and fix whatever caused them to happen. Therefore, unchecked Java exceptions are the perfect way to represent faults. They allow fault notifications to percolate untouched through all methods on the call stack to a level specifically designed to catch them, capture the diagnostic information they contain, and provide a controlled and graceful conclusion to the activity. The fault-generating method is not required to declare them, upstream methods are not required to catch them, and the method's implementation stays properly hidden—all with a minimum of code clutter.
Newer Java APIs such as the Spring Framework and the Java Data Objects library have little or no reliance on checked exceptions. The Hibernate ORM framework redefined key facilities as of release 3.0 to eliminate the use of checked exceptions. This reflects the realization that the great majority of the exception conditions that these frameworks report are unrecoverable, stemming from incorrect coding of a method call, or a failure of some underlying component such as a database server. Practically speaking, there is almost no benefit to be gained by forcing a caller to catch or declare such exceptions.
Fault handling in your architecture
The first step toward handling faults effectively in your architecture is to admit that you need to do it. Coming to this acceptance is difficult for engineers who take pride in their ability to create impeccable software. Here is some reasoning that will help. First, your application will be spending a great deal of time in development where mistakes are commonplace. Providing for programmer-generated faults will make it easier for your team to diagnose and fix them. Second, the (over)use of checked exceptions in the Java library for fault conditions will force your code to deal with them, even if your calling sequences are completely correct. If there's no fault handling framework in place, the resulting makeshift exception handling will inject entropy into your application.
A successful fault handling framework has to accomplish four goals:
- Minimize code clutter
- Capture and preserve diagnostics
- Alert the right person
- Exit the activity gracefully
Faults are a distraction from your application's real purpose. Therefore, the amount of code devoted to processing them should be minimal and, ideally, isolated from the semantic parts of the application. Fault processing must serve the needs of the people responsible for correcting them. They need to know that a fault happened and get the information that will help them figure out why. Even though a fault, by definition, is not recoverable, good fault handling will attempt to terminate the activity that encountered the fault in a graceful way.
Use unchecked exceptions for fault conditions
There are lots of reasons to make the architectural decision to represent fault conditions with unchecked exceptions. The Java runtime rewards programming mistakes by throwing RuntimeException
subclasses such as ArithmeticException
and ClassCastException
, setting a precedent for your architecture. Unchecked exceptions minimize clutter by freeing upstream methods from the requirement to include code for conditions that are irrelevant to their purpose.
Your fault handling strategy should recognize that methods in the Java library and other APIs may use checked exceptions to represent what could only be fault conditions in the context of your application. In this case, adopt the architectural convention to catch the API exception where it happens, treat it as a fault, and throw an unchecked exception to signal the fault condition and capture diagnostic information.
The specific exception type to throw in this situation should be defined by your architecture. Don't forget that the primary purpose of a fault exception is to convey diagnostic information that will be recorded to help people figure out what went wrong. Using multiple fault exception types is probably overkill, since your architecture will treat them all identically. A good, descriptive message embedded inside a single fault exception type will do the job in most cases. It's easy to defend using Java's generic RuntimeException
to represent your fault conditions. As of Java 1.4, RuntimeException
, like all throwables, supports exception chaining, allowing you to capture and report a fault-inducing checked exception.
You may choose to define your own unchecked exception for the purpose of fault reporting. This would be necessary if you need to use Java 1.3 or earlier versions that do not support exception chaining. It is simple to implement a similar chaining capability to capture and translate checked exceptions that constitute faults in your application. Your application may have a need for special behavior in a fault reporting exception. That would be another reason to create a subclass of RuntimeException
for your architecture.
Establish a fault barrier
Deciding which exception to throw and when to throw it are important decisions for your fault-handling framework. Just as important are the questions of when to catch a fault exception and what to do afterward. The goal here is to free the functional portions of your application from the responsibility of processing faults. Separation of concerns is generally a good thing, and a central facility responsible for dealing with faults will pay benefits down the road.
In the fault barrier pattern, any application component can throw a fault exception, but only the component acting as the "fault barrier" catches them. Adopting this pattern eliminates much of the intricate code that developers insert locally to deal with faults. The fault barrier resides logically toward the top of the call stack where it stops the upward propagation of an exception before default action is triggered. Default action means different things depending on the application type. For a stand-alone Java application, it means that the active thread is terminated. For a Web application hosted by an application server, it means that the application server sends an unfriendly (and embarrassing) response to the browser.
The first responsibility of a fault barrier component is to record the information contained in the fault exception for future action. An application log is by far the best place to do this. The exception's chained messages, stack traces, and so on, are all valuable pieces of information for diagnosticians. The worst place to send fault information is across the user interface. Involving the client of your application in your debugging process is hardly ever good for you or your client. If you are really tempted to paint the user interface with diagnostic information, it probably means that your logging strategy needs improvement.
The next responsibility of a fault barrier is to close out the operation in a controlled manner. What that means is up to your application design but usually involves generating a generic response to a client that may be waiting for one. If your application is a Web service, it means building a SOAP <fault>
element into the response with a <faultcode>
of soap:Server
and a generic <faultstring>
failure message. If your application communicates with a Web browser, the barrier would arrange to send a generic HTML response indicating that the request could not be processed.
In a Struts application, your fault barrier can take the form of a global exception handler configured to process any subclass of RuntimeException
. Your fault barrier class will extendorg.apache.struts.action.ExceptionHandler
, overriding methods as needed to implement the custom processing you need. This will take care of inadvertently generated fault conditions and fault conditions explicitly discovered during the processing of a Struts action. Figure 2 shows contingency and fault exceptions.
Figure 2. Contingency and fault exceptions
If you are using the Spring MVC framework, your fault barrier can easily be built by extending SimpleMappingExceptionResolver
and configuring it to handle RuntimeException
and its subclasses. By overriding the resolveException()
method, you can add any custom handling you need before using the superclass method to route the request to a view component that sends a generic error display.
When your architecture includes a fault barrier and developers are made aware of it, the temptation to write one-off fault exception handling code decreases dramatically. The result is cleaner and more maintainable code throughout your application.
Contingency Handling in Your Architecture
With fault processing relegated to the barrier, contingency communication between primary components becomes much simpler. A contingency represents an alternative method result that is just as important as the principal return result. Therefore, checked exception type is a good vehicle to convey the existence of a contingency condition and supply the information needed to contend with it. This practice enlists the help of the Java compiler to remind developers of all aspects of the API they are using and the need to provide for the full range of method outcomes.
It is possible to convey simple contingencies by using a method's return type alone. For example, returning a null reference instead of an actual object can signify that the object could not be created for a defined reason. Java I/O methods typically return an integer value of -1 instead of a byte value or byte count to indicate an end-of-file condition. If your method's semantics are simple enough to allow it, alternative return values may be the way to go, since they eliminate the overhead that comes with exceptions. The downside is that the method caller is responsible for testing the return value to see if it is a primary result or a contingency result. The compiler will not insist that the method caller makes that test, however.
If a method has a void return type, an exception is the only way to indicate that a contingency occurred. If a method is returns an object reference, the vocabulary that the return value can express is limited to two values (null and non-null). If a method returns an integral value, it may be possible to express several contingency conditions by choosing values that are guaranteed not to conflict with the primary return values. But now we have entered the world of error code checking, something the Java exception model was developed to avoid.
Supply something useful
It made little sense to define different fault reporting exception types, since the fault barrier treats them all the same. Contingency exceptions are quite different, because they are meant to convey diverse conditions to method callers. Your architecture would probably specify that these exceptions should all extend java.lang.Exception
or a designated base class that does.
Do not forget your exceptions are complete Java types that can accommodate specialized fields, methods, and even constructors that can be shaped for your unique purposes. For example, the InsufficientFundsException
type thrown by the imaginary CheckingAccount
processCheck()
method could include an OverdraftProtection
object that is able to transfer funds needed to cover the shortfall from another account whose identity depends on how the checking account is set up.
To log or not to log
Logging fault exceptions makes sense because their purpose is to draw the attention of people to situations that need to be corrected. The same cannot be said for contingency exceptions. They may represent relatively rare events, but every one of them is expected to happen during the life of your application. If anything, they signify that the application is working the way it was designed to work. Routinely adding logging code to contingency catch blocks adds clutter to your code with no actual benefit. If a contingency represents a significant event, it is probably better for a method to generate a log entry recording the event before throwing a contingency exception to alert its caller.
Exception Aspects
In Aspect Oriented Programming (AOP) terms, fault and contingency handling are crosscutting concerns. To implement the fault barrier pattern, for example, all the participating classes must follow common conventions:
- The fault barrier method must reside at the head of a graph of method calls that traverses the participating classes.
- They must all use unchecked exceptions to signify fault conditions.
- They must all use the specific unchecked exception types that the fault barrier is expecting to receive.
- They all must catch and translate checked exceptions from lower methods that are deemed to be faults in their execution context.
- They must not interfere with the propagation of fault exceptions on their way to the barrier.
These concerns cut across the boundaries of otherwise unrelated classes. The result is minor bits of scattered fault handling code and implicit coupling between the barrier class and the participants (although still a great improvement over not using a pattern at all!). AOP allows the fault handling concern to be encapsulated in a common Aspect applied to the participating classes. Java AOP frameworks such as AspectJ and Spring AOP recognize exception handling as a join point to which fault handling behavior (or advice) can be attached. In this way, the conventions that bind participants in the fault barrier pattern can be relaxed. Fault processing can now reside within an independent, out-of-line aspect, eliminating the need for a "barrier" method to be placed at the head of a method invocation sequence.
If you are exploiting AOP in your architecture, fault and contingency handling are ideal candidates for aspects that apply throughout an application. A full exploration of how fault and contingency handling could work in the AOP world would make an interesting topic for a future article.
Conclusion
Although the Java exception model has generated spirited discussion during its lifetime, it provides excellent value when it is applied correctly. As an architect, it is up to you to establish conventions that get the most from the model. Thinking of exceptions in terms of faults and contingencies can help you make the right choices. Using the Java exception model properly will keep your application simple, maintainable, and correct. Aspect Oriented Programming techniques may offer some definite advantages for your architecture by recognizing fault and contingency handling as crosscutting concerns.
References
Barry Ruzek has been named a Master Certified IT Architect by the Open Group. He has over 30 years of experience developing operating systems and enterprise applications.