The Process Virtual Machine
流程虚拟机
by Tom Baeyens and Miguel Valdes Faura
May 8th, 2007.
Translated by Landy(landy.lxy@gmail.com)
Introduction
介绍
There are many process languages for Business Process Management (BPM), workflow and orchestration. For simplicity we'll refer to that collection as workflow for short. There are two aspects to workflow; the process modelling aspect and the software implementation aspect. The biggest problem in workflow technologies today is that they don't handle the dual nature of those technologies properly. The main goal of this paper is to fix exactly that problem.
有许多针对业务流程管理(BPM),工作流和服务编排的流程语言。为了简单起见,我们把它们简称为工作流。工作流包括两个方面:流程模型方面和软件实现方面。如今,早工作流技术领域最大的问题是无法很好的处理这些技术的双重性。这篇文章的主要目的就是为了解决这个问题。
The Process Virtual Machine does not define a process language. Instead it acknowledges some process language might be better suited for a certain situation then another and hence, there will be multiple process languages coexisting. The Process Virtual Machine will define a common model that can be shared between all the graph based execution languages. It also includes a strategy on how process constructs can be seen as software components. This will enable support for multiple process languages and also it will show much clearer how process technology fits right into software development projects.
流程虚拟机没有定义一门流程语言。相反,它接受“一些流程语言可能相比其他流程语言能够更好的适应特定的领域”的观点。因此,这将导致多种流程语言同时存在。流程虚拟机为所有的基于图的执行语言定义了一组可共享的公共的模型。同时,它也包括“如何将流程元素视为软件组件”的策略,这将使支持多种流程语言成为可能,它也更清晰的说明了流程技术如何合适的应用到软件开发项目中。
Component technology
组件技术
The reason for the fragmentation in process languages is that there are many environments and features that can be handled by workflow. Up to now, the main focus was to build 'the best' process language. There is no sign yet that process languages are converging to each other in some way or another. So the new thing about Process Virtual Machine is the idea that different environments are best served with dedicated process languages.
导致流程语言分裂的原因是许多环境和特性都能够由工作流处理。至今为止,业界主要关注于构建最好的流程语言。还没有迹象表明所有流程语言会以某种方式收敛到一种。因此,流程虚拟机是一种“不同的环境有其适应性最好的特定的流程语言”的思路。
While every developer knows the relational model that underpins relational databases, such a conceptual model is absent for workflow. The Process Virtual Machine will fix this missing piece by defining a component model for process constructs. That way, complete process languages can be built on top of the Process Virtual Machine as a set of process construct implementations. Developers that know the Process Virtual Machine will much better understand the current process engines that are currently available. Furthermore, we expect that the Process Virtual Machine will be the basis for all next generation workflow engines.
尽管所有的开发者知道关系模型是关系型数据库的基石,而工作流领域却缺乏这样的一种概念模型。流程虚拟机为流程元素定义了一套组件模型,解决了这个问题。采用这种方式,流程语言可以以添加一套流程元素实现的方式基于流程虚拟机来实现。了解流程虚拟机的开发人员能够更好的理解当前存在的工作流引擎。将来,我们期望流程虚拟机将会成为所有下一代工作流引擎的基础。
More and more, software development will be done in a mix of many languages. Aspects like a domain model or gluecode that binds frameworks to the domain language are best expressed in a general purpose programming language like Java. But a shift is taking place away from general purpose programming languages and leverage more frameworks for specific aspects of software development. Typically, the frameworks come with a library and Domain Specific Language (DSL). Those frameworks and languages address certain parts of software development in a more natural and easier way. Examples are process engines with process languages, rules engines with rules languages, object relational mapping frameworks with the mapping metadata, parsers based on grammars, inversion of control (IoC) containers with an object wiring language and web frameworks with their configuration files that specify navigation.
越来越多的软件将混合使用多种语言开发。像领域模型或者将框架绑定到领域语言等方面能够使用像Java之类的通用编程语言很好的表示。但是,转变正在从“通用编程语言”到“更多的适应于软件开发特定方面框架”发生。通常的,这些框架会附带一个库和领域特定语言(DSL)。这些框架和语言能够更容易和自然的解决软件开发的某个部分。例子有包含流程语言的流程引擎,包含规则语言的规则引擎,包含映射元数据的对象关系映射框架,基于语法的解析器,包含对象装配语言的控制反转(IOC)容器和包括制定页面导航的配置文件的web框架。
Many of the Domain Specific Languages (DSL) that we see today are graph based execution languages. For all of those, the Process Virtual Machine is the common foundation that can be see as the masterpiece to federate DSL graph based languages by reducing maintenance, design and implementation cost.
很多当前的领域特定语言(DSL)是基于图的执行语言。为了减少维护、设计和实现成本,流程虚拟机可以作为这些语言的公共基石。
So process languages will be pluggable on top of the Process Virtual Machine.
因此,流程语言可以以可插拔的方式基于流程虚拟机之上实现。
Embeddable workflow
嵌入式工作流
Just a few workflow automation projects can be realized only by using process languages. The BPM Suites that preach the "no code" approach only target that kind of projects where everything must be modelled as a process. We think that workflow, BPM and orchestration can ease the implementation of some aspects of a software development project. And a software development project usually combines many aspects, only some of which could be modelled as processes.
仅有少量的工作流自动化项目能够仅仅使用流程语言实现。某些宣扬“零编码”方案的BPM套件仅仅能适应这些“必须将一切建模成流程”的项目。我们认为工作流,BPM和服务编排能够简化软件开发项目的某些方面,一个软件开发项目通常包括多个方面,仅仅部分能够建模为流程。
In this view, workflow technology is much more broadly applicable then it is used today. Because today's process engines are not embeddable enough and because of the current fragmentation and confusion, developers still write homegrown workflow engines for their own project.
基于这种观点,工作流技术被广泛采纳和使用。因为当前的流程引擎嵌入程度不够和因为当前工作流技术的分裂和给用户造成的混淆,开发者仍然为他自己的项目开发自己的工作流引擎。
The concept of embeddable workflow means that a process engines can be easily integrated into todays software development projects. This is in contrast with the traditional monolithic BPM server approach.
嵌入式工作流的概念意味着流程引擎能够更容易的与当今的软件开发项目整合。这与传统的独立BPM服务器方案截然不同。
In software development in general, there is a clear trend towards the use of more Domain Specific Languages (DSL). Embeddable workflow really fits with that trend: Process languages just become another language that developers can use in their projects. Workflow is complementary to old fashion plain programming. The developer should be free to select the language best suited for the job.
在一般的软件开发过程中,更多的使用领域特定语言(DSL)成为趋势。嵌入式工作流完全符合这种趋势:流程语言仅仅作为另一种语言,开发者可以在他的项目中使用它。工作流与以往的plain编程互补。开发者应该能够自由的选择最适合与他的工作的语言。
The following aspects are crucial to make the workflow engine embeddable:
下面是使工作流引擎嵌入化的一些要点:
· Persistence: The workflow engine itself should be decoupled from the actual persistence technology. First of all, persistence itself should be optional and in case it is required, different persistence technologies like e.g. JPA, Java serialization or XML should be pluggable. In case a relational database is used for persistence, the developer should have a choice to deploy the workflow engine's database tables separate or alongside the application's database tables or in a separate database. The latter can make the whole application much more manageable.
持久化:工作流引擎应该与特定的流程化技术解耦。首先,持久化本身应该是可选的,如果需要持久化,像JPA,Java序列化或XML之类的不同的持久化技术应该是可插拔的。如果使用关系数据库来持久化,开发者应该能够选择将工作流引擎的数据库表独立开来部署或与应用的数据库表在一起部署或者在单独的数据库中部署。后者能够使整个应用更容易被管理。
· Services: The services that an engine might use such as a timer service or an asynchronous message service should be pluggable so that different implementations can be used in different environments like e.g. stardard Java and enterprise Java.
服务:引擎可能使用的服务如Timer服务或异步消息服务应该是可插拔的,因此,可在不同的环境下使用不同的实现,如标准java和企业Java。
· Transactions: If the workflow engine's can work with the same database connection as the application, there is no need for global or distrubuted transactions. That definitely reduces complexity. Same reasoning applies to sessions of the object relational mapper like e.g. Hibernate.
事务:如果工作流引擎可以和应用使用同一个数据库连接,则不需要分布式事务。这将毫无疑问的降低复杂度。同样的理由适应于对象关系映射的会话如Hibernate。
· Libraries: Apart from the traditional monolithic deployment of a workflow engine, it often can be much more practical to deploy just the engine's libraries in the client application.
库:除了传统的独立部署工作流引擎,通常也可以仅将引擎库部署在客户应用中。
· Testability: Executable processes are software so they should be tested the same way. Tests should take process executions through different test scenarios. This should of course be integrated with the rest of the client application infrastructure.
可测试性:可执行的流程是软件,因此,他们应该同样的被测试。测试应该通过不同的测试场景来执行流程。这显然需要流程引擎与其他客户应用集成。
· Binding: It should be easy to bind the process logic to the rest of the software development project. As we'll see in section Process modelling, an executable process is always a combination of the graphical diagram and technical details. Up to now, the focus of tooling is on editing the graphical diagram. But software development projects usually contain many languages like e.g. Java, process language and scripting languages. So we expect that tools will start to focus more that coherence between different artifacts in one software project. For example, a refactoring for renaming a Java class might span updates to Java classes and process definition files.
绑定:应该能够容易的将流程逻辑绑定到其他软件开发项目。像我们在流程建模章节看到的,一个可执行的流程通常是符号图和技术细节的组合。到现在为止,工具的关注点在编辑符号图。但是软件开发项目通常包括像Java,流程语言和脚本语言之类的需求语言。因此,我们期望工具应该更多的关注在同一软件项目中的不同方面。比如,重命名一个Java类的重构应该更够关联更新java类和流程定义文件。
Scope
范围
We already explained above that the Process Virtual Machine is a common foundation for graph based execution languages. The languages for which the Process Virtual Machine can be leveraged have three main characteristics:
我们已经阐述了流程虚拟机是基于图的执行语言的公共基础。能够基于流程虚拟机的语言有三个主要特征:
· Processes are represented graphically to facilitate communication between all stakeholders
流程是用来图形化表达参与方之间通讯的
· The process expresses some kind of execution flow
流程表示一种执行次序
· Processes can potentially include wait states from the perspective of the process engine and be 'long running'
从流程引擎方面来看,流程可以潜在的包含等待状态;流程也可以是long running的。
Any aspect in software development that meets these criteria can be built on top of the Process Virtual Machine. It's not limited to workflow or BPM as we know it today. An example of such a non trivial aspect might be pageflow; a language to describe navigation between pages in a web application.
如果软件开发的任何方面满足这些条件的都可以基于流程虚拟机构建。不局限于我们今天知道的工作流或BPM。一个例子是页面流:一种在web应用中用来描述不同页面间导航关系的语言。
Basics
基本原理
Here follow the basic principles of the Process Virtual Machine. After the basics, we'll cover a series of extensions that are needed to cover all features needed for real process languages.
这里,讲述了流程虚拟机的基本原则。基本原理之后,我们将讲述一系列的扩展,这些扩展覆盖了真正的流程语言需要的所有特性。
A process is a graphical description of an execution flow. For example the procedure on processing expense notes is a process. It can be deployed in a process engine. One process can have many executions. E.g. my expense note of last Monday could have been handled by one execution of the expense note process. Another example of a process is shown in Figure 1.
流程是一个可执行流程的图形化描述。如,处理费用记录的过程是一个流程。它可以部署在流程引擎中。一个流程可以有多个执行(execution),如我上周一的费用记录可能被费用记录流程的一个执行处理了。另一个流程的例子在图一中说明了。
Figure 1: An example process for an insurance claim
图1:保险索赔的样例流程
The basic structure of a process is made up of nodes and transitions. Transitions have a sense of direction and hence a process forms a directed graph. Nodes can also have a set of nested nodes. Figure 2 shows how transitions and nodes can be modelled in a UML class diagram.
流程的基本结构由节点和转移组成。转移是有向的,因此,流程形成了一个有向图。注意,也可以有一系列的内嵌节点。图二说明了转移和节点如何使用UML类图来建模。
Figure 2: UML class diagram of nodes, transitions and the behavior
图2:节点,转移和行为的UML类图
Each node in the process has a piece of Java code associated as its behaviour. The interface to associate Java code with a node is shown in Figure 3.
流程的每个节点都有一段Java代码与它的行为关联。关联Java代码和节点的接口如图三所示。
public interface Executable {
void execute(Execution execution) throws Exception;
}
|
Figure 3: The Executable interface for specifying the behaviour of nodes
图三:指定节点行为的Executable接口
Now, let's look at the runtime data structure. An execution is a pointer that keeps track of the current position in the process graph as indicated in Figure 4.
现在,让我们看看运行时的数据结构。一个执行时跟踪流程图的当前位置的指针。如图四。
Figure 4: An execution points to the current position in the process graph
图四:一个执行指向流程图中的当前位置
When a new execution is started for a given process, the initial node will be positioned in the initial node of the process. After that, the execution is waiting for an external trigger. The method and class structure of Execution's is indicated in Figure 5.
如果一个指定的流程的新的执行启动了,流程的初始节点会被作为初始位置。然后,执行会等待一个外部的触发。图五指出了执行的方法和类结构。
Figure 5: UML class diagram of an Execution
图五:执行的UML类图
An external trigger can be given with the proceed(String transitionName) method on the execution. Such an external trigger is very similar to the signal operation in finite state machines. The execution knows how to interpret the process graph. By calling the proceed method, the execution will take the specified (or the default) transition and arrives in the destination node of the transition. Then, the execution will update its node pointer and invoke the node's behaviour.
外部触发可以由执行的proceed(String transitionName)方法来发起。这样的一个外部触发与有限状态机中的signal操作非常类似。执行知道如何解释流程图。通过调用proceed方法,执行将沿着指定的(或默认的)转移,到达转移的目的节点。然后,执行会更新它的节点指针和调用节点的行为。
The node's behaviour has got access to the current state of the process through the execution that is passed in as a parameter. In the extensions that are described in detail in the full paper show how for example variables or external services will be available through the execution.
节点的行为能够通过作为参数传入的执行来访问流程的当前状态。在本文中详细描述了的扩展将会说明如何在执行过程中使用样例变量和外部服务。
On the other hand, the node's behaviour has got control over the propagation of execution. This means that the executable implementation can just behave as a wait state, continue execution, create concurrent executions or update any information in the execution.
另一方面,节点的行为通过执行的传播来控制。这意味着executable实现可以仅仅实现为:等待状态,继续执行,创建并发的执行或更新当前执行的信息。
Let's look at two example node behaviour implementations:
让我们看看两个节点行为的实现样例:
A task node
任务节点
The reason why task management and workflow are so closely related is because tasks for humans often translate to wait states for a software system. Processes can easily combine software operations with human tasks in the following way.
任务管理和工作流紧密相关的原因是人员的任务通常翻译为软件系统的等待状态。流程能够容易的将软件操作与人员任务以下面的方式组合起来
The first thing that is needed outside of the process execution engine is a task repository, a place where tasks for people are kept. On top of this component, there is a user interface that allows for people to see their task list and complete them.
首先,任务仓库是除了流程执行引擎之外必要的,它是用来存储人员的任务的。这个组件之上,有用户界面允许人员察看他的任务列表和完成它们。
Then you can imagine the following behaviour implementation of a task node. First, some external trigger has to be given (with the proceed method) so that the process starts executing and arrives in the task node. The node behaviour implementation will create a new task for a given person in the task list component. That task also includes a reference back to the execution. Then, the node behaviour returns without propagating the execution. This means that the execution will be positioned in the task node when the proceed invocation returns.
然后,你可以想象任务节点的下面的行为实现。首先,应该发起一个外部触发(通过proceed方法),因此,流程开始执行,到达任务节点。节点行为实现会在节点列表组件中为指定的人创建一个新任务。这个任务同时也包含execution的一个引用。然后,节点行为返回,不需要传播执行。这意味着当proceed调用返回时,执行会定位在任务节点。
public class TaskNode implements Executable {
String taskName;
public void execute(Execution execution) {
// find the responsible person for this new task
User assignedUser = calculateUser(taskName, execution);
// crete the task
Task task = new Task(taskName, assignedUser, execution);
// add the task to the repository
TaskRepository taskRepository = execution.getContext().getTaskRepository();
taskRepository.addTask(task);
}
}
|
Figure 6: Pseudo code for a task node implementation
图6:任务节点实现的伪代码
The taskName member field shows how configuration information that is specified in the process definition file, should be injected into the behaviour object.
taskName成员变量表示流程定义文件中指定的配置信息,应该注入到行为对象中。
So the execution can then be persisted during while the system is waiting for the user to complete the task. After some time, when the user completes the task, the task management component will use the reference to the execution to provide a trigger. This is done with the proceed method. Then the execution resumes, leaves the task node and continues.
因此,在系统等待用户完成任务的间隔,执行可以被持久化。一段时间后,当用户完成任务,任务管理组件会使用执行的引用提供一个触发。这些通过proceed方法来完成。然后执行恢复,离开任务节点,然后继续后续执行。
An email node
Email节点
An email node is different from a task node in the sense that it is automatic. An email has to be send and then the execution should immediately be propagated over the default leaving transitions. That propagation of execution is done with the invocation of the proceed method at the end of EmailNode's behaviour implementation in Figure 7.
Email节点与任务节点的不同点在它是自动的。需要发送一封Email,然后执行应该立刻通过默认的离开转移传播。执行的传播是通过调用图七中EmailNode行为实现的最后面的process方法实现的。
public class EmailNode implements Executable {
String recipient;
String subject;
String text;
public void execute(Execution execution) {
// send the email
sendEmail(recipient, subject, text, execution);
// propagate the execution
execution.proceed();
}
}
|
Figure 7: Pseudo code for an email node implementation
图七:Email节点实现的伪代码
Similarly as with the task node, the member fields recipient, subject and text are specified in the process definition file. The engine will inject the values into the member fields. Those values might be the result of runtime evaluation of expressions and also mappings might be applied.
与任务节点类似,成员变量recipient,subject和test在流程定义文件中指定。引擎会将这些值注入成员变量。这些可能会是运行时计算表达式的结果,映射可能也需要采用。
Now, let's discuss the main features of the basic model that we have presented so far.
现在,我们讨论我们目前介绍的基本模型的主要特性。
Graphical representation
特性化展现
The basic Process Virtual Machine concepts show how graphical diagrams can be made executable. Processes are made up of nodes and transitions, hence they have a direct relation with the graphical representation. On the other hand, the behaviour member field in the Node class defines the programming logic that is executed at runtime.
流程虚拟机的基本概念展示了如何使符号图可执行。流程由节点和转移组成,因此它们有有向关联。另一方面,在Node类中的behaviour成员变量定义运行时执行的编程逻辑。
The graphical structure of the concepts defined in Process Virtual Machine cover most constructs of process languages. Even advanced conctructs like e.g. UML super states can be mapped directly to the nestedNodes relation as defined in Figure 2. Still, process languages can extend the basic structure with new concepts such as e.g. timers, data flow or conditions.
流程虚拟机中定义的概念的图形化结构覆盖了流程语言的大部分元素。甚至高级元素,如UML超状态可以直接映射到图二中定义的nestedNodes关系。流程语言也可以使用新概念来扩展基本结构,如timers,data flow或conditions。
Wait states
等待状态
We also saw in the task node example that nodes can represent wait states. When an execution arrives in a node and the execution.proceed() method is not invoked, the originally proceed, invoked by external client will return. That means that the execution is now in a wait state and typically it is then waiting for an external trigger. Typically, that is the right time to persist an execution.
我们在任务节点的例子中看到了节点可以表示等待状态。当执行到达一个节点时,不调用execution.proceed()方法, 由外部客户端调用的初始的执行方法会返回。这意味着执行当前处于等待状态,典型场景是执行等待一个外部触发。通常,这是持久化只执行的合适时间。
With wait states, processes can now express execution flows that spans software operations (typically in a transaction) and wait states when the engine must wait for an external trigger. This is also known as long running processes.
通过等待状态,流程可以表示跨软件操作(通常在一个事务中)的执行流和当引擎必须等待外部触发的等待状态。通常这也叫长运行的流程。
Synchronous execution
同步执行
When a client invokes the proceed method, the exection will start to interpret and execute the process in that thread. An execution typically starts by taking a transition and then executing the destination node. When the execution invokes the behaviour of the destination node, it passes itself as the parameter. The node behaviour can then in turn propagate the execution onward by calling the proceed method on the execution again. Note that this is still done in the thread of the client, which is still waiting for the original proceed invocation to return.
当客户端调用proceed方法,执行会在那么线程中开始解析和执行流程。执行通常开始于选择一个转移,然后执行目的节点。当执行调用目的节点的行为,它将自己作为参数传递。然后节点行为可以以此向下传播执行通过再次调用执行的proceed方法。注意,这仍然在客户线程中完成,客户端仍然在等待原来的proceed调用返回。
In other words, the execution is always interpreted in the thread of the client. When the the execution is not propagated any more, the nested proceed invocations start returning since the proceed should be the last operation in the behaviour's execute method. The proceed invocation will be blocking for the caller until a wait state is reached.
换句话说,执行通常在客户端线程中解释。当执行不再传播,内嵌的proceed调用开始返回,因为proceed应该是behaviour的执行方法的最后一个操作。调用者的Proceed调用会阻塞,直到到达等待状态。
The default behaviour is to have synchronous executions because it limits the number of transactions and hence improves performance. But in case there are many automatic things to be done before the process execution reaches another wait state, Asynchronous continuations have to be considered.
默认行为是同步执行,因为它限制了事务的数量,因此提升了性能。但是当有许多自动的事情要在流程执行到达另外的等待状态前完成时,应该考虑异步执行。
Interpretation
解释
The bulk of the interpretation of the process is delegated to the behaviour implementations. The only responsibility of the execution is to follow the transition to it's destination node and update the node pointer accordingly. That way, the execution's node pointer will always indicate the current state.
大量流程的解释是在behaviour的实现中完成的。Execution的唯一职责是沿着转移到目的节点和更新节点位置。Execution的节点指示器会始终指向当前状态。
This is important because it means that the bulk of the behaviour of the engine is pluggable. The essence of interpreting a graph is baked in the component model. But all the rest is left up to the process construct implementations. That is why it is possible to build such diverse process languages on top of the same framework.
这一点非常重要,因为这意味着引擎的大量behaviour是可插拔的。解释图的本质是baked组件模型。但是,所有其他的留给了流程元素实现。这就是为什么能够在同一个框架之上构建多种多样的流程语言。
Another way of describing the responsibilities of the node implementations is the following: First, they can use any Java API to perform any kind of business logic like sending emails, creating tasks, calling web services and so on. To do this, the implementations have access to the contextual information about the process like process variables, services from the environmentand configuration information that gets injected into the member variables. After the business logic, node implementations can propagate the control flow of the process execution. The basic scenarios are wait states and propagation of the incoming execution over one a leaving transition. But in more exotic scenarios, the node implementation can reorganise the complete runtime execution datastructure as well.
下面是描述节点实现的职责的另一种方式:首先,能够使用任意Java API来执行任意业务逻辑,如发电子邮件,创建任务,调用web服务等等。要实现这样,实现要求能够访问流程的上下文信息如流程变量,环境的服务和注入成员变量的配置信息。业务逻辑执行完之后,节点实现可以传播流程执行的控制流。基本的场景是等待状态和根据一个离开转移传播执行。但是,在更加特定的场景中,节点实现也可改变整个运行时执行数据结构。
Extensions
扩展
The previous sections explained the basic operations of the Process Virtual Machine. In the next sections we'll describe a set of extensions to this basic model to show how advanced features can be build on top.
前面的章节解释了流程虚拟机的基本操作。后续的章节中,我们将阐述对基本模型的一组扩展来展现如何增加高级特性。
Variables
变量
Process variables can contain contextual information associated with a single execution. For example, in a payraise process, the desired amount and the reason could be process variables.
流程变量可以包含与单个execution关联的上下文信息。比如,在一个加薪流程中,加薪数目和理由可作为流程变量。
Process variables can be added to the basic Process Virtual Machine by associating a set of key value pairs within an execution.
流程变量可以以将一组key-value对关联到单个execution的方式加入流程虚拟机。
Processes or nodes can have variable declarations. Some process languages mandate variable declarations and others don't. So that is why the key value pairs should be dynamic, just like a HashMap in Java. The variable declarations could be used at runtime to initialize, destroy or to check for avaiability of a certain variable.
流程或节点可拥有变量声明。一些流程语言强制变量声明但是另外一些不强制。这种情况下的key-value对是动态的,就像Java中的hashmap。变量声明可以在运行时用来初始化,销毁或检测特定变量是否存在。
Process variables should be Plain Old Java Objects (POJO) because the component programming model is in Java. Process languages like BPEL only store xsd types or XML snippets. In that case, the process language itself is responsible of translating those datatypes into the desired POJO type.
流程变量应该是POJO,因为组件编程模型是Java。流程语言像BPEL仅存储xsd类型或XML片段。在这种情况下,流程语言自己应负责转换这些数据类型为目标POJO类型。
Persistence is optional, so in general, it should be possible to store any POJO object in the process variables. Only when an execution is stored, transformations should be applied to map the POJO's to their persistable format.
持久化是可选的,因此,一般来讲,应该能够在流程变量中存储任何POJO对象。仅当execution存储的时候,可将POJO映射到它的持久化格式。
The section about Concurrent paths of execution will introduce a tree of executions. That tree also defines a natural scoping structure for process variables.
并行路径执行一章介绍了一种执行树。这种树也为流程变量定义了一种域结构。
Actions
动作
An action is the crucial concept that will enable modelling freedom for the analyst, while the developer has to make the process executable.
动作是非常重要的概念,它可使得分析师独立于模型,即使开发人员需要让流程可执行。
An action is a piece of programming logic that is inserted invisibly in the process graph. For example, perform a certain database update when leaving this node. Or, delete a certain file when this transition is taken. Actions can also be described as listeners to process events.
动作是不可视的插入流程图的一段编程逻辑,当离开当前节点时完成特定的db更新。或者,当转移选取时删除一个特定文件。动作可被描述为流程事件的监听器。
The Executable interface that is shown above can be used to refer to actions.
上面的Executable接口可以用来引用动作。
An event is a point in the process where actions can be specified. The three most common events are entering a node, leaving a node and taking a transition. As an implementation detail, we mention that the Execution class can be enhanced with a fire method in the API. That way, clients, actions and node behaviour implementations can all fire events. Despite the goal of being hidden from the graphical diagram, in Figure 8, we indicate how the actions are related to the diagram just to serve this explanation.
时间是流程中可以指定动作的点。三个最常用的事件是节点进入事件,节点离开事件和转移获取事件。作为实现细节,我们提及了Execution类可以在api中以增加一个fire方法的方式增强。客户端,动作,和节点行为实现可以触发事件。尽管目的被隐藏在符号图中了,在图八中,我们指出来动作是如何与图相关的。
Figure 8: Most common events
图八:最常用的事件
But it is good to use the combination of a process element and a plain event type string to identify an event. That way, much more event types can be defined and even users could define their own event type.
但是,最好使用流程元素和简单事件类型字符串的组合来标识一个事件。用这种方式,可以定义更多的事件类型,甚至用户可以定义他们自己的事件类型。
Events can also be propagated. The process can have a hierarchical structure with nested nodes. If a process language implements a construct like e.g. UML super states, events on a nested node can be propagated to the enclosing superstate. That propagation can continue recursively until the level of the process defintion itself. So super states or processes might listen to all events of a certain type that are fired inside. For example, invoke this action for every transition that is taken within this superstate.
事件也可以传播。流程可以带有内嵌的节点,组成分层结构。如果流程语言实现了一个构造如UML 超state,内嵌的节点上的事件可以传播到封闭的超状态。传播可以递归进行,直到流程定义级别为止。所以,超状态或流程可能监听所有触发的指定类型的事件。例如,为超状态中的每一次“获取转移”调用动作。
Concurrent paths of execution
并行路径执行
One path of execution can point to one single location in the process graph. In some situations, there parts in the process that can progress independently. For instance, the shipping and billing part of a process. In those scenarios, multiple paths of execution are needed to keep track of the state for one complete process execution. To support this, the executions can be structured in a parent-child relation. A new execution for a given process acts as the main path of execution. To create concurrent paths of execution, child executions can be created. Typically, the parent execution will remain inactive in the fork (aka and-split) while the child executions are executing.
执行的一个路径可以指向流程图中的单个位置。在一些情形下,流程的这些环节能够独立的执行。例如,发货和开发票环节。在这些场景中,需要execution的多个路径来跟踪一个完整的流程执行的状态。为了支撑这种情形,execution可以构造为父子关系。指定的流程的新执行可以作为执行的主要路径。要创建execution的并行路径,可创建子exection。典型的,当子执行正在执行时,父执行会保持“不活动”状态在分支(或and-split)。
Several fork and join behaviours can be implemented in as node behaviours. The fork typically launches a new execution over each of its leaving transitions. Optionally, guard conditions might be added to prevent some of the concurrent executions from being created.
一些fork和join行为可以实现为节点行为。For通常会为每一个离开转移发起一个新执行。可选的是,可以增加条件来阻止创建一些并行执行。
Note that process concurrency mostly doesn't require multithreaded calculations. Assume for a second that no actions are involved and that all nodes right after the fork (aka and split) behave as wait states, like illustrated in figure 9. In that case, one database transaction should contain the operations to inactivate the parent execution and to add the two new child executions each referencing their respective wait state.
注意流程并行通常不需要多线程执行。假定某一时刻没有执行任何动作,且fork(或and split)后面的所有节点都是等待状态,就像图9中的例子。在这种情况下,一个数据事务应该包含去激活父执行和增加两个新的子执行的操作。每个子执行应该引用各自的等待状态。
Figure 9: Concurrent paths of execution
图九:并行路径执行
Calculating the updates for that single transaction doesn't need to be done in two separate threads. In fact, those threads would have to be synchronized anyway because they operate on the same database transaction.
为那个独立的事务的更新计算不需要在两个独立的线程中完成。事实上,不论怎样,那些线程将会需要同步,因为它们操作同一个数据库事务。
The main difference is in the join behaviour. We'll describe two examples of join implementations just to illustrate that any type of join behaviour can be implemented. The Process Virtual Machine itself doesn't include a fixed mechanism for process concurrency. Instead, the different forms of concurrency can just be implemented just like any other process construct.
主要的不同点在join行为中。我们将描述join实现的两个例子来说明任意类型的join行为可以被实现。流程虚拟机本身没有包括流程并行的一个固定机制。相反,不同形式的并行可以像其他任何流程元素一样实现。
As long as all paths leaving a fork arrive at the same join and no other paths can arrive at that join, a simple join implementation can be build that takes advantage of the hierarchical structure of the executions: When an execution arrives in a join, it first marks the incoming execution as deactivated and then checks if there are any active siblings left. If there are no active siblings left, the parent execution is reactivated, positioned in the fork and that execution is propagated over the default leaving transition.
只要来自同一个fork的所有路径到达同一个join,并且没有其他的路径可以到达那个join,一个简单的join实现可以利用执行的分层结构:当执行到达join,它首先标记进入的执行为去激活状态,然后检查是否存在活动的兄弟。如果没有活动的兄弟,父执行可以重新激活,定位在fork,然后执行通过默认的离开转移传播。
Another approach is to count the number of executions that have arrived in a join. If that number just got the same as the number of arriving transitions, a new execution will be created, positioned in the join and propagated over the default transition. In this case the complete execution tree needs to be pruned for inactive executions after a join has been triggered.
另一个方法是去计算已经到达join的执行的数目。如果该数目正好与到达转移的数目相同,则新创建一个执行,定位在join然后通过默认转移传播。在这种情况下,当join触发之后,该完整的执行树需要为非激活的执行修正。
This type of join implementation can handle unstructured combinations of forks and joins. I leave it up to the imagination of the reader to find the differences with the previously explained join behaviour. The most important point is that different types of fork and join behaviours can be implemented on top of the Process Virtual Machine.
该类型的join实现能够处理fork和join的无结构组合。我留给读者去发现其与前面阐述的join行为的差别。最重要的一点是不同类型的fork和join行为可以基于流程虚拟机实现。
Process composition
流程组成
Process composition means that a node in one process can reference another process. When an execution arrives in such a process node, that execution will wait in that node until a complete execution of the referenced process is completed.
流程组成意思是流程的一个节点可以引用另一个流程。当执行到达这样的流程节点时,执行会在那个节点等待,直到引用的流程执行完成。
Figure 10: Process composition
图十:流程组成
This can be added to the process virtual machine by leveraging the hierarchical relation of the execution tree that was created for concurrent paths of execution. When an execution arrives in a process node, a child execution can be created for the referenced process like illustrated in Figure 10.
可以将其通过像并行路径执行一样采用分层关系的执行树的方式增加到流程虚拟机中。当执行到达一个流程节点,如图10所示为引用的流程创建子执行。
A check needs to be added when executions complete. In this case, executions that are completed need to check whether there is a parent execution available. If that is the case, that execution need to be propagated. The parent execution, one that originally arrived in the process node, will then leave that process node and continue over the default transition.
当执行完成时需要增加一个检测。在这种情况下,完成的执行需要检查是否存在父执行。如果存在,执行需要被传播。最初到达流程节点的父执行会离开流程节点,通过默认的转移继续执行。
Asynchronous continuations
异步执行
Up to now, all propagation of the execution was done synchronously. In some situations that might be problematic. For example suppose a process where a large pdf file has to be generated automatically after a user has completed a task. And most likely, the invocation ofexecution.proceed() will be done in the request-response cycle of the web application for the user. In that case, the web response is unnecessarily delayed with the generation of the pdf because that is done inside of the proceed method invocation. In general, this means that the client that is calling the Execution.proceed will be blocked until the process has reached a new wait state.
到目前为止,执行的传播是同步进行的。在某些场景下市有问题的。例如,假设当用户完成一个任务后,流程需要自动生成一个大pdf文档。很有可能的是,execution.proceed()的调用是用户在请求-应答模式的web应用下进行的,在这种情况下,web应答将不必要的为pdf的生成而等待,因为pdf生成是在proceed方法调用中完成的。一般地,这意味着调用Execution.proceed的客户端会阻塞,直到流程到达新的等待状态。
The Process Virtual Machine can be extended to handle these situations by using an asynchronous message queue. To see how that works, we first must define what the atomic operations are. Atomic operations cannot be interrupted. Here, we'll define two atomic operations: Executing a node and taking a transition. So execution of a node and of a transition cannot be interrupted. But in between those atomic operations, a thread can decide to stop the interpretation and instruct another thread to continue the interpretation asynchronously from that point forward.
流程虚拟机可以使用异步消息队列扩展用来处理这些场景。为了明白它如何工作,我们首先必须定义什么是原子操作。原子操作不能中断。这里,我们会定义两个原子操作:执行一个节点和获取一个转移。因此,执行一个节点和执行一个转移不能中断。但是在这些原子操作之间,线程可以决定停止解释,通知另一个线程来从那一点向前继续异步解释。
In the node and transition elements, an extra configuration flag can be added. Let's call it the asynchronous flag. When a node has the asynchronous flag set, it will be executed asynchronously at runtime. Similar for transitions.
在节点和转移元素中,可以增加一个额外的配置项。我们可称它为异步标记。当一个节点设置了异步标记,它将在运行时异步执行。转移也是一样的。
Remember that the execution is responsible for interpreting the graph, finding the destination node of a transition and executing the behaviour of that node. Now, when an execution arrives in an asynchronous node, the node will not be executed. But instead, a message is being sent over an asynchronous messaging system. The message contains a reference to the execution and the instruction to execute that node. After that, the Execution.proceed returns and the transaction, that contains both the process updates and the production of the asynchronous message, can be committed.
记住执行的职责是解释图,找到转移的目的节点和执行那个节点的行为。现在,当执行到达异步节点时,该节点不会执行。但是,会通过异步消息系统发送一个消息。该消息包括执行的一个引用和执行那个节点的指示。然后,execution.proceed返回,包含流程更新和异步消息产生的事务可被提交。
At the other end of the queue, there is a component called the job executor. That component will start a new transaction, fetch a message from the queue and perform the instructions in the message. In our scenario, this means that the node will be executed. When that is done, the transaction will be committed. That transaction included consumption of the message and the new process updates.
在这个队列的另一端,有一个叫job执行器的组件。该组件会启动一个新事务,从队列中获取一个消息,按照消息的指示执行。在我们的场景中,这意味着会执行那个节点。当这个完成后,事务会被提交。这个事务包括消息消费和新的流程更新。
Figure 11: Asynchronous continuations
图十一:异步执行
This can also be seen as a kind of transaction demarcation. Because by default the interpretation of the graph will continue until a wait state is reached. But with asynchronous continuations, it's possible to mark a position in the process where the current transaction should be committed and a new transaction should be started automatically.
这也可以看做一种事务划分。因为默认情况下图的解释会继续进行,直到到达等待状态。但是,有了异步执行,就可以在流程中标记一个位置,哪里应该提交当前的事务,哪里应该自动启动一个新事务。
The tradeoff for asynchronous continuations is exception handling versus response times. We already saw in the beginning of this section that response time can be significantly reduced by introducing asynchronous continuations. But on the other hand, the downside can be that the original caller is not aware of any problems that might arise further in the process. For example, a task node is followed by an email notification node and suppose that it is crucial that this notification is sent. When the email notification node is marked as asynchronous, the task completion will be faster, but that user might not get a notification when the mail server is down.
权衡是否采用异步执行的原则是异常处理和响应时间。我们已经在本章的开始看到响应时间能够采用异步执行大大的减少。但是从另一方面讲,缺点是调用者无法发现流程中可能发生的任何问题。例如,一个任务节点后面有一个email通知节点,假定这封邮件的发送非常重要,如果email通知节点标记为异步,任务完成会非常快,但是如果邮件服务器当掉了,用户将无法得到通知。
In case an exception occurs in the job executor, an automatic retry mechanism could be used. Most message queueing systems have a configurable number of retries before the message is sent to the dead letter queue. Administrators could monitor that dead letter queue or a timer could be added to notify the stakeholders of the process execution in case of asynchronous failure.
如果job执行器发生了异常,可使用自动重试的机制。多数消息队列系统在消息被送到死信队列前可配置一定数量的重试。在异步失败的情况下,管理员可以监控死信队列或者可以增加一个定时器来通知流程执行的涉众。
Process updates
流程更新
A set of process updates describe modifications to a particular process. For example additions and removals of nodes, transitions or actions.
流程更新描述对特定流程的更改。例如,增加和移除节点,转移或动作。
With process updates, the Process Virtual Machine can be extended for two new use cases: Process inheritence and per instance process customizations.
通过流程更新,流程虚拟机可以为两个新用户用力扩展:流程继承和单个实例流程定制。
Suppose a situation where for each country there is a process on how banks must do efforts to detect suspecious stock transactions. And suppose that most that the basic layout of all those processes would be the same. There are only some minor details, that are different. In that case, it might be an option to describe a single default process and then define a kind of process inheritence. A specialized process might be defined as a reference to another process plus a set of updates. Of course, the execution's interpretation algorithm have to be modified to take these process modifications into account.
假定一个场景,每个国家应有个流程用于检测可疑的股票交易。假定大多数的流程的设计是相同的,仅有很少的细节不一样。在这种情况下,可以选择描述一个默认的流程,然后定义一种流程继承。一个特定的流程可能定义为引用一个其他的流程加一些更新。当然,执行的解释算法应考虑相应的流程修改。
The second use case is updates per execution. Suppose that a user wants to monitor all progress for a specific execution. In that case, he could add an action on a process definition for the transition event. That means that the notification action will be executed for each transition that is taken in the whole process.
另一个用户用例是更新每个执行。假定一个用户想要监控一个特定执行的所有进展。在这种情形下,他可以在流程定义上为转移事件增加一个动作。这意味着这个动作会在整个流程的每个转移获取时被执行。
Another example of updates per execution is removing transitions for a given execution. Suppose that in the process, there is a distinct handling for urgent packets from the handling of normal packets. Now imagine that early in the process, a user can recognizes a client that always tries to get the fast delivery while he's not entitled to it. Then this might be implemented with process updates by letting the user remove the transition for urgent handling.
更新每个执行的另一个例子是移除给定执行的转移。假定在流程中,会区别处理紧急邮包和正常邮包。现在想象在流程的早期,一个用户能够认识总是尝试取得最快的邮递的客户,但是他有没有这样的权利,这可以通过让用户移除用作紧急处理的转移的流程更新方式来实现。
Figure 12: Process update for a single execution
图12:为单个执行的流程更新
History
历史
While a process is being executed, history logs can be generated. Those history logs can contain information about when transitions were taken, when nodes are entered and when nodes are completed. Also the process variable updates and tasks can produce logs. Basically, the whole execution of a process can be logged.
One way to add configurability and flexibility is to define a loging service. The execution and the nodes can then generate log objects in the form of Java objects and pass them to the logging service. That way, log filtering and transformations can be applied centrally. Also different forms of storage can be configured in the logging service.
当执行流程时,可生成历史日志。历史日志可包含信息:转移何时获取,何时进入节点和何时完成节点,还包括流程变量更新和任务生成日志。基本的,整个流程的执行可被记录。
增加可配置型和灵活性的一种方式是定义一个日志服务。执行和解点可以以Java对象的方式生成日志对象,然后传递他们到日志服务,用这种方式,日志过滤和转换可以集中应用。不同形式的存储也能够配置在日志服务中。
The most important use case for history logging is explained in more depth in section Business intelligence (BI).
历史记录最重要的用户用例在BI章节中更深入的解释了。
If the logs keep complete track of all the updates that are done in a process, the logs can be used to generate compensating transactions. A compensating transaction means that a new transaction will undo the effects of previous transactions. A technique to do this could be to walk through the logs in reverse order and restore the begin state as indicated in the log entries.
如果日志完整的记录了流程中所有的更新,日志可以被用来生成补偿事务。补偿事务的意思是一个新事物会用于取消前一个事务的所有效果。一种实现技术可以是反向遍历日志然后恢复日志实体中指定的开始状态。
Persistence
持久化
Up to now, the Process Virtual Machine was explained in terms of objects, without any reference to persistence. This is possible because of Object Relational Mapping (ORM) technologies such asHibernate. ORM can translate between objects and database records. On the Java platform the database is accessed using JDBC. Object relational mappers can also make abstraction of the differences in the SQL dialects of databases.
到目前为止,用对象的方式解释了流程虚拟机,没有提及持久化。可以通过对象关系映射技术如hibernate来实现。ORM可以转换对象和数据库记录。在Java平台上,使用JDBC来访问数据库。对象关系映射也可以用来屏蔽数据库的方言的差异。
The algorithms we have defined for process interpretation never use the database directly. Persistence of processes and executions can therefor be offered as a separate service. That way, process languages that don't need persistence, like e.g. pageflow are not required to use a database.
我们为流程解释定义的算法不会直接使用数据库。流程和执行的持久化可以由一个独立的服务提供。用这种方式,流程语言不需要持久化,比如,pageflow不需要使用数据库。
The data in a workflow database can be split into three main compartiments like illustrated in Figure 13: Static process definition information, runtime execution information and history logs.
工作流数据库中的数据可以如图13所示分为三个主要部分:静态流程定义信息,运行时执行信息和历史日志。
Figure 13: The three parts of a workflow database
图13:工作流数据库的三部分
The object relational mapper will take a complete process object graph and translate that into records in the database. That way, the complete process structure can stored in the database. For example, the node objects in the process object graph will be inserted as records in the nodes table. Since process definition information normally doesn't change, it can be cached in memory. In jBPM's case that is done using the second level cache of Hibernate.
对象关系映射会获取一个完整的流程对象图,转化为数据库记录。完整的流程结构可以存储在数据库中。例如,流程对象图中的节点对象会作为节点表的记录插入。因为流程定义信息一般不会改变,可以将它缓存在内存中。JBPM中是使用了hibernate的二级缓存。
The runtime process execution information is typically updated in every transaction that includes a workflow operation because the execution's node pointer will have moved to a next node. In Figure 14, you can see the scenario of the workflow persistence operations in one transaction.
运行时流程执行信息通常在每个事务中更新,包括工作流操作,因为执行的节点指针会移到下一个节点。在图14中,你能够看到工作流在一个事务中的持久化操作的场景。
In this scenario, the starting situation is that an execution was already persisted in the database and now an external client wants to provide an external trigger to resume the process execution. So the execution record in the database has a foreign key column that references the node in the nodes table. To get started, the client needs the id of the execution.
在这个场景中,初始的情形是执行已经持久化在数据库中,现在,一个外部客户端想要提供一个外部的出发来恢复流程的执行。因此数据库中的执行记录有一个引用节点表中的节点的外键。客户端需要执行的id来启动。
In the first step, the client start a transaction and loads the execution object using the object relational mapper using the id.
在第一步中,客户端启动一个事务,通过使用ORM使用id装载执行对象。
Figure 14: The persistence scenario of a runtime workflow transaction
图14:运行时工作流事务的持久化场景
In the second step, the client will invoke some methods on the execution object. Typically this will be the proceed-method. In general, methods will be invoked on the runtime data structure. After those method invocations, the execution is pointing to a new node. This second step was executed without any consideration of persistence.
在第二步中,客户端会调用执行对象的一些方法。通常会是proceed方法。通常,运行时数据结构的方法会被调用。在这些方法调用之后,执行会指向到一个新节点。这个第二步实在没有任何考虑持久化的前提下执行的。
In the third step, the changes made to the execution object will be saved in the database. It's the task of the object relational mapper at this stage to analyse the execution java object structure and compare it with the data that was originally loaded from the database. The object relational mapper will then issue the nessecary insert, update and delete SQL statements to bring the database in sync with the execution object graph.
在第三步中,执行对象的更改会保存到数据库中。分析执行java对象结构和对比它与最初从数据库中装载的对象的数据是数据关系映射工具的任务。对象关系映射工具然后会执行必要的插入,更新和删除sql语句,使数据库与执行对象图同步。
In the case where an execution just moves to the another node, the result will be a single update statement where only the foreign key column of the execution will have to be changed to the id of the node to which the execution is pointing after the method invocation.
Then, the transaction can be committed.
在执行仅仅移动到另一个节点的情况下,结果会仅仅是单个更新语句更新执行表的外键列。
Another aspect of object relational mapping solutions that deserves attention in this context is optimistic locking. ORM solutions like Hibernate have a built-in mechanism for optimistic concurrency control. A version column is added to the database and each time an object is updated, the version number will be increased as well. But with each update, in the where clause an extra condition is added that specifies the version of the object that was originally loaded. If execution of that SQL statement returns that zero rows were updated, that transaction knows it was working with stale data and the transaction should be rolled back. For more about this, seethe hibernate reference manual.
在当前上下文中值得注意的对象关系映射方案的另一个方面是乐观锁。ORM方案像hibernate有内置的乐观并发控制机制。数据库中会增加一个版本列,每次更新对象时,版本数目会相应的增加。但是对每次更新,一个额外的用来指定初始装载的对象的版本的条件会增加在where子句中。如果那条sql语句执行返回0行,则事务知道他使用了脏数据,事务应该回滚。可通过Hibernate参考指南了解更多这方面的知识。
But the result of all this optimistic locking is that process engines based on the Process Virtual Machine can scale by synchronizing on the database, using this light weight concurreny control mechanism. As long a they all are working on the same database and use this optimistic concurrency, all process updates comming from multiple systems will be synchronized.
但是,乐观锁的结果是基于流程虚拟机的流程引擎能够通过同步数据库的方式扩展,使用这种轻量级的并发控制机制。只要他们都是操作同一个数据库且使用乐观并发控制,来自不同系统的所有流程更新会同步。
Timers
定时器
To add timers to the Process Virtual Machine, a timer service is required that can schedule timers to be executed in the future. Timers must have the ability to contain some contextual information and reference the program logic that needs to be executed when the timer expires.
为了给流程虚拟机增加定时器,需要一个定时器服务,它可以调度将来执行的定时器。定时器必须有能力获取一些上下文信息和引用定时器过期时要执行的程序逻辑。
Let's look at the typical scenario where a wait state needs to be monitored with a timer. For example, if a user task is not completed within 2 days, notify the manager.
让我们来看看需要用一个定时器来监控的等待状态的典型场景。例如,如果用户任务没有在两天内完成,通知经理。
To implement this, the timer service as described above can addressed from actions on the node-enter and node-leave events as indicated in Figure 15. When the node enters, a timer is created with a given due date and it is scheduled with the timer service. Then the node will behave as a wait state until an external trigger is given.
为了实现它,上面描述的定时器服务可以用动作来解决,动作可基于图15中指定的节点进入和节点离开事件。当节点进入时,使用给定的过期日期创建一个定时器,然后用timer服务来调度它。然后节点会进入等待状态,直到一个外部触发发起。
Figure 15: A node with a timer
图15:带一个定时器的节点
Suppose now that an external trigger is given before the due date of the timer. Then the execution will leave the node and the cancellation action will be executed. This action will cancel the timer that was created upon entering the node.
If on the other hand, the external trigger is not given, the execution will still be positioned in the node when the timer fires. In that case, the programming logic of the timer is executed. In our example that was the notification to the manager.
假定在定时器的过期日期之前,发起了一个外部触发。然后,执行会离开节点,取消动作会被执行。这个动作会取消在进入节点时创建的定时器。
在另一方面,如果没有外部触发,执行会仍然指向定时器激活的节点。在这种情形下,定时器的编程逻辑会被执行。在我们的例子中,会通知经理。
Services
服务
In this section about services, we'll discuss an implementation aspect of the Process Virtual Machine that is crucial for embeddabity of the engine inside of the client application.
Whenever the execution or the node behaviour implementation needs some kind of service, always consider using a self made interface. Examples of such services are the asynchronous message service and the timer service as discussed above. That way, different implementations can be provided in different environments. For example, in a Java enterprise environment, the asynchronous message service will probably need to be tied to JMS. Whereas in a standard environment this might be done using an in-memory queue.
在关于服务的这一章中,我们会讨论流程虚拟机的实现方面,这对将引擎嵌入到客户应用中非常重要。
每当执行或节点的行为实现需要某些服务时,通常考虑使用自定义的接口。这种服务的一个例子是异步消息服务和上面讨论的定时器服务。基于这样,在不同的环境中可以提供不同的实现。例如,在Java企业环境中,异步消息服务会可能是JMS。如果是在标准环境中,这可能会使用内存中的队列。
A simple way to make all external services available to the execution and all the node behaviour implementations is to add a context property to the execution. The context can manage a set of named services and objects. The client knows in which environment it is running, so it constructs the context object and injects it in the execution before a method is invoked on it.
使所有外部服务对执行和所有节点行为实现可用的一个简单办法是为执行增加一个上下文属性。上下文可管理一组命名服务和对象。客户端知道它们在哪种环境中运行,因此它构造上下文对象,在调用执行的方法前将它注入到执行中。
This mechanism can be given an extra dimension when an Inversion of Control (IoC) container is used for the context. In that case, transactional services can be lazy created on demand during execution of a process. The typical XML based configuration files of the IoC containers can be leveraged to specify which implementations for the services need to be used.
当为上下文使用了控制反转容器,这种机制可以给定一个额外的方面。在这种情形下,事务服务可以在流程执行时惰性创建。典型的基于XML的IOC容器配置文件能够用来指定使用服务的哪种实现。
Features
特性
In this section we explain the essential features of process languages and how they relate to the Process Virtual Machine. Where the PVM basics and extensions explains a strategy on implementing a common foundation for process engines, this section describes more the essential features of BPM Systems and how the Process Virtual Machine relates to those features.
本章中我们会解释流程语言的基本特性和它们如何与流程虚拟机联系起来。前述流程虚拟机的基本和扩展章节阐述了实现流程引擎的公共基础的策略,本章会介绍更多的BPM系统基本特性和如何使流程虚拟机关联到这些特性。
Process modeling
流程建模
First we need to distinct between two categories of processes: descriptive process models and executable processes.
The purpose of descriptive process models is to describe business processes to other people. They support various kinds of notations like BPMN, Event-driven Process Chains (EPC) or UML acitivty diagrams. The main goal of these languages is to define precise semantics of the notational elements, resulting in an expressive graphical language. The modeller can use the graphical constructs in great freedom for the purpose of communicating the process to the reader. Apart from the expressive notation, they may also provide value in managing large sets of interrelated diagrams and models. These processes are not executable on a runtime environment.
我们首先要区分两类流程:描述性流程模型和可执行的流程模型。
描述性流程模型的目的是向其他人员描述业务流程。存在许多种类的图元支持它,像BPMN,事件驱动的流程链(EPC)或UML活动图。这些语言的主要目的是定义图元的准确语义,产生一个用于表示的图形化语言。建模人员可以自主使用这些图形构造来达到将流程传达给读者的目的。除了表示符号,可能也会在管理大量的相关联的图和模型方面提供价值。这些流程不能在运行环境执行。
The second catogory aims to make process models executable on a runtime environment aka process engine. In this case, executable process models are in fact software artifacts that specify the behaviour of a computer system. In that respect executable processes are exactly the same as a programs in a Object Oriented (OO) programming language like e.g. Java, even while the format and language are completely different. So in this case, the executable process as a whole is not so free any more. Because it has to specify a particular desired behaviour of a software system.
第二类关注于使流程模型在运行环境如流程引擎可执行。在这种情形下,可执行的流程模型指定了计算机系统的行为,是事实上的软件器件。在这些方面,可执行的流程与面向对象语言如Java中的程序完全一样,即使格式和语言完全不同。因此在此情形下,可执行的流程作为一个整体不再如此随意(指相对描述性流程而言),因为它需要指定软件系统的特定目标行为。
The Process Virtual Machine defines a model for the implementation of process engines. As for any process that is executable on a process engine, processes for the Process Virtual Machine are a combination of a graphical process elements and related technical details. So the graphical diagram of an executional process can be seen as a kind of projection that excludes those technical details.
The big value of this combination is that the process diagram can serve as a common language between all stakeholders of a process, regardless of their technical skills. This is shown in Figure 16.
流程虚拟机定义了流程引擎实现的模型。任何在流程引擎中可执行的流程都是由图形化的流程元素和相关的技术细节组成的。因此,可执行的流程的流程图可被看作一种不包括技术细节的投影。
这种组合的巨大价值是流程图可以作为面向流程的所有涉众的公共语言,而不需要关心他们的技术能力。如图16所示:
Figure 16: Technical details of a process
图16:流程的技术细节
Business analyst have vary different technical skills. Some have none. Some have a little bit of technical skills from a previous life as a developer and other people might actually combine the roles of developer and business analyst. This translates into Figure 16 as the dotted line that can shift up or down.
业务专家有完全不同的技术能力。一些没有,一些从事过开发者职业的会有一些技术能力,其它的可能同时承担开发者和业务专家的角色。图16中的可上下移动的虚线反映了这一点。
To translate that into the concepts of the Process Virtual Machine, there are three levels of detail.
要转换那些信息成流程虚拟机的概念,存在三级细节。
1. Process structure: These are the nodes and transitions. The graphical structure is the highest level that can always serve as a communication starting point. Even for technical illiterates.
流程结构:节点和转移。图形化结构通常是可作为交流的起点的最高级别。对技术文盲也是一样。
2. Node type: In process engines, the node type defines an exact runtime behaviour. In the Process Virtual Machine this is corresponds to the node behaviour implementations. That is the second level of detail. This typically also corresponds to the process constructs available in the pallette of the graphical editor. For example, a task node, a decision a web service invocation node and so on.
节点类型:在流程引擎中,节点类型定义了额外的运行时行为。在流程虚拟机中有对应的节点行为实现。这是二级细节。这通常也对应到图形化编辑器的控件面板中的流程构件。例如,一个任务节点,一个决策,web服务调用节点等等。
3. Configurations: The finest level of detail is the configurations that customize the process construct for a particular use of the node type. In terms of the Process Virtual Machine, this corresponds to the member field values that are injected into the Executable behaviour implementation objects. In graphical editors, usually a form pops up when selecting a node in the diagram to enter those configuration details. Examples of configurations are the URL for a web service invocation, the role or swimlane for a task assignment, the condition for a decision and so on.
配置:配置是最后级别的细节,它定制流程构件的节点类型的特定用法。从流程虚拟机的方面来讲,这对应到注入到Executable行为实现对象的成员变量值。在图形化编辑器中,通常在图中选择一个节点时会弹出一个窗口来输入这些配置。例如调用的web服务的URL,用于任务分派的泳道的角色,决策节点的条件等等。
On top of these three traditional levels, the Process Virtual Machine has two features that we want to highlight that still give a great deal of modelling freedom to the business analyst, even for executable business processes.
First of all, if the desired behaviour of a node in a process doesn't match with any of the available process constructs in a process language, a developer is always free do write program logic to implement custom behaviour in the nodes. Secondly, in case the developer needs to add a piece of programming logic to make the process executable and this is of no interest to the business analyst, an action can be used to inject this piece program logic without a change in the graphical representation of the diagram. This provides an extra level of separation between the graphical representation and the technical execution details, keeping as much modelling freedom for the business analyst.
在这经典的三级之上,流程虚拟机有两个特性,我们要特别指出业务专家仍然有很大的自由来建模,对可执行的业务流程也是一样的。
首先,如果流程节点的预期行为不匹配流程语言中的任何流程构造,开发者通常会有编写程序逻辑来定制节点的行为的自由。其次,当开发者需要增加程序逻辑来让流程可执行,业务专家对这些没有兴趣,可以使用动作来注入这些编程逻辑而不需要改变图的图形化展现。这样分离了图形化展现和执行技术细节,让模型对业务专家更自由。
Business Intelligence (BI)
商业智能(BI)
Business intelligence is about extracting information from the history of process executions that is useful for business people. As described above in History, a lot of information can be captured from process executions. Each time an external trigger is given or when a transition is taken, this can be logged in the history database.
商业智能是从流程执行历史中提取对业务人员有用的信息。按照前面历史章节中描述的,可以从流程执行中捕获许多信息。每次给定一个外部出发或当转移发生时,可以记录在历史数据库中。
Now the nice thing is that from those history logs, meaningful business level information can be captured. For example, "How long does each step in the process take on average?" or "What percentage of priority claims are handled within two weeks?" It is not a coincidence that these meaningful questions can be easily answered by the workflow engine's history logs because the graphical diagram was built in collaboration with business people.
There are many ways on how that historic information can be logged and processed. While outside of the scope of the Process Virtual Machine, we still want to highlight the typical processing and usage of historic infomration. Often process engines log every event in a kind of flat list style during runtime. Then, at some point (e.g. when the process execution finishes) the flat list of logs is processed and transformed in a separate database schema that is optimized for querying.
现在比较好的事情是可从历史日志中捕获有意义的业务信息。例如,“流程中的每一步平均花了多长时间?”或“有多少百分比的高优先级的报销单被处理了?”因为流程的图形图是业务人员协力完成的,因此工作流引擎的历史日志能够回答这些有意义的问题不是一个巧合。
有许多方法来记录和处理这些有重要的信息。尽管这不是流程虚拟机的范畴,我们仍然想要特别指出这些重要信息的典型的处理和使用。通常,流程引擎在运行时以一种平面列表格式来记录每一个事件,然后,在某一时刻(例如:当流程执行完成),这些平面列表记录会被处理和转换到一个独立的数据库中,使其更易于被查询。
Workflow engines usually record these history logs by default. Many statistics typically come out of the box and it is very easy to define new meaningfull queries on this business intelligence database. While with old fashion plain programming, this will take a lot more time and make the overall project more complex. So when in doubt about using a workflow process or plain programming, consider wether this kind of information is important for your application.
工作流引擎通常默认会记录这些历史日志。许多统计数据通常会因此产生出来,也能够很容易的基于商业智能数据库定义新的有意义的查询。如果采用旧的plain编程,这将导致花费许多时间和让整个项目更加复杂。因此,当有使用工作流或还是使用plain编程的疑问时,要考虑这些信息是否对你的应用很重要。
Task management
任务管理
Task management means that all people know exactly what they should do and when no tasks are done twice. Therefor, the right information needs to be provided to the right people at the right time. The benefits of proper task management are outside the scope of this paper. The goal of this section is to give you an idea of what task management is and how it relates to the Process Virtual Machine.
A task management component maintains a set of tasks. Task management features include direct (push) and group (pull) assignment, task forms, reassignments, notifications, and so on. The task list for a given user is typically made available through a web application and an API. The combination of task notifications and the related preferences that can be customized by users is also known as 'user awareness'.
任务管理意思是所有的人都清楚的知道他们应该做什么,不应该有任务被做两次。因此,正确的信息需要在正确的时间提供给正确的人。从适当的任务管理中受益不是本书的范畴,本章的目的是告诉你什么是任务管理,他如何与流程虚拟机关联。
任务管理组件维护一个任务集合。任务管理特性包含直接(推)和集体(拉)安排,任务表单,再分配,通知等等。指定用户的任务列表通常通过web应用和API来可用。任务通知和相关属性的结合可由用户定制,也可以称作“用户知会”。
Now, we'll zoom in to the relation between the Process Virtual Machine and a task management component. A process contains a mix of tasks for people and automatic activities. Looking from the perspective of the workflow engine, a user task represent a wait state. As we already described above, support for wait states is exactly one of the most important features of the Process Virtual Machine and workflow in general. So that is why process technologies and task management go hand in hand.
The most common scenario is when a node in the process corresponds to a single user task. In that case, a task node behaviour can be implemented as follows. When an Execution arrives in a task node, it can add task entry to the task management component. Then, the execution should wait in the task node. The new task will now appear in the assigned user's task list. As an integration between the workflow engine and the task management component, the task must keep a reference to the execution.
现在,我们深入流程虚拟机和任务管理组件的关系。一个流程包含给人员的一组任务和自动活动。从流程引擎的方面来看,用户任务表示等待状态。像我们前面已经说过了的,支持等待状态通常是流程虚拟机或工作流的非常重要的特性。这是为什么流程技术和任务管理通常是一起的。
非常普遍的场景是当流程中的一个节点对应到单个用户任务时,任务节点行为可以像下面所说的方式来实现。当Execution到达任务节点,它可以增加任务条目至任务管理组件。然后,Execution应该在任务节点等待。新创建的任务会出现在安排的用户的任务列表中。因为工作流引擎和任务管理组件的集成,任务需要保持一个到execution的引用。
After some time, the assigned user might select that task from the task list and complete it with the user interface or te API. When a task with a related execution is completed, the task management component is responsible to invoke the execution's proceed. So completing the task triggers the execution to resume its execution.
Notifications can be built on top of Process Virtual Machine events with actions. Tasks can propagate their events like e.g. assign, duedate expired or completed to the execution. Once that integration is in place, listeners to those events can be specified as actions in the process definition or dynamically added to the execution at runtime. The dynamic subscription to task events is a crucial feature for user awareness.
一段时间后,分配的用户会从任务列表中选择那个任务,通过用户界面或API来完成它。当一个包含相关的执行的任务完成后,任务管理组件需要负责调用execution的proceed方法。因此,任务的完成会触发execution恢复执行。
通知可以基于流程虚拟机的事件和动作来构建。任务可以传播它们的事件如分配,超时或完成到execution。如果做了这种集成,这些事件的监听器可以在流程定义中作为动作配置或者运行时动态增加到execution。任务事件的动态订阅是用户需要知道的重要特性。
This concludes the basic integration between task management and process execution. The many variations to this theme, like e.g. dynamic task creation, swimlanes and others can be handled by the pvm but they don't really contribute to the purpose of this article.
这章论述了任务管理和流程执行的基本集成。这个主题的许多变化如动态任务创建,泳道和其它能够被pvm处理,但是这些与本文的目的无关。
Cooperative applications and Human Interaction Management (HIM)
应用协同和人工交互管理(HIM)
From a workflow perspective, an expense notes process is completely different from building a new leasure palm-tree resort with those nice red coctails and waitresses in miniscule biki... Anyway, the difference that we want to highlight is that an expense note process is completely known and all possible scenarios can be modelled upfront. Whereas for a leasure resort, there will be no fixed process at the start. That kind of processes will be defined on the fly. The category of software systems that support those kind of ad hoc processes is called cooperative applications or Human Interaction Management (HIM).
从工作流的方面来看,一个费用记录流程与构建一个旅游胜地完全不同…不管如何,我们要指出的不同点是费用记录流程的所有可能场景都完全了解,能够提前建模。而在开始构建旅游胜地前,可能不会有固定的流程。这种流程会即时定义。这一类支持那种类型的特别流程的软件系统被叫做应用协同或人工交互管理。
The Process Virtual Machine as described above suggests that all processes are as predictable as submitting an expense note. That is because a distinction is made between the process model --with Node's and Transitions-- on the one hand and the runtime execution structure --with theExecution-- on the other hand. Let's call this the static approach, reflecting the static process definition.
As a side note, the more frequent a process is executed, the more predictible all the scenarios will be and the more compelling it will be to automate that process. So that is why we took the approach with a single static definitions having many executions as the default one.
前面描述的流程虚拟机提倡所有的流程都应该像费用记录流程一样清晰,因为流程模型—包含节点和转移—不同于另一方面的运行时执行结构—包含Execution—在另一方面。让我们叫它做静态方法,反映静态的流程定义。
作为一个边注,流程执行的越频繁,所有的场景越能被预测,流程会越可能被自动化。这是为什么我们采取一个静态定义有多个执行的方法的原因。
Cooperative processes can also be implemented easily on top of the process virtual machine. An execution is created before any process nodes and transitions are defined. Then typically people, roles and task nodes are created on the fly while the process is executing. So each process execution will create its own process model, whereas in the static approach the single static process model will be used for many executions.
协作流程可以基于流程虚拟机容易的实现。在任意流程节点和转移定义之前创建一个执行,然后,在流程执行时,即时创建典型的人,角色和任务节点。因此每个流程执行会创建它自己的流程模型,但是在静态方法中,单个静态流程模型会被许多执行使用。
From an implementation perspective, storing, updating and removing parts of the process model is done similarly as with executional data. The only issue to keep in mind is that process definition caching needs to be turned off in such use cases.
The nice thing is that with the Process Virtual Machine its possible to build processes that mix the static and the cooperative approach. As explained in "Process updates", a basic fixed model can be defined and updates to that model can be associated to the Execution.
从实现方面来说,存储,更新和删除流程模型的部分与执行数据的更新类似。唯一的问题是要记住在这种情形下,流程定义缓存需要关闭。
一个好的事情是可以基于流程虚拟机构建混合了静态和协作方法的流程,像在流程更新中解释的,一个基本的固定的模型可以被定义,对模型的更新可以关联到Execution。
Asynchronous architectures
异步架构
Asycnrhonous architectures are another environment where workflow technology can be very useful. In such architectures, multiple machines or components communicate through queues or other forms of asynchronous messaging.
If you look at one component in such an architecture, it will receive many messages (or service invocations) that are related to one overall execution flow. That overall execution flow can be modelled and implemented as an executable process. The period between the response sent by the component and the next related message that is expected by the component translates to a wait state in the process.
This principle applies to various environments. For example a web services environment like e.g. an Enterprise Service Bus (ESB). In that case BPEL is the most appropriate language. In essence, with BPEL, new web services can be scripted as a function of other web services. But other languages like e.g. jPDL can be very convenient to orchestrate a set of Message Driven Beans (MDB) in a Java enterprise environment.
异步架构是工作流技术非常有用的另一个场景。在这种架构下,多个机器或组件通过队列或其他形式的一步消息交互。
如果你看看在这种架构中的一个组件,在整个执行流中,它会收到许多相关的消息(或服务调用)。这个执行流可以用执行流程建模和实现。组件发送应答到组件期望收到下一个相关的消息的间隔可转换为流程的等待状态。
这一原则适用于多种环境。例如,web服务环境如ESB。在这种场景下,BPEL是最合适的语言。本质上,使用bpel,新web服务能够给予其它web服务的功能实现。但是其他语言如JPDL能够非常方便的用于在java企业环境下编排一组消息驱动bean。
Process languages
This section describes the process languages for which we have proven that they can be build on top of the Process Virtual Machine.
BPEL
BPEL is a standardized service orchestration language. If you can excuse us for the overly simplified statement, we describe BPEL as an executable language to script web services as a function of other web services.
BPEL is a language that fits right on top of an Enterprise Service Bus (ESB), which is a piece of integration infrastructure. Because an ESB targets integration, services are typically described in WSDL and based on XML technologies. That is where BPEL fits with ESB's. BPEL is also based on WSDL and XML technologies.
XPDL
XPDL is a standardized BPM process language. The background of BPM Systems (BPMS) and hence of XPDL is very different from BPEL. XPDL processes describe a combination of user tasks and automated activities. All references to resources, automatic activities and applications are adressed indirectly, meaning there is no implicit assumption of a technological environment such as enterprise Java or an ESB.
XPDL 2.0 defines a complete mapping with BPMN. BPMN standardizes a graphical notation that defines the shapes, icons and decorations of process models.
jPDL
jPDL is a process language of the JBoss jBPM platform. This language is in essense the simplest wrapper around the Process Virtual Machine, to make it available in a Java environment. jPDL processes can reference Plain Old Java Objects (POJO) directly. The process variables and API's are based on standard Java too.
SEAM pageflow
This is a language that describes the navigation between web pages of a SEAM application graphically. Nodes in the diagram represent pages and transitions represent the navigation between the pages.
The nice thing about the SEAM pageflow language is that it really shows the diversity of languages that can be built on top of the Process Virtual Machine. For instance, pageflow doesn't need persistence nor transaction while pageflow executions need to be serialized in the HTTP session.
Conclusion
First of all, this paper has outlined the essential principles of workflow, BPM and orchestration. This is already very crucial knowledge for understanding today's workflow and BPM systems. But the bigger goals of this article is to facilitate a big step forward in resolving the current fragmentation and confusion around workflow technologies. Both developers and workflow tool vendors will benefit significantly from a unified model for workflow.
Secondly, a component model was introduced that shows how a base framework can be build in a programming language that allows for multiple process languages to be developed on top. This served as a clear illustration for developers to indicate what exactly a workflow engine does and when this technology is appropriate. The bigger goal of this component model is also targetted at workflow tool vendors. Both jBPM in collaboration with Bonita and Orchestra, and Microsoft independently developed a very similar component model. We believe that engines based on this component model will be much more broadly applicable because of their support for multiple process languages in multiple environments, whereas most of the current workflow systems only cover a very small niche of use cases.
As a third item, a realistic and practical approach is detailed that acknowledges and copes with the dual nature of Business Process Management (BPM). Non technically skilled business people are focussed on the graphical diagram. But we have shown that an executable business process always has a graphical and a technical part. The graphical diagram serves as the common language between the business analyst and the developer.
References
· JBoss jBPM
· Bonita The XPDL workflow engine by Bull hosted at OW2
· jPDL The workflow for Java process language of JBoss jBPM
· JBoss SEAM Pageflow The process language build on jBPM for specifying navigation between web pages
· Orchestra The BPEL engine from Bull hosted at OW2
· JBoss jBPM BPELThe BPEL engine from JBoss build on top of jBPM
· Windows Workflow Foundation
· XPDL The workflow language defined by the WfMC and supported by Bonita
· BPELThe service orchestration language defined by OASIS
· (TODO) The reference to the summary article at OnJava
About the authors
Tom Baeyens
Tom Baeyens is the founder and lead developer of JBoss jBPM, the open source platform for workflow, BPM and orchestration engines. His mission is to bring the value of BPM technology to the developer community. Tom is frequent speaker on this subject at international conferences working for Red Hat. He's also participating in the Java Community Process. Tom blog is calledProcess Developments and can be found at http://processdevelopments.blogspot.com/.
Miguel Valdes Faura
Miguel Valdes Faura is the Workflow Project Manager working for Bull R&D. He is also member of the OW2 Technical Council in which he is leading the Bonita workflow project. Before that he has worked in Spain in different European projects based on J2EE platform and Open Source application servers. He joined INRIA, the French Research Institute in Computer Sciences, on February 2001 co-founding the Bonita Workflow System. He is a regular speaker at international conferences : JavaOne, Internet Global Congress, Open Source World Conference, javaHispano Conference, ObjectWebCon, COSGov, JavaBin...