- Program specialization
- Partial evaluation
Program Specialization
Let us consider a program P, taking two arguments S and D, and producing a result R:
run P(S,D) = R
A specialization of P with respect to S is a program PS such that, for all input D,
run PS(D) = run P(S,D) = R
Input S is called static, it is known (i.e., available) at specialization time. Input D is dynamic, it is unknown (i.e., unavailable) until run time.
Specialization Examples
Program specialization makes sense in any programming language. Consider for example the following Scheme program. (See below for more examples, in C.)
-
(define (append list1 list2)
(if (null? list1)
list2
(cons (car list1) (append (cdr list1) list2))))
A possible specialization of append
with respect to a static argument list1
= (4 2)
is function append42
below.
-
(define (append42 list2)
(cons 4 (cons 2 list2)))
Function append42
preserves the semantics of append
, or more precisely, it has the same semantics as the trivial specialization function triv_append42
, defined as
-
(define (triv-append42 list2)
(append '(4 2) list2))
Depending on the context, S is called a specialization value or an invariant. In the general case, a specialization may exploit several invariants, whether input values or constants already present in the code of P.
Interest of Specialization
The interest of function append42
above, as opposed to triv-append42
, is that computations depending only on the static input list1
= (4 2)
have already been performed. More generally, specialization impacts on speed and size of programs, thus offering applications to program optimization.
-
Speed
- Specialization factors out computations from the specialized program. As a result, a specialized program runs faster than the original program. (Although, in some rare cases, cache effects may yield a slight slowdown.) For example, function
append42
above runs faster than append
(or more precisely, triv-append42
) because the traversal of argument list1
as already been performed.
-
Size
- A specialized program is sometimes smaller than the original program (e.g., when a static input corresponds to an option that dispatches on different functionalities of the program). It is sometimes bigger (e.g., when a loop or a recursive call is unfolded).
Note that all program arguments do not have the same impact on specialization. For example, specializing append
with respect to list2
= (4 2)
leads to the quite unexciting function below.
-
(define (dull-append42 list1)
(if (null? list1)
'(4 2)
(cons (car list1) (dull-append42 (cdr list1)))))
Specialization is used in particular (sometimes unknowingly) to optimize critical sections of code. It is often handwritten.
Partial Evaluation
Partial evaluation (PE) is the process that automates program specialization [CD93, DRT96, JGS93]. A partial evaluator (or specializer) is a program M that takes two arguments, the source of a program P and a static (known) subset of the input S, and produces a specialized program PS:
run M(P,S) = PS
Roughly speaking, partial evaluation can be thought of as a combination of aggressive constant folding, inlining, loop unrolling and inter-procedural constant propagation applied to all data types (including pointers, structures and arrays) instead of just scalars.
Applications of Partial Evaluation
Handwritten specialization is tedious, error-prone and does not scale to large programs. Because it is automatic, specialization via partial evaluation does not have all those drawbacks; it is even predictable (see below). As a result, specialization becomes an issue in engineering software: it is possible to rapidly write generic programs, which are maintainable but slow, and automatically produce fast specialized instances. Because the programmer focuses less on optimization hacks, and more on reusability, partial evaluation greatly improve productivity and program safety.
Partial evaluation has been successfully applied as an optimizer in various domains such as operating systems and networking, computer graphics, numerical computation, circuit simulation, software architectures, compiling and compiler generation.
It has also been used for program understanding and reengineering: given various running options, partial evaluation may split large programs into smaller ones.
Off-line vs. On-line Partial Evaluation
An on-line partial evaluator takes as arguments the source of a program P and a static subset of the input S, performs symbolic computations on available data, and directly yields the source of a specialized program PS.
In an off-line partial evaluator, the specialization is divided into two steps. First, an program binding-time analysis propagate abstract information about static and dynamic values throughout the code. It prepares the second phase that, given actual specialization values, produce specialized code.
On-line partial evaluator are theoretically more powerful: specialization relies on actual values, not on the fact that values are known. On the other hand, off-line partial evaluator are faster because value propagation is "pre-compiled". Moreover, they are predictable in the sense that it is possible to assess the degree of specialization.
Some partial evaluators, like Tempo, can specialize programs not only at compile time (i.e., source-to-source transformation) but also run time (i.e., run-time code generation). Only off-line partial evaluation lends itself to run-time specialization.
Binding-Time Analysis
As a first step, the user provides a program and specifies initial binding times, that is, which arguments (including global variables) are static (i.e., known) and which are dynamic (i.e., yet unknown). For example, the user provides the following code for a
miniprintf
function, and specifies that the first argument is static whereas the second is dynamic:
miniprintf(S,D)
.
miniprintf(char fmt[], int val[])
{
int i = 0;
while( *fmt != '\0' ) {
if( *fmt != '%' )
putchar(*fmt);
else
switch(*++fmt) {
case 'd' : putint(val[i++]); break;
case '%' : putchar('%'); break;
default : prterror(*fmt); break;
}
fmt++;
}
}
Binding-time analysis (BTA) propagates the static/dynamic information throughout the program and annotates each statement and expression with a binding time. These annotations can be visualized using colors (or font effects).
/* LEGEND: STATICDYNAMIC
*/
miniprintf(char fmt[],int val[]){int i = 0;while( *fmt != '\0' ) {if( *fmt != '%' )putchar(*fmt);elseswitch(*++fmt) {case 'd' :putint(val[i++]);break;case '%' :putchar('%');break;default :prterror(*fmt);break;}fmt++;}}
The blue color (bold face for black and white display) represent static constructions, i.e. values that can be computed at specialization time. The red color (standard font for black and white display) is for dynamic expressions, whose value cannot be precomputed knowing only the static arguments. Basically, everything in blue (bold) will disappear after specialization; only red (standard font) parts will remain. Visualizing of the analysis is very important for the user to assess the amount of specialization in the code.
Note that in the case of languages like C, the binding-time analysis must takes into account pointer aliases and side-effects.
Compile-Time Specialization
When the user is satisfied with the analysis (i.e., what the user expects to specialize is indeed considered as static by the
BTA), actual specialization values must be provided. For example, giving
"<%d|%d>"
as the actual specialization value for the
fmt
argument of the
miniprintf()
function yields the following specialized code.
miniprintf_1(int val[])
{
putchar( '<' );
putint ( val[0] );
putchar( '|' );
putint ( val[1] );
putchar( '>' );
}
Many specializations can be performed, sharing the same analysis. Only different specialization values have to be provided.
Run-Time Specialization
Partial evaluators like
Tempo [
CHL+98,
CHN+96] can also perform run-time specialization [
CN96], using optimized binary code templates [
NHCL97]. A dedicated run-time specializer is generated from the results of the program analysis. In the case of the
miniprintf
function, a runtime specializer
rts_miniprintf()
is generated, which can be used as in the following example.
/* * Some dynamic execution context setting variable 'format' */
spec_printf = rts_miniprintf(format); // specialize
...
(*spec_printf)(val1); // <=> miniprintf(format,val1)
(*spec_printf)(val2); // <=> miniprintf(format,val2)
The function rts_miniprintf()
is a dedicated runtime specializer. It returns a pointer to the specialized function. Several specialized versions can also be generated and used at the same time.
References
Various resources concerning partial evaluation, including existing specializers, PE-related events and basic references are accessible from pe_resources.php3.
- [CD93]
- Tutorial Notes on Partial Evaluation. C. Consel and O. Danvy. In ACM Symposium on Principles of Programming Languages, pages 493-501, 1993.
- [CHL+98]
- C. Consel, L. Hornof, J. Lawall, R. Marlet, G. Muller, J. Noyé, S. Thibault, and N. Volanschi. Tempo: Specializing systems applications and beyond. ACM Computing Surveys, Symposium on Partial Evaluation, 1998. To appear.
- [CHN+96]
- C. Consel, L. Hornof, F. Noël, J. Noyé, and E.N. Volanschi. A uniform approach for compile-time and run-time specialization. In O. Danvy, R. Glück, and P. Thiemann, editors, Partial Evaluation, International Seminar, Dagstuhl Castle, number 1110 in Lecture Notes in Computer Science, pages 54-72, February 1996.
- [CN96]
- C. Consel and F. Noël. A general approach for run-time specialization and its application to C. In Conference Record of the 23rd Annual ACM SIGPLAN-SIGACT Symposium on Principles Of Programming Languages, pages 145-156, St. Petersburg Beach, FL, USA, January 1996. ACM Press.
- [DRT96]
- Partial Evaluation. O. Danvy, R. Glück and P. Thiemann (Eds.). Lecture Notes in Computer Science, Vol. 1110.
- [JGS93]
- Partial evaluation and automatic program generation. N.D. Jones, C. Gomard and P. Sestoft. Prentice Hall international series in computer science, 1993.
- [NHCL97]
- F. Noël, L. Hornof, C. Consel, and J. Lawall. Automatic, template-based run-time specialization : Implementation and experimental study. In International Conference on Computer Languages, Chicago, IL, May 1998. IEEE Computer Society Press. Also available as IRISA report PI-1065.
Something missing? Send suggestions for additions and improvement!
Last modified: 2003-09-25. - Jocelyn.Frechot@labri.fr - http://compose.labri.fr
from: http://compose.labri.fr/documentation/pe/pe_overview.php3