FOray Users
Module Users

FOray Development: FOTree Design Notes



There are two major epochs in FOTree processing: 1) parse time, and 2) use time. One of the biggest design decisions is how to divide the work between these two. FOray has chosen to essentially defer as much work as possible until use-time (late binding), for the following reasons:

  • Of primary importance, we think that this gives us the cleanest and clearest way to handle the complexities of the standard. Most attempts at early binding are for performance reasons, and we are pretty sure that they make correct processing more difficult, perhaps impossible.
  • Late binding should save memory, at the possible expense of speed.
  • We think that late binding gives us the most flexibility for dealing with future standard changes.
  • At some point, we would like to build a mechanism that infers similarities in styles used throughout a document, builds one instance of that style, and then shares that one instance throughout the document. This is sort of an extensible shorthand concept that we also expect to see in some future version of the standard. The general concept is also important to possible future FOray work that ties an FO document back to its semantic XML source, by creating a non-XSLT-based stylesheet concept.

Some of these reasons are based more on gut feel than hard facts or examples, and they may frankly be wrong. The decision to do late binding is not written in stone, but reflects our best thinking to date. Comparisons between FOrays approach and some of the others is probably in order.

One drawback to late binding is that it might tend to duplicate some processing, by forcing values to be computed more often than with early binding. We think that in most cases, the effect of this is insignificant, especially when considered against the tradeoffs of extra memory consumption (which also used processing time to allocate and garbage-collect). However, this bears more research. Also, in some cases, it may make sense to cache the results of a use-time computation rather than repeatedly computing the value. The computation of table column widths is an example where FOray currently uses this approach. It may be possible to make a more general solution available as an option in the future, but that is a low priority at the moment.

Data Structure

If you are going to modify (as opposed to simply use) FOray FOTree, it is very important that you understand how the data is structured. Although FOTree attempts to follow the XSL-FO standard in the way that its classes are organized, the standard itself is relatively complex. There is a significant amount of detail, and several axes that must be handled in order to get the data stored and retrieved in a predictable manner. There are probably a large number of possible ways to accurately handle the job.

Here is an outline of the hierarchy of data in the FOray FOTree:

StreamRenderer (implements FOTreeControl)
   |  |
   |  |--Classes for converting elements
   |      and attributes into FO objects
   |      and properties

The structure of the FO objects and the classes that represent them is fairly easy to follow, but the storage and retrieval of the property values is much more complex. The first principle to grasp is that although the API only exposes methods to obtain refined, computed traits, the internal storage of the data is of the raw property values. Of primary importance is the distinguishing between the following three concepts:

  • The FO property itself.
  • The value of the property.
  • The java classes that are used to manage the above concepts.

The abstract java superclass Property is the container object for each FO property. A collection of Property instances is stored in the PropertyList, which is attached to a PropertyManager, which is in turn attached to the FObj instance.

Each Property instance stores its parent FObj, its type (stored as an integral) and its value. The integral type is important for at least two reasons. First, there is no one-to-one relationship between the Property subclasses and the property types. The subclasses are used used more for processing and programming convenience and not really at all for distinguishing between property types. The second reason for storing a type is that some performance efficiencies can be achieved by avoiding casting and instanceof operations when an object is searching for a specific property.

Each property can have exactly one value, which will be a subclass of PropertyValue. PropertyValue instances can be:

  • XSL-FO datatypes, which include classes like DtInteger, DtNumber (for floating-point numbers), DtPercentage, etc.
  • XSL-FO functions.
  • A PropertyCollection instance. PropertyCollection is a PropertyValue that stores a collection of Property instances (each one of them of course having its own PropertyValue). PropertyCollection is useful for certain shorthand properties. For example, the border-width shorthand property can designate 1-4 instances of a generic property that affects all four sides of a box. These properties exactly correspond to names properties like border-bottom-width, and it is convenient to simply collect the variable number of them and store them inside of a specialized PropertyValue instance.

One of the complexities of the standard is that certain functions and expressions are not designated in terms of the XSL-FO datatypes, but instead as Numeric. We have chosen to handle this complexity with a java interface called Numeric, which provides certain methods that are suitable to such functions and expressions. For example, IntegerDT, NumberDT, and LengthDT all implement this interface.


Namespaces are handled through the Namespace abstract class. Subclasses of Namespace are responsible for being able to convert elements and attributes in the namespace into instances of FObj and Property subclasses. There are some standard tools available to do that, but using these tools is not required.

Before parsing begins, Namespace instances are registered with the FOTreeBuilder. During parsing, a list of registered namespaces is consulted, and the appropriate Namespace instance is then called upon to do the actual conversion to FObj and Property instances.

FObj subclasses must identify which Namespace they belong to. For the standard namespaces, this does not require any extra memory, as the Namespace instance can be obtained pretty easily from the Namespace registry. However, for non-standard namespaces (i.e. those that you might add), you may need to cache the Namespace instance in the FObj subclass instance itself.

Property instances do not explicitly know what namespace they are in. However, this information is implied in the propertyType variable. Although there is no formal mechanism or any enforcement, the propertyType values are segregated by namespace. Here are the ranges assigned to each of the standard namespaces:

At the moment, namespace clashes are not a big issue at the attribute level. The only non-FO attribute that is supported is the "xml:lang" that is specified by the XSL-FO standard as an FO shorthand. However, we think that the infrastructure is robust enough to handle much more complexity within the scheme outlined above.

  • FO: 1-500.
  • XML: 501-1000.
  • SVG: 1001-1500. (FOray does not actually parse properties for this namespace).
  • Extensions (FOray): 1501-2000. (Currently, all attributes in this namespace use attributes from the FO namespace).
  • MathML: 2001-2500. (This namespace is not yet actually supported, but the space is reserved for possible future support).

The propertyType variable is a short, so there are 65,536 possibilities. Please do not ever use items in the range between -100 and 0 or the standard ranges above. Non-standard namespaces (i.e. those not directly supported by FOray) are encouraged to use negative values to avoid conflict with future standard namespaces that might be added to FOray.

It is possible that enforcement of these ranges may be added in the future. This can easily be done by simply passing the Namespace instance to the Property constructor, and requiring the Namespace subclasses to report the range of propertyType values that they are claiming. These ranges can then be checked for conflict as they are registered.

Challenges in Creating an Independent FO Tree

The overriding challenge in creating an independent FO Tree is that fully resolving many FO Tree values is dependent on information from outside of the raw FO Tree data.

One area where this seems to be true is Fonts. The FOray model wants FOTree to not be dependent on layout or rendering information, so that it can be reused in several render contexts. Since the availability of Fonts may differ from one render context to another, ideally Font resolution should be deferred until layout. However, some FOTree attributes depend on information from the resolved font. (The baseline and alignment properties in the Area Alignment property section are good examples). There are several potential ways to handle this:

  • Have the layout or AreaTree pass the resolved font back to FOTree for these computations.
  • If we wish to have the fonts resolved on the FOTree side, provide a mechanism to re-resolve them if the FOTree wants to be reused in a different render context.
  • Move the computation of the resolved values over into the AreaTree itself. In other words, have the FOTree track only the nominal font, and have any methods that require a resolved font to exist in the Area Tree, to pull the nominal font information from FOTree and to resolve and use the font information in the AreaTree.


The order in which an FO object and its properties are created is important, especially when validation is considered. Here is the order in which the key events occur:

  1. An empty PropertyList is created.
  2. The FObj is constructed, with the empty PropertyList encapsulated in it. If the object constructor requires a more specific type for the parent than is supplied from the parser, the object maker factory is responsible to validate the cast and throw an exception if the type is not right. This allows objects to avoid unnecessary casts on their parents as the tree is being used. No other validation should be done during construction.
  3. The new object knows who its parent is, but the parent does not know about it yet. Also, the new object does not yet know what its properties are, nor what its children are.
  4. The FObj.validateAncestry() method is run. This allows the object to validate is parent and other ancestors if it needs to. Note that the parent type may have been validated during construction.
  5. The properties are parsed.
  6. The FObj.validateProperties method is run. This allows the object to do any validation work that it needs to on the properties themselves.
  7. The new object is registered as a child of its parent. The parent may do validation work on the child during this registration.
  8. The FObj startup() method is run.
  9. All children are parsed. As each child is parsed, the FObj.addChild(FONode) method is run. (This is the registration process mentioned above, but now we are looking at it from the standpoint of being the parent instead of being the child). This method is abstract and the parameter type cannot be made more specific, but it allows an exception to be thrown if the node is not of the correct type. This allows us to cast the children as needed, to do specialized storage of them, and to accumulate any state information that might be needed. In some cases this state information needs to be accumulated for each child because later children depend on it. The parsing of a table is a good example. The information parsed from the column children is useful in validating later children of the table-body, table-header, and table-footer children.
  10. The FObj.end() method is run. This allows the object to complete any state computations it might have before proceeding. It now knows that all children have been completely parsed and added.
  11. The FObj.validateDescendants() method is run. This allows the object to do any validation work on the descendant objects that was not completed as the descendant objects were added. This is especially useful for looking for errors of ommission as those might not necessarily be caught in any of the states above. For example, only after parsing all children of a table can we necessarily be sure that it had at least one table-body child.
  12. The end-object event is fired.

It may turn out to be useful to fire events at other places along the way. Note that the FObj.start() method runs in pre-traversal order and that the FOjb.end() method runs in post-traversal order. From all of this, the optimal times for various validation tasks on an FObj instance is as follows:

  • FObj construction: validate ancestry.
  • FObj start() method: validate properties.
  • FObj end() method: validate children.

fo:marker and fo:retrieve-marker

Essentially we are asked to graft a marker's content into a retrieve-marker location. This presents some interesting challenges. The brute force approach is to simply copy the marker content each time it is needed by a new retrieve-marker. This interfers with some of FOray's goals, especially leanness of memory use and the ability to round-trip the FOTree. So we are left with these major challenges:

fo:marker inheritance

One solution considered is to lock the marker with a retrieve-marker instance, then release it when done. However, this is inefficient. The AreaTree must be involved, therefore it must, for every trait computation, search within itself to see if it is in a marker-generated area, then get the retrieve-marker, then lock the marker, then remember to undo it all when done.

The solution implemented is to pass something through the FOTree that tells it how to do the subsitution when it is needed. The is FOContext interface provides a method that can return the appropriate RetrieveMarker instance to use when a Marker instance is found in the FOTree. AreaNode implements this.

fo:marker line-breaking

Line-breaking present a challenge, because it has its own abstraction of the data it needs. Some extra work was required to be able to pass FOContext information through the ine-breaking system, be used within that system, and then passed through to the processes that create Area instances. The solution chosen was to add subinterfaces for LineText and LineNonText to the FOTree system (FOLineText and FOLineNonText). These subinterfaces have methods that can be used to wrap real data items up with their context information, to that the context-aware values are used by the line-breaking, and then can unwrap them on the other side for Area instantiation.

To Do

Known code deficiencies are recorded in the source code itself, and tagged with the string "TODO".

General features that we would like to add include the following:

  • If the user doesn't need round-tripping of the FOTree, consider allowing earlier binding of property values where possible. Also, if performance is improved by it, allow bound properties to be pushed down to child objects that need them. These are purely for performance reasons.
  • So far, data validation has been on an opportunistic, haphazard manner. We need to comprehensively address every FO object for proper ancestry, properties, and descendants.
  • We need a testing system specifically for the FOTree.
  • After absolutely all property-related to-do items in the code itself are cleared, we need a project to clean up and consolidate code in the property sub-system.
  • Devise a new (extension) Formatting Object whose purpose is solely to contain properties, and which can be referenced from other Formatting objects by name. This would significantly reduce parsing time and increase FO document human readability, and may make stylesheet creation more intuitive.
  • FOTrees that cannot fit in memory should be serialized.
  • Consider replacing some trait method parameters with FOContext, and adding the method parameters to it instead. This might make trait processing more efficient by skipping context computatioins if they are not actually needed.

Resolved Issues

Compound properties can be created either with a short form or a complete form, or both. See Section 5.11 of the standard for an example using space-before. Coding either one is pretty straightforward, but handling the situation where both can occur in any order makes the code much more complex, requiring some method of keeping track of whether a component of the property was explicitly set, or whether an initial value was created. All of this can be avoided by ensuring that any short form is processed before any complete form. This can be accomplished by processing the attributes in alphabetical order. To accomplish this, we have chosen the expedient of a virtual sort of the attributes before they are processed. The SAX-generated attribute list itself is untouched, but a separate integer array is created which holds the order in which the attribute list elements should be processed. The sort itself occurs in int[] FOTreeBuilder.sortAttributes(Attributes attlist).

Open Issues

This section contains issues that we essentially have had to bypass in order to keep moving, but for which we did not want to lose track. They represent known deficiencies in the current FOray FOTree implementation.

  1. Expressions are not currently handled completely. Specifically, they are only parsed for some properties. This is primarily because the scheme for processing them has changed from evaluation at parse-time to evaluation at use-time. The expression parser probably needs to be overhauled to use more of a top-down approach, and should be fed an array of suitable possible return values.
  2. Markers and static content still obtain their traits from their FOTree location instead of their grafting point.
  3. Border and padding traits need to be reviewed to make sure that they are being extracted in the correct order.
  4. All “corresponding” properties need another round of review.

Issues with the XSL-FO Standard

Remarks in this section apply to the XSL-FO 1.1 Working Draft.

  1. Within the line-height specification is an implied concept of a line-height-multiplier. The “normal”, <number>, <percentage>, and “inherit” values all operate directly on this principle. For <number>, the spec explains that the multiplier is what should be inherited, not the computed value. Since “normal” is a merely a special case of <number>, it is included here as well. What about <percentage>? Also, since a line-height-multiplier can be inferred from <length> divided by font-size, should that value be inherited also? Some clarifying text would help a lot here.
  2. There is some confusion about the column-number trait as it applies to fo:table-cell. The spec (both 1.0 and 1.1WD) say “The initial value is the current column-number. For the first table-cell in a table-row, the current column number is 1.” However, in the case where a previous cell in the row has been affected by a number-rows-spanned trait on a cell from a previous row, this does not seem to be correct. For example, if the previous row, first column, has a number-rows-spanned="2", then it included the cell in the first column of the current row. This could mean that the first specified cell in the current column should have a column-number initial value of 2. So the question is whether a) the fo stylesheet author is responsible to indicate that the column-number for such a cell is 2, or b) the computed column-number should consider the effect of number-rows-spanned from previous rows.
  3. There is a nasty problem with the definition of the relationship between the <length> datatype and the <percentage> datatype. On many (perhaps all) properties where a value may be a length, a percentage is also allowed. The definition shows “<length> | <percentage>”. In other words, an “or” relationship is implied. However, note 1 under Section 5.11 states “Since a <percentage> value, that is not interpreted as "auto", is a valid <length> value it may be used in a short form.” The length definition indicates that a unit qualification is required. Is a percent-sign considered a unit qualification? Percentages, considered by themselves, have a length-power of zero, while all other length values have a length power of 1. This is not such a big problem when length and percentage are considered together, but the problem becomes more obvious when length-range (or space) and percentage are considered together, as they are for leader-length and space-start. In each of these cases, percentages are accepted, not in their role as a percentage per se, but in their role as either a short form for a set of length definitions, or as length definitions themselves. In other words, if a length can be specified as a percentage (with no other unit qualifier), it is incorrect to allow both a length and a percentage in the definition. They are not two separate things, but rather one is a subset of the other.
  4. Should U+2028 result in a line-break? If not, how should it be treated?
  5. Should U+2029 be interpreted as the beginning of a new block? If so, presumably the innermost block? If not, how should it be treated?
  6. Between 1.0 and 1.1 white-space-treatment changed. In 1.0, it was processed as part of FO Tree refinement, in 1.1 it is part of line-breaking, which is clearly an Area Tree concept. In 1.0, it had to be done before linefeed-treatment or white-space-collapse. In 1.1, it appears that it must be done after these. Please clarify.
  7. In most cases, the entire width of a border lies entirely inside the border-rectangle. However, per CSS2 Section 17.6.2, pertaining to border-collpase, it appears that when that model is used, one-half of the outer borders fall into the margin-area. Please clarify for XSL-FO.