Document-centric XML processing

Document-Centric XML processing is one of two conceptual approaches to processing XML content, along with Data-centric XML processing. Although there is no universally accepted definition of the term, following articles discuss features typically associated with this approach:

Applications based on Document-centric Approach

VTD-XML

Before VTD-XML, traditional XML processing models (e.g. DOM, SAX and JAXB etc.) are designed around the notion of objects. The XML text, merely as the serialization of the objects, is relegated to the status of a second-class citizen. Applications are based on DOM nodes, string and various business objects, but rarely on the physical documents. This object-oriented approach of XML processing has serious issues because of the performance hits from virtually all directions. Not only are object creation and garbage collection inherently memory and CPU inefficient, but applications incur the cost of re-serialization with even the smallest changes to the original text.

With document-centric XML processing, the XML document (the persistent format of data) is the starting point from which everything else comes AbOUT. Whether it is parsing, XPath evaluation, modifying content, or slicing element fragments, by default you no longer work directly with objects. You only do that when it makes sense. More often than not, one treat documents purely as syntax, and think in bytes, byte arrays, integers, offsets, lengths, fragments and namespace-compensated fragments. The first-class citizen in this paradigm is the XML text. And the object-centric notions of XML processing, such as serialization and de-serialization (or marshalling and unmarshalling) are often displaced, if not replaced, by more document-centric notions of parsing and composition.