public class DOM2DTM extends DTMDefaultBaseIterators
DOM2DTM
class serves up a DOM's contents via the
DTM API.
Note that it doesn't necessarily represent a full Document
tree. You can wrap a DOM2DTM around a specific node and its subtree
and the right things should happen. (I don't _think_ we currently
support DocumentFrgment nodes as roots, though that might be worth
considering.)
Note too that we do not currently attempt to track document
mutation. If you alter the DOM after wrapping DOM2DTM around it,
all bets are off.Modifier and Type | Class and Description |
---|---|
static interface |
DOM2DTM.CharacterNodeHandler |
DTMDefaultBaseIterators.AncestorIterator, DTMDefaultBaseIterators.AttributeIterator, DTMDefaultBaseIterators.ChildrenIterator, DTMDefaultBaseIterators.DescendantIterator, DTMDefaultBaseIterators.FollowingIterator, DTMDefaultBaseIterators.FollowingSiblingIterator, DTMDefaultBaseIterators.InternalAxisIteratorBase, DTMDefaultBaseIterators.NamespaceAttributeIterator, DTMDefaultBaseIterators.NamespaceChildrenIterator, DTMDefaultBaseIterators.NamespaceIterator, DTMDefaultBaseIterators.NthDescendantIterator, DTMDefaultBaseIterators.ParentIterator, DTMDefaultBaseIterators.PrecedingIterator, DTMDefaultBaseIterators.PrecedingSiblingIterator, DTMDefaultBaseIterators.RootIterator, DTMDefaultBaseIterators.SingletonIterator, DTMDefaultBaseIterators.TypedAncestorIterator, DTMDefaultBaseIterators.TypedAttributeIterator, DTMDefaultBaseIterators.TypedChildrenIterator, DTMDefaultBaseIterators.TypedDescendantIterator, DTMDefaultBaseIterators.TypedFollowingIterator, DTMDefaultBaseIterators.TypedFollowingSiblingIterator, DTMDefaultBaseIterators.TypedNamespaceIterator, DTMDefaultBaseIterators.TypedPrecedingIterator, DTMDefaultBaseIterators.TypedPrecedingSiblingIterator, DTMDefaultBaseIterators.TypedRootIterator, DTMDefaultBaseIterators.TypedSingletonIterator
Modifier and Type | Field and Description |
---|---|
protected Vector |
m_nodes
The node objects.
|
DEFAULT_BLOCKSIZE, DEFAULT_NUMBLOCKS, DEFAULT_NUMBLOCKS_SMALL, m_documentBaseURI, m_dtmIdent, m_elemIndexes, m_expandedNameTable, m_exptype, m_firstch, m_indexing, m_mgr, m_mgrDefault, m_namespaceDeclSetElements, m_namespaceDeclSets, m_nextsib, m_parent, m_prevsib, m_shouldStripWhitespaceStack, m_shouldStripWS, m_size, m_traversers, m_wsfilter, m_xstrf, NOTPROCESSED, ROOTNODE
ATTRIBUTE_NODE, CDATA_SECTION_NODE, COMMENT_NODE, DOCUMENT_FRAGMENT_NODE, DOCUMENT_NODE, DOCUMENT_TYPE_NODE, ELEMENT_NODE, ENTITY_NODE, ENTITY_REFERENCE_NODE, NAMESPACE_NODE, NOTATION_NODE, NTYPES, NULL, PROCESSING_INSTRUCTION_NODE, ROOT_NODE, TEXT_NODE
Constructor and Description |
---|
DOM2DTM(DTMManager mgr,
DOMSource domSource,
int dtmIdentity,
DTMWSFilter whiteSpaceFilter,
XMLStringFactory xstringfactory,
boolean doIndexing)
Construct a DOM2DTM object from a DOM node.
|
Modifier and Type | Method and Description |
---|---|
protected int |
addNode(Node node,
int parentIndex,
int previousSibling,
int forceNodeType)
Construct the node map from the node.
|
void |
dispatchCharactersEvents(int nodeHandle,
ContentHandler ch,
boolean normalize)
Directly call the
characters method on the passed ContentHandler for the
string-value of the given node (see http://www.w3.org/TR/xpath#data-model
for the definition of a node's string-value).
|
protected static void |
dispatchNodeData(Node node,
ContentHandler ch,
int depth)
Retrieve the text content of a DOM subtree, appending it into a
user-supplied FastStringBuffer object.
|
void |
dispatchToEvents(int nodeHandle,
ContentHandler ch)
Directly create SAX parser events from a subtree.
|
int |
getAttributeNode(int nodeHandle,
String namespaceURI,
String name)
Retrieves an attribute node by by qualified name and namespace URI.
|
ContentHandler |
getContentHandler()
getContentHandler returns "our SAX builder" -- the thing that
someone else should send SAX events to in order to extend this
DTM model.
|
DeclHandler |
getDeclHandler()
Return this DTM's DeclHandler.
|
String |
getDocumentTypeDeclarationPublicIdentifier()
Return the public identifier of the external subset,
normalized as described in 4.2.2 External Entities [XML].
|
String |
getDocumentTypeDeclarationSystemIdentifier()
A document type declaration information item has the following properties:
1.
|
DTDHandler |
getDTDHandler()
Return this DTM's DTDHandler.
|
int |
getElementById(String elementId)
Returns the
Element whose ID is given by
elementId . |
EntityResolver |
getEntityResolver()
Return this DTM's EntityResolver.
|
ErrorHandler |
getErrorHandler()
Return this DTM's ErrorHandler.
|
int |
getHandleOfNode(Node node)
Get the handle from a Node.
|
LexicalHandler |
getLexicalHandler()
Return this DTM's lexical handler.
|
String |
getLocalName(int nodeHandle)
Given a node handle, return its XPath-style localname.
|
String |
getNamespaceURI(int nodeHandle)
Given a node handle, return its DOM-style namespace URI
(As defined in Namespaces, this is the declared URI which this node's
prefix -- or default in lieu thereof -- was mapped to.)
|
protected int |
getNextNodeIdentity(int identity)
Get the next node identity value in the list, and call the iterator
if it hasn't been added yet.
|
Node |
getNode(int nodeHandle)
Return an DOM node for the given node.
|
protected static void |
getNodeData(Node node,
FastStringBuffer buf)
Retrieve the text content of a DOM subtree, appending it into a
user-supplied FastStringBuffer object.
|
String |
getNodeName(int nodeHandle)
Given a node handle, return its DOM-style node name.
|
String |
getNodeNameX(int nodeHandle)
Given a node handle, return the XPath node name.
|
String |
getNodeValue(int nodeHandle)
Given a node handle, return its node value.
|
int |
getNumberOfNodes()
Get the number of nodes that have been added.
|
String |
getPrefix(int nodeHandle)
Given a namespace handle, return the prefix that the namespace decl is
mapping.
|
SourceLocator |
getSourceLocatorFor(int node)
No source information is available for DOM2DTM, so return
null here. |
XMLString |
getStringValue(int nodeHandle)
Get the string-value of a node as a String object
(see http://www.w3.org/TR/xpath#data-model
for the definition of a node's string-value).
|
String |
getUnparsedEntityURI(String name)
The getUnparsedEntityURI function returns the URI of the unparsed
entity with the specified name in the same document as the context
node (see [3.3 Unparsed Entities]).
|
boolean |
isAttributeSpecified(int attributeHandle)
5.
|
boolean |
isWhitespace(int nodeHandle)
Determine if the string-value of a node is whitespace
|
protected Node |
lookupNode(int nodeIdentity)
Get a Node from an identity index.
|
boolean |
needsTwoThreads() |
protected boolean |
nextNode()
This method iterates to the next node that will be added to the table.
|
void |
setIncrementalSAXSource(IncrementalSAXSource source)
Bind an IncrementalSAXSource to this DTM.
|
void |
setProperty(String property,
Object value)
For the moment all the run time properties are ignored by this
class.
|
getAxisIterator, getTypedAxisIterator
getAxisTraverser
_exptype, _firstch, _level, _nextsib, _parent, _prevsib, _type, appendChild, appendTextChild, declareNamespaceInContext, documentRegistration, documentRelease, dumpDTM, dumpNode, ensureSizeOfIndex, error, findGTE, findInSortedSuballocatedIntVector, findNamespaceContext, getDocument, getDocumentAllDeclarationsProcessed, getDocumentBaseURI, getDocumentEncoding, getDocumentRoot, getDocumentStandalone, getDocumentSystemIdentifier, getDocumentVersion, getDTMIDs, getExpandedTypeID, getExpandedTypeID, getFirstAttribute, getFirstAttributeIdentity, getFirstChild, getFirstNamespaceNode, getLastChild, getLevel, getLocalNameFromExpandedNameID, getManager, getNamespaceFromExpandedNameID, getNamespaceType, getNextAttribute, getNextAttributeIdentity, getNextNamespaceNode, getNextSibling, getNodeHandle, getNodeIdent, getNodeType, getOwnerDocument, getParent, getPreviousSibling, getShouldStripWhitespace, getStringValueChunk, getStringValueChunkCount, getTypedAttribute, getTypedFirstChild, getTypedNextSibling, hasChildNodes, indexNode, isCharacterElementContentWhitespace, isDocumentAllDeclarationsProcessed, isNodeAfter, isSupported, makeNodeHandle, makeNodeIdentity, migrateTo, popShouldStripWhitespace, pushShouldStripWhitespace, setDocumentBaseURI, setFeature, setShouldStripWhitespace, supportsPreStripping
protected Vector m_nodes
public DOM2DTM(DTMManager mgr, DOMSource domSource, int dtmIdentity, DTMWSFilter whiteSpaceFilter, XMLStringFactory xstringfactory, boolean doIndexing)
mgr
- The DTMManager who owns this DTM.domSource
- the DOM source that this DTM will wrap.dtmIdentity
- The DTM identity ID for this DTM.whiteSpaceFilter
- The white space filter for this DTM, which may
be null.xstringfactory
- XMLString factory for creating character content.doIndexing
- true if the caller considers it worth it to use
indexing schemes.protected int addNode(Node node, int parentIndex, int previousSibling, int forceNodeType)
node
- The node that is to be added to the DTM.parentIndex
- The current parent index.previousSibling
- The previous sibling index.forceNodeType
- If not DTM.NULL, overrides the DOM node type.
Used to force nodes to Text rather than CDATASection when their
coalesced value includes ordinary Text nodes (current DTM behavior).public int getNumberOfNodes()
getNumberOfNodes
in class DTMDefaultBase
protected boolean nextNode()
nextNode
in class DTMDefaultBase
public Node getNode(int nodeHandle)
getNode
in interface DTM
getNode
in class DTMDefaultBase
nodeHandle
- The node ID.protected Node lookupNode(int nodeIdentity)
protected int getNextNodeIdentity(int identity)
getNextNodeIdentity
in class DTMDefaultBase
identity
- The node identity (index).public int getHandleOfNode(Node node)
%OPT% This will be pretty slow.
%REVIEW% This relies on being able to test node-identity via object-identity. DTM2DOM proxying is a great example of a case where that doesn't work. DOM Level 3 will provide the isSameNode() method to fix that, but until then this is going to be flaky.node
- A node, which may be null.DTM.NULL
.public int getAttributeNode(int nodeHandle, String namespaceURI, String name)
getAttributeNode
in interface DTM
getAttributeNode
in class DTMDefaultBase
nodeHandle
- int Handle of the node upon which to look up this attribute..namespaceURI
- The namespace URI of the attribute to
retrieve, or null.name
- The local name of the attribute to
retrieve.nodeName
) or DTM.NULL
if there is no such
attribute.public XMLString getStringValue(int nodeHandle)
getStringValue
in interface DTM
getStringValue
in class DTMDefaultBase
nodeHandle
- The node ID.public boolean isWhitespace(int nodeHandle)
nodeHandle
- The node Handle.protected static void getNodeData(Node node, FastStringBuffer buf)
There are open questions regarding whitespace stripping. Currently we make no special effort in that regard, since the standard DOM doesn't yet provide DTD-based information to distinguish whitespace-in-element-context from genuine #PCDATA. Note that we should probably also consider xml:space if/when we address this. DOM Level 3 may solve the problem for us.
%REVIEW% Actually, since this method operates on the DOM side of the fence rather than the DTM side, it SHOULDN'T do any special handling. The DOM does what the DOM does; if you want DTM-level abstractions, use DTM-level methods.
node
- Node whose subtree is to be walked, gathering the
contents of all Text or CDATASection nodes.buf
- FastStringBuffer into which the contents of the text
nodes are to be concatenated.public String getNodeName(int nodeHandle)
getNodeName
in interface DTM
getNodeName
in class DTMDefaultBase
nodeHandle
- the id of the node.public String getNodeNameX(int nodeHandle)
getNodeNameX
in interface DTM
getNodeNameX
in class DTMDefaultBase
nodeHandle
- the id of the node.public String getLocalName(int nodeHandle)
getLocalName
in interface DTM
getLocalName
in class DTMDefaultBase
nodeHandle
- the id of the node.public String getPrefix(int nodeHandle)
%REVIEW% Are you sure you want "" for no prefix?
%REVIEW-COMMENT% I think so... not totally sure. -sb
getPrefix
in interface DTM
getPrefix
in class DTMDefaultBase
nodeHandle
- the id of the node.public String getNamespaceURI(int nodeHandle)
%REVIEW% Null or ""? -sb
getNamespaceURI
in interface DTM
getNamespaceURI
in class DTMDefaultBase
nodeHandle
- the id of the node.public String getNodeValue(int nodeHandle)
getNodeValue
in interface DTM
getNodeValue
in class DTMDefaultBase
nodeHandle
- The node id.public String getDocumentTypeDeclarationSystemIdentifier()
getDocumentTypeDeclarationSystemIdentifier
in interface DTM
getDocumentTypeDeclarationSystemIdentifier
in class DTMDefaultBase
public String getDocumentTypeDeclarationPublicIdentifier()
getDocumentTypeDeclarationPublicIdentifier
in interface DTM
getDocumentTypeDeclarationPublicIdentifier
in class DTMDefaultBase
public int getElementById(String elementId)
Element
whose ID
is given by
elementId
. If no such element exists, returns
DTM.NULL
. Behavior is not defined if more than one element
has this ID
. Attributes (including those
with the name "ID") are not of type ID unless so defined by DTD/Schema
information available to the DTM implementation.
Implementations that do not know whether attributes are of type ID or
not are expected to return DTM.NULL
.
%REVIEW% Presumably IDs are still scoped to a single document, and this operation searches only within a single document, right? Wouldn't want collisions between DTMs in the same process.
getElementById
in interface DTM
getElementById
in class DTMDefaultBase
elementId
- The unique id
value for an element.public String getUnparsedEntityURI(String name)
XML processors may choose to use the System Identifier (if one is provided) to resolve the entity, rather than the URI in the Public Identifier. The details are dependent on the processor, and we would have to support some form of plug-in resolver to handle this properly. Currently, we simply return the System Identifier if present, and hope that it a usable URI or that our caller can map it to one. TODO: Resolve Public Identifiers... or consider changing function name.
If we find a relative URI reference, XML expects it to be resolved in terms of the base URI of the document. The DOM doesn't do that for us, and it isn't entirely clear whether that should be done here; currently that's pushed up to a higher level of our application. (Note that DOM Level 1 didn't store the document's base URI.) TODO: Consider resolving Relative URIs.
(The DOM's statement that "An XML processor may choose to completely expand entities before the structure model is passed to the DOM" refers only to parsed entities, not unparsed, and hence doesn't affect this function.)
getUnparsedEntityURI
in interface DTM
getUnparsedEntityURI
in class DTMDefaultBase
name
- A string containing the Entity Name of the unparsed
entity.public boolean isAttributeSpecified(int attributeHandle)
isAttributeSpecified
in interface DTM
isAttributeSpecified
in class DTMDefaultBase
attributeHandle
- the attribute handletrue
if the attribute was specified;
false
if it was defaulted.public void setIncrementalSAXSource(IncrementalSAXSource source)
source
- The IncrementalSAXSource that we want to recieve events from
on demand.public ContentHandler getContentHandler()
public LexicalHandler getLexicalHandler()
public EntityResolver getEntityResolver()
public DTDHandler getDTDHandler()
public ErrorHandler getErrorHandler()
public DeclHandler getDeclHandler()
public boolean needsTwoThreads()
public void dispatchCharactersEvents(int nodeHandle, ContentHandler ch, boolean normalize) throws SAXException
dispatchCharactersEvents
in interface DTM
dispatchCharactersEvents
in class DTMDefaultBase
nodeHandle
- The node ID.ch
- A non-null reference to a ContentHandler.normalize
- true if the content should be normalized according to
the rules for the XPath
normalize-space
function.SAXException
protected static void dispatchNodeData(Node node, ContentHandler ch, int depth) throws SAXException
There are open questions regarding whitespace stripping. Currently we make no special effort in that regard, since the standard DOM doesn't yet provide DTD-based information to distinguish whitespace-in-element-context from genuine #PCDATA. Note that we should probably also consider xml:space if/when we address this. DOM Level 3 may solve the problem for us.
%REVIEW% Note that as a DOM-level operation, it can be argued that this routine _shouldn't_ perform any processing beyond what the DOM already does, and that whitespace stripping and so on belong at the DTM level. If you want a stripped DOM view, wrap DTM2DOM around DOM2DTM.
node
- Node whose subtree is to be walked, gathering the
contents of all Text or CDATASection nodes.SAXException
public void dispatchToEvents(int nodeHandle, ContentHandler ch) throws SAXException
dispatchToEvents
in interface DTM
dispatchToEvents
in class DTMDefaultBase
nodeHandle
- The node ID.ch
- A non-null reference to a ContentHandler.SAXException
public void setProperty(String property, Object value)
property
- a String
valuevalue
- an Object
valuepublic SourceLocator getSourceLocatorFor(int node)
null
here.node
- an int
valueCopyright © 2021 JBoss by Red Hat. All rights reserved.