JAXB 0.21: Class XMLScanner

Overview

Package

Class

Tree

Index

Help

JAXB
v0.21

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: INNER | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

javax.xml.marshal
Class XMLScanner

java.lang.Object
  |
  +--javax.xml.marshal.XMLScanner

public abstract class XMLScanner
extends Object

A scanner of XML input streams or data structures.

When unmarshalling XML into a content tree it is not necessary to use a full-fledged XML parser because schema-derived classes enforce all validity constraints as well as the following non-local well-formedness constraints of the XML 1.0 specification:

Element type match: The name in an element's end-tag must match the element type in its start-tag.
Unique Att Spec: No attribute name may appear more than once in the same start-tag or empty-element tag.

An XML scanner therefore enforces only the remaining lexical well-formedness constraints of XML 1.0.

An XML scanner is in one of the following states:

Start: The scanner is positioned at a start tag. This state may be followed by the AttributeName, Chars, Start, or End states. An empty tag, that is, a tag of the form <foo/>, will yield the Start state followed by the End state, possibly with some intervening attribute states.
AttributeName: The scanner is positioned at an attribute name. This state may be followed only by the AttributeValue state. This state will be entered exactly once for each attribute that is read. Attributes are read in the order in which they appear in the input document.
AttributeValue: The scanner is positioned at an attribute value. This state may be followed by the AttributeName, Chars, Start, or End states. If the tokenizeAttributeValue method is invoked then this state may also be followed by the AttributeValueToken state.
AttributeValueToken: The scanner is positioned at one of the tokens of a tokenized attribute value. This state may be followed by the AttributeValueToken, AttributeName, Chars, Start, or End states.
Chars: The scanner is positioned at some character content. This state may be followed by the Start or End states.
End: The scanner is positioned at an end tag. This state may be followed by the Chars, Start, End, or EndOfDocument states.
EndOfDocument: The scanner has reached the end of the input document, at which point it closes itself. The state of the scanner will not change after it reaches this state.

For each state Foo there is at least one of each of the following kinds of methods:

Methods named atFoo return a boolean value indicating whether the scanner is in the Foo state and possibly whether some other condition holds.
Methods named takeFoo check that the scanner is in the Foo state; if so, a relevant value is returned and the scanner is (in most cases) advanced to the next state. A InvalidContentException is thrown if the scanner is not in the Foo state.

The methods for reading attribute values and character data take a whitespace parameter, one of the constants WS_COLLAPSE, WS_NORMALIZE, or WS_PRESERVE, indicating how whitespace is to be processed. Whitespace is defined here exactly as in the XML 1.0 specification: A whitespace character is one of TAB ('\u0009'), LINE FEED ('\u000A'), CARRIAGE RETURN ('\u000D'), or SPACE ('\u0020').

This class also defines a method for retrieving the scanner's position and factory methods for creating scanners that read byte-input streams and scanners that read DOM trees.

Version:: 1.10, 01/05/31

Field Summary

static int WS_COLLAPSE
          Constant indicating that whitespace is to be collapsed.

static int WS_NORMALIZE
          Constant indicating that whitespace is to be normalized.

static int WS_PRESERVE
          Constant indicating that whitespace is to be preserved.

Constructor Summary

XMLScanner()


Method Summary

abstract boolean atAttribute()
          Tests whether the scanner is positioned at an attribute name.

abstract boolean atAttributeValue()
          Tests whether the scanner is positioned at an attribute value.

abstract boolean atAttributeValueToken()
          Tests whether the scanner is positioned at an attribute-value token.

abstract boolean atChars(int whitespace)
          Tests whether the scanner is positioned at some character data.

abstract boolean atEnd()
          Skips whitespace, if any, and then tests whether the scanner is positioned at an end tag.

abstract boolean atEnd(String name)
          Skips whitespace, if any, and then tests whether the scanner is positioned at an end tag with the given name.

abstract boolean atEndOfDocument()
          Skips whitespace, if any, and then tests whether the scanner has reached the end of the input document.

abstract boolean atStart()
          Skips whitespace, if any, and then tests whether the scanner is positioned at a start tag.

abstract boolean atStart(String name)
          Skips whitespace, if any, and then tests whether the scanner is positioned at a start tag with the given name.

abstract void close()
          Closes this scanner.

static XMLScanner open(org.w3c.dom.Document doc)
          Creates a new scanner that scans the given DOM tree.

static XMLScanner open(InputStream in)
          Creates a new scanner that reads an XML document from the given input stream.

abstract String peekStart()
          Skips whitespace, if any, and then reads the current start tag.

abstract ScanPosition position()
          Returns a new scan-position object reporting the scanner's current position.

abstract String takeAttributeName()
          Reads the current attribute name and then advances the scanner to the next state.

String takeAttributeValue()
          Reads the current attribute value, collapsing whitespace, and then advances the scanner to the next state.

abstract String takeAttributeValue(int whitespace)
          Reads the current attribute value and then advances the scanner to the next state.

abstract String takeAttributeValueToken()
          Reads the current attribute-value token and then advances the scanner to the next state.

abstract String takeChars(int whitespace)
          Reads the current character data and then advances the scanner to the next state.

void takeEmpty(String name)
          Takes an empty tag.

abstract String takeEnd()
          Skips whitespace, if any, reads the current end tag, and then advances the scanner to the next state.

abstract void takeEnd(String name)
          Skips whitespace, if any, checks that the current end tag's name is equal to the given name, and then advances the scanner to the next state.

abstract void takeEndOfDocument()
          Skips whitespace, if any, and then checks that the scanner has reached the end of the input document.

String takeLeaf(String name, int whitespace)
          Takes a simple leaf element.

abstract String takeStart()
          Skips whitespace, if any, reads the current start tag, and then advances the scanner to the next state.

abstract void takeStart(String name)
          Skips whitespace, if any, checks that the current start tag's name is equal to the given name, and then advances the scanner to the next state.

abstract void tokenizeAttributeValue()
          Reads the current attribute's value as a sequence of non-whitespace tokens, returning them in succeeding AttributeValueToken states.

Methods inherited from class java.lang.Object

clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Field Detail

WS_COLLAPSE

public static final int WS_COLLAPSE

Constant indicating that whitespace is to be collapsed. When collapsing, every contiguous sequence of whitespace characters is replaced by a single SPACE character ('\u0020'), and then leading and trailing spaces are removed.

WS_NORMALIZE

public static final int WS_NORMALIZE

Constant indicating that whitespace is to be normalized. When normalizing, every whitespace character that is not a SPACE character ('\u0020') is replaced by a SPACE character.

WS_PRESERVE

public static final int WS_PRESERVE

Constant indicating that whitespace is to be preserved.

Constructor Detail

XMLScanner

public XMLScanner()

Method Detail

atAttribute

public abstract boolean atAttribute()

Tests whether the scanner is positioned at an attribute name.

Returns:: true if the scanner's state is AttributeName

atAttributeValue

public abstract boolean atAttributeValue()

Tests whether the scanner is positioned at an attribute value.

Returns:: true if the scanner's state is AttributeValue

atAttributeValueToken

public abstract boolean atAttributeValueToken()

Tests whether the scanner is positioned at an attribute-value token.

Returns:: true if the scanner's state is AttributeValueToken

atChars

public abstract boolean atChars(int whitespace)
                         throws ScanException

Tests whether the scanner is positioned at some character data.

If the value of the whitespace parameter is WS_COLLAPSE then any initial whitespace is first skipped.

Parameters:: whitespace - Determines how whitespace in the character data will be handled; must be one of WS_COLLAPSE, WS_NORMALIZE, or WS_PRESERVE
Returns:: true if the scanner's state is Chars
Throws:: IllegalStateException - If this method has already been invoked for the current state but with a different value for the whitespace parameter; ScanException - If input that is not lexically well-formed is scanned, or if an I/O error occurs

atEnd

public abstract boolean atEnd()
                       throws ScanException

Skips whitespace, if any, and then tests whether the scanner is positioned at an end tag.

Returns:: true if, after skipping whitespace, the scanner's state is End
Throws:: ScanException - If input that is not lexically well-formed is scanned, or if an I/O error occurs

atEnd

public abstract boolean atEnd(String name)
                       throws ScanException

Skips whitespace, if any, and then tests whether the scanner is positioned at an end tag with the given name.

Parameters:: name - The element name to be tested
Returns:: true if, after skipping whitespace, the scanner's state is End and the name in the tag is equal to name
Throws:: ScanException - If input that is not lexically well-formed is scanned, or if an I/O error occurs

atEndOfDocument

public abstract boolean atEndOfDocument()
                                 throws InvalidContentException,
                                        ScanException

Skips whitespace, if any, and then tests whether the scanner has reached the end of the input document.

Returns:: true if, after skipping whitespace, the scanner's state is End
Throws:: ScanException - If input that is not lexically well-formed is scanned, or if an I/O error occurs

atStart

public abstract boolean atStart()
                         throws ScanException

Skips whitespace, if any, and then tests whether the scanner is positioned at a start tag.

Returns:: true if, after skipping whitespace, the scanner's state is Start
Throws:: ScanException - If input that is not lexically well-formed is scanned, or if an I/O error occurs

atStart

public abstract boolean atStart(String name)
                         throws ScanException

Skips whitespace, if any, and then tests whether the scanner is positioned at a start tag with the given name.

Parameters:: name - The element name to be tested
Returns:: true if, after skipping whitespace, the scanner's state is Start and the name in the tag is equal to name
Throws:: ScanException - If input that is not lexically well-formed is scanned, or if an I/O error occurs

close

public abstract void close()
                    throws ScanIOException

Closes this scanner.

If this scanner was created from a byte-input stream then the input stream is closed.

Throws:: ScanIOException - If an I/O error occurs

open

public static XMLScanner open(org.w3c.dom.Document doc)
                       throws ScanException

Creates a new scanner that scans the given DOM tree.

Parameters:: doc - The document to be scanned
Throws:: ScanException - If input that is not lexically well-formed is scanned, or if an I/O error occurs

open

public static XMLScanner open(InputStream in)
                       throws ScanException

Creates a new scanner that reads an XML document from the given input stream.

Parameters:: in - The input stream to be scanned
Throws:: ScanException - If input that is not lexically well-formed is scanned, or if an I/O error occurs

peekStart

public abstract String peekStart()
                          throws InvalidContentException,
                                 ScanException

Skips whitespace, if any, and then reads the current start tag.

Returns:: The name in the current start tag
Throws:: InvalidContentException - If, after skipping whitespace, the scanner's state is not Start; ScanException - If input that is not lexically well-formed is scanned, or if an I/O error occurs

position

public abstract ScanPosition position()

Returns a new scan-position object reporting the scanner's current position.

Returns:: A scan-position object

takeAttributeName

public abstract String takeAttributeName()
                                  throws InvalidContentException,
                                         ScanException

Reads the current attribute name and then advances the scanner to the next state.

Returns:: The current attribute name
Throws:: InvalidContentException - If the scanner's state is not AttributeName; ScanException - If input that is not lexically well-formed is scanned, or if an I/O error occurs

takeAttributeValue

public final String takeAttributeValue()
                                throws InvalidContentException,
                                       ScanException

Reads the current attribute value, collapsing whitespace, and then advances the scanner to the next state.

An invocation of this method behaves in exactly the same way as an invocation of the takeAttributeValue(int) method, passing WS_COLLAPSE for the whitespace argument.

Returns:: The current attribute value
Throws:: InvalidContentException - If the scanner's state is not AttributeValue; ScanException - If input that is not lexically well-formed is scanned, or if an I/O error occurs

takeAttributeValue

public abstract String takeAttributeValue(int whitespace)
                                   throws InvalidContentException,
                                          ScanException

Reads the current attribute value and then advances the scanner to the next state.

Parameters:: whitespace - Determines how whitespace in the attribute value will be handled; must be one of WS_COLLAPSE, WS_NORMALIZE, or WS_PRESERVE
Returns:: The current attribute value
Throws:: InvalidContentException - If the scanner's state is not AttributeValue; ScanException - If input that is not lexically well-formed is scanned, or if an I/O error occurs

takeAttributeValueToken

public abstract String takeAttributeValueToken()
                                        throws InvalidContentException,
                                               ScanException

Reads the current attribute-value token and then advances the scanner to the next state.

Returns:: The current attribute-value token
Throws:: InvalidContentException - If the scanner's state is not AttributeValueToken; ScanException - If input that is not lexically well-formed is scanned, or if an I/O error occurs

takeChars

public abstract String takeChars(int whitespace)
                          throws InvalidContentException,
                                 ScanException

Reads the current character data and then advances the scanner to the next state.

Parameters:: whitespace - Determines how whitespace in the character data will be handled; must be one of WS_COLLAPSE, WS_NORMALIZE, or WS_PRESERVE
Returns:: The current character data
Throws:: InvalidContentException - If the scanner's state is not Chars; IllegalStateException - If the atChars(int) method has already been invoked for the current state but with a different value for the whitespace parameter; ScanException - If input that is not lexically well-formed is scanned, or if an I/O error occurs

takeEmpty

public final void takeEmpty(String name)
                     throws InvalidContentException,
                            ScanException

Takes an empty tag.

This method takes a start tag with the given name and then takes an end tag with the same name.

Parameters:: name - The element name of the expected start and end tags
Throws:: InvalidContentException - If the expected tags cannot be scanned; ScanException - If input that is not lexically well-formed is scanned, or if an I/O error occurs

takeEnd

public abstract String takeEnd()
                        throws InvalidContentException,
                               ScanException

Skips whitespace, if any, reads the current end tag, and then advances the scanner to the next state.

Returns:: The name in the current end tag
Throws:: InvalidContentException - If, after skipping whitespace, the scanner's state is not End; ScanException - If input that is not lexically well-formed is scanned, or if an I/O error occurs

takeEnd

public abstract void takeEnd(String name)
                      throws InvalidContentException,
                             ScanException

Skips whitespace, if any, checks that the current end tag's name is equal to the given name, and then advances the scanner to the next state.

Parameters:: name - The element name to be scanned
Throws:: InvalidContentException - If, after skipping whitespace, the scanner's state is not End and the name in the tag is not equal to name; ScanException - If input that is not lexically well-formed is scanned, or if an I/O error occurs

takeEndOfDocument

public abstract void takeEndOfDocument()
                                throws InvalidContentException,
                                       ScanException

Skips whitespace, if any, and then checks that the scanner has reached the end of the input document.

Throws:: InvalidContentException - If, after skipping whitespace, the scanner's state is not End; ScanException - If input that is not lexically well-formed is scanned, or if an I/O error occurs

takeLeaf

public final String takeLeaf(String name,
                             int whitespace)
                      throws InvalidContentException,
                             ScanException

Takes a simple leaf element.

This method takes start tag with the given name, takes a sequence of characters, if present, takes an end tag with the given name, and then returns the character data, if any.

Parameters:: name - The element name of the expected start and end tags; whitespace - Determines how whitespace in the character data will be handled; must be one of WS_COLLAPSE, WS_NORMALIZE, or WS_PRESERVE
Returns:: The character data, or null if no character data was scanned
Throws:: InvalidContentException - If the expected tags and the character data cannot be scanned; ScanException - If input that is not lexically well-formed is scanned, or if an I/O error occurs

takeStart

public abstract String takeStart()
                          throws InvalidContentException,
                                 ScanException

Skips whitespace, if any, reads the current start tag, and then advances the scanner to the next state.

Returns:: The name in the current start tag
Throws:: InvalidContentException - If, after skipping whitespace, the scanner's state is not Start; ScanException - If input that is not lexically well-formed is scanned, or if an I/O error occurs

takeStart

public abstract void takeStart(String name)
                        throws InvalidContentException,
                               ScanException

Skips whitespace, if any, checks that the current start tag's name is equal to the given name, and then advances the scanner to the next state.

Parameters:: name - The element name to be scanned
Returns:: The name in the current start tag
Throws:: InvalidContentException - If, after skipping whitespace, the scanner's state is not Start and the name in the tag is not equal to name; ScanException - If input that is not lexically well-formed is scanned, or if an I/O error occurs

tokenizeAttributeValue

public abstract void tokenizeAttributeValue()
                                     throws InvalidContentException,
                                            ScanException

Reads the current attribute's value as a sequence of non-whitespace tokens, returning them in succeeding AttributeValueToken states. If the current attribute's value is only whitespace then the next state will not be AttributeValueToken.

Throws:: InvalidContentException - If the scanner's state is not AttributeValue; ScanException - If input that is not lexically well-formed is scanned, or if an I/O error occurs

Overview

Package

Class

Tree

Index

Help

JAXB
v0.21

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: INNER | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

Comments to: jaxb-comments@java.sun.com
More information at: http://java.sun.com/xml/jaxb

Field Summary
`static int`	`WS_COLLAPSE` Constant indicating that whitespace is to be collapsed.
`static int`	`WS_NORMALIZE` Constant indicating that whitespace is to be normalized.
`static int`	`WS_PRESERVE` Constant indicating that whitespace is to be preserved.

Method Summary
`abstract boolean`	`atAttribute()` Tests whether the scanner is positioned at an attribute name.
`abstract boolean`	`atAttributeValue()` Tests whether the scanner is positioned at an attribute value.
`abstract boolean`	`atAttributeValueToken()` Tests whether the scanner is positioned at an attribute-value token.
`abstract boolean`	`atChars(int whitespace)` Tests whether the scanner is positioned at some character data.
`abstract boolean`	`atEnd()` Skips whitespace, if any, and then tests whether the scanner is positioned at an end tag.
`abstract boolean`	`atEnd(String name)` Skips whitespace, if any, and then tests whether the scanner is positioned at an end tag with the given name.
`abstract boolean`	`atEndOfDocument()` Skips whitespace, if any, and then tests whether the scanner has reached the end of the input document.
`abstract boolean`	`atStart()` Skips whitespace, if any, and then tests whether the scanner is positioned at a start tag.
`abstract boolean`	`atStart(String name)` Skips whitespace, if any, and then tests whether the scanner is positioned at a start tag with the given name.
`abstract void`	`close()` Closes this scanner.
`static XMLScanner`	`open(org.w3c.dom.Document doc)` Creates a new scanner that scans the given DOM tree.
`static XMLScanner`	`open(InputStream in)` Creates a new scanner that reads an XML document from the given input stream.
`abstract String`	`peekStart()` Skips whitespace, if any, and then reads the current start tag.
`abstract ScanPosition`	`position()` Returns a new scan-position object reporting the scanner's current position.
`abstract String`	`takeAttributeName()` Reads the current attribute name and then advances the scanner to the next state.
`String`	`takeAttributeValue()` Reads the current attribute value, collapsing whitespace, and then advances the scanner to the next state.
`abstract String`	`takeAttributeValue(int whitespace)` Reads the current attribute value and then advances the scanner to the next state.
`abstract String`	`takeAttributeValueToken()` Reads the current attribute-value token and then advances the scanner to the next state.
`abstract String`	`takeChars(int whitespace)` Reads the current character data and then advances the scanner to the next state.
`void`	`takeEmpty(String name)` Takes an empty tag.
`abstract String`	`takeEnd()` Skips whitespace, if any, reads the current end tag, and then advances the scanner to the next state.
`abstract void`	`takeEnd(String name)` Skips whitespace, if any, checks that the current end tag's name is equal to the given name, and then advances the scanner to the next state.
`abstract void`	`takeEndOfDocument()` Skips whitespace, if any, and then checks that the scanner has reached the end of the input document.
`String`	`takeLeaf(String name, int whitespace)` Takes a simple leaf element.
`abstract String`	`takeStart()` Skips whitespace, if any, reads the current start tag, and then advances the scanner to the next state.
`abstract void`	`takeStart(String name)` Skips whitespace, if any, checks that the current start tag's name is equal to the given name, and then advances the scanner to the next state.
`abstract void`	`tokenizeAttributeValue()` Reads the current attribute's value as a sequence of non-whitespace tokens, returning them in succeeding AttributeValueToken states.

javax.xml.marshal Class XMLScanner

WS_COLLAPSE

WS_NORMALIZE

WS_PRESERVE

XMLScanner

atAttribute

atAttributeValue

atAttributeValueToken

atChars

atEnd

atEnd

atEndOfDocument

atStart

atStart

close

open

open

peekStart

position

takeAttributeName

takeAttributeValue

takeAttributeValue

takeAttributeValueToken

takeChars

takeEmpty

takeEnd

takeEnd

takeEndOfDocument

takeLeaf

takeStart

takeStart

tokenizeAttributeValue

javax.xml.marshal
Class XMLScanner