JAXB
v0.21

javax.xml.marshal
Class XMLScanner

java.lang.Object
  |
  +--javax.xml.marshal.XMLScanner

public abstract class XMLScanner
extends Object

A scanner of XML input streams or data structures.

When unmarshalling XML into a content tree it is not necessary to use a full-fledged XML parser because schema-derived classes enforce all validity constraints as well as the following non-local well-formedness constraints of the XML 1.0 specification:

An XML scanner therefore enforces only the remaining lexical well-formedness constraints of XML 1.0.

An XML scanner is in one of the following states:

For each state Foo there is at least one of each of the following kinds of methods: The methods for reading attribute values and character data take a whitespace parameter, one of the constants WS_COLLAPSE, WS_NORMALIZE, or WS_PRESERVE, indicating how whitespace is to be processed. Whitespace is defined here exactly as in the XML 1.0 specification: A whitespace character is one of TAB ('\u0009'), LINE FEED ('\u000A'), CARRIAGE RETURN ('\u000D'), or SPACE ('\u0020').

This class also defines a method for retrieving the scanner's position and factory methods for creating scanners that read byte-input streams and scanners that read DOM trees.

Version:
1.10, 01/05/31

Field Summary
static int WS_COLLAPSE
          Constant indicating that whitespace is to be collapsed.
static int WS_NORMALIZE
          Constant indicating that whitespace is to be normalized.
static int WS_PRESERVE
          Constant indicating that whitespace is to be preserved.
 
Constructor Summary
XMLScanner()
           
 
Method Summary
abstract  boolean atAttribute()
          Tests whether the scanner is positioned at an attribute name.
abstract  boolean atAttributeValue()
          Tests whether the scanner is positioned at an attribute value.
abstract  boolean atAttributeValueToken()
          Tests whether the scanner is positioned at an attribute-value token.
abstract  boolean atChars(int whitespace)
          Tests whether the scanner is positioned at some character data.
abstract  boolean atEnd()
          Skips whitespace, if any, and then tests whether the scanner is positioned at an end tag.
abstract  boolean atEnd(String name)
          Skips whitespace, if any, and then tests whether the scanner is positioned at an end tag with the given name.
abstract  boolean atEndOfDocument()
          Skips whitespace, if any, and then tests whether the scanner has reached the end of the input document.
abstract  boolean atStart()
          Skips whitespace, if any, and then tests whether the scanner is positioned at a start tag.
abstract  boolean atStart(String name)
          Skips whitespace, if any, and then tests whether the scanner is positioned at a start tag with the given name.
abstract  void close()
          Closes this scanner.
static XMLScanner open(org.w3c.dom.Document doc)
          Creates a new scanner that scans the given DOM tree.
static XMLScanner open(InputStream in)
          Creates a new scanner that reads an XML document from the given input stream.
abstract  String peekStart()
          Skips whitespace, if any, and then reads the current start tag.
abstract  ScanPosition position()
          Returns a new scan-position object reporting the scanner's current position.
abstract  String takeAttributeName()
          Reads the current attribute name and then advances the scanner to the next state.
 String takeAttributeValue()
          Reads the current attribute value, collapsing whitespace, and then advances the scanner to the next state.
abstract  String takeAttributeValue(int whitespace)
          Reads the current attribute value and then advances the scanner to the next state.
abstract  String takeAttributeValueToken()
          Reads the current attribute-value token and then advances the scanner to the next state.
abstract  String takeChars(int whitespace)
          Reads the current character data and then advances the scanner to the next state.
 void takeEmpty(String name)
          Takes an empty tag.
abstract  String takeEnd()
          Skips whitespace, if any, reads the current end tag, and then advances the scanner to the next state.
abstract  void takeEnd(String name)
          Skips whitespace, if any, checks that the current end tag's name is equal to the given name, and then advances the scanner to the next state.
abstract  void takeEndOfDocument()
          Skips whitespace, if any, and then checks that the scanner has reached the end of the input document.
 String takeLeaf(String name, int whitespace)
          Takes a simple leaf element.
abstract  String takeStart()
          Skips whitespace, if any, reads the current start tag, and then advances the scanner to the next state.
abstract  void takeStart(String name)
          Skips whitespace, if any, checks that the current start tag's name is equal to the given name, and then advances the scanner to the next state.
abstract  void tokenizeAttributeValue()
          Reads the current attribute's value as a sequence of non-whitespace tokens, returning them in succeeding AttributeValueToken states.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

WS_COLLAPSE

public static final int WS_COLLAPSE
Constant indicating that whitespace is to be collapsed. When collapsing, every contiguous sequence of whitespace characters is replaced by a single SPACE character ('\u0020'), and then leading and trailing spaces are removed.


WS_NORMALIZE

public static final int WS_NORMALIZE
Constant indicating that whitespace is to be normalized. When normalizing, every whitespace character that is not a SPACE character ('\u0020') is replaced by a SPACE character.


WS_PRESERVE

public static final int WS_PRESERVE
Constant indicating that whitespace is to be preserved.

Constructor Detail

XMLScanner

public XMLScanner()
Method Detail

atAttribute

public abstract boolean atAttribute()
Tests whether the scanner is positioned at an attribute name.

Returns:
true if the scanner's state is AttributeName

atAttributeValue

public abstract boolean atAttributeValue()
Tests whether the scanner is positioned at an attribute value.

Returns:
true if the scanner's state is AttributeValue

atAttributeValueToken

public abstract boolean atAttributeValueToken()
Tests whether the scanner is positioned at an attribute-value token.

Returns:
true if the scanner's state is AttributeValueToken

atChars

public abstract boolean atChars(int whitespace)
                         throws ScanException
Tests whether the scanner is positioned at some character data.

If the value of the whitespace parameter is WS_COLLAPSE then any initial whitespace is first skipped.

Parameters:
whitespace - Determines how whitespace in the character data will be handled; must be one of WS_COLLAPSE, WS_NORMALIZE, or WS_PRESERVE
Returns:
true if the scanner's state is Chars
Throws:
IllegalStateException - If this method has already been invoked for the current state but with a different value for the whitespace parameter
ScanException - If input that is not lexically well-formed is scanned, or if an I/O error occurs

atEnd

public abstract boolean atEnd()
                       throws ScanException
Skips whitespace, if any, and then tests whether the scanner is positioned at an end tag.

Returns:
true if, after skipping whitespace, the scanner's state is End
Throws:
ScanException - If input that is not lexically well-formed is scanned, or if an I/O error occurs

atEnd

public abstract boolean atEnd(String name)
                       throws ScanException
Skips whitespace, if any, and then tests whether the scanner is positioned at an end tag with the given name.

Parameters:
name - The element name to be tested
Returns:
true if, after skipping whitespace, the scanner's state is End and the name in the tag is equal to name
Throws:
ScanException - If input that is not lexically well-formed is scanned, or if an I/O error occurs

atEndOfDocument

public abstract boolean atEndOfDocument()
                                 throws InvalidContentException,
                                        ScanException
Skips whitespace, if any, and then tests whether the scanner has reached the end of the input document.

Returns:
true if, after skipping whitespace, the scanner's state is End
Throws:
ScanException - If input that is not lexically well-formed is scanned, or if an I/O error occurs

atStart

public abstract boolean atStart()
                         throws ScanException
Skips whitespace, if any, and then tests whether the scanner is positioned at a start tag.

Returns:
true if, after skipping whitespace, the scanner's state is Start
Throws:
ScanException - If input that is not lexically well-formed is scanned, or if an I/O error occurs

atStart

public abstract boolean atStart(String name)
                         throws ScanException
Skips whitespace, if any, and then tests whether the scanner is positioned at a start tag with the given name.

Parameters:
name - The element name to be tested
Returns:
true if, after skipping whitespace, the scanner's state is Start and the name in the tag is equal to name
Throws:
ScanException - If input that is not lexically well-formed is scanned, or if an I/O error occurs

close

public abstract void close()
                    throws ScanIOException
Closes this scanner.

If this scanner was created from a byte-input stream then the input stream is closed.

Throws:
ScanIOException - If an I/O error occurs

open

public static XMLScanner open(org.w3c.dom.Document doc)
                       throws ScanException
Creates a new scanner that scans the given DOM tree.

Parameters:
doc - The document to be scanned
Throws:
ScanException - If input that is not lexically well-formed is scanned, or if an I/O error occurs

open

public static XMLScanner open(InputStream in)
                       throws ScanException
Creates a new scanner that reads an XML document from the given input stream.
Parameters:
in - The input stream to be scanned
Throws:
ScanException - If input that is not lexically well-formed is scanned, or if an I/O error occurs

peekStart

public abstract String peekStart()
                          throws InvalidContentException,
                                 ScanException
Skips whitespace, if any, and then reads the current start tag.

Returns:
The name in the current start tag
Throws:
InvalidContentException - If, after skipping whitespace, the scanner's state is not Start
ScanException - If input that is not lexically well-formed is scanned, or if an I/O error occurs

position

public abstract ScanPosition position()
Returns a new scan-position object reporting the scanner's current position.

Returns:
A scan-position object

takeAttributeName

public abstract String takeAttributeName()
                                  throws InvalidContentException,
                                         ScanException
Reads the current attribute name and then advances the scanner to the next state.

Returns:
The current attribute name
Throws:
InvalidContentException - If the scanner's state is not AttributeName
ScanException - If input that is not lexically well-formed is scanned, or if an I/O error occurs

takeAttributeValue

public final String takeAttributeValue()
                                throws InvalidContentException,
                                       ScanException
Reads the current attribute value, collapsing whitespace, and then advances the scanner to the next state.

An invocation of this method behaves in exactly the same way as an invocation of the takeAttributeValue(int) method, passing WS_COLLAPSE for the whitespace argument.

Returns:
The current attribute value
Throws:
InvalidContentException - If the scanner's state is not AttributeValue
ScanException - If input that is not lexically well-formed is scanned, or if an I/O error occurs

takeAttributeValue

public abstract String takeAttributeValue(int whitespace)
                                   throws InvalidContentException,
                                          ScanException
Reads the current attribute value and then advances the scanner to the next state.

Parameters:
whitespace - Determines how whitespace in the attribute value will be handled; must be one of WS_COLLAPSE, WS_NORMALIZE, or WS_PRESERVE
Returns:
The current attribute value
Throws:
InvalidContentException - If the scanner's state is not AttributeValue
ScanException - If input that is not lexically well-formed is scanned, or if an I/O error occurs

takeAttributeValueToken

public abstract String takeAttributeValueToken()
                                        throws InvalidContentException,
                                               ScanException
Reads the current attribute-value token and then advances the scanner to the next state.

Returns:
The current attribute-value token
Throws:
InvalidContentException - If the scanner's state is not AttributeValueToken
ScanException - If input that is not lexically well-formed is scanned, or if an I/O error occurs

takeChars

public abstract String takeChars(int whitespace)
                          throws InvalidContentException,
                                 ScanException
Reads the current character data and then advances the scanner to the next state.

Parameters:
whitespace - Determines how whitespace in the character data will be handled; must be one of WS_COLLAPSE, WS_NORMALIZE, or WS_PRESERVE
Returns:
The current character data
Throws:
InvalidContentException - If the scanner's state is not Chars
IllegalStateException - If the atChars(int) method has already been invoked for the current state but with a different value for the whitespace parameter
ScanException - If input that is not lexically well-formed is scanned, or if an I/O error occurs

takeEmpty

public final void takeEmpty(String name)
                     throws InvalidContentException,
                            ScanException
Takes an empty tag.

This method takes a start tag with the given name and then takes an end tag with the same name.

Parameters:
name - The element name of the expected start and end tags
Throws:
InvalidContentException - If the expected tags cannot be scanned
ScanException - If input that is not lexically well-formed is scanned, or if an I/O error occurs

takeEnd

public abstract String takeEnd()
                        throws InvalidContentException,
                               ScanException
Skips whitespace, if any, reads the current end tag, and then advances the scanner to the next state.

Returns:
The name in the current end tag
Throws:
InvalidContentException - If, after skipping whitespace, the scanner's state is not End
ScanException - If input that is not lexically well-formed is scanned, or if an I/O error occurs

takeEnd

public abstract void takeEnd(String name)
                      throws InvalidContentException,
                             ScanException
Skips whitespace, if any, checks that the current end tag's name is equal to the given name, and then advances the scanner to the next state.

Parameters:
name - The element name to be scanned
Throws:
InvalidContentException - If, after skipping whitespace, the scanner's state is not End and the name in the tag is not equal to name
ScanException - If input that is not lexically well-formed is scanned, or if an I/O error occurs

takeEndOfDocument

public abstract void takeEndOfDocument()
                                throws InvalidContentException,
                                       ScanException
Skips whitespace, if any, and then checks that the scanner has reached the end of the input document.

Throws:
InvalidContentException - If, after skipping whitespace, the scanner's state is not End
ScanException - If input that is not lexically well-formed is scanned, or if an I/O error occurs

takeLeaf

public final String takeLeaf(String name,
                             int whitespace)
                      throws InvalidContentException,
                             ScanException
Takes a simple leaf element.

This method takes start tag with the given name, takes a sequence of characters, if present, takes an end tag with the given name, and then returns the character data, if any.

Parameters:
name - The element name of the expected start and end tags
whitespace - Determines how whitespace in the character data will be handled; must be one of WS_COLLAPSE, WS_NORMALIZE, or WS_PRESERVE
Returns:
The character data, or null if no character data was scanned
Throws:
InvalidContentException - If the expected tags and the character data cannot be scanned
ScanException - If input that is not lexically well-formed is scanned, or if an I/O error occurs

takeStart

public abstract String takeStart()
                          throws InvalidContentException,
                                 ScanException
Skips whitespace, if any, reads the current start tag, and then advances the scanner to the next state.

Returns:
The name in the current start tag
Throws:
InvalidContentException - If, after skipping whitespace, the scanner's state is not Start
ScanException - If input that is not lexically well-formed is scanned, or if an I/O error occurs

takeStart

public abstract void takeStart(String name)
                        throws InvalidContentException,
                               ScanException
Skips whitespace, if any, checks that the current start tag's name is equal to the given name, and then advances the scanner to the next state.

Parameters:
name - The element name to be scanned
Returns:
The name in the current start tag
Throws:
InvalidContentException - If, after skipping whitespace, the scanner's state is not Start and the name in the tag is not equal to name
ScanException - If input that is not lexically well-formed is scanned, or if an I/O error occurs

tokenizeAttributeValue

public abstract void tokenizeAttributeValue()
                                     throws InvalidContentException,
                                            ScanException
Reads the current attribute's value as a sequence of non-whitespace tokens, returning them in succeeding AttributeValueToken states. If the current attribute's value is only whitespace then the next state will not be AttributeValueToken.

Throws:
InvalidContentException - If the scanner's state is not AttributeValue
ScanException - If input that is not lexically well-formed is scanned, or if an I/O error occurs

JAXB
v0.21


Comments to: jaxb-comments@java.sun.com
More information at: http://java.sun.com/xml/jaxb

Copyright © 2001 by Sun Microsystems, Inc., 901 San Antonio Road,
Palo Alto, California, 94303, U.S.A. All Rights Reserved.