Class AbstractXmlParser

java.lang.Object
org.apache.maven.doxia.parser.AbstractParser
org.apache.maven.doxia.parser.AbstractXmlParser
All Implemented Interfaces:
LogEnabled, Markup, XmlMarkup, Parser
Direct Known Subclasses:
DocBookParser, FmlParser, Xhtml5BaseParser, XhtmlBaseParser

public abstract class AbstractXmlParser extends AbstractParser implements XmlMarkup
An abstract class that defines some convenience methods for XML parsers.
Since:
1.0
  • Field Details

    • PATTERN_ENTITY_1

      private static final Pattern PATTERN_ENTITY_1
      Entity pattern for HTML entity, i.e. &nbsp; "<!ENTITY(\\s)+([^>|^\\s]+)(\\s)+\"(\\s)*(&[a-zA-Z]{2,6};)(\\s)*\"(\\s)*>
      see http://www.w3.org/TR/REC-xml/#NT-EntityDecl.
    • PATTERN_ENTITY_2

      private static final Pattern PATTERN_ENTITY_2
      Entity pattern for Unicode entity, i.e. &#38; "<!ENTITY(\\s)+([^>|^\\s]+)(\\s)+\"(\\s)*(&(#x?[0-9a-fA-F]{1,5};)*)(\\s)*\"(\\s)*>"
      see http://www.w3.org/TR/REC-xml/#NT-EntityDecl.
    • ignorableWhitespace

      private boolean ignorableWhitespace
    • collapsibleWhitespace

      private boolean collapsibleWhitespace
    • trimmableWhitespace

      private boolean trimmableWhitespace
    • entities

      private Map<String,String> entities
    • validate

      private boolean validate
  • Constructor Details

    • AbstractXmlParser

      public AbstractXmlParser()
  • Method Details

    • parse

      public void parse(Reader source, Sink sink) throws ParseException
      Parses the given source model and emits Doxia events into the given sink.
      Specified by:
      parse in interface Parser
      Parameters:
      source - not null reader that provides the source document. You could use newReader methods from ReaderFactory.
      sink - A sink that consumes the Doxia events.
      Throws:
      ParseException - if the model could not be parsed.
    • initXmlParser

      protected void initXmlParser(org.codehaus.plexus.util.xml.pull.XmlPullParser parser) throws org.codehaus.plexus.util.xml.pull.XmlPullParserException
      Initializes the parser with custom entities or other options.
      Parameters:
      parser - A parser, not null.
      Throws:
      org.codehaus.plexus.util.xml.pull.XmlPullParserException - if there's a problem initializing the parser
    • parse

      public void parse(String string, Sink sink) throws ParseException
      Convenience method to parse an arbitrary string and emit events into the given sink. Convenience method to parse an arbitrary string and emit any xml events into the given sink.
      Overrides:
      parse in class AbstractParser
      Parameters:
      string - A string that provides the source input.
      sink - A sink that consumes the Doxia events.
      Throws:
      ParseException - if the string could not be parsed.
    • getType

      public final int getType()
      The parser type value could be Parser.UNKNOWN_TYPE, Parser.TXT_TYPE or Parser.XML_TYPE.
      Specified by:
      getType in interface Parser
      Overrides:
      getType in class AbstractParser
      Returns:
      a int.
    • getAttributesFromParser

      protected SinkEventAttributeSet getAttributesFromParser(org.codehaus.plexus.util.xml.pull.XmlPullParser parser)
      Converts the attributes of the current start tag of the given parser to a SinkEventAttributeSet.
      Parameters:
      parser - A parser, not null.
      Returns:
      a SinkEventAttributeSet or null if the current parser event is not a start tag.
      Since:
      1.1
    • parseXml

      private void parseXml(org.codehaus.plexus.util.xml.pull.XmlPullParser parser, Sink sink) throws org.codehaus.plexus.util.xml.pull.XmlPullParserException, MacroExecutionException
      Parse the model from the XmlPullParser into the given sink.
      Parameters:
      parser - A parser, not null.
      sink - the sink to receive the events.
      Throws:
      org.codehaus.plexus.util.xml.pull.XmlPullParserException - if there's a problem parsing the model
      MacroExecutionException - if there's a problem executing a macro
    • handleStartTag

      protected abstract void handleStartTag(org.codehaus.plexus.util.xml.pull.XmlPullParser parser, Sink sink) throws org.codehaus.plexus.util.xml.pull.XmlPullParserException, MacroExecutionException
      Goes through the possible start tags.
      Parameters:
      parser - A parser, not null.
      sink - the sink to receive the events.
      Throws:
      org.codehaus.plexus.util.xml.pull.XmlPullParserException - if there's a problem parsing the model
      MacroExecutionException - if there's a problem executing a macro
    • handleEndTag

      protected abstract void handleEndTag(org.codehaus.plexus.util.xml.pull.XmlPullParser parser, Sink sink) throws org.codehaus.plexus.util.xml.pull.XmlPullParserException, MacroExecutionException
      Goes through the possible end tags.
      Parameters:
      parser - A parser, not null.
      sink - the sink to receive the events.
      Throws:
      org.codehaus.plexus.util.xml.pull.XmlPullParserException - if there's a problem parsing the model
      MacroExecutionException - if there's a problem executing a macro
    • handleText

      protected void handleText(org.codehaus.plexus.util.xml.pull.XmlPullParser parser, Sink sink) throws org.codehaus.plexus.util.xml.pull.XmlPullParserException
      Handles text events.

      This is a default implementation, if the parser points to a non-empty text element, it is emitted as a text event into the specified sink.

      Parameters:
      parser - A parser, not null.
      sink - the sink to receive the events. Not null.
      Throws:
      org.codehaus.plexus.util.xml.pull.XmlPullParserException - if there's a problem parsing the model
    • handleCdsect

      protected void handleCdsect(org.codehaus.plexus.util.xml.pull.XmlPullParser parser, Sink sink) throws org.codehaus.plexus.util.xml.pull.XmlPullParserException
      Handles CDATA sections.

      This is a default implementation, all data are emitted as text events into the specified sink.

      Parameters:
      parser - A parser, not null.
      sink - the sink to receive the events. Not null.
      Throws:
      org.codehaus.plexus.util.xml.pull.XmlPullParserException - if there's a problem parsing the model
    • handleComment

      protected void handleComment(org.codehaus.plexus.util.xml.pull.XmlPullParser parser, Sink sink) throws org.codehaus.plexus.util.xml.pull.XmlPullParserException
      Handles comments.

      This is a default implementation, all data are emitted as comment events into the specified sink.

      Parameters:
      parser - A parser, not null.
      sink - the sink to receive the events. Not null.
      Throws:
      org.codehaus.plexus.util.xml.pull.XmlPullParserException - if there's a problem parsing the model
    • handleEntity

      protected void handleEntity(org.codehaus.plexus.util.xml.pull.XmlPullParser parser, Sink sink) throws org.codehaus.plexus.util.xml.pull.XmlPullParserException
      Handles entities.

      This is a default implementation, all entities are resolved and emitted as text events into the specified sink, except:

      • the entities with names #160, nbsp and #x00A0 are emitted as nonBreakingSpace() events.
      Parameters:
      parser - A parser, not null.
      sink - the sink to receive the events. Not null.
      Throws:
      org.codehaus.plexus.util.xml.pull.XmlPullParserException - if there's a problem parsing the model
    • handleUnknown

      protected void handleUnknown(org.codehaus.plexus.util.xml.pull.XmlPullParser parser, Sink sink, int type)
      Handles an unknown event.

      This is a default implementation, all events are emitted as unknown events into the specified sink.

      Parameters:
      parser - the parser to get the event from.
      sink - the sink to receive the event.
      type - the tag event type. This should be one of HtmlMarkup.TAG_TYPE_SIMPLE, HtmlMarkup.TAG_TYPE_START, HtmlMarkup.TAG_TYPE_END or HtmlMarkup.ENTITY_TYPE. It will be passed as the first argument of the required parameters to the Sink Sink.unknown(String, Object[], org.apache.maven.doxia.sink.SinkEventAttributes) method.
    • isIgnorableWhitespace

      protected boolean isIgnorableWhitespace()

      isIgnorableWhitespace.

      Returns:
      true if whitespace will be ignored, false otherwise.
      Since:
      1.1
      See Also:
    • setIgnorableWhitespace

      protected void setIgnorableWhitespace(boolean ignorable)
      Specify that whitespace will be ignored. I.e.:
      <tr> <td/> </tr>
      is equivalent to
      <tr><td/></tr>
      Parameters:
      ignorable - true to ignore whitespace, false otherwise.
      Since:
      1.1
    • isCollapsibleWhitespace

      protected boolean isCollapsibleWhitespace()

      isCollapsibleWhitespace.

      Returns:
      true if text will collapse, false otherwise.
      Since:
      1.1
      See Also:
    • setCollapsibleWhitespace

      protected void setCollapsibleWhitespace(boolean collapsible)
      Specify that text will be collapsed. I.e.:
      Text   Text
      is equivalent to
      Text Text
      Parameters:
      collapsible - true to allow collapsible text, false otherwise.
      Since:
      1.1
    • isTrimmableWhitespace

      protected boolean isTrimmableWhitespace()

      isTrimmableWhitespace.

      Returns:
      true if text will be trim, false otherwise.
      Since:
      1.1
      See Also:
    • setTrimmableWhitespace

      protected void setTrimmableWhitespace(boolean trimmable)
      Specify that text will be collapsed. I.e.:
      <p> Text </p>
      is equivalent to
      <p>Text</p>
      Parameters:
      trimmable - true to allow trimmable text, false otherwise.
      Since:
      1.1
    • getText

      protected String getText(org.codehaus.plexus.util.xml.pull.XmlPullParser parser)

      getText.

      Parameters:
      parser - A parser, not null.
      Returns:
      the XmlPullParser.getText() taking care of trimmable or collapsible configuration.
      Since:
      1.1
      See Also:
    • getLocalEntities

      protected Map<String,String> getLocalEntities()
      Return the defined entities in a local doctype. I.e.:
       <!DOCTYPE foo [
         <!ENTITY bar "&#x160;">
         <!ENTITY bar1 "&#x161;">
       ]>
       
      Returns:
      a map of the defined entities in a local doctype.
      Since:
      1.1
    • isValidate

      public boolean isValidate()

      isValidate.

      Returns:
      true if XML content will be validate, false otherwise.
      Since:
      1.1
    • setValidate

      public void setValidate(boolean validate)
      Specify a flag to validate or not the XML content.
      Parameters:
      validate - the validate to set
      Since:
      1.1
      See Also:
    • addEntity

      private void addEntity(org.codehaus.plexus.util.xml.pull.XmlPullParser parser, String entityName, String entityValue) throws org.codehaus.plexus.util.xml.pull.XmlPullParserException
      Add an entity given by entityName and entityValue to entities.
      By default, we exclude the default XML entities: &amp;, &lt;, &gt;, &quot; and &apos;.
      Parameters:
      parser - not null
      entityName - not null
      entityValue - not null
      Throws:
      org.codehaus.plexus.util.xml.pull.XmlPullParserException - if any
      See Also:
      • XmlPullParser.defineEntityReplacementText(String, String)
    • addLocalEntities

      private void addLocalEntities(org.codehaus.plexus.util.xml.pull.XmlPullParser parser, String text) throws org.codehaus.plexus.util.xml.pull.XmlPullParserException
      Handle entities defined in a local doctype as the following:
       <!DOCTYPE foo [
         <!ENTITY bar "&#x160;">
         <!ENTITY bar1 "&#x161;">
       ]>
       
      Parameters:
      parser - not null
      text - not null
      Throws:
      org.codehaus.plexus.util.xml.pull.XmlPullParserException - if any
    • addDTDEntities

      private void addDTDEntities(org.codehaus.plexus.util.xml.pull.XmlPullParser parser, String text) throws org.codehaus.plexus.util.xml.pull.XmlPullParserException
      Handle entities defined in external doctypes as the following:
       <!DOCTYPE foo [
         <!-- These are the entity sets for ISO Latin 1 characters for the XHTML -->
         <!ENTITY % HTMLlat1 PUBLIC "-//W3C//ENTITIES Latin 1 for XHTML//EN"
                "http://www.w3.org/TR/xhtml1/DTD/xhtml-lat1.ent">
         %HTMLlat1;
       ]>
       
      Parameters:
      parser - not null
      text - not null
      Throws:
      org.codehaus.plexus.util.xml.pull.XmlPullParserException - if any