Class HtmlParseNode

java.lang.Object
com.renomad.minum.htmlparsing.HtmlParseNode

public final class HtmlParseNode extends Object
Represents the expected types of things we may encounter when parsing an HTML string, which for our purposes is ParseNodeType.

See W3.org Elements

  • Field Details Link icon

  • Constructor Details Link icon

  • Method Details Link icon

    • print Link icon

      public List<String> print()
      Return a list of strings of the text content of the tree.

      This method traverses the tree from this node downwards, adding the text content as it goes. Its main purpose is to quickly render all the strings out of an HTML document at once.

    • search Link icon

      public List<HtmlParseNode> search(TagName tagName, Map<String,String> attributes)
      Return a list of HtmlParseNode nodes in the HTML that match provided attributes.
    • getType Link icon

      public ParseNodeType getType()
      Gets the type of this node - either it's an element, with opening and closing tags and attributes and an inner content, or it's just plain text.
    • getTagInfo Link icon

      public TagInfo getTagInfo()
      Returns the TagInfo, which contains valuable information like the type of element (p, a, div, and so on) and attributes like class, id, etc.
    • getInnerContent Link icon

      public List<HtmlParseNode> getInnerContent()
      The inner content is the data between the opening and closing tags of this element, comprised of potentially other complex elements and/or characters or a mix (or nothing at all, which will return an empty list).
    • getTextContent Link icon

      public String getTextContent()
      If the ParseNodeType is ParseNodeType.CHARACTERS, then this will have text content. Otherwise, it returns an empty string.
    • innerText Link icon

      public String innerText()
      Return the inner text of a node

      If this element has only one inner content item, and it's a ParseNodeType.CHARACTERS element, return its text content.

      If there is more than one node, concatenates them to a single string, with each section wrapped in square brackets.

    • equals Link icon

      public boolean equals(Object o)
      Overrides:
      equals in class Object
    • hashCode Link icon

      public int hashCode()
      Overrides:
      hashCode in class Object
    • toString Link icon

      public String toString()
      Overrides:
      toString in class Object