java.lang.Object
com.renomad.minum.htmlparsing.HtmlParseNode
Represents the expected types of things we may encounter when parsing an HTML string, which
for our purposes is
ParseNodeType
.
See W3.org Elements
-
Field Summary
Fields -
Constructor Summary
ConstructorsConstructorDescriptionHtmlParseNode
(ParseNodeType type, TagInfo tagInfo, List<HtmlParseNode> innerContent, String textContent) -
Method Summary
Modifier and TypeMethodDescriptionboolean
The inner content is the data between the opening and closing tags of this element, comprised of potentially other complex elements and/or characters or a mix (or nothing at all, which will return an empty list).Returns theTagInfo
, which contains valuable information like the type of element (p, a, div, and so on) and attributes like class, id, etc.If theParseNodeType
isParseNodeType.CHARACTERS
, then this will have text content.getType()
Gets the type of this node - either it's an element, with opening and closing tags and attributes and an inner content, or it's just plain text.int
hashCode()
Return the inner text of a nodeprint()
Return a list of strings of the text content of the tree.Return a list ofHtmlParseNode
nodes in the HTML that match provided attributes.toString()
-
Field Details
-
Constructor Details
-
HtmlParseNode
public HtmlParseNode(ParseNodeType type, TagInfo tagInfo, List<HtmlParseNode> innerContent, String textContent)
-
-
Method Details
-
print
Return a list of strings of the text content of the tree.This method traverses the tree from this node downwards, adding the text content as it goes. Its main purpose is to quickly render all the strings out of an HTML document at once.
-
search
Return a list ofHtmlParseNode
nodes in the HTML that match provided attributes. -
getType
Gets the type of this node - either it's an element, with opening and closing tags and attributes and an inner content, or it's just plain text. -
getTagInfo
Returns theTagInfo
, which contains valuable information like the type of element (p, a, div, and so on) and attributes like class, id, etc. -
getInnerContent
The inner content is the data between the opening and closing tags of this element, comprised of potentially other complex elements and/or characters or a mix (or nothing at all, which will return an empty list). -
getTextContent
If theParseNodeType
isParseNodeType.CHARACTERS
, then this will have text content. Otherwise, it returns an empty string. -
innerText
Return the inner text of a nodeIf this element has only one inner content item, and it's a
ParseNodeType.CHARACTERS
element, return its text content.If there is more than one node, concatenates them to a single string, with each section wrapped in square brackets.
-
equals
-
hashCode
public int hashCode() -
toString
-