KickJava   Java API By Example, From Geeks To Geeks.

Java > Open Source Codes > au > id > jericho > lib > html > Element


1 // Jericho HTML Parser - Java based library for analysing and manipulating HTML
2
// Version 2.2
3
// Copyright (C) 2006 Martin Jericho
4
// http://sourceforge.net/projects/jerichohtml/
5
//
6
// This library is free software; you can redistribute it and/or
7
// modify it under the terms of the GNU Lesser General Public
8
// License as published by the Free Software Foundation; either
9
// version 2.1 of the License, or (at your option) any later version.
10
// http://www.gnu.org/copyleft/lesser.html
11
//
12
// This library is distributed in the hope that it will be useful,
13
// but WITHOUT ANY WARRANTY; without even the implied warranty of
14
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
15
// Lesser General Public License for more details.
16
//
17
// You should have received a copy of the GNU Lesser General Public
18
// License along with this library; if not, write to the Free Software
19
// Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
20

21 package au.id.jericho.lib.html;
22
23 import java.util.*;
24
25 /**
26  * Represents an <a target="_blank" HREF="http://www.w3.org/TR/html401/intro/sgmltut.html#h-3.2.1">element</a>
27  * in a specific {@linkplain Source source} document, which encompasses a {@linkplain #getStartTag() start tag},
28  * an optional {@linkplain #getEndTag() end tag} and all {@linkplain #getContent() content} in between.
29  * <p>
30  * The term <i><a name="Normal">normal element</a></i> refers to an element having a {@linkplain #getStartTag() start tag}
31  * with a {@linkplain StartTag#getStartTagType() type} of {@link StartTagType#NORMAL}.
32  * This comprises all {@linkplain HTMLElements HTML elements} and <a HREF="HTMLElements.html#NonHTMLElement">non-HTML elements</a>.
33  * <p>
34  * <code>Element</code> instances are obtained using one of the following methods:
35  * <ul>
36  * <li>{@link StartTag#getElement()}
37  * <li>{@link EndTag#getElement()}
38  * <li>{@link Segment#findAllElements()}
39  * <li>{@link Segment#findAllElements(String name)}
40  * <li>{@link Segment#findAllElements(StartTagType)}
41  * </ul>
42  * See also the {@link HTMLElements} class, and the
43  * <a target="_blank" HREF="http://www.w3.org/TR/REC-xml#dt-element">XML 1.0 specification for elements</a>.
44  * <h3><a name="Structure">Element Structure</a></h3>
45  * <p>
46  * The three possible structures of an element are listed below:
47  * <dl class="Separated">
48  * <dt><a name="SingleTag">Single Tag Element</a>:
49  * <dd>
50  * The element consists only of a single {@linkplain #getStartTag() start tag} and has no {@linkplain #getContent() element content}
51  * (although the start tag itself may have {@linkplain StartTag#getTagContent() tag content}).
52  * <br />{@link #getEndTag()}<code>==null</code>
53  * <br />{@link #isEmpty()}<code>==true</code>
54  * <br />{@link #getEnd() getEnd()}<code>==</code>{@link #getStartTag()}<code>.</code>{@link #getEnd() getEnd()}
55  * <p>
56  * This occurs in the following situations:
57  * <ul class="Unseparated">
58  * <li>An <a HREF="HTMLElements.html#HTMLElement">HTML element</a> for which the {@linkplain HTMLElements#getEndTagForbiddenElementNames() end tag is forbidden}.
59  * <li>An <a HREF="HTMLElements.html#HTMLElement">HTML element</a> for which the {@linkplain HTMLElements#getEndTagRequiredElementNames() end tag is required},
60  * but the end tag is not present in the source document.
61  * <li>An <a HREF="HTMLElements.html#HTMLElement">HTML element</a> for which the {@linkplain HTMLElements#getEndTagOptionalElementNames() end tag is optional},
62  * where the <a HREF="#ImplicitlyTerminated">implicitly terminating</a> tag is situated immediately after the element's
63  * {@linkplain #getStartTag() start tag}.
64  * <li>An {@linkplain #isEmptyElementTag() empty element tag}
65  * <li>A <a HREF="HTMLElements.html#NonHTMLElement">non-HTML element</a> that is not an {@linkplain #isEmptyElementTag() empty element tag} but is missing its end tag.
66  * <li>An element with a start tag of a {@linkplain StartTag#getStartTagType() type} that does not define a
67  * {@linkplain StartTagType#getCorrespondingEndTagType() corresponding end tag type}.
68  * <li>An element with a start tag of a {@linkplain StartTag#getStartTagType() type} that does define a
69  * {@linkplain StartTagType#getCorrespondingEndTagType() corresponding end tag type} but is missing its end tag.
70  * </ul>
71  * <dt><a name="ExplicitlyTerminated">Explicitly Terminated Element</a>:
72  * <dd>
73  * The element consists of a {@linkplain #getStartTag() start tag}, {@linkplain #getContent() content},
74  * and an {@linkplain #getEndTag() end tag}.
75  * <br />{@link #getEndTag()}<code>!=null</code>.
76  * <br />{@link #isEmpty()}<code>==false</code> (provided the end tag doesn't immediately follow the start tag)
77  * <br />{@link #getEnd() getEnd()}<code>==</code>{@link #getEndTag()}<code>.</code>{@link #getEnd() getEnd()}.
78  * <p>
79  * This occurs in the following situations, assuming the start tag's matching end tag is present in the source document:
80  * <ul class="Unseparated">
81  * <li>An <a HREF="HTMLElements.html#HTMLElement">HTML element</a> for which the end tag is either
82  * {@linkplain HTMLElements#getEndTagRequiredElementNames() required} or {@linkplain HTMLElements#getEndTagOptionalElementNames() optional}.
83  * <li>A <a HREF="HTMLElements.html#NonHTMLElement">non-HTML element</a> that is not an {@linkplain #isEmptyElementTag() empty element tag}.
84  * <li>An element with a start tag of a {@linkplain StartTag#getStartTagType() type} that defines a
85  * {@linkplain StartTagType#getCorrespondingEndTagType() corresponding end tag type}.
86  * </ul>
87  * <dt><a name="ImplicitlyTerminated">Implicitly Terminated Element</a>:
88  * <dd>
89  * The element consists of a {@linkplain #getStartTag() start tag} and {@linkplain #getContent() content},
90  * but no {@linkplain #getEndTag() end tag}.
91  * <br />{@link #getEndTag()}<code>==null</code>.
92  * <br />{@link #isEmpty()}<code>==false</code>
93  * <br />{@link #getEnd() getEnd()}<code>!=</code>{@link #getStartTag()}<code>.</code>{@link #getEnd() getEnd()}.
94  * <p>
95  * This only occurs in an <a HREF="HTMLElements.html#HTMLElement">HTML element</a> for which the
96  * {@linkplain HTMLElements#getEndTagOptionalElementNames() end tag is optional}.
97  * <p>
98  * The element ends at the start of a tag which implies the termination of the element, called the <i>implicitly terminating tag</i>.
99  * If the implicitly terminating tag is situated immediately after the element's {@linkplain #getStartTag() start tag},
100  * the element is classed as a <a HREF="#SingleTag">single tag element</a>.
101  * <p>
102  * See the <a HREF="Element.html#ParsingRulesHTMLEndTagOptional">element parsing rules for HTML elements with optional end tags</a>
103  * for details on which tags can implicitly terminate a given element.
104  * <p>
105  * See also the documentation of the {@link HTMLElements#getEndTagOptionalElementNames()} method.
106  * </dl>
107  * <h3><a name="ParsingRules">Element Parsing Rules</a></h3>
108  * The following rules describe the algorithm used in the {@link StartTag#getElement()} method to construct an element.
109  * The detection of the start tag's matching end tag or other terminating tags always takes into account the possible nesting of elements.
110  * <p>
111  * <ul class="Separated">
112  * <li>
113  * If the start tag has a {@linkplain StartTag#getStartTagType() type} of {@link StartTagType#NORMAL}:
114  * <ul>
115  * <li>
116  * If the {@linkplain StartTag#getName() name} of the start tag matches one of the
117  * recognised {@linkplain HTMLElementName HTML element names} (indicating an <a HREF="HTMLElements.html#HTMLElement">HTML element</a>):
118  * <ul>
119  * <li>
120  * <a name="ParsingRulesHTMLEndTagForbidden"></a>
121  * If the end tag for an element of this {@linkplain StartTag#getName() name} is
122  * {@linkplain HTMLElements#getEndTagForbiddenElementNames() forbidden},
123  * the parser does not conduct any search for an end tag and a <a HREF="#SingleTag">single tag element</a> is created.
124  * <li>
125  * <a name="ParsingRulesHTMLEndTagRequired"></a>
126  * If the end tag for an element of this {@linkplain StartTag#getName() name} is
127  * {@linkplain HTMLElements#getEndTagRequiredElementNames() required}, the parser searches for the start tag's matching end tag.
128  * <ul class="Unseparated">
129  * <li>
130  * If the matching end tag is found, an <a HREF="#ExplicitlyTerminated">explicitly terminated element</a> is created.
131  * <li>
132  * If no matching end tag is found, the source document is not valid HTML and the incident is
133  * {@linkplain Source#setLogWriter(Writer) logged} as a missing required end tag.
134  * In this situation a <a HREF="#SingleTag">single tag element</a> is created.
135  * </ul>
136  * <li>
137  * <a name="ParsingRulesHTMLEndTagOptional"></a>
138  * If the end tag for an element of this {@linkplain StartTag#getName() name} is
139  * {@linkplain HTMLElements#getEndTagOptionalElementNames() optional}, the parser searches not only for the start tag's matching end tag,
140  * but also for any other tag that <a HREF="#ImplicitlyTerminated">implicitly terminates</a> the element.
141  * <br />For each tag (<i>T2</i>) following the start tag (<i>ST1</i>) of this element (<i>E1</i>):
142  * <ul class="Unseparated">
143  * <li>
144  * If <i>T2</i> is a start tag:
145  * <ul>
146  * <li>
147  * If the {@linkplain StartTag#getName() name} of <i>T2</i> is in the list of
148  * {@linkplain HTMLElements#getNonterminatingElementNames(String) non-terminating element names} for <i>E1</i>,
149  * then continue evaluating tags from the {@linkplain Element#getEnd() end} of <i>T2</i>'s corresponding
150  * {@linkplain StartTag#getElement() element}.
151  * <li>
152  * If the {@linkplain StartTag#getName() name} of <i>T2</i> is in the list of
153  * {@linkplain HTMLElements#getTerminatingStartTagNames(String) terminating start tag names} for <i>E1</i>,
154  * then <i>E1</i> ends at the {@linkplain StartTag#getBegin() beginning} of <i>T2</i>.
155  * If <i>T2</i> follows immediately after <i>ST1</i>, a <a HREF="#SingleTag">single tag element</a> is created,
156  * otherwise an <a HREF="#ImplicitlyTerminated">implicitly terminated element</a> is created.
157  * </ul>
158  * <li>
159  * If <i>T2</i> is an end tag:
160  * <ul>
161  * <li>
162  * If the {@linkplain EndTag#getName() name} of <i>T2</i> is the same as that of <i>ST1</i>,
163  * an <a HREF="#ExplicitlyTerminated">explicitly terminated element</a> is created.
164  * <li>
165  * If the {@linkplain EndTag#getName() name} of <i>T2</i> is in the list of
166  * {@linkplain HTMLElements#getTerminatingEndTagNames(String) terminating end tag names} for <i>E1</i>,
167  * then <i>E1</i> ends at the {@linkplain EndTag#getBegin() beginning} of <i>T2</i>.
168  * If <i>T2</i> follows immediately after <i>ST1</i>, a <a HREF="#SingleTag">single tag element</a> is created,
169  * otherwise an <a HREF="#ImplicitlyTerminated">implicitly terminated element</a> is created.
170  * </ul>
171  * <li>
172  * If no more tags are present in the source document, then <i>E1</i> ends at the end of the file, and an
173  * <a HREF="#ImplicitlyTerminated">implicitly terminated element</a> is created.
174  * </ul>
175  * </ul>
176  * Note that the syntactical indication of an {@linkplain StartTag#isEmptyElementTag() empty-element tag} in the start tag
177  * is ignored when determining the end of <a HREF="HTMLElements.html#HTMLElement">HTML elements</a>.
178  * See the documentation of the {@link #isEmptyElementTag()} method for more information.
179  * <li>
180  * If the {@linkplain StartTag#getName() name} of the start tag does not match one of the
181  * recognised {@linkplain HTMLElementName HTML element names} (indicating a <a HREF="HTMLElements.html#NonHTMLElement">non-HTML element</a>):
182  * <ul>
183  * <li>
184  * If the start tag is an {@linkplain StartTag#isEmptyElementTag() empty-element tag},
185  * the parser does not conduct any search for an end tag and a <a HREF="#SingleTag">single tag element</a> is created.
186  * <li>
187  * Otherwise, section <a target="_blank" HREF="http://www.w3.org/TR/REC-xml#CleanAttrVals">3.1</a>
188  * of the XML 1.0 specification states that a matching end tag MUST be present, and
189  * the parser searches for the start tag's matching end tag.
190  * <ul class="Unseparated">
191  * <li>
192  * If the matching end tag is found, an <a HREF="#ExplicitlyTerminated">explicitly terminated element</a> is created.
193  * <li>
194  * If no matching end tag is found, the source document is not valid XML and the incident is
195  * {@linkplain Source#setLogWriter(Writer) logged} as a missing required end tag.
196  * In this situation a <a HREF="#SingleTag">single tag element</a> is created.
197  * </ul>
198  * </ul>
199  * </ul>
200  * <li>
201  * If the start tag has any {@linkplain StartTag#getStartTagType() type} other than {@link StartTagType#NORMAL}:
202  * <ul>
203  * <li>
204  * If the start tag's type does not define a {@linkplain StartTagType#getCorrespondingEndTagType() corresponding end tag type},
205  * the parser does not conduct any search for an end tag and a <a HREF="#SingleTag">single tag element</a> is created.
206  * <li>
207  * If the start tag's type does define a {@linkplain StartTagType#getCorrespondingEndTagType() corresponding end tag type},
208  * the parser assumes that a matching end tag is required and searches for it.
209  * <ul class="Unseparated">
210  * <li>
211  * If the matching end tag is found, an <a HREF="#ExplicitlyTerminated">explicitly terminated element</a> is created.
212  * <li>
213  * If no matching end tag is found, the missing required end tag is {@linkplain Source#setLogWriter(Writer) logged}
214  * and a <a HREF="#SingleTag">single tag element</a> is created.
215  * </ul>
216  * </ul>
217  * </ul>
218  * @see HTMLElements
219  */

220 public final class Element extends Segment implements HTMLElementName {
221     private final StartTag startTag;
222     private final EndTag endTag;
223     private Segment content=null;
224     Element parentElement=Element.NOT_CACHED;
225     private int depth=-1;
226     
227     static final Element NOT_CACHED=new Element();
228
229     Element(final Source source, final StartTag startTag, final EndTag endTag) {
230         super(source, startTag.begin, endTag==null ? startTag.end : endTag.end);
231         this.startTag=startTag;
232         this.endTag=(endTag==null || endTag.length()==0) ? null : endTag;
233     }
234
235     private Element() {
236         startTag=null;
237         endTag=null;
238     }
239
240     /**
241      * Returns the parent of this element in the document element hierarchy.
242      * <p>
243      * The {@link Source#fullSequentialParse()} method should be called after construction of the <code>Source</code> object if this method is to be used.
244      * <p>
245      * This method returns <code>null</code> for a <a HREF="Source.html#TopLevelElement">top-level element</a>,
246      * as well as any element formed from a {@linkplain TagType#isServerTag() server tag}, regardless of whether it is nested inside a normal element.
247      * <p>
248      * See the {@link Source#getChildElements()} method for more details.
249      *
250      * @return the parent of this element in the document element hierarchy, or <code>null</code> if this element is a <a HREF="Source.html#TopLevelElement">top-level element</a>.
251      * @see #getChildElements()
252      */

253     public Element getParentElement() {
254         if (parentElement==Element.NOT_CACHED) {
255             source.getChildElements();
256             if (parentElement==Element.NOT_CACHED) parentElement=null;
257         }
258         return parentElement;
259     }
260
261     /**
262      * Returns a list of the immediate children of this element in the document element hierarchy.
263      * <p>
264      * The objects in the list are all of type {@link Element}.
265      * <p>
266      * See the {@link Source#getChildElements()} method for more details.
267      *
268      * @return a list of the immediate children of this element in the document element hierarchy, guaranteed not <code>null</code>.
269      * @see #getParentElement()
270      */

271     public final List getChildElements() {
272         return childElements!=null ? childElements : getChildElements(-1);
273     }
274
275     final List getChildElements(int depth) {
276         if (depth!=-1) this.depth=depth;
277         if (childElements==null) {
278             if (!Config.IncludeServerTagsInElementHierarchy && end==startTag.end) {
279                 childElements=Collections.EMPTY_LIST;
280             } else {
281                 final int childDepth=(depth==-1 ? -1 : depth+1);
282                 childElements=new ArrayList();
283                 int pos=Config.IncludeServerTagsInElementHierarchy ? begin+1 : startTag.end;
284                 final int maxChildBegin=(Config.IncludeServerTagsInElementHierarchy || endTag==null) ? end : endTag.begin;
285                 while (true) {
286                     final StartTag childStartTag=source.findNextStartTag(pos);
287                     if (childStartTag==null || childStartTag.begin>=maxChildBegin) break;
288                     if (Config.IncludeServerTagsInElementHierarchy) {
289                         if (childStartTag.begin<startTag.end && !childStartTag.getTagType().isServerTag() && !startTag.getTagType().isServerTag()) {
290                             // A start tag is found within another start tag, but neither is a server tag.
291
// This only legitimately happens in very rare cases like entity definitions in doctype.
292
// We don't want to include the child elements in the hierarchy.
293
pos=childStartTag.end;
294                             continue;
295                         }
296                     } else if (childStartTag.getTagType().isServerTag()) {
297                         pos=childStartTag.end;
298                         continue;
299                     }
300                     final Element childElement=childStartTag.getElement();
301                     childElement.parentElement=this;
302                     if (childElement.end>end && source.isLoggingEnabled()) source.log("Child element "+childElement.getDebugInfo()+" extends beyond end of parent "+getDebugInfo());
303                     childElements.add(childElement);
304                     childElement.getChildElements(childDepth);
305                     pos=childElement.end;
306                 }
307             }
308         }
309         return childElements;
310     }
311
312     /**
313      * Returns the nesting depth of this element in the document element hierarchy.
314      * <p>
315      * The {@link Source#fullSequentialParse()} method should be called after construction of the <code>Source</code> object if this method is to be used.
316      * <p>
317      * A <a HREF="Source.html#TopLevelElement">top-level element</a> has a nesting depth of <code>0</code>.
318      * <p>
319      * An element formed from a {@linkplain TagType#isServerTag() server tag} always have a nesting depth of <code>0</code>,
320      * regardless of whether it is nested inside a normal element.
321      * <p>
322      * See the {@link Source#getChildElements()} method for more details.
323      *
324      * @return the nesting depth of this element in the document element hierarchy.
325      * @see #getParentElement()
326      */

327     public int getDepth() {
328         if (depth==-1) {
329             getParentElement();
330             if (depth==-1) depth=0;
331         }
332         return depth;
333     }
334
335     /**
336      * Returns the segment representing the <a target="_blank" HREF="http://www.w3.org/TR/REC-xml#dt-content">content</a> of the element.
337      * <p>
338      * This segment spans between the end of the start tag and the start of the end tag.
339      * If the end tag is not present, the content reaches to the end of the element.
340      * <p>
341      * Note that before version 2.0 this method returned <code>null</code> if the element was {@linkplain #isEmpty() empty},
342      * whereas now a zero-length segment is returned.
343      *
344      * @return the segment representing the content of the element, guaranteed not <code>null</code>.
345      */

346     public Segment getContent() {
347         if (content==null) content=new Segment(source,startTag.end,getContentEnd());
348         return content;
349     }
350
351     /**
352      * Returns the start tag of the element.
353      * @return the start tag of the element.
354      */

355     public StartTag getStartTag() {
356         return startTag;
357     }
358
359     /**
360      * Returns the end tag of the element.
361      * <p>
362      * If the element has no end tag this method returns <code>null</code>.
363      *
364      * @return the end tag of the element, or <code>null</code> if the element has no end tag.
365      */

366     public EndTag getEndTag() {
367         return endTag;
368     }
369
370     /**
371      * Returns the {@linkplain StartTag#getName() name} of the {@linkplain #getStartTag() start tag} of this element, always in lower case.
372      * <p>
373      * This is equivalent to {@link #getStartTag()}<code>.</code>{@link StartTag#getName() getName()}.
374      * <p>
375      * See the {@link Tag#getName()} method for more information.
376      *
377      * @return the name of the {@linkplain #getStartTag() start tag} of this element, always in lower case.
378      */

379     public String JavaDoc getName() {
380         return startTag.getName();
381     }
382
383     /**
384      * Indicates whether this element has zero-length {@linkplain #getContent() content}.
385      * <p>
386      * This is equivalent to {@link #getContent()}<code>.</code>{@link Segment#length() length()}<code>==0</code>.
387      * <p>
388      * Note that this is a broader definition than that of both the
389      * <a target="_blank" HREF="http://www.w3.org/TR/html401/intro/sgmltut.html#didx-element-4">HTML definition of an empty element</a>,
390      * which is only those elements whose end tag is {@linkplain HTMLElements#getEndTagForbiddenElementNames() forbidden}, and the
391      * <a target="_blank" HREF="http://www.w3.org/TR/REC-xml#dt-empty">XML definition of an empty element</a>,
392      * which is "either a start-tag immediately followed by an end-tag, or an {@linkplain #isEmptyElementTag() empty-element tag}".
393      * The other possibility covered by this property is the case of an <a HREF="HTMLElements.html#HTMLElement">HTML element</a> with an
394      * {@linkplain HTMLElements#getEndTagOptionalElementNames() optional} end tag that is immediately followed by another tag that implicitly
395      * terminates the element.
396      *
397      * @return <code>true</code> if this element has zero-length {@linkplain #getContent() content}, otherwise <code>false</code>.
398      * @see #isEmptyElementTag()
399      */

400     public boolean isEmpty() {
401         return startTag.end==getContentEnd();
402     }
403
404     /**
405      * Indicates whether this element is an <a target="_blank" HREF="http://www.w3.org/TR/REC-xml#dt-eetag">empty-element tag</a>.
406      * <p>
407      * It is signified by an {@linkplain #isEmpty() empty} element with the characters "<code>/&gt;</code>" at the end of the
408      * {@linkplain #getStartTag() start tag}.
409      * <p>
410      * This is equivalent to {@link #isEmpty()}<code> && </code>{@link #getStartTag()}<code>.</code>{@link StartTag#isEmptyElementTag() isEmptyElementTag()}.
411      * <p>
412      * The {@link StartTag#isEmptyElementTag()} property only checks whether the start tag syntactically an
413      * <a target="_blank" HREF="http://www.w3.org/TR/REC-xml#dt-eetag">empty-element tag</a>, whereas this property also makes sure
414      * the element is in fact {@linkplain #isEmpty() empty}.
415      * <p>
416      * A syntactical empty-element tag that is not actually empty can occur if the end tag of an <a HREF="HTMLElements.html#HTMLElement">HTML element</a>
417      * is either {@linkplain HTMLElements#getEndTagRequiredElementNames() required} or {@linkplain HTMLElements#getEndTagOptionalElementNames() optional},
418      * but the start tag is erroneously terminated with the characters "<code>/&gt;</code>" in the source document.
419      * All major browsers ignore the syntactical hint of an empty element in this case, even in an
420      * <a target="_blank" HREF="http://www.w3.org/TR/xhtml1/">XHTML</a> document, so this parser does the same.
421      *
422      * @return <code>true</code> if this element is an <a target="_blank" HREF="http://www.w3.org/TR/REC-xml#dt-eetag">empty-element tag</a>, otherwise <code>false</code>.
423      */

424     public boolean isEmptyElementTag() {
425         return isEmpty() && startTag.isEmptyElementTag();
426     }
427
428     /**
429      * Returns the attributes specified in this element's start tag.
430      * <p>
431      * This is equivalent to {@link #getStartTag()}<code>.</code>{@link StartTag#getAttributes() getAttributes()}.
432      *
433      * @return the attributes specified in this element's start tag.
434      * @see StartTag#getAttributes()
435      */

436     public Attributes getAttributes() {
437         return getStartTag().getAttributes();
438     }
439
440     /**
441      * Returns the {@linkplain CharacterReference#decode(CharSequence) decoded} value of the attribute with the specified name (case insensitive).
442      * <p>
443      * Returns <code>null</code> if the {@linkplain #getStartTag() start tag of this element} does not
444      * {@linkplain StartTagType#hasAttributes() have attributes},
445      * no attribute with the specified name exists or the attribute {@linkplain Attribute#hasValue() has no value}.
446      * <p>
447      * This is equivalent to {@link #getStartTag()}<code>.</code>{@link StartTag#getAttributeValue(String) getAttributeValue(attributeName)}.
448      *
449      * @param attributeName the name of the attribute to get.
450      * @return the {@linkplain CharacterReference#decode(CharSequence) decoded} value of the attribute with the specified name, or <code>null</code> if the attribute does not exist or {@linkplain Attribute#hasValue() has no value}.
451      */

452     public String JavaDoc getAttributeValue(final String JavaDoc attributeName) {
453         return getStartTag().getAttributeValue(attributeName);
454     }
455
456     /**
457      * Returns the {@link FormControl} defined by this element.
458      * @return the {@link FormControl} defined by this element, or <code>null</code> if it is not a <a target="_blank" HREF="http://www.w3.org/TR/html401/interact/forms.html#form-controls">control</a>.
459      */

460     public FormControl getFormControl() {
461         return FormControl.construct(this);
462     }
463
464     public String JavaDoc getDebugInfo() {
465         if (this==NOT_CACHED) return "NOT_CACHED";
466         final StringBuffer JavaDoc sb=new StringBuffer JavaDoc();
467         sb.append("Element ").append(super.getDebugInfo()).append(": ");
468         startTag.appendStartTagDebugInfo(sb);
469         sb.append(endTag==null ? "(no end tag)" : "(with end tag)");
470         return sb.toString();
471     }
472
473     /**
474      * Returns the {@linkplain #getContent() content} text of the element.
475      * <p>
476      * This method has been deprecated as of version 2.0 as the {@link Segment} returned by the {@link #getContent()} method
477      * now implements <code>CharSequence</code> and can be used directly in many cases.
478      * Use {@link #getContent()}<code>.</code>{@link #toString() toString()} if a <code>String</code> is required.
479      *
480      * @return the content text of the element, or <code>null</code> if the element is {@linkplain #isEmpty() empty}.
481      * @deprecated Use {@link #isEmpty() isEmpty()}<code> ? null : </code>{@link #getContent()}<code>.</code>{@link #toString() toString()} instead.
482      */

483     public String JavaDoc getContentText() {
484         return isEmpty() ? null : toString();
485     }
486
487     /**
488      * Indicates whether the specified element name is an HTML {@linkplain HTMLElements#getBlockLevelElementNames() block-level element}.
489      * <p>
490      * This method has been deprecated as of version 2.0 as the {@link HTMLElements#getBlockLevelElementNames()} method now provides
491      * a complete set of the element names for which this method returns <code>true</code>.
492      *
493      * @param elementName an element {@linkplain #getName() name}.
494      * @return <code>true</code> if the specified element name is an HTML {@linkplain HTMLElements#getBlockLevelElementNames() block-level element}, otherwise <code>false</code>.
495      * @deprecated Use {@link HTMLElements#getBlockLevelElementNames()}<code>.contains(elementName.toLowerCase())</code> instead.
496      */

497     public static boolean isBlock(final String JavaDoc elementName) {
498         return HTMLElements.getBlockLevelElementNames().contains(elementName.toLowerCase());
499     }
500
501     /**
502      * Indicates whether the specified element name is an HTML {@linkplain HTMLElements#getInlineLevelElementNames() inline-level element}.
503      * <p>
504      * This method has been deprecated as of version 2.0 as the {@link HTMLElements#getInlineLevelElementNames()} method now provides
505      * a complete set of the element names for which this method returns <code>true</code>.
506      *
507      * @param elementName an element {@linkplain #getName() name}.
508      * @return <code>true</code> if the specified element name is an HTML {@linkplain HTMLElements#getInlineLevelElementNames() inline-level element}, otherwise <code>false</code>.
509      * @deprecated Use {@link HTMLElements#getInlineLevelElementNames()}<code>.contains(elementName.toLowerCase())</code> instead.
510      */

511     public static boolean isInline(final String JavaDoc elementName) {
512         return HTMLElements.getInlineLevelElementNames().contains(elementName.toLowerCase());
513     }
514
515     int getContentEnd() {
516         return endTag!=null ? endTag.begin : end;
517     }
518 }
519
520
Popular Tags