Module htmldata :: Class _HTMLTag
[show private | hide private]
[frames | no frames]

Class _HTMLTag


HTML tag extracted by _full_tag_extract.
Method Summary
  __init__(self, pos, name, attrs, key_pos, value_pos)
Create an _HTMLTag object.

Instance Variable Summary
  attrs: Dictionary mapping tag attributes to corresponding tag values.
  key_pos: Key position dict.
  name: Name of tag.
  pos: (start, end) indices of the entire tag in the HTML document.
  value_pos: Value position dict.

Method Details

__init__(self, pos, name, attrs, key_pos, value_pos)
(Constructor)

Create an _HTMLTag object.

Instance Variable Details

attrs

Dictionary mapping tag attributes to corresponding tag values.

Example:
>>> tag = _full_tag_extract('<a href="d.com">')[0]
>>> tag.attrs
{'href': 'd.com'}
Surrounding quotes are stripped from the values.

key_pos

Key position dict.

Maps the name of a tag attribute to (start, end) indices for the key string in the "key=value" HTML pair. Indices are absolute, where 0 is the start of the HTML document.

Example:
>>> tag = _full_tag_extract('<a href="d.com">')[0]
>>> tag.key_pos['href']
(3, 7)

>>> '<a href="d.com">'[3:7]
'href'

name

Name of tag. For example, 'img'.

pos

(start, end) indices of the entire tag in the HTML document.

value_pos

Value position dict.

Maps the name of a tag attribute to (start, end) indices for the value in the HTML document string. Surrounding quotes are excluded from this range. Indices are absolute, where 0 is the start of the HTML document.

Example:
>>> tag = _full_tag_extract('<a href="d.com">')[0]
>>> tag.value_pos['href']
(9, 14)

>>> '<a href="d.com">'[9:14]
'd.com'