htmldata

Manipulate HTML/XHTML/CSS documents.

By Connelly Barnes

Features

Compatible with Python 2.0 - 2.5.
Extract and modify URLs in an HTML/XHTML/CSS document.
Translate HTML/XHTML documents to and from list data structures.
This allows you to read and write HTML documents programmably, with much flexibility.
Use cases: robots, proxy CGI scripts, filtering of HTML and CSS, and flexible wget-like mirroring.
Works in the "real world," unlike many Python HTML parsers.
In other words, if a Web browser can parse it, then htmldata should parse it (that's the *goal* at least...:-).

Current Version

Version 1.1.1 source code and manual are available.

Version History

Version history (changelog):

1.1.1: source code
1.1.0: source code
1.0.9: source code
1.0.8: source code
1.0.7: source code
1.0.6: source code
1.0.5: source code
1.0.4: source code
1.0.3: source code
1.0.2: source code
1.0.1: source code
1.0.0: source code

Author

The htmldata source code and supporting documentation have been placed in the public domain.

Please send patches and bug reports to my e-mail address:

>>> 'Y29ubmVsbHliYXJuZXNAZ21haWwuY29t'.decode('base64')