13. Structured Markup Processing Tools

Python supports a variety of modules to work with various forms of structured data markup. This includes modules to work with the Standard Generalized Markup Language (SGML) and the Hypertext Markup Language (HTML), and several interfaces for working with the Extensible Markup Language (XML).

HTMLParser   A simple parser that can handle HTML and XHTML.
sgmllib   Only as much of an SGML parser as needed to parse HTML.
htmllib   A parser for HTML documents.
htmlentitydefs   Definitions of HTML general entities.
xml.parsers.expat   An interface to the Expat non-validating XML parser.
xml.dom   Document Object Model API for Python.
xml.dom.minidom   Lightweight Document Object Model (DOM) implementation.
xml.dom.pulldom   Support for building partial DOM trees from SAX events.
xml.sax   Package containing SAX2 base classes and convenience functions.
xml.sax.handler   Base classes for SAX event handlers.
xml.sax.saxutils   Convenience functions and classes for use with SAX.
xml.sax.xmlreader   Interface which SAX-compliant XML parsers must implement.
xmllib   A parser for XML documents.

See About this document... for information on suggesting changes.