◐ Shell
clean mode source ↗

New Python HTML Libraries 2026

last commit 2 years ago html5lib/html5lib-python 1K +1

added 1 year ago

Standards-compliant library for parsing and serializing HTML documents and fragments in Python

Python HTML Libraries

last commit 7 months ago alir3z4/html2text 2K +5

added 1 year ago

Convert HTML to Markdown-formatted text.

Python HTML Libraries Markdown Libraries

last commit 4 months ago gawel/pyquery 2K +1

added 1 year ago

A jQuery-like library for python.

Python HTML Libraries DOM Libraries XML Libraries

this project has been archived mozilla/bleach 2K

added 1 year ago

Bleach is an allowed-list-based HTML sanitizing library that escapes or strips markup and attributes

Python HTML Sanitizers HTML Libraries

last commit 4 months ago buriy/python-readability 2K

added 1 year ago

Given an HTML document, extract and clean up the main body text and title.

Python HTML Libraries

last commit 4 days ago lxml/lxml 3K +1

added 1 year ago

lxml is the most feature-rich and easy-to-use library for processing XML and HTML in the Python language

Python HTML Libraries XML Libraries DOM Libraries

last commit 4 months ago scrapy/parsel 1K +5

added 1 year ago

Parsel lets you extract data from XML/HTML/JSON documents using XPath or CSS selectors.

Python XML Libraries HTML Libraries JSON Libraries

last commit 3 years ago psf/requests-html 13K -4

added 1 year ago

This library intends to make parsing HTML as simple and intuitive as possible.

Python HTML Libraries DOM Libraries