Build software better, together

Here are 560 public repositories matching this topic...

OCR software, free and offline. 开源、免费的离线OCR软件。支持截屏/批量导入图片，PDF文档识别，排除水印/页眉页脚，扫描/生成二维码。内置多国语言库。

Updated Nov 20, 2025
Python

CnOCR: Awesome Chinese/English OCR Python toolkits based on PyTorch. It comes with 20+ well-trained models for different application scenarios and can be used directly after installation. 【基于 PyTorch/MXNet 的中文/英文 OCR Python 包。】

Updated Feb 7, 2026
Python

Document (PDF, Word, PPTX ...) extraction and parse API using state of the art modern OCRs + Ollama supported models. Anonymize documents. Remove PII. Convert any document or picture to structured JSON or Markdown

Updated Dec 8, 2025
Python

结束和新的开始

Updated Nov 19, 2023
QML

Lightweight & fast OCR models for license plate text recognition.

Updated Mar 14, 2026
Python

A math workspace for screenshot OCR, handwriting-to-LaTeX, editing, preview, and symbolic computation, powered by MathCraft OCR and MathLive.

Updated Jun 15, 2026
Python

Extracting Tables from Document Images using a Multi-stage Pipeline for Table Detection and Table Structure Recognition

Updated Sep 5, 2022
Jupyter Notebook

OCR, Archive, Index and Search: Implementation agnostic OCR framework.

Updated Nov 3, 2023
Jupyter Notebook

Collection of PDF parsing libraries like AI based docling, claude, openai, gemini, meta's llama-vision, unstructured-io, and pdfminer, pymupdf, pdfplumber etc for efficient snapshot, text, table, and metadata extraction.

Updated Apr 21, 2026
Python

Perform text detection in a variety of languages with your computer webcam using Google Tesseract OCR and OpenCV. This script achieves a real-time OCR effect via multi-threading.

Updated Jan 30, 2023
Python

A carefully-designed OCR pipeline for universal boarded table recognition and reconstruction.

Updated Jan 10, 2023
C++

Manga OCR snipping application for desktop

Updated Jan 7, 2023
Python

Python3 package for Chinese/English OCR,use paddleocr-v5 onnx model(~20MB), with ultra-fast inference speed. 基于ppocr-v5-onnx模型推理，中英文OCR开源SOTA，推理速度超快。

Updated Apr 11, 2026
Python

Anansi is a computer vision (cv2 and FFmpeg) + OCR (EasyOCR and tesseract) python-based crawler for finding and extracting questions and correct answers from video files of popular TV game shows in the Balkan region.

Updated Sep 26, 2022
Python

The open-source universal adapter for LLMs. Turn messy real-world data into clean, agent-ready context.

Updated Jun 12, 2026
Python

PDF text data extraction web app with OCR for scanned documents

Updated Jun 5, 2024
Python

A FLOSS software for Persian Optical Character Recognition

Updated Jun 19, 2024
Jupyter Notebook

OCR Tamil is a powerful tool that can detect and recognize text in Tamil images with high accuracy on Natural Scenes

Updated Sep 6, 2025
Python

Easter2.0: IMPROVING CONVOLUTIONAL MODELS FOR HANDWRITTEN TEXT RECOGNITION

Updated Apr 25, 2023
Jupyter Notebook

OCR CLI Tool for Extracting Text from Screenshots (images) using bash, and python scripts for both x11 and wayland

Updated Mar 28, 2026
Shell

Improve this page

Add a description, image, and links to the ocr-python topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the ocr-python topic, visit your repo's landing page and select "manage topics."

Learn more