This site has been retired. For up to date information, see or

[Home] [TitleIndex] [WordIndex


1. Overview

OCRFeeder is a document layout analysis and optical character recognition system.

Given the images it will automatically outline its contents, distinguish between what's graphics and text and perform OCR over the latter. It generates multiple formats being its main one ODT.

It features a complete GTK graphical user interface that allows the users to correct any unrecognized characters, defined or correct bounding boxes, set paragraph styles, clean the input images, import PDFs, save and load the project, export everything to multiple formats, etc. OCRFeeder was developed as the project of the Master's Thesis in Computer Science of Joaquim Rocha.

2. News

OCRFeeder's category on the author's blog

3. Preview


3.1. Screenshots

4. Where Do I Get It


Releases packages:

Issue tracker:

4.1. Show Me The Code

You can checkout the latest sources doing the following:

git clone

4.2. How Do I Use It

Please visit the link below to read OCRFeeder's README:

Manual in German from Ubuntu Users' wiki:

Configuration for Chinese OCR:

4.3. Roadmap


2024-10-23 10:58