Configuration of Tesseract

    The Tesseract library is known as the one of the most famous free libraries for the optical character recognition (OCR). It was initially developed by Hewlett Packard about 20 years ago. Since year 2006 Google bought the sources and proceeded the work on it. This tutorial contains some hints on how to set up and use the library in Windows OS.

    At the initial step, we need to download and install the following programs

git logo http://git-scm.com/
svn logo http://www.sliksvn.com/en/download
VS logo https://www.visualstudio.com

 

As soon we are ready with above, it’s time to create a directory for building the Tesseract. Let’s assume that our path is D:\Tesseract files\. The next step is to move to this directory using the Git CMD prompt. Figure below illustrates how this can be done.

Fig.1
Fig. 1. Git CMD example

Now we are exactly in required folder. Firstly, we are copying all dependencies from GitHub repository using the following command

git clone git://github.com/pvorb/tesseract-vs2013.git

The next step is to open the VS2013 developer command prompt

Fig.2
Fig. 2. Path to VS2013 command prompt

And change the path to the created folder

Fig.3
Fig. 3. VS2013 command prompt

As you see we already have the “tesseract-vs2013” folder inside. Now we can perform building using the command

msbuild build.proj

After this step the VS2013 can be closed.

Building the Tesseract

At this stage we are ready to build the library. This is done using the following steps

  1. Re-open the Git command prompt (Fig. 2) and ensure it’s still in D:\Tesseract files\.
  2. Get the latest source using SVN(print in cmd): svn checkout http://tesseract-ocr.googlecode.com/svn/trunk/ tesseract-ocr
  3. Change to the newly checked-out repository(in cmd): cd tesseract-ocr
  4. Apply the patch provided in tesseract-vs2013 (print in cmd). svn patch ..\tesseract-vs2013\vs2013+64bit_support.patch you should see something like this
Fig.4
Fig. 4. Tesseract building – step 1

After closing the Git command prompt we alrady have the folders containing header and library files:

C:\ Tesseract files \include\

C:\ Tesseract files \lib\

Now all we need is to open the VS2013 solution and build the source

  1. Open D:\ Tesseract files \tesseract-ocr\vs2013\tesseract.sln with Visual Studio 2013.
  1. Build the project .

The current VS2013 solution contains configurations for both x32 and x64 platforms (both dynamic and static ). As a result, you can find the compiled binaries in D:\ Tesseract files \tesseract-ocr\vs2013\bin\.

Connect the Tesseract to existing VS2013 C++ project

In order to set up the VS2013 project and use the OCR engine in it, we need to carefully handle the Tesseract library files and set up the paths in C++ in a proper way.

  1. Copy tesseract .dll files to necessary project:From “Tesseract files\lib” copy “libtesseract304.dll” (or “libtesseract304d.dll”) to “Release” (or “Debug”) folder in necessary project (In this folder must be exe file).From “Tesseract files\lib\Win32” (or X64) copy “liblept171.dll” (or “liblept171d.dll”) to “Release” (or “Debug”) folder in necessary project (In this folder must be exe file).
    screen1
  2. Set properties of necessary project in VS (Alt+F7) for debug mode win32:
    In C++ –> general –> Additional Include Directories :
    C:\Tesseract files\tesseract-ocr\ccmain
    C:\Tesseract files\tesseract-ocr\ccstruct
    C:\Tesseract files\tesseract-ocr\ccutil
    C:\Tesseract files\include\leptonica
    C:\Tesseract files\tesseract-ocr\api
    C:\Tesseract files\include
    screen2
  3. In Linker –> General –> Additional Library Directories:
    C:\Tesseract files\lib\Win32 C:\Tesseract files\lib
    screen3
  4. In linker  –> General –> Additional Library Directories:
    C:\Tesseract files\lib\Win32
    C:\Tesseract files\lib
    In Linker –> Input –> Additional Dependencies:
    libtesseract304d.lib
    liblept171d.lib.
    Example with recognized text:
Fig.5
Fig. 5. OCR example
Tesseract library configuration

2 thoughts on “Tesseract library configuration

  • 15.02.2016 at 15:55
    Permalink

    Wow great. so excited to try this tutorial.

    Reply
  • 22.08.2016 at 20:43
    Permalink

    thank you very much just hope with this beautiful tutorial if able to compile tesseract to use visual studio . incidentally be correct this tutorial procedure would apply equally to compile tesseract for visual studio 2015

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *