DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Low-Code Development: Leverage low and no code to streamline your workflow so that you can focus on higher priorities.

DZone Security Research: Tell us your top security strategies in 2024, influence our research, and enter for a chance to win $!

Launch your software development career: Dive head first into the SDLC and learn how to build high-quality software and teams.

Open Source Migration Practices and Patterns: Explore key traits of migrating open-source software and its impact on software development.

Related

  • Reversing an Array: An Exploration of Array Manipulation
  • Applying the Pareto Principle To Learn a New Programming Language
  • Linting Excellence: How Black, isort, and Ruff Elevate Python Code Quality
  • 5 Simple Steps To Get Your Test Suite Running in Heroku CI

Trending

  • OpenID Connect Flows: From Implicit to Authorization Code With PKCE and BFF
  • Documenting a Spring REST API Using Smart-doc
  • Application Telemetry: Different Objectives for Developers and Product Managers
  • The Role of AI in Low- and No-Code Development
  1. DZone
  2. Coding
  3. Languages
  4. Python Techniques for Text Extraction From Images

Python Techniques for Text Extraction From Images

Explore two methods of text extraction from images using Python 3.

By 
Stylianos Kampakis user avatar
Stylianos Kampakis
·
May. 06, 24 · Tutorial
Like (2)
Save
Tweet
Share
1.2K Views

Join the DZone community and get the full member experience.

Join For Free

Python is one of the most powerful programming languages available today. It is the most popular language when it comes to AI-related tasks such as Optical Character Recognition (OCR).

Its community support is one of the most extensive in 2024. There are numerous libraries and packages in Python that help with the creation of AI software. Today, we are going to look at a few methods of text extraction from images using Python 3.

The only prerequisites needed are to have a computer, an internet connection, and a Google account because we are going to do everything on Google Collaboratory.

Python Techniques for Image-to-Text Conversion 

Several Python libraries can help you extract text from an image. Below are two straightforward methods of using such libraries.

1. Tesseract In Google Colab

Tesseract is an OCR engine. You can use Tesseract in Python with the help of Pytesseract. We are going to teach you how to use Tesseract for image-to-text conversion with Google Colab, which is an online tool for running Python code.

The advantage of using Colab is that you don’t have to worry about anything like dependencies or installing massive libraries on your system.

So, let’s see how you can do that.

  1. Installing Tesseract OCR

Tesseract OCR is the particular component of Tesseract that helps us to use OCR functions. This is vital for converting images to text. 

The command for installing it is :

Python
 
!sudo apt install tesseract-ocr


Normally, you would need to install Tesseract OCR on your system. The “!sudo apt install” is a Linux terminal command. With Google Colab, though, you don’t need to get into that type of trouble. Simply run this command in a code block and Colab will handle everything else.

The installation may look like this:

Tesseract installation

  1. Installing Tesseract and Pillow

Now, we need to install Tesseract for Python. This is a simple matter. All you need to do is write the following command in a code block and run it.

Python
 
!pip install pytesseract


!pip is a Python install command. PIP stands for Python installs packages. It is used to install all kinds of Python libraries and dependencies. 

Anyway, after you run this command, you will see some installations going on. They may look like this:

pytesseract installation

You may have noticed that the installing Tesseract also installs Pillow.  Pillow is a Python imaging library fork. It provides functions for importing, opening, manipulating, and saving image files.

Without Pillow, we cannot provide an image to the program for image-to-text conversion. So, Python automatically installs Pillow along with Tesseract. Sweet!

  1. Image Import Preparation

Now, we need to use some commands to enable the importing of images.  This can be done by using the ‘shutil,’ ‘os,’ and ‘random’ commands.

  1. shutil: Helps you copy, move, and delete files and directories in Python.
  2. os: Lets you work with the operating system, like navigating files, checking file existence, and executing commands.
  3. random: Generates random numbers and selections and is useful for things like games, simulations, and statistical sampling.

Here’s how you need to write them down:

Python
 
import pytesseract

import shutil

import os

import random


Then right below them, you need to write the following code:

Python
 
try:

    from PIL import Image

except ImportError:

    import Image


This code snippet is a common pattern used in Python to import the Image module from the Python Imaging Library (PIL) or its fork, Pillow. Here's what it does:

  1. It attempts to import the Image module from the PIL package using the from ... import ... syntax.
  2. If the PIL package is not installed or cannot be imported, it falls back to importing the Image module from the global namespace, which may refer to the Pillow library if it's installed.

This allows the code to work with either PIL or Pillow without needing to change the import statement manually. It's a way to ensure compatibility across different environments where either PIL or Pillow may be installed.

Now, we are ready to import our image. 

  1. Image Import to Colab

To import an image from your device to Colab, you need to write the following snippet of code:

Python
 
from google.colab import files



uploaded = files.upload()


Running this piece of code will allow you to select a file from your device and import it to the run time.

  1. Text Extraction From Image

To extract text from an image, you need to write the following two commands:

Python
 
extractedInformation = pytesseract.image_to_string(Image.open('sample.png'))


Python
 
print(extractedInformation)


Here is a simple explanation of this code.

  • Image.open('sample.png'): This part opens the image file named "sample.png". The Image.open() function is from the Python Imaging Library (PIL) or Pillow library, which allows you to open and manipulate image files.
  • pytesseract.image_to_string(...): This part of the code calls the image_to_string function from the pytesseract package. This function takes an image file (in this case, opened using Image.open()) as input and extracts the text from it using the Tesseract OCR engine.
  • extractedInformation = ...: This assigns the extracted text to the variable named extractedInformation.
  • Print (extractedInformation) will simply output the result, which is the extracted text.

The image we chose for this exercise was this one:

"It was the best of times, it was the worst of times" quote.

As you can see, our output was the same.

Extracted information

So, there you have it. You've learned how to use Python for text extraction from an image using Tesseract and Google Colab.

2. Editpad: A Python-Powered Online Tool

There is another technique of using Python for text extraction. That is to use a Python-powered online tool like Editpad. 

Editpad is a simple tool that uses Python in its backend to deploy OCR and extract text from images. Here’s how you can use this tool.

  1. Open a web browser and search for Editpad to extract text from the image tool. 
  2. Open the result that matches your query.
  3. You will see a simple interface like this:

Editpad: Extract text from image

  1. Follow the on-screen instructions to input your image. 

  2. Click the “Extract Text” button

Extract Text button

  1. You will get your output in a matter of seconds. Simply download or copy it to use it.

Option to download or copy

This is an overall much simpler way of extracting text from images. Another advantage is that you can input multiple images for extraction. There is also an API you can use if you want to import this functionality to your own programs or apps.

Conclusion

You have learned two Python techniques for text extraction from images. One method was to manually write a program in Python and use Tesseract OCR for text extraction. The other method was to use an online tool that utilizes Python in the back end for text extraction. Both approaches have their merits, and you should use them accordingly.

Extract Python (language)

Opinions expressed by DZone contributors are their own.

Related

  • Reversing an Array: An Exploration of Array Manipulation
  • Applying the Pareto Principle To Learn a New Programming Language
  • Linting Excellence: How Black, isort, and Ruff Elevate Python Code Quality
  • 5 Simple Steps To Get Your Test Suite Running in Heroku CI

Partner Resources


Comments

ABOUT US

  • About DZone
  • Send feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends: