Skip to content

Commit e78b973

Browse files
gguussdpebot
authored andcommitted
Adds document text detection tutorial. [(#868)](GoogleCloudPlatform/python-docs-samples#868)
* Adds document text detection tutorial. * Feedback from review * Less whitespace and fewer hanging indents
1 parent 38682ca commit e78b973

File tree

7 files changed

+280
-0
lines changed

7 files changed

+280
-0
lines changed
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
output-text.jpg
Lines changed: 110 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,110 @@
1+
.. This file is automatically generated. Do not edit this file directly.
2+
3+
Google Cloud Vision API Python Samples
4+
===============================================================================
5+
6+
This directory contains samples for Google Cloud Vision API. `Google Cloud Vision API`_ allows developers to easily integrate vision detection features within applications, including image labeling, face and landmark detection, optical character recognition (OCR), and tagging of explicit content
7+
8+
9+
10+
11+
.. _Google Cloud Vision API: https://cloud.google.com/vision/docs
12+
13+
Setup
14+
-------------------------------------------------------------------------------
15+
16+
17+
Authentication
18+
++++++++++++++
19+
20+
Authentication is typically done through `Application Default Credentials`_,
21+
which means you do not have to change the code to authenticate as long as
22+
your environment has credentials. You have a few options for setting up
23+
authentication:
24+
25+
#. When running locally, use the `Google Cloud SDK`_
26+
27+
.. code-block:: bash
28+
29+
gcloud beta auth application-default login
30+
31+
32+
#. When running on App Engine or Compute Engine, credentials are already
33+
set-up. However, you may need to configure your Compute Engine instance
34+
with `additional scopes`_.
35+
36+
#. You can create a `Service Account key file`_. This file can be used to
37+
authenticate to Google Cloud Platform services from any environment. To use
38+
the file, set the ``GOOGLE_APPLICATION_CREDENTIALS`` environment variable to
39+
the path to the key file, for example:
40+
41+
.. code-block:: bash
42+
43+
export GOOGLE_APPLICATION_CREDENTIALS=/path/to/service_account.json
44+
45+
.. _Application Default Credentials: https://cloud.google.com/docs/authentication#getting_credentials_for_server-centric_flow
46+
.. _additional scopes: https://cloud.google.com/compute/docs/authentication#using
47+
.. _Service Account key file: https://developers.google.com/identity/protocols/OAuth2ServiceAccount#creatinganaccount
48+
49+
Install Dependencies
50+
++++++++++++++++++++
51+
52+
#. Install `pip`_ and `virtualenv`_ if you do not already have them.
53+
54+
#. Create a virtualenv. Samples are compatible with Python 2.7 and 3.4+.
55+
56+
.. code-block:: bash
57+
58+
$ virtualenv env
59+
$ source env/bin/activate
60+
61+
#. Install the dependencies needed to run the samples.
62+
63+
.. code-block:: bash
64+
65+
$ pip install -r requirements.txt
66+
67+
.. _pip: https://pip.pypa.io/
68+
.. _virtualenv: https://virtualenv.pypa.io/
69+
70+
Samples
71+
-------------------------------------------------------------------------------
72+
73+
Document Text tutorial
74+
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
75+
76+
77+
78+
To run this sample:
79+
80+
.. code-block:: bash
81+
82+
$ python doctext.py
83+
84+
usage: doctext.py [-h] image_file
85+
86+
positional arguments:
87+
image_file The image for text detection.
88+
89+
optional arguments:
90+
-h, --help show this help message and exit
91+
92+
93+
94+
95+
The client library
96+
-------------------------------------------------------------------------------
97+
98+
This sample uses the `Google Cloud Client Library for Python`_.
99+
You can read the documentation for more details on API usage and use GitHub
100+
to `browse the source`_ and `report issues`_.
101+
102+
.. Google Cloud Client Library for Python:
103+
https://googlecloudplatform.github.io/google-cloud-python/
104+
.. browse the source:
105+
https://github.com/GoogleCloudPlatform/google-cloud-python
106+
.. report issues:
107+
https://github.com/GoogleCloudPlatform/google-cloud-python/issues
108+
109+
110+
.. _Google Cloud SDK: https://cloud.google.com/sdk/
Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
# This file is used to generate README.rst
2+
3+
product:
4+
name: Google Cloud Vision API
5+
short_name: Cloud Vision API
6+
url: https://cloud.google.com/vision/docs
7+
description: >
8+
`Google Cloud Vision API`_ allows developers to easily integrate vision
9+
detection features within applications, including image labeling, face and
10+
landmark detection, optical character recognition (OCR), and tagging of
11+
explicit content.
12+
13+
setup:
14+
- auth
15+
- install_deps
16+
17+
samples:
18+
- name: Document Text tutorial
19+
file: doctext.py
20+
show_help: True
21+
22+
cloud_client_library: true
Lines changed: 121 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,121 @@
1+
#!/usr/bin/env python
2+
3+
# Copyright 2017 Google Inc. All Rights Reserved.
4+
#
5+
# Licensed under the Apache License, Version 2.0 (the "License");
6+
# you may not use this file except in compliance with the License.
7+
# You may obtain a copy of the License at
8+
#
9+
# http://www.apache.org/licenses/LICENSE-2.0
10+
#
11+
# Unless required by applicable law or agreed to in writing, software
12+
# distributed under the License is distributed on an "AS IS" BASIS,
13+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
14+
# See the License for the specific language governing permissions and
15+
# limitations under the License.
16+
17+
"""Outlines document text given an image.
18+
19+
Example:
20+
python doctext.py resources/text_menu.jpg
21+
"""
22+
# [START full_tutorial]
23+
# [START imports]
24+
import argparse
25+
from enum import Enum
26+
import io
27+
28+
from google.cloud import vision
29+
from PIL import Image, ImageDraw
30+
# [END imports]
31+
32+
33+
class FeatureType(Enum):
34+
PAGE = 1
35+
BLOCK = 2
36+
PARA = 3
37+
WORD = 4
38+
SYMBOL = 5
39+
40+
41+
def draw_boxes(image, blocks, color):
42+
"""Draw a border around the image using the hints in the vector list."""
43+
# [START draw_blocks]
44+
draw = ImageDraw.Draw(image)
45+
46+
for block in blocks:
47+
draw.polygon([
48+
block.vertices[0].x, block.vertices[0].y,
49+
block.vertices[1].x, block.vertices[1].y,
50+
block.vertices[2].x, block.vertices[2].y,
51+
block.vertices[3].x, block.vertices[3].y], None, color)
52+
return image
53+
# [END draw_blocks]
54+
55+
56+
def get_document_bounds(image_file, feature):
57+
# [START detect_bounds]
58+
"""Returns document bounds given an image."""
59+
vision_client = vision.Client()
60+
61+
bounds = []
62+
63+
with io.open(image_file, 'rb') as image_file:
64+
content = image_file.read()
65+
66+
image = vision_client.image(content=content)
67+
document = image.detect_full_text()
68+
69+
# Collect specified feature bounds by enumerating all document features
70+
for page in document.pages:
71+
for block in page.blocks:
72+
for paragraph in block.paragraphs:
73+
for word in paragraph.words:
74+
for symbol in word.symbols:
75+
if (feature == FeatureType.SYMBOL):
76+
bounds.append(symbol.bounding_box)
77+
78+
if (feature == FeatureType.WORD):
79+
bounds.append(word.bounding_box)
80+
81+
if (feature == FeatureType.PARA):
82+
bounds.append(paragraph.bounding_box)
83+
84+
if (feature == FeatureType.BLOCK):
85+
bounds.append(block.bounding_box)
86+
87+
if (feature == FeatureType.PAGE):
88+
bounds.append(block.bounding_box)
89+
90+
return bounds
91+
# [END detect_bounds]
92+
93+
94+
def render_doc_text(filein, fileout):
95+
# [START render_doc_text]
96+
image = Image.open(filein)
97+
bounds = get_document_bounds(filein, FeatureType.PAGE)
98+
draw_boxes(image, bounds, 'blue')
99+
bounds = get_document_bounds(filein, FeatureType.PARA)
100+
draw_boxes(image, bounds, 'red')
101+
bounds = get_document_bounds(filein, FeatureType.WORD)
102+
draw_boxes(image, bounds, 'yellow')
103+
104+
if fileout is not 0:
105+
image.save(fileout)
106+
else:
107+
image.show()
108+
# [END render_doc_text]
109+
110+
111+
if __name__ == '__main__':
112+
# [START run_crop]
113+
parser = argparse.ArgumentParser()
114+
parser.add_argument('detect_file', help='The image for text detection.')
115+
parser.add_argument('-out_file', help='Optional output file', default=0)
116+
args = parser.parse_args()
117+
118+
parser = argparse.ArgumentParser()
119+
render_doc_text(args.detect_file, args.out_file)
120+
# [END run_crop]
121+
# [END full_tutorial]
Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
# Copyright 2017 Google Inc. All Rights Reserved.
2+
#
3+
# Licensed under the Apache License, Version 2.0 (the "License");
4+
# you may not use this file except in compliance with the License.
5+
# You may obtain a copy of the License at
6+
#
7+
# http://www.apache.org/licenses/LICENSE-2.0
8+
#
9+
# Unless required by applicable law or agreed to in writing, software
10+
# distributed under the License is distributed on an "AS IS" BASIS,
11+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12+
# See the License for the specific language governing permissions and
13+
# limitations under the License.
14+
15+
import os
16+
17+
import doctext
18+
19+
20+
def test_text(cloud_config, capsys):
21+
"""Checks the output image for drawing the crop hint is created."""
22+
doctext.render_doc_text('resources/text_menu.jpg', 'output-text.jpg')
23+
out, _ = capsys.readouterr()
24+
assert os.path.isfile('output-text.jpg')
Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
google-cloud-vision==0.23.2
2+
pillow==4.0.0
Loading

0 commit comments

Comments
 (0)