Skip to content

Commit d487509

Browse files
beccasaurusdizcology
authored andcommitted
Data Labeling Beta samples (#2096)
* add files * upate create_annotation_spec_set and test * add requirements.txt * update create_instruction and test * update import data and test * add label image and test * add label_text test * add label_video_test * add manage dataset and tests * flake * fix * add README
1 parent c4a0e6d commit d487509

18 files changed

+1270
-0
lines changed

datalabeling/README.rst

Lines changed: 78 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,78 @@
1+
.. This file is automatically generated. Do not edit this file directly.
2+
3+
Google Cloud Data Labeling Service Python Samples
4+
===============================================================================
5+
6+
.. image:: https://gstatic.com/cloudssh/images/open-btn.png
7+
:target: https://console.cloud.google.com/cloudshell/open?git_repo=https://github.com/GoogleCloudPlatform/python-docs-samples&page=editor&open_in_editor=datalabeling/README.rst
8+
9+
10+
This directory contains samples for Google Cloud Data Labeling Service. `Google Cloud Data Labeling Service`_ allows developers to request having human labelers label a collection of data that you plan to use to train a custom machine learning model.
11+
12+
13+
14+
15+
.. _Google Cloud Data Labeling Service: https://cloud.google.com/data-labeling/docs/
16+
17+
Setup
18+
-------------------------------------------------------------------------------
19+
20+
21+
Authentication
22+
++++++++++++++
23+
24+
This sample requires you to have authentication setup. Refer to the
25+
`Authentication Getting Started Guide`_ for instructions on setting up
26+
credentials for applications.
27+
28+
.. _Authentication Getting Started Guide:
29+
https://cloud.google.com/docs/authentication/getting-started
30+
31+
Install Dependencies
32+
++++++++++++++++++++
33+
34+
#. Clone python-docs-samples and change directory to the sample directory you want to use.
35+
36+
.. code-block:: bash
37+
38+
$ git clone https://github.com/GoogleCloudPlatform/python-docs-samples.git
39+
40+
#. Install `pip`_ and `virtualenv`_ if you do not already have them. You may want to refer to the `Python Development Environment Setup Guide`_ for Google Cloud Platform for instructions.
41+
42+
.. _Python Development Environment Setup Guide:
43+
https://cloud.google.com/python/setup
44+
45+
#. Create a virtualenv. Samples are compatible with Python 2.7 and 3.4+.
46+
47+
.. code-block:: bash
48+
49+
$ virtualenv env
50+
$ source env/bin/activate
51+
52+
#. Install the dependencies needed to run the samples.
53+
54+
.. code-block:: bash
55+
56+
$ pip install -r requirements.txt
57+
58+
.. _pip: https://pip.pypa.io/
59+
.. _virtualenv: https://virtualenv.pypa.io/
60+
61+
62+
63+
The client library
64+
-------------------------------------------------------------------------------
65+
66+
This sample uses the `Google Cloud Client Library for Python`_.
67+
You can read the documentation for more details on API usage and use GitHub
68+
to `browse the source`_ and `report issues`_.
69+
70+
.. _Google Cloud Client Library for Python:
71+
https://googlecloudplatform.github.io/google-cloud-python/
72+
.. _browse the source:
73+
https://github.com/GoogleCloudPlatform/google-cloud-python
74+
.. _report issues:
75+
https://github.com/GoogleCloudPlatform/google-cloud-python/issues
76+
77+
78+
.. _Google Cloud SDK: https://cloud.google.com/sdk/

datalabeling/README.rst.in

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
# This file is used to generate README.rst
2+
3+
product:
4+
name: Google Cloud Data Labeling Service
5+
short_name: Cloud Data Labeling
6+
url: https://cloud.google.com/data-labeling/docs/
7+
description: >
8+
`Google Cloud Data Labeling Service`_ allows developers to request having
9+
human labelers label a collection of data that you plan to use to train a
10+
custom machine learning model.
11+
12+
setup:
13+
- auth
14+
- install_deps
15+
16+
cloud_client_library: true
17+
18+
folder: datalabeling
Lines changed: 77 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,77 @@
1+
#!/usr/bin/env python
2+
3+
# Copyright 2019 Google LLC
4+
#
5+
# Licensed under the Apache License, Version 2.0 (the "License");
6+
# you may not use this file except in compliance with the License.
7+
# You may obtain a copy of the License at
8+
#
9+
# http://www.apache.org/licenses/LICENSE-2.0
10+
#
11+
# Unless required by applicable law or agreed to in writing, software
12+
# distributed under the License is distributed on an "AS IS" BASIS,
13+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
14+
# See the License for the specific language governing permissions and
15+
# limitations under the License.
16+
17+
import argparse
18+
19+
20+
# [START datalabeling_create_annotation_spec_set_beta]
21+
def create_annotation_spec_set(project_id):
22+
"""Creates a data labeling annotation spec set for the given
23+
Google Cloud project.
24+
"""
25+
from google.cloud import datalabeling_v1beta1 as datalabeling
26+
client = datalabeling.DataLabelingServiceClient()
27+
28+
project_path = client.project_path(project_id)
29+
30+
annotation_spec_1 = datalabeling.types.AnnotationSpec(
31+
display_name='label_1',
32+
description='label_description_1'
33+
)
34+
35+
annotation_spec_2 = datalabeling.types.AnnotationSpec(
36+
display_name='label_2',
37+
description='label_description_2'
38+
)
39+
40+
annotation_spec_set = datalabeling.types.AnnotationSpecSet(
41+
display_name='YOUR_ANNOTATION_SPEC_SET_DISPLAY_NAME',
42+
description='YOUR_DESCRIPTION',
43+
annotation_specs=[annotation_spec_1, annotation_spec_2]
44+
)
45+
46+
response = client.create_annotation_spec_set(
47+
project_path, annotation_spec_set)
48+
49+
# The format of the resource name:
50+
# project_id/{project_id}/annotationSpecSets/{annotationSpecSets_id}
51+
print('The annotation_spec_set resource name: {}'.format(response.name))
52+
print('Display name: {}'.format(response.display_name))
53+
print('Description: {}'.format(response.description))
54+
print('Annotation specs:')
55+
for annotation_spec in response.annotation_specs:
56+
print('\tDisplay name: {}'.format(annotation_spec.display_name))
57+
print('\tDescription: {}\n'.format(annotation_spec.description))
58+
59+
return response
60+
# [END datalabeling_create_annotation_spec_set_beta]
61+
62+
63+
if __name__ == '__main__':
64+
parser = argparse.ArgumentParser(
65+
description=__doc__,
66+
formatter_class=argparse.RawDescriptionHelpFormatter
67+
)
68+
69+
parser.add_argument(
70+
'--project-id',
71+
help='Project ID. Required.',
72+
required=True
73+
)
74+
75+
args = parser.parse_args()
76+
77+
create_annotation_spec_set(args.project_id)
Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
#!/usr/bin/env python
2+
3+
# Copyright 2019 Google, Inc
4+
#
5+
# Licensed under the Apache License, Version 2.0 (the "License");
6+
# you may not use this file except in compliance with the License.
7+
# You may obtain a copy of the License at
8+
#
9+
# http://www.apache.org/licenses/LICENSE-2.0
10+
#
11+
# Unless required by applicable law or agreed to in writing, software
12+
# distributed under the License is distributed on an "AS IS" BASIS,
13+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
14+
# See the License for the specific language governing permissions and
15+
# limitations under the License.
16+
17+
import os
18+
19+
import create_annotation_spec_set
20+
from google.cloud import datalabeling_v1beta1 as datalabeling
21+
import pytest
22+
23+
PROJECT_ID = os.getenv('GCLOUD_PROJECT')
24+
25+
26+
@pytest.mark.slow
27+
def test_create_annotation_spec_set(capsys):
28+
response = create_annotation_spec_set.create_annotation_spec_set(
29+
PROJECT_ID)
30+
out, _ = capsys.readouterr()
31+
assert 'The annotation_spec_set resource name:' in out
32+
33+
# Delete the created annotation spec set.
34+
annotation_spec_set_name = response.name
35+
client = datalabeling.DataLabelingServiceClient()
36+
client.delete_annotation_spec_set(annotation_spec_set_name)

datalabeling/create_instruction.py

Lines changed: 93 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,93 @@
1+
#!/usr/bin/env python
2+
3+
# Copyright 2019 Google LLC
4+
#
5+
# Licensed under the Apache License, Version 2.0 (the "License");
6+
# you may not use this file except in compliance with the License.
7+
# You may obtain a copy of the License at
8+
#
9+
# http://www.apache.org/licenses/LICENSE-2.0
10+
#
11+
# Unless required by applicable law or agreed to in writing, software
12+
# distributed under the License is distributed on an "AS IS" BASIS,
13+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
14+
# See the License for the specific language governing permissions and
15+
# limitations under the License.
16+
17+
import argparse
18+
19+
20+
# [START datalabeling_create_instruction_beta]
21+
def create_instruction(project_id, data_type, instruction_gcs_uri):
22+
""" Creates a data labeling PDF instruction for the given Google Cloud
23+
project. The PDF file should be uploaded to the project in
24+
Google Cloud Storage.
25+
"""
26+
from google.cloud import datalabeling_v1beta1 as datalabeling
27+
client = datalabeling.DataLabelingServiceClient()
28+
29+
project_path = client.project_path(project_id)
30+
31+
pdf_instruction = datalabeling.types.PdfInstruction(
32+
gcs_file_uri=instruction_gcs_uri)
33+
34+
instruction = datalabeling.types.Instruction(
35+
display_name='YOUR_INSTRUCTION_DISPLAY_NAME',
36+
description='YOUR_DESCRIPTION',
37+
data_type=data_type,
38+
pdf_instruction=pdf_instruction
39+
)
40+
41+
operation = client.create_instruction(project_path, instruction)
42+
43+
result = operation.result()
44+
45+
# The format of the resource name:
46+
# project_id/{project_id}/instruction/{instruction_id}
47+
print('The instruction resource name: {}\n'.format(result.name))
48+
print('Display name: {}'.format(result.display_name))
49+
print('Description: {}'.format(result.description))
50+
print('Create time:')
51+
print('\tseconds: {}'.format(result.create_time.seconds))
52+
print('\tnanos: {}'.format(result.create_time.nanos))
53+
print('Data type: {}'.format(
54+
datalabeling.enums.DataType(result.data_type).name))
55+
print('Pdf instruction:')
56+
print('\tGcs file uri: {}'.format(
57+
result.pdf_instruction.gcs_file_uri))
58+
59+
return result
60+
# [END datalabeling_create_instruction_beta]
61+
62+
63+
if __name__ == '__main__':
64+
parser = argparse.ArgumentParser(
65+
description=__doc__,
66+
formatter_class=argparse.RawDescriptionHelpFormatter
67+
)
68+
69+
parser.add_argument(
70+
'--project-id',
71+
help='Project ID. Required.',
72+
required=True
73+
)
74+
75+
parser.add_argument(
76+
'--data-type',
77+
help='Data type. Only support IMAGE, VIDEO, TEXT and AUDIO. Required.',
78+
required=True
79+
)
80+
81+
parser.add_argument(
82+
'--instruction-gcs-uri',
83+
help='The URI of Google Cloud Storage of the instruction. Required.',
84+
required=True
85+
)
86+
87+
args = parser.parse_args()
88+
89+
create_instruction(
90+
args.project_id,
91+
args.data_type,
92+
args.instruction_gcs_uri
93+
)
Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,41 @@
1+
#!/usr/bin/env python
2+
3+
# Copyright 2019 Google, Inc
4+
#
5+
# Licensed under the Apache License, Version 2.0 (the "License");
6+
# you may not use this file except in compliance with the License.
7+
# You may obtain a copy of the License at
8+
#
9+
# http://www.apache.org/licenses/LICENSE-2.0
10+
#
11+
# Unless required by applicable law or agreed to in writing, software
12+
# distributed under the License is distributed on an "AS IS" BASIS,
13+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
14+
# See the License for the specific language governing permissions and
15+
# limitations under the License.
16+
17+
import os
18+
19+
import create_instruction
20+
from google.cloud import datalabeling_v1beta1 as datalabeling
21+
import pytest
22+
23+
PROJECT_ID = os.getenv('GCLOUD_PROJECT')
24+
INSTRUCTION_GCS_URI = ('gs://cloud-samples-data/datalabeling'
25+
'/instruction/test.pdf')
26+
27+
28+
@pytest.mark.slow
29+
def test_create_instruction(capsys):
30+
result = create_instruction.create_instruction(
31+
PROJECT_ID,
32+
'IMAGE',
33+
INSTRUCTION_GCS_URI
34+
)
35+
out, _ = capsys.readouterr()
36+
assert 'The instruction resource name: ' in out
37+
38+
# Delete the created instruction.
39+
instruction_name = result.name
40+
client = datalabeling.DataLabelingServiceClient()
41+
client.delete_instruction(instruction_name)

0 commit comments

Comments
 (0)