Skip to content

[video] Streaming AutoML Classification #2313

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Aug 26, 2019
Merged
Show file tree
Hide file tree
Changes from 7 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
77 changes: 77 additions & 0 deletions video/cloud-client/analyze/beta_snippets.py
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,9 @@

python beta_snippets.py streaming-annotation-storage resources/cat.mp4 \
gs://mybucket/myfolder

python beta_snippets.py streaming-automl-classification \
resources/cat.mp4 projects/myproject/location/mylocation/model/mymodel
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit:

Thoughts on changing:
projects/myproject/location/mylocation/model/mymodel
To:
projects/[PROJECT_ID]/location/[LOCATION]/model/[MODEL_NAME]

Or something like the above to help show which things need to be changed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated. [LOCATION] needs to be hard coded to us-central1 because it is the only region that supports this feature at the moment.

"""

import argparse
Expand Down Expand Up @@ -629,6 +632,72 @@ def stream_generator():
# [END video_streaming_annotation_to_storage_beta]


def streaming_automl_classification(path, model_path):
# [START video_streaming_automl_classification_beta]
import io

from google.cloud import videointelligence_v1p3beta1 as videointelligence
from google.cloud.videointelligence_v1p3beta1 import enums

# path = 'path_to_file'
# model_path = 'projects/project_id/locations/location_id/models/model_id'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we ask for project_id and model_id to be passed in separately?

This is how we traditionally do samples which require a project path :)

Python has helper methods for creating these long model resource paths, e.g.

client.model_path("project id", "us-central1", "model id")

We have pre-existing examples for samples that require project ID (in "resource paths" like this) for: Product Search and Translation (v3 translation has Glossary objects with resource paths)

See Product Search example:
https://github.com/GoogleCloudPlatform/python-docs-samples/blob/master/vision/cloud-client/product_search/product_set_management.py#L39

Excerpt:

# [START vision_product_search_create_product_set]
def create_product_set(
        project_id, location, product_set_id, product_set_display_name):
    """Create a product set.
    Args:
        project_id: Id of the project.
        location: A compute region name.
        product_set_id: Id of the product set.
        product_set_display_name: Display name of the product set.
    """
    client = vision.ProductSearchClient()

    # A resource that represents Google Cloud Platform location.
    location_path = client.location_path(
        project=project_id, location=location)

    # Create a product set with the product set specification in the region.
    product_set = vision.types.ProductSet(
            display_name=product_set_display_name)

    # The response is the product set with `name` populated.
    response = client.create_product_set(
        parent=location_path,
        product_set=product_set,
        product_set_id=product_set_id)

    # Display the product set information.
    print('Product set name: {}'.format(response.name))
# [END vision_product_search_create_product_set]

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are missing a method that constructs the model path from the video intelligence client. How about I use string formatting here instead?

['SERVICE_ADDRESS',
 '_INTERFACE_NAME',
 '__class__',
 '__delattr__',
 '__dict__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__le__',
 '__lt__',
 '__module__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 '__weakref__',
 '_client_info',
 '_inner_api_calls',
 '_method_configs',
 'enums',
 'from_service_account_file',
 'from_service_account_json',
 'streaming_annotate_video',
 'transport']


client = videointelligence.StreamingVideoIntelligenceServiceClient()

# Here we use classification as an example.
automl_config = (videointelligence.types
.StreamingAutomlClassificationConfig(
model_name=model_path))

video_config = videointelligence.types.StreamingVideoConfig(
feature=enums.StreamingFeature.STREAMING_AUTOML_CLASSIFICATION,
automl_classification_config=automl_config)

# config_request should be the first in the stream of requests.
config_request = videointelligence.types.StreamingAnnotateVideoRequest(
video_config=video_config)

# Set the chunk size to 5MB (recommended less than 10MB).
chunk_size = 5 * 1024 * 1024

# Load file content.
stream = []
with io.open(path, 'rb') as video_file:
while True:
data = video_file.read(chunk_size)
if not data:
break
stream.append(data)

def stream_generator():
yield config_request
for chunk in stream:
yield videointelligence.types.StreamingAnnotateVideoRequest(
input_content=chunk)

requests = stream_generator()

# streaming_annotate_video returns a generator.
# The default timeout is about 300 seconds.
# To process longer videos it should be set to
# larger than the length (in seconds) of the stream.
responses = client.streaming_annotate_video(requests, timeout=600)

for response in responses:
# Check for errors.
if response.error.message:
print(response.error.message)
break

for label in response.annotation_results.label_annotations:
for frame in label.frames:
print("At {:3d}s segment, {:5.1%} {}".format(
frame.time_offset.seconds,
frame.confidence,
label.entity.entity_id))
# [END video_streaming_automl_classification_beta]


if __name__ == '__main__':
parser = argparse.ArgumentParser(
description=__doc__,
Expand Down Expand Up @@ -678,6 +747,12 @@ def stream_generator():
video_streaming_annotation_to_storage_parser.add_argument('path')
video_streaming_annotation_to_storage_parser.add_argument('output_uri')

video_streaming_automl_classification_parser = subparsers.add_parser(
'streaming-automl-classification',
help=streaming_automl_classification.__doc__)
video_streaming_automl_classification_parser.add_argument('path')
video_streaming_automl_classification_parser.add_argument('model_path')

args = parser.parse_args()

if args.command == 'transcription':
Expand All @@ -700,3 +775,5 @@ def stream_generator():
detect_explicit_content_streaming(args.path)
elif args.command == 'streaming-annotation-storage':
annotation_to_storage_streaming(args.path, args.output_uri)
elif args.command == 'streaming-automl-classification':
streaming_automl_classification(args.path, args.model_path)
15 changes: 14 additions & 1 deletion video/cloud-client/analyze/beta_snippets_test.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
#!/usr/bin/env python

# Copyright 2017 Google, Inc
# Copyright 2019 Google, Inc
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
Expand All @@ -16,6 +16,7 @@

from six.moves.urllib.request import urlopen
import time
import os
import uuid

import beta_snippets
Expand Down Expand Up @@ -160,3 +161,15 @@ def test_track_objects_gcs():
assert text_exists
assert object_annotations[0].frames[0].normalized_bounding_box.left >= 0.0
assert object_annotations[0].frames[0].normalized_bounding_box.left <= 1.0


@pytest.mark.slow
def test_streaming_automl_classification(capsys, in_file):
model_id = 'VCN6363999689846554624'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@anguillanneuf, responding here, but for modelid I'm not sure if there is a better way.

But for project_id you can typically use:

import os
os.environ['GCLOUD_PROJECT`]

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This still shows a literal model_id identifier here, I don't see it updated to use an environment variable

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I'm late to the party here but I just noticed that in_file isn't defined in this test - it's defined in test_track_objects_gcs() - should its definition also be added to this function so that we're following the practice of making sure tests can run alone and/or in any order? @anguillanneuf

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Leah for catching this! I had meant to use 'video_path' instead of 'in_file' because 'video_path' can be used as a fixture. Addressing it now.

model_path = 'projects/{}/locations/us-central1/models/{}'.format(
os.environ['GCLOUD_PROJECT'],
model_id,
)
beta_snippets.streaming_automl_classification(in_file, model_path)
out, _ = capsys.readouterr()
assert 'brush_hair' in out
2 changes: 1 addition & 1 deletion video/cloud-client/analyze/requirements.txt
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
google-cloud-videointelligence==1.8.0
google-cloud-videointelligence==1.11.0
google-cloud-storage==1.14.0
Binary file modified video/cloud-client/analyze/resources/googlework_short.mp4
Binary file not shown.