Skip to content

Speech transcribe async bug #688

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Nov 30, 2016
Merged

Speech transcribe async bug #688

merged 11 commits into from
Nov 30, 2016

Conversation

ryanmats
Copy link
Contributor

Fixed bugs for Speech API GRPC sample code:

  • Issues with import statements
  • Changed default encoding choice from FLAC to LINEAR16. Added code to print error message if there was an error with the transcribe_async long running operation. (As per Jerjou's suggestion)
  • Added tests for GRPC transcribe and transcribe_async samples. Code uploads the audio.raw blob to the default CLOUD_STORAGE_BUCKET, tests sample code using this blob's URI, and deletes the blob at the end.

@googlebot googlebot added the cla: yes This human has signed the Contributor License Agreement. label Nov 29, 2016
@theacodes
Copy link
Contributor

theacodes commented Nov 29, 2016

@ryanmats all of the speech test resources should be pre-uploaded, so there should be no need to upload/delete the blob every time (see scripts/prepare-testing-project.sh L47).

@theacodes
Copy link
Contributor

Other than the note about the tests, this LGTM. @jerjou can you take a quick look before we merge?

if operation.error.message:
print("")
print("Operation error:")
print(operation.error)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What? Jon didn't wasn't like, "Please use single quotes for strings"?? WHO ARE YOU AND WHAT HAVE YOU DONE TO @jonparrott ?!? :)

I'd just combine this into one string, for brevity:

print('\nOperation error:\n{}'.format(operation.error))

We should be paid proportional to like, the inverse of lines of code :) Except then we'd be incentivized to not write any code at all... hm...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I blame my lack of sleep.

assert re.search(r'how old is the Brooklyn Bridge', out, re.DOTALL | re.I)

# Delete audio file from storage bucket
blob.delete()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 just have the asset on the GCS bucket already :)

But for future reference, if you're going to do this, should probably put this in a context manager (so that eg if the assertion fails, the blob is still deleted), or in one of those fancy pytest fixtures (same idea)

@jerjou
Copy link
Contributor

jerjou commented Nov 29, 2016

Minor comments. Mainly just Jon's thing :) Otherwise, looks good.

@ryanmats
Copy link
Contributor Author

@jonparrott @jerjou I pushed the changes based on your code review comments. Also updated the speech sample audio filename in scripts/prepare-testing-project.sh.

@@ -44,7 +44,7 @@ echo "Creating pubsub resources."
gcloud alpha pubsub topics create gae-mvm-pubsub-topic

echo "Creating speech resources."
gsutil cp speech/api/resources/audio.flac gs://$GCLOUD_PROJECT/speech/
gsutil cp speech/api-client/resources/audio.raw gs://$GCLOUD_PROJECT/speech/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there other tests that depend on audio.flac?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just searched through the repo and it doesn't look like there are. Also, the audio.flac file doesn't exist in that directory anymore.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might want to set -e on the top of this file, so that the build will fail in the future when this happens..

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(doesn't have to be in this CL - just saying it'd probably be a good idea..)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed in this PR

def test_main(resource, capsys):

# Run the transcribe_async sample on audio.raw, verify correct results
speech_storage_uri = 'gs://' + os.environ.get('CLOUD_STORAGE_BUCKET')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use the cloud_config or remote_resource fixture.

@@ -26,12 +26,15 @@ def __init__(self, **kwargs):
@pytest.fixture(scope='session')
def cloud_config():
"""Provides a configuration object as a proxy to environment variables."""
speech_storage_uri = 'gs://' + os.environ.get('CLOUD_STORAGE_BUCKET')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ryanmats there's already the storage_bucket thing here, there isn't any need to add another one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

def test_main(resource, capsys, cloud_config):

# Run the transcribe sample on audio.raw, verify correct results
speech_storage_uri = 'gs://' + cloud_config.storage_bucket
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lol okay, I promise I'm not doing this just to annoy you:

Prefer:

storage_uri = 'gs://{}/speech/audio.raw'.format(cloud_config.storage_bucket)

(Here and in the other test file).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@theacodes theacodes merged commit ff54b24 into master Nov 30, 2016
@theacodes theacodes deleted the speech-transcribe-async-bug branch November 30, 2016 21:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes This human has signed the Contributor License Agreement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants