Skip to content

Add initial Sentiment Analysis tutorial to Natural Language samples #524

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 8 commits into from

Conversation

manshreck
Copy link

This sample will be used within the Sentiment Analysis Tutorial at:

https://cloud.google.com/natural-language/docs/sentiment-tutorial

This pull does not yet include [SECTION] tags, which I will add subsequently.

@googlebot
Copy link

We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for the commit author(s). If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google.

@googlebot googlebot added the cla: no This human has *not* signed the Contributor License Agreement. label Sep 14, 2016
## Download the Code

```
git clone sso://python-dev-samples/language/sentiment/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

git clone https://github.com/GoogleCloudPlatform/python-docs-samples.git

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -0,0 +1,27 @@
# Introduction
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The file should have an .md extension

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


This sample contains the code referenced in the
[Sentiment Analysis Tutorial](http://cloud.google.com/natural-language/docs/sentiment-tutorial)
within the Natural Language API Documentation. A full walkthrough of this sample
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First mention of the product should use the entire (verbose) name

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

git clone sso://python-dev-samples/language/sentiment/
```

### Usage
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should probably add a section for creating a virtualenv and installing the requirements. example

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this really necessary? I don't see that in a lot of other samples, and I think it will add a lot of implementation detail that's unnecessary. For complex samples, using virtualenv and a requirements.txt file is good practice, but I don't quite agree it should be always the case.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, every sample should use a virtualenv. Not using a virtualenv is like going out in the rain without an umbrella.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you show me a README that you wish to use as the canonical README for that purpose?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

http://go/python-samples-guide#canonical-sample has apparently been factored away, but you can check out python-docs-samples/storage for its README

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also added requirements.txt

@@ -0,0 +1,20 @@
I really wanted to love 'Bladerunner' but ultimately I couldn't get
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you rename the sentiment dir to resources, for consistency with the other samples

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "sentiment" dir refers to this tutorial, and I'd like to keep it named for what it is. Did you mean move the test files into a subdirectory: "sentiment/resources"? I can do that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No - for all the other python samples, files used by the code for testing, etc, are in a directory named resources. Unless there's a good reason to deviate, having consistency between our samples is valuable. Can you update your tutorial to say "resources" instead of "sentiment"?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're missing my point. All of these files are already in a "language/sentiment" directory. I'm not renaming that to "language/resources." If you want me to put these text files into a subdirectory so that they are in a "language/sentiment/resources" directory, that I can do.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, language/sentiment/resources would be preferable.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

http = httplib2.Http()
credentials.authorize(http)

service = discovery.build('language', 'v1beta1', http=http)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can just do:

credentials = GoogleCredentials.get_application_default()
service = discovery.build('language', 'v1beta1', credentials=credentials)

The python client lib will take care of adding the right scope, and authorizing the http object.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer the former way even though it is more steps, because it emphasizes that you first get credentials, and then authorize them explicitly via http (and that's how I document it in the tutorial).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use the idiomatic way that @jerjou suggested, there's absolutely no reason to expose the http client to the user, and we're actively trying to remove httplib2. The less samples I have to clean up after that, the better.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

try:
polarity = response['documentSentiment']['polarity']
magnitude = response['documentSentiment']['magnitude']
except KeyError:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just let the KeyError propagate up. Error handling doesn't really provide much educational value IMO..

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do, but there are competing philosophies about this. I, for one, don't like to see typical Python stack traces for known error cases, even if they aren't handled.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah - I think it comes down to trading off teaching about best practices, vs keeping the signal / noise ratio high (where 'signal' is 'stuff that's relevant to the api I'm trying to learn about'). My feeling is that sample code should be as focused as is prudent on the specific thing you're trying to teach, so the user doesn't have to wade through boilerplate to get at the point. More general best practices can be covered in other venues, where they can be given the exposition they deserve.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ackknowledged.

polarity = response['documentSentiment']['polarity']
magnitude = response['documentSentiment']['magnitude']
except KeyError:
print("The response did not contain the expected fields.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: prefer single quotes over double quotes

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

except KeyError:
print("The response did not contain the expected fields.")
print('Sentiment: polarity of %s with magnitude of %s'
% (polarity, magnitude))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefer using the 'string {}'.format(arg) syntax

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done



if __name__ == '__main__':
parser = argparse.ArgumentParser()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a module-level docstring and set that as the description. See http://go/python-sample-guide#argparse-section

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@manshreck manshreck removed the cla: no This human has *not* signed the Contributor License Agreement. label Sep 15, 2016
@googlebot googlebot added the cla: no This human has *not* signed the Contributor License Agreement. label Sep 15, 2016
@googlebot
Copy link

CLAs look good, thanks!

@googlebot googlebot added cla: yes This human has signed the Contributor License Agreement. and removed cla: no This human has *not* signed the Contributor License Agreement. labels Sep 15, 2016
@googlebot
Copy link

We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for the commit author(s). If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google.

@googlebot googlebot added cla: no This human has *not* signed the Contributor License Agreement. and removed cla: yes This human has signed the Contributor License Agreement. labels Sep 15, 2016

```
$ gcloud init
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I think this won't work. Does gcloud init set up a service account for you? IIUC, without setting up a service account, this will attempt to use the gcloud auth, which doesn't have the api enabled.


## Prerequisites

1. Install the [Google Cloud SDK](https://cloud.google.com/sdk/), including the [gcloud tool](https://cloud.google.com/sdk/gcloud/), and [gcloud app component](https://cloud.google.com/sdk/gcloud-app).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This step isn't necessary anymore

Copy link
Contributor

@jerjou jerjou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, after removing step 1

@jerjou
Copy link
Contributor

jerjou commented Sep 16, 2016

@dpebot could you kindly merge this pull request once travis has given its approval? Many thanks! Cheerio and all the best in all your endeavors!

@theacodes
Copy link
Contributor

@jerjou I'm gonna have to add that to his trigger phrases.

@theacodes
Copy link
Contributor

@dpebot merge when travis passes

#triggered.

@dpebot
Copy link
Collaborator

dpebot commented Sep 16, 2016

Okay! I'll merge when all statuses are green.

@dpebot dpebot added the automerge Merge the pull request once unit tests and other checks pass. label Sep 16, 2016
@jerjou
Copy link
Contributor

jerjou commented Sep 17, 2016

Still can't merge it :-(

@theacodes
Copy link
Contributor

@jerjou I merged your other PR, is this one still needed?

@theacodes theacodes closed this Sep 19, 2016
@jerjou
Copy link
Contributor

jerjou commented Sep 19, 2016

Yeah - this one can be closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
automerge Merge the pull request once unit tests and other checks pass. cla: no This human has *not* signed the Contributor License Agreement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants