Skip to content

Add sentiment analysis sample #533

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Sep 19, 2016
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions language/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,5 +13,9 @@ to extract text from images, then uses the NL API to extract entity information
from those texts, and stores the extracted information in a database in support
of further analysis and correlation.

- [sentiment](sentiment) contains the [Sentiment Analysis
Tutorial](https://cloud.google.com/natural-language/docs/sentiment-tutorial)
code as used within the documentation.

- [syntax_triples](syntax_triples) uses syntax analysis to find
subject-verb-object triples in a given piece of text.
48 changes: 48 additions & 0 deletions language/sentiment/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
# Introduction

This sample contains the code referenced in the
[Sentiment Analysis Tutorial](http://cloud.google.com/natural-language/docs/sentiment-tutorial)
within the Google Cloud Natural Language API Documentation. A full walkthrough of this sample
is located within the documentation.

This sample is a simple illustration of how to construct a sentiment analysis
request and process a response using the API.

## Prerequisites

Set up your
[Cloud Natural Language API project](https://cloud.google.com/natural-language/docs/getting-started#set_up_a_project)
, which includes:

* Enabling the Natural Language API
* Setting up a service account
* Ensuring you've properly set up your `GOOGLE_APPLICATION_CREDENTIALS` for proper
authentication to the service.

## Download the Code

```
$ git clone https://github.com/GoogleCloudPlatform/python-dev-samples/language/sentiment/
$ cd python-docs-samples/language/sentiment
```

## Run the Code

Open a sample folder, create a virtualenv, install dependencies, and run the sample:

```
$ virtualenv env
$ source env/bin/activate
(env)$ pip install -r requirements.txt
```

### Usage

This sample provides four sample movie reviews which you can
provide to the sample on the command line. (You can also
pass your own text files.)

```
(env)$ python sentiment_analysis.py textfile.txt
Sentiment: polarity of -0.1 with magnitude of 6.7
```
2 changes: 2 additions & 0 deletions language/sentiment/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
google-api-python-client==1.5.3

20 changes: 20 additions & 0 deletions language/sentiment/resources/mixed.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
I really wanted to love 'Bladerunner' but ultimately I couldn't get
myself to appreciate it fully. However, you may like it if you're into
science fiction, especially if you're interested in the philosophical
exploration of what it means to be human or machine. Some of the gizmos
like the flying cars and the Vouight-Kampff machine (which seemed very
steampunk), were quite cool.

I did find the plot pretty slow and but the dialogue and action sequences
were good. Unlike most science fiction films, this one was mostly quiet, and
not all that much happened, except during the last 15 minutes. I didn't
understand why a unicorn was in the movie. The visual effects were fantastic,
however, and the musical score and overall mood was quite interesting.
A futurist Los Angeles that was both highly polished and also falling apart
reminded me of 'Outland.' Certainly, the style of the film made up for
many of its pedantic plot holes.

If you want your sci-fi to be lasers and spaceships, 'Bladerunner' may
disappoint you. But if you want it to make you think, this movie may
be worth the money.

4 changes: 4 additions & 0 deletions language/sentiment/resources/neg.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
What was Hollywood thinking with this movie! I hated,
hated, hated it. BORING! I went afterwards and demanded my money back.
They refused.

3 changes: 3 additions & 0 deletions language/sentiment/resources/neutral.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
I neither liked nor disliked this movie. Parts were interesting, but
overall I was left wanting more. The acting was pretty good.

11 changes: 11 additions & 0 deletions language/sentiment/resources/pos.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
`Bladerunner` is often touted as one of the best science fiction films ever
made. Indeed, it satisfies many of the requisites for good sci-fi: a future
world with flying cars and humanoid robots attempting to rebel against their
creators. But more than anything, `Bladerunner` is a fantastic exploration
of the nature of what it means to be human. If we create robots which can
think, will they become human? And if they do, what makes us unique? Indeed,
how can we be sure we're not human in any case? `Bladerunner` explored
these issues before such movies as `The Matrix,' and did so intelligently.
The visual effects and score by Vangelis set the mood. See this movie
in a dark theatre to appreciate it fully. Highly recommended!

54 changes: 54 additions & 0 deletions language/sentiment/sentiment_analysis.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
# Copyright 2016, Google, Inc.
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

'''Demonstrates how to make a simple call to the Natural Language API'''

import argparse
from googleapiclient import discovery
from oauth2client.client import GoogleCredentials


def main(movie_review_filename):
'''Run a sentiment analysis request on text within a passed filename.'''

credentials = GoogleCredentials.get_application_default()
service = discovery.build('language', 'v1beta1', credentials=credentials)

with open(movie_review_filename, 'r') as review_file:
service_request = service.documents().analyzeSentiment(
body={
'document': {
'type': 'PLAIN_TEXT',
'content': review_file.read(),
}
}
)
response = service_request.execute()

polarity = response['documentSentiment']['polarity']
magnitude = response['documentSentiment']['magnitude']

print('Sentiment: polarity of {} with magnitude of {}'.format(
polarity, magnitude))
return 0


if __name__ == '__main__':
parser = argparse.ArgumentParser(
description=__doc__,
formatter_class=argparse.RawDescriptionHelpFormatter)
parser.add_argument(
'movie_review_filename',
help='The filename of the movie review you\'d like to analyze.')
args = parser.parse_args()
main(args.movie_review_filename)
46 changes: 46 additions & 0 deletions language/sentiment/sentiment_analysis_test.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
# Copyright 2016, Google, Inc.
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

import re
from sentiment_analysis import main


def test_pos(resource, capsys):
main(resource('pos.txt'))
out, err = capsys.readouterr()
polarity = float(re.search('polarity of (.+?) with', out).group(1))
magnitude = float(re.search('magnitude of (.+?)', out).group(1))
assert polarity * magnitude > 0


def test_neg(resource, capsys):
main(resource('neg.txt'))
out, err = capsys.readouterr()
polarity = float(re.search('polarity of (.+?) with', out).group(1))
magnitude = float(re.search('magnitude of (.+?)', out).group(1))
assert polarity * magnitude < 0


def test_mixed(resource, capsys):
main(resource('mixed.txt'))
out, err = capsys.readouterr()
polarity = float(re.search('polarity of (.+?) with', out).group(1))
assert polarity <= 0.3
assert polarity >= -0.3


def test_neutral(resource, capsys):
main(resource('neutral.txt'))
out, err = capsys.readouterr()
magnitude = float(re.search('magnitude of (.+?)', out).group(1))
assert magnitude <= 2.0