python google nlp

Getting Started with Google NLP API Using Python

Estimated Read Time: 6 minute(s)
Common Topics: google, nlp, api, text, sentiment

Natural Language Processing (NLP) has been a revolution for search engines and SEO. NLP is the set of methods that lets machines understand human language. This matters because machines now perform the bulk of page evaluation, not humans. Although understanding some of the science behind NLP is useful, today’s tools let you apply NLP without a data-science degree. By understanding how machines interpret our content, we can correct misalignment or ambiguity.

This is a two-part series:

  1.  Process using user-entered text
  2.  Process a comparison between two different web pages

In this intermediate tutorial, I’ll walk you through basic implementations of four of the five Google NLP API features (excluding Syntax). Given a text, we will:

  • Identify Entities and generate salience scores
  • Calculate sentiment scores
  • Calculate sentiment magnitude
  • Categorize text

I recommend reading the Google NLP documentation for instructions on setting up Google Cloud Platform, enabling the NLP API, and configuring authentication: Google NLP documentation.

These scripts include modified portions from Google’s samples — no need to reinvent the wheel.

Requirements and Assumptions

  • Python 3 is installed and you understand basic Python syntax
  • Access to a Linux installation (I recommend Ubuntu) or Google Colab
  • A Google Cloud Platform account
  • NLP API enabled
  • A service account was created and its JSON key downloaded

Import Modules and Set Authentication

We’ll import several modules. In Google Colab they are preinstalled; otherwise, install the Google NLP client library.

  • os – set the environment variable for credentials
  • google.cloud – Google’s NLP modules
  • numpy – used for a dictionary comparison function
  • matplotlib – used for scatter plots
import os
from google.cloud import language_v1
from google.cloud.language_v1 import enums

from google.cloud import language
from google.cloud.language import types

import numpy as np
import matplotlib.pyplot as plt
from matplotlib.pyplot import figure

Next, set the environment variable that points to your credentials JSON file. Google uses this environment variable for authentication. The example assumes Google Colab (upload the file). On Linux (I use Ubuntu), add the following line to ~/.profile or ~/.bashrc and replace “path_to_json_credentials_file” as needed: export GOOGLE_APPLICATION_CREDENTIALS="path_to_json_credentials_file". Keep this JSON file secure.

os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = "path_to_json_credentials_file"

Now we’re ready to use the API. The code below is presented as a single block so you can see it in context. The text_content variable holds the text to analyze. I limit it to 1000 characters (one unit) because Google NLP API pricing is unit-based — avoid pasting very long texts to prevent unexpected charges. Next we initialize the NLP client, specify the document type (plain text), and set a language (optional; the API can auto-detect).

After packaging the request, we send it to Google’s NLP. We loop over the returned entities to print name, type, salience score, and metadata. Salience is rounded to three decimal places; adjust as needed.

Identify Entities

text_content = "The key to successful internet marketing is to make decisions that make sense for your business, your company and your customers. We work with you to build a custom strategy that drives both visits and conversions."

text_content = text_content[0:1000]

client = language_v1.LanguageServiceClient()

type_ = enums.Document.Type.PLAIN_TEXT

language = "en"
document = {"content": text_content, "type": type_, "language": language}

encoding_type = enums.EncodingType.UTF8

response = client.analyze_entities(document, encoding_type=encoding_type)

for entity in response.entities:
    print(u"Entity Name: {}".format(entity.name))

    print(u"Entity type: {}".format(enums.Entity.Type(entity.type).name))

    print(u"Salience score: {}".format(round(entity.salience,3)))

    for metadata_name, metadata_value in entity.metadata.items():
        print(u"{}: {}".format(metadata_name, metadata_value))

    print('\n')

Below is the example entity output for the text (from Wikipedia): “Summerfest, the largest music festival in the world, is also a large economic engine and cultural attraction for the city. In 2018, Milwaukee was named “The Coolest City in the Midwest” by Vogue magazine.”

The salience score measures an entity’s relative importance within the text. If salience scores don’t align with your content goals, adjust the text to steer the model. “MID” stands for machine ID — an identifier for the entity. Entities with mids indicate strong confidence and often correspond to entries in the Google Knowledge Graph.

entities google nlp api

Calculate Sentiment Score

We’ll pass the document to client.analyze_sentiment(), which returns a sentiment score and magnitude. Here we process the score (magnitude is handled next). Scores are rounded to four decimal places. Sentiment score ranges from -1 (most negative) to 1 (most positive). The code maps score ranges to human-readable labels and prints the result. I also visualize the score using a simple scatter plot as a number line, coloring negative scores red and non-negative scores green; adjust as desired.

document = types.Document(
    content=text_content,
    type=enums.Document.Type.PLAIN_TEXT)

sentiment = client.analyze_sentiment(document=document).document_sentiment
sscore = round(sentiment.score,4)
smag = round(sentiment.magnitude,4)

if sscore < 1 and sscore < -0.5:
  sent_label = "Very Negative"
elif sscore < 0 and sscore > -0.5:
  sent_label = "Negative"
elif sscore == 0:
  sent_label = "Neutral"
elif sscore > 0.5:
  sent_label = "Very Positive"
elif sscore > 0 and sscore < 0.5:
  sent_label = "Positive"

print('Sentiment Score: {} is {}'.format(sscore,sent_label))

predictedY =[sscore] 
UnlabelledY=[0,1,0]

if sscore < 0:
    plotcolor = 'red'
else:
    plotcolor = 'green'

plt.scatter(predictedY, np.zeros_like(predictedY),color=plotcolor,s=100)

plt.yticks([])
plt.subplots_adjust(top=0.9,bottom=0.8)
plt.xlim(-1,1)
plt.xlabel('Negative                                                            Positive')
plt.title("Sentiment Attitude Analysis")
plt.show()

Below is the sentiment output for the text we used for the entity analysis above. As you can see, it registers as slightly positive.

sentiment google nlp api

Calculate Sentiment Magnitude

Next we process and visualize sentiment magnitude. Magnitude quantifies the amount of emotional content in the text. The code labels magnitude: 0–1 as no/little emotion, 1–2 as low emotion, and 2+ as high emotion. Larger documents often have larger magnitudes, so adjust these thresholds as needed. The visualization follows the same approach used for the sentiment score.

if smag > 0 and smag < 1:
  sent_m_label = "No Emotion"
elif smag > 2:
  sent_m_label = "High Emotion"
elif smag > 1 and smag < 2:
  sent_m_label = "Low Emotion"

print('Sentiment Magnitude: {} is {}'.format(smag,sent_m_label))

predictedY =[smag] 
UnlabelledY=[0,1,0]

if smag > 0 and smag < 2:
    plotcolor = 'red'
else:
    plotcolor = 'green'

plt.scatter(predictedY, np.zeros_like(predictedY),color=plotcolor,s=100)

plt.yticks([])
plt.subplots_adjust(top=0.9,bottom=0.8)
plt.xlim(0,5)
plt.xlabel('Low Emotion                                                          High Emotion')
plt.title("Sentiment Magnitiude Analysis")
plt.show()

Below is the sentiment magnitude for the text. Values near zero indicate little or neutral emotion.

emotion google nlp api

Calculate Categorization

Category analysis assigns the text to Google-defined categories when the API has sufficient confidence.

response = client.classify_text(document)

for category in response.categories:
    print(u"Category name: {}".format(category.name))
    print(u"Confidence: {}%".format(int(round(category.confidence,3)*100)))

Below is the calculated categorization. If the categories don’t match your intent, adjust the content and try again.

category google nlp api

Here is the Google Colab notebook.

Now you have tools to identify entities, categorize text, and calculate sentiment and magnitude. Part two will show how to apply these NLP tools to web page content rather than pasted text; in part three we’ll compare two web pages. Stay tuned!

Google NLP and Python FAQ

How can I begin using the Google NLP API with Python?

To start using the Google NLP API with Python, set up a Google Cloud project, enable the NLP API, and obtain API credentials. Then install the google-cloud-language Python library.

What Python libraries are commonly used for interacting with the Google NLP API?

The primary Python library for interacting with the Google NLP API is google-cloud-language. Install this library to make API requests and process natural language data.

What tasks can be performed using the Google NLP API with Python?

The API supports various natural language processing tasks, including sentiment analysis, entity recognition, syntax analysis, and content classification. Python scripts can utilize these capabilities for language-related tasks.

Are there any authentication requirements for using the Google NLP API with Python?

Yes. Obtain API credentials (for example, a service account key) and set the required environment variables in your Python script to authenticate requests.

Where can I find comprehensive documentation and examples for using the Google NLP API with Python?

Refer to the official Google Cloud documentation for the NLP API, which includes detailed guides, reference material, and Python examples to help you get started.

Greg Bernhardt
Follow me