27 Jul 2023

OpenAI/GPT: Returning consistent/valid JSON from a prompt

When using OpenAI it can be tricky to get it to return a consistent response for a prompt. In this blog post, we’re going to learn how to use functions to return a consistent JSON format for a basic sentiment analysis prompt.

I’ve created a video showing how to do this on my YouTube channel, Learn Data with Mark, so if you prefer to consume content through that medium, I’ve embedded it below:

Let’s start by installing OpenAI and Pandas:

BASH pip install openai pandas

And now, we’ll open a Python REPL and import the following libraries:

PYTHON import openai
import json
import pandas as pd

We’re going to analyse the sentiment of some reviews from goodreads about The Culture Map book. I’ve put these into a CSV file, which we can load using the following code:

PYTHON reviews_df = pd.read_csv("reviews.csv")
reviews = reviews_df['review'].tolist()
reviews

Output

["Possibly the worst book I've ever read.It's a huge collection of biases for all the possible countries and cultures. The whole book is structured with examples like: if you are working with Chinese people, you should take this approach, instead if your team is composed by German people you should do this etc....",
 'A book full of oversimplifications, generalisations and self-contradiction. Plus many of the examples felt simply made up. Although it had one or two good ideas thrown in there, I am honestly not sure if this book can hardly help anyone.',
 'I had it on my recommendations list for a long time, but my impression was always like: "damn, I don\'t need a book on cultural differences; I\'ve worked in many international enterprises, I have been trained, I have practical experience - it would be just a waste of time". In the end, it wasn\'t (a waste of time).',
 'Candidate for the best book I have read in 2016 unless another one can beat it. The author made is fun to read with great examples that I could easily relate to.',
 'A practical and comprehensive guide to how different cultures should be approached regarding business relations, but it can also be used outside of that.',
 'The book was OK. It offers a good overview of differences between cultures. Sometimes we may assume that 2 cultures are similar, but in the end there is a possibility of conflict, because they have different "mentality" on a certain point (trust or time perception, for instance). But Erin often limits herself to personal stories and doesn\'t cite almost any researcher or study.']

We’re then going to write a function that computes the sentiment for a list of reviews using OpenAI:

PYTHON def analyse_reviews(user_input):
    prompt = f""" (1)
    {user_input}
    Analyse the sentiment of the reviews above and return a JSON array as the result.
Provide sentiment on a scale of 1-100?
The JSON must have these fields: sentiment, sentiment_score.
    """
    completion = openai.ChatCompletion.create( (2)
        model="gpt-3.5-turbo",
        messages=[
            {"role": "system", "content": "You are a helpful review analysis tool."},
            {"role": "user", "content": prompt},
        ]
    )
    try:
        generated_text = completion.choices[0].message.content (3)
        return json.loads(generated_text)
    except Exception as e:
        print(f"An error occurred: {e}")
        return None

1	Create a prompt asking GPT to return the sentiment of provided reviews
2	Create and execute prompt
3	Parse result

We can then run the prompt like this:

PYTHON analyse_reviews(reviews)

If we run that a few times, we’ll likely get a different result each time:

Output

{'sentiment': 'mixed', 'sentiment_score': 64.2}

Output

[{'sentiment': 'negative', 'sentiment_score': 15},
 {'sentiment': 'negative', 'sentiment_score': 25},
 {'sentiment': 'positive', 'sentiment_score': 70},
 {'sentiment': 'positive', 'sentiment_score': 85},
 {'sentiment': 'positive', 'sentiment_score': 90},
 {'sentiment': 'neutral', 'sentiment_score': 50}]

Output

{'reviews': [{'sentiment': 'negative', 'sentiment_score': 12},
  {'sentiment': 'negative', 'sentiment_score': 22},
  {'sentiment': 'neutral', 'sentiment_score': 50},
  {'sentiment': 'positive', 'sentiment_score': 85},
  {'sentiment': 'positive', 'sentiment_score': 70},
  {'sentiment': 'neutral', 'sentiment_score': 65}]}

Output

{'reviews': [{'review': "Possibly the worst book I've ever read.It's a huge collection of biases for all the possible countries and cultures. The whole book is structured with examples like: if you are working with Chinese people, you should take this approach, instead if your team is composed by German people you should do this etc....",
   'sentiment': 'negative',
   'sentiment_score': 15},
  {'review': 'A book full of oversimplifications, generalisations and self-contradiction. Plus many of the examples felt simply made up. Although it had one or two good ideas thrown in there, I am honestly not sure if this book can hardly help anyone.',
   'sentiment': 'negative',
   'sentiment_score': 20},
  {'review': 'I had it on my recommendations list for a long time, but my impression was always like: "damn, I don\'t need a book on cultural differences; I\'ve worked in many international enterprises, I have been trained, I have practical experience - it would be just a waste of time". In the end, it wasn\'t (a waste of time).',
   'sentiment': 'positive',
   'sentiment_score': 80}]}

These are just 4 of the responses that I saw after running this multiple times. Either way, we need a more deterministic response.

Lucky for us, OpenAI recently added the concept of functions. With functions, you can tell OpenAI that you’re going to pass its response to a function that takes in a specific JSON structure. We aren’t going to pass the result to a function, but we’ll co-opt this functionality to get ourselves a consistent response!

Using this functionality, we end up with the following function:

PYTHON def analyse_reviews(user_input):
    prompt = f"""
    {user_input}
    Analyse the sentiment of the reviews above and return a JSON array as the result.
Provide sentiment on a scale of 1-100?
The JSON must have these fields: sentiment, sentiment_score.
    """
    completion = openai.ChatCompletion.create(
        model="gpt-3.5-turbo",
        messages=[
            {"role": "system", "content": "You are a helpful review analysis tool."},
            {"role": "user", "content": prompt},
        ],
        functions=[{"name": "dummy_fn_set_sentiment", "parameters": { (1)
          "type": "object",
          "properties": { (2)
            "sentiments": {
              "type": "array",
              "items": {
                "type": "object",
                "properties": {
                  "sentiment": {"type": "string", "description": "Sentiment of the review"},
                  "sentiment_score": {"type": "integer","description": "Score between 1-100 of the sentiment"}
                }
              }
            }
          }
        }}],
    )
    try:
        generated_text = completion.choices[0].message.function_call.arguments
        return json.loads(generated_text)
    except Exception as e:
        print(f"An error occurred: {e}")
        return None

1	Specify a function name (we’ll use a made-up name since we aren’t going to call it)
2	Define the arguments expected by this function in JSON schema format.

We can then call this function the same way as we did with the other one:

PYTHON sentiments = analyse_reviews(reviews)
sentiments

Output

{'sentiments': [{'sentiment': 'negative', 'sentiment_score': 30},
  {'sentiment': 'negative', 'sentiment_score': 25},
  {'sentiment': 'positive', 'sentiment_score': 70},
  {'sentiment': 'positive', 'sentiment_score': 90},
  {'sentiment': 'positive', 'sentiment_score': 80},
  {'sentiment': 'neutral', 'sentiment_score': 50}]}

We won’t get the same sentiment or sentiment_score each time, but the structure will be consistent. Finally, let’s put everything together into a nice DataFrame:

PYTHON sentiment_df = pd.DataFrame(sentiments["sentiments"])
result = pd.concat([reviews_df, sentiment_df], axis=1)
pd.set_option('max_colwidth', 100)
result

Table 1. Output
review	sentiment	sentiment_score
Possibly the worst book I’ve ever read.It’s a huge collection of biases for all the possible cou…	negative	30
A book full of oversimplifications, generalisations and self-contradiction. Plus many of the exa…	negative	25
I had it on my recommendations list for a long time, but my impression was always like: "damn, I…	positive	70
Candidate for the best book I have read in 2016 unless another one can beat it. The author made …	positive	90
A practical and comprehensive guide to how different cultures should be approached regarding bus…	positive	80
The book was OK. It offers a good overview of differences between cultures. Sometimes we may ass…	neutral	50

About the author

I'm currently working on short form content at ClickHouse. I publish short 5 minute videos showing how to solve data problems on YouTube @LearnDataWithMark. I previously worked on graph analytics at Neo4j, where I also co-authored the O'Reilly Graph Algorithms Book with Amy Hodler.