Workspace
Jorge Verdejo Allende/

Duplicate of Live training | 2023-03-31 | Getting Started with the OpenAI API and ChatGPT

0
Beta
Spinner

Live training | 2023-03-31 | Getting started with the OpenAI API and GPT-3

While chatting with the GPT AI is commonly done via the web interface, today we are looking at using the API. This is great for programming and automation workflows (some ideas later). We'll cover:

  • Getting set up with an OpenAI developer account and integration with Workspace.
  • Calling the chat functionality in the OpenAI API.
  • Extracting the response text.
  • Holding a longer conversation.
  • Combining the OpenAI API with other APIs.

Before you begin

Create a developer account with OpenAI

  1. Go to the API signup page.

  2. Create your account (you'll need to provide your email address and your phone number).

  1. Go to the API keys page.

  2. Create a new secret key.

  1. Take a copy of it. (If you lose it, delete the key and create a new one.)

Add a payment method

OpenAI sometimes provides free credits for the API, but it's not clear if that is worldwide or what the conditions are. You may need to add debit/credit card details.

The API costs $0.002 / 1000 tokens for GPT-3.5-turbo. 1000 tokens is about 750 words. Today's session should cost less than 2 US cents (but if you rerun tasks, you will be charged every time).

  1. Go to the Payment Methods page.

  2. Click Add payment method.

  1. Fill in your card details.

Set up a Workspace integration

  1. In Workspace, click on Integrations.
  1. Click on the "Create integration" plus button.
  1. Select an "Environment Variables" integration.
  1. In the "Name" field, type "OPENAI". In the "Value" field, paste in your secret key.
  1. Click "Create", and connect the new integration.

Task 0: Setup

To use GPT, we need to import the os and openai packages, and some functions from IPython.display to render Markdown. A later task will also use Yahoo! Finance data via the yfinance package.

We also need to put the environment variable we just created in a place that the openai package can see it.

Instructions

  • Import the os package.
  • Import the openai package.
  • Import the yfinance package with the alias yf.
  • From the IPython.display package, import display and Markdown.
  • Set openai.api_key to the OPENAI environment variable.
# Import the os package


# Import the openai package


# Import yfinance as yf


# From the IPython.display package, import display and Markdown


# Set openai.api_key to the OPENAI environment variable

Task 1: Get GPT to create a dataset

It's time to chat! Having a conversation with GPT involves a single function call of this form.

response = openai.ChatCompletion.create( model="MODEL_NAME", messages=[ {"role": "system", "content": 'SPECIFY HOW THE AI ASSISTANT SHOULD BEHAVE'}, {"role": "user", "content": 'SPECIFY WANT YOU WANT THE AI ASSISTANT TO SAY'} ] )

There are a few things to unpack here.

The model names are listed in the Model Overview page of the developer documentation. Today we'll be using gpt-3.5-turbo, which is the latest model used by ChatGPT that has broad public API access.

If you have access to GPT-4, you can use that instead by setting model="gpt-4", though note that the price is 15 times higher.

There are three types of message, documented in the Introduction to the Chat documentation:

  • system messages describe the behavior of the AI assistant. If you don't know what you want, try "You are a helpful assistant".
  • user messages describe what you want the AI assistant to say. We'll cover examples of this today.
  • assistant messages describe previous responses in the conversation. We'll cover how to have an interactive conversation in later tasks.

The first message should be a system message. Additional messages should alternate between user and assistant.

Pro Tip

GPT-4 is more "steerable" than GPT-3.5-turbo. That means that it can play a wider range of roles more convincingly. For API usage, it means that the system message has a larger effect on the output conversation in GPT-4 compared to GPT-3.5-turbo.

Pro tip

If you are worried about the price of API calls, you can also set a max_tokens argument to limit the amount of output created.

Instructions

  • Define the system message, system_msg as

'You are a helpful assistant who understands data science.'

  • Define the user message, user_msg as:

'Create a small dataset of data about people. The format of the dataset should be a data frame with 5 rows and 3 columns. The columns should be called "name", "height_cm", and "eye_color". The "name" column should contain randomly chosen first names. The "height_cm" column should contain randomly chosen heights, given in centimeters. The "eye_color" column should contain randomly chosen eye colors, taken from a choice of "brown", "blue", and "green". Provide Python code to generate the dataset, then provide the output in the format of a markdown table.'

  • Ask GPT to create a dataset using the gpt-3.5-turbo model. Assign to response.
# Define the system message


# Define the user message


# Create a dataset using GPT

Task 2: Check the response is OK

API calls are "risky" because problems can occur outside of your notebook, like internet connectivity issues, or a problem with the server sending you data, or because you ran out of API credit. You should check that the response you get is OK.

GPT models return a status code with one of four values, documented in the Response format section of the Chat documentation.

  • stop: API returned complete model output
  • length: Incomplete model output due to max_tokens parameter or token limit
  • content_filter: Omitted content due to a flag from our content filters
  • null: API response still in progress or incomplete

The GPT API sends data to Python in JSON format, so the response variable contains deeply nested lists and dictionaries. It's a bit of a pain to work with!

For a response variable named response, the status code is stored in response["choices"][0]["finish_reason"].

Pro tip

If you prefer to work with dataframes rather than nested lists and dictionaries, you can flatten the output to a single row dataframe with the following code.

import pandas as pd pd.json_normalize(response, "choices", ['id', 'object', 'created', 'model', 'usage'])

Instructions

  • Check the status code of the response variable.
# Check the status code of the response variable

Task 3: Extract the AI assistant's message

Buried within the response variable is the text we asked GPT to generate. Luckily, it's always in the same place.

response["choices"][0]["message"]["content"]

The response content can be printed as usual with print(content), but it's Markdown content, which Jupyter notebooks can render, via display(Markdown(content)).