How to get an image description with OpenAI

Use this method to get a text description of an image previously sent to the bot.

1. Add Message. In the text, write that the user should send the image.

2. Add User Input in order to assign this image to variables. Change Expected data type to File. In the example, image is a variable name, but you can choose a different name. If you use your own name, remember to change it whenever you output it in Request Body, which we will talk about later.

3. Add another Message or Buttons with hints that will send a message to the bot from the button. Regular Buttons will not work in this case, because they do not simulate user input. The message that the user sends should be something like this: "What is depicted in this picture?". This means that the user will send a prompt that will be applied to the image they just sent.

4. Add User Input, where this prompt will be recorded. In the example, prompt is a variable name that you can use or you can choose another one. If you use your name for the variable, remember to change it whenever you output it in Request Body, which we will discuss next.

5. Add the Request component to send the data to OpenAI:

  • select the POST request method
  • add the request URL:
https://api.openai.com/v1/chat/completions
  • add the Request Body:
 {
            "model": "gpt-4o-mini",
            "messages": [
              {
                "role": "user",
                "content": [
                  {
                    "type": "text",
                    "text": "{{prompt}}"
                  },
                  {
                    "type": "image_url",
                    "image_url": {
                      "url": "{{image}}",
                      "detail": "low"
                    }
                  }
                ]
              }
            ],
            "max_tokens": 300
          }
  • change variable names if you used them in User Input, then. If your variables are named as in this example — image and prompt, then you don't need to change anything,
  • the detail parameter is responsible for the quality of the image that is be sent to OpenAI. The quality can be set to low, high, or auto. If you don’t specify this parameter, auto will be used by default. When the "low" resolution is selected, a lower-quality image will be sent to OpenAI and fewer tokens will be spent than when you specify the high parameter. For more information on how to use the detail parameter, see the official OpenAI documentation →
  • add a new Key to the Request Headers: Authorization, with the Bearer value, then add your token from OpenAI.

Example of Value:

Bearer sk-12aBCdEfGhiJkLmnOPqRS3TuvwXYzqWER7tYUIO456OPASdf

6. Add Message in which the response will be output:

{{last_request.choices.[0].message.content}}

This way you can output the number of tokens spent when processing the last request:

{{last_request.usage.total_tokens}}

7. Save the changes.

Done.

To the beginning ↑