AI -> Image to Text

Hi All,

Goal: The user uploads a UML / BPMN diagram (picture), and the app analyzes and evaluates it using AI mechanisms.

Question: Is there a way to invoke Python or JavaScript code from within Glide to perform image analysis (specifically using the gpt-4-vision-preview model)? Can Make/Zapier be used for this?

I tried using the AI Image2Text feature in Glide, but it doesn’t work well for me (by the way, which model does Glide use - from Google or OpenAI, and is there a way to customize the prompt used by Glide)?

Code:

from openai import OpenAI

api_key = ‘sk-…’

image_url = “https://cdn-images.visual-paradigm.com/tutorials/howtodrawsequencediagram_screenshots/23-final-uml-sequence-diagram.png

client = OpenAI(api_key=api_key)

response = client.chat.completions.create(
model=“gpt-4-vision-preview”,
messages=[
{
“role”: “user”,
“content”: [
{“type”: “text”, “text”: “What does the diagram show - describe it in Polish and find any errors in it.”},
{
“type”: “image_url”,
“image_url”: {
“url”: image_url,
“detail”: “high”
},
},
],
}
],
max_tokens=700,
)

Best Regards,
Andrzej

I believe they use OpenAI’s model under the hood, but I’m not entirely sure. I assume you are using the “Image to Text” column, but “Complete chat” might actually be what you need instead.

It allows you to add an image, and ask a question, whilst “Image to text” doesn’t allow you to specify a prompt, I believe.

@ThinhDinh Thank you very much!
First tests show that it works much better. There is not always a stable answer - but I will definitely test this method.

2 Likes

Ooh this could be helpful for my recent new question- thanks!

2 Likes

Can you check if it works for you? I am trying to use an image in different ways but always receive the same error:

Complete chat: Invalid type for messages[1].content[1]image_url: expected an object, but got a string instead.

@David

Please try a model that has the “vision” capability. In my test, “gpt-4-1106-vision-preview” works.