Finding a word in an image


I would like to build a very basic app (from Glide point of view) but with one “little detail” that I have absolutely no idea how to start it.

My use case is that I have a shelf with 300 books. And I never find the one I am looking for.
So, I would like to:

  • take a picture of the shelf
  • enter the title of the book to be found
  • then have the image modified with a frame on the title of the book.

If you had some advices, it would be great (knowing that my knowledge in code and AI is below 0).

Thanks in advance

Frame around the book will be tricky, but an arrow pointing to the book should be much easier.

Something like this:

  1. Upload photo to Glide
  2. Find book button → Send image and keyword to Zapier/Make → Analyze with Google Cloud Vision → Find X/Y of located text → send coordinates back to Glide → cloudinary toadd arrow overlay pointing at X/Y

An arrow will completely ok, I just want to identify where it is.
Thanks @Robert_Petitto … let’s discover Google Cloud Vision

(“nocode” they used to say :slight_smile: )

I imagine if the front of the book is visible to the computer vision then it would be easier.

But if you have something like this, which I expect to be the standard, that might be a little harder to read the text out, since they are presented vertically.

Yes @ThinhDinh this is exactly what I want to build…

I was thinking about having an option copying three time the original image then turning one 90°, one -90° and one 180°.
And finally send them three to computer vision.

But this was before going to “google cloud vision”… nightmare, don’t even know where to start…

nb: just for fun, I asked to ChatGPT, it recommends tensorflow (?) and gives the code!


i dont know if im late for this, but a solution could be using a no-code web app that use data science templates for idetifying what you are looking for.

for example, using this web app: “” can already do what you want to, but like every data science project you need data to make it better (in your case a lot of images and photos)

then connect the results to glide with any web automation (like make, zapier or power automate) and voila!

i hope it help you! :slight_smile:

1 Like

Thank you @Alvaro_Souza for your answer.

I have 0 knowledge in AI, and have a basic question: given my app should “just” read some typo letters, cf. book title (which would be previously entered in a form), why do I need to train a model? Is letters recognition not something quite basic?

Thanks in advance

as i undesrtand what you need, the model have to first recognize that is a book and then read whats on them. right?

Well, I’m not sure that it’s mandatory to recognize that it is a book.
The goal is to find a “title”.
Therefore, on below picture for example, it " “just” " has to search for text and show where it matches the title looked for

Google Lens is pretty good at reading text in any orientation. I use it a lot for translating other languages in Images, or live in real time with my camera. Also for copying text out of images. Pretty good for identifying plants too. It even solves typed or handwritten math problems. Not sure though if they have an API or any way to use it as a service. Pretty powerful feature on my phone though.

I found a random picture of books online and with Google Lens it was about to find most of the book titles. It’s subtle, but all text is highlighted in this image and I can click on each book to do a Google search or copy that text.

I can even translate everything to French.

If you can find something that uses Google Lens in some way, then that might work for you.


Ya…Google Lens = Google Cloud Vision. Make has a scenario for that.


Thank you both,
I’ve tried to plays with Google Cloud Vision, it’s complicated for a beginner… I’m still trying to generate an API Key :slight_smile:
[edit: I’ve asked to ChatGPT and it explained me the how to!!]

But what @Jeff_Hager achieved is exactly what I will try to do (+ a form to enter the “title” searched)

[re: edit] I achieved to initiate this “star wars nocode” suite!
I “just” have to have the “blue” part to do something…:upside_down_face::face_with_spiral_eyes: because up to now it just copies-pastes the initial picture

1 Like

The response to the Google cloud vision module should return XY coordinates of the text it’s instructed to find. If you’re able to achieve this, let me know and I can give you next steps.


Hi, as @Robert_Petitto said, Google Vision + Make.

BUT what Robert did not say :slight_smile: : there is a :snake: → you need to do some Python !, or java, go and all their barbarian friends!
(cf. "Label detection part)

I could copy the code, play loto by changing some lines, but I don’t even know where pasting it :slight_smile:
I think it is a step too high a poor lonesome NO-coder !

Hi! Im not clear about topic, but i remember cloudinary have addons to recognize text on images. No code solution.

1 Like