Finding a word in an image

AyS_0908 · December 10, 2022, 5:23pm

Hi,

I would like to build a very basic app (from Glide point of view) but with one “little detail” that I have absolutely no idea how to start it.

My use case is that I have a shelf with 300 books. And I never find the one I am looking for.
So, I would like to:

take a picture of the shelf
enter the title of the book to be found
then have the image modified with a frame on the title of the book.

If you had some advices, it would be great (knowing that my knowledge in code and AI is below 0).

Thanks in advance

Robert_Petitto · December 10, 2022, 8:08pm

Frame around the book will be tricky, but an arrow pointing to the book should be much easier.

Something like this:

Upload photo to Glide
Find book button → Send image and keyword to Zapier/Make → Analyze with Google Cloud Vision → Find X/Y of located text → send coordinates back to Glide → cloudinary toadd arrow overlay pointing at X/Y

AyS_0908 · December 10, 2022, 10:14pm

An arrow will completely ok, I just want to identify where it is.
Thanks @Robert_Petitto … let’s discover Google Cloud Vision

(“nocode” they used to say )

ThinhDinh · December 11, 2022, 12:21am

I imagine if the front of the book is visible to the computer vision then it would be easier.

But if you have something like this, which I expect to be the standard, that might be a little harder to read the text out, since they are presented vertically.

AyS_0908 · December 11, 2022, 9:44am

Yes @ThinhDinh this is exactly what I want to build…

I was thinking about having an option copying three time the original image then turning one 90°, one -90° and one 180°.
And finally send them three to computer vision.

But this was before going to “google cloud vision”… nightmare, don’t even know where to start…

nb: just for fun, I asked to ChatGPT, it recommends tensorflow (?) and gives the code!

Alvaro_Souza · December 11, 2022, 3:58pm

Hi!

i dont know if im late for this, but a solution could be using a no-code web app that use data science templates for idetifying what you are looking for.

for example, using this web app: “obviously.ai” can already do what you want to, but like every data science project you need data to make it better (in your case a lot of images and photos)

then connect the results to glide with any web automation (like make, zapier or power automate) and voila!

i hope it help you!

AyS_0908 · December 11, 2022, 5:04pm

Thank you @Alvaro_Souza for your answer.

I have 0 knowledge in AI, and have a basic question: given my app should “just” read some typo letters, cf. book title (which would be previously entered in a form), why do I need to train a model? Is letters recognition not something quite basic?

Thanks in advance

Alvaro_Souza · December 12, 2022, 12:39am

as i undesrtand what you need, the model have to first recognize that is a book and then read whats on them. right?

AyS_0908 · December 12, 2022, 9:35am

Well, I’m not sure that it’s mandatory to recognize that it is a book.
The goal is to find a “title”.
Therefore, on below picture for example, it " “just” " has to search for text and show where it matches the title looked for

Jeff_Hager · December 12, 2022, 12:44pm

Google Lens is pretty good at reading text in any orientation. I use it a lot for translating other languages in Images, or live in real time with my camera. Also for copying text out of images. Pretty good for identifying plants too. It even solves typed or handwritten math problems. Not sure though if they have an API or any way to use it as a service. Pretty powerful feature on my phone though.

I found a random picture of books online and with Google Lens it was about to find most of the book titles. It’s subtle, but all text is highlighted in this image and I can click on each book to do a Google search or copy that text.

I can even translate everything to French.

If you can find something that uses Google Lens in some way, then that might work for you.

Robert_Petitto · December 12, 2022, 1:02pm

Ya…Google Lens = Google Cloud Vision. Make has a scenario for that.

AyS_0908 · December 12, 2022, 1:54pm

Thank you both,
I’ve tried to plays with Google Cloud Vision, it’s complicated for a beginner… I’m still trying to generate an API Key
[edit: I’ve asked to ChatGPT and it explained me the how to!!]

But what @Jeff_Hager achieved is exactly what I will try to do (+ a form to enter the “title” searched)

[re: edit] I achieved to initiate this “star wars nocode” suite!
I “just” have to have the “blue” part to do something… because up to now it just copies-pastes the initial picture

Robert_Petitto · December 12, 2022, 11:34pm

The response to the Google cloud vision module should return XY coordinates of the text it’s instructed to find. If you’re able to achieve this, let me know and I can give you next steps.

AyS_0908 · December 21, 2022, 4:36pm

Hi, as @Robert_Petitto said, Google Vision + Make.

BUT what Robert did not say : there is a → you need to do some Python !, or java, go and all their barbarian friends!
(cf. "Label detection part)

I could copy the code, play loto by changing some lines, but I don’t even know where pasting it
I think it is a step too high a poor lonesome NO-coder !

slscustom.ru · December 24, 2022, 8:39pm

Hi! Im not clear about topic, but i remember cloudinary have addons to recognize text on images. No code solution.

Topic		Replies	Views
OCR application on camera Ask for Help	4	41	January 20, 2025
Add Text to Image via API Ask for Help	2	1391	January 18, 2021
Glide AI: Text to Image Feature Requests	0	184	March 6, 2024
Taking a picture of a business card and OCR data to contact Ask for Help	18	1638	May 26, 2023
Image > Text AI - Passport Information Report a Bug	2	40	May 16, 2025

Finding a word in an image

Related topics