Extracting long text to JSON

Hi All
I am presenting text as the input that lists the company’s Disciplinary Code as a number of standards, each standard has offences falling under the standard, and each offence has a number of penalties available to impose.

Sample of first Standard

Standards

  1. Standard - Behaviour - Employees will treat clients, visitors, suppliers and members of the public on client premises and at special events, patiently, diligently and courteously.

1.1 Offence - Rudeness, insolence, insulting or neglect regarding clients visitors and members of the public on client premises and at special events (serious).

1.1.1 Penalty - First Offence Summary dismissal

1.2 Offence - Rudeness, insolence, insulting or neglect regarding clients, visitors and members of the public at special events (minor incident).

1.2.1 Penalty - First Offence Written warning

1.2.2 Penalty - Second Offence Final Written warning

1.2.3 Penalty - Third Offence Termination with notice.

1.3
etc…

I am using AI Text to JSON to try and extract the offences in a list that I can then use to allocate specific offences on a disciplinary report.

My action to convert text to JSON uses the following instruction:

Convert the following text into JSON format with the structure:
{
“standards”: [
{
“standard”: “Standard Title”,
“offences”: [
{
“offence”: “Offence Name”,
“penalties”: [
“Penalty”,
“Penalty”,
“Penalty”
]
}
]
}
]
}

I keep getting the error “Could not parse JSON”.

Appreciate any advice? (The list is a long one!)

So you have a long string, and you want to convert that string to JSON, but the column isn’t working?

How long are we talking about here? Possibly it crossed the tokens allowed?

Thanks for responding @ThinhDinh
I am getting inconsistent results. Currently I have the text successfully converted and stored as JSON. (Via Glide Action)
I don’t expect the disciplinary code to change often so will not need to generate the JSON often.

It did not occur to me to look in the documentation for token limits! So have learned something new. Thanks

I used ChatGPT to analyse my JSON - lots of tokens.
Let’s calculate it:

  1. Character count: 12061 characters.
  2. Estimated token count: 120614≈3015\frac{12061}{4} \approx 3015412061​≈3015 tokens.

What would be a better way to achieve the conversion of a well structured text document into JSON?

Hi @mwza . I am facing the same issue. Did you find a better way to convert?

Hi @Gui_Rodrigues_Goncal
I ended up creating a “Sections” table to store sections of the document. My use case was suitable for this approach since every section was identical in structure. So I simply reduced the size of the document / text and created the JSON per section. I can add as many sections as required (I used 11), and then used a lookup to combine JSON in all sections into a single column in my “Code” table.

1 Like

Mark, thank you for the answer.

In my case the user upload a Excel File and I use the DOC TO TXT (AI) column to transform in TXT.
So, we always have diferent data and I´ll be not able to divide in sections.

I will keep searching another alternative.

1 Like

Have you tried the Text to JSON column?

@ThinhDinh That’s what i am doing. The problem is that the column stay in loop and after a while I receive the message: Could not parse JSON.

I guess it is related to the lenght of the Excel File because when I used a spreadsheet with 20 lines it worked very well, but when I used a 300 lines sheet it didn´t work.

I don´t know if there are limits with the lenght of the TXT used as source for the “TXT TO JSON’, and, in this case, can cause problems related to"Tokens allowed”.

Do you know if there is another alternative to read the rows from an Excel file and save them in Glide Tables? (in front end, I mean)

1 Like

I would have a flow to upload a CSV file to Glide, then use Make to loop through the CSV and add rows to your table.

I have no experience with Make but I will try this approach. Thank you.

1 Like