Extracting long text to JSON

Hi All
I am presenting text as the input that lists the company’s Disciplinary Code as a number of standards, each standard has offences falling under the standard, and each offence has a number of penalties available to impose.

Sample of first Standard

Standards

  1. Standard - Behaviour - Employees will treat clients, visitors, suppliers and members of the public on client premises and at special events, patiently, diligently and courteously.

1.1 Offence - Rudeness, insolence, insulting or neglect regarding clients visitors and members of the public on client premises and at special events (serious).

1.1.1 Penalty - First Offence Summary dismissal

1.2 Offence - Rudeness, insolence, insulting or neglect regarding clients, visitors and members of the public at special events (minor incident).

1.2.1 Penalty - First Offence Written warning

1.2.2 Penalty - Second Offence Final Written warning

1.2.3 Penalty - Third Offence Termination with notice.

1.3
etc…

I am using AI Text to JSON to try and extract the offences in a list that I can then use to allocate specific offences on a disciplinary report.

My action to convert text to JSON uses the following instruction:

Convert the following text into JSON format with the structure:
{
“standards”: [
{
“standard”: “Standard Title”,
“offences”: [
{
“offence”: “Offence Name”,
“penalties”: [
“Penalty”,
“Penalty”,
“Penalty”
]
}
]
}
]
}

I keep getting the error “Could not parse JSON”.

Appreciate any advice? (The list is a long one!)

So you have a long string, and you want to convert that string to JSON, but the column isn’t working?

How long are we talking about here? Possibly it crossed the tokens allowed?

Thanks for responding @ThinhDinh
I am getting inconsistent results. Currently I have the text successfully converted and stored as JSON. (Via Glide Action)
I don’t expect the disciplinary code to change often so will not need to generate the JSON often.

It did not occur to me to look in the documentation for token limits! So have learned something new. Thanks

I used ChatGPT to analyse my JSON - lots of tokens.
Let’s calculate it:

  1. Character count: 12061 characters.
  2. Estimated token count: 120614≈3015\frac{12061}{4} \approx 3015412061​≈3015 tokens.

What would be a better way to achieve the conversion of a well structured text document into JSON?