AI Extract Text As JSON Failing

I have built a Payments Processing app. I upload pdf documents only. I am using Glide AI to generate firstly text from uploaded supplier bills, then to generate text as JSON.

Mostly the extraction is working flawlessly. I have a scenario where a particular document is failing in that the Query JSON columns that reference the generated JSON is failing to produce results. (They remain empty.)
The generated JSON column still appears to work fine - My generated JSON for two examples is as follows:

Working Scenario:

{
  "supplierName": "LX CARNATIONS & EVENTS",
  "documentType": "INVOICE",
  "documentReference": "INV00001480",
  "documentDate": "21/11/2023",
  "serviceDescription": "Florals Florals with only Peonies and Hydrangeas, White platters",
  "documentTotal": "R1,650.00",
  "documentTax": "R0.00",
  "documentDueByDate": "30/11/2023",
  "nameOfBank": "FIRST NATIONAL BANK",
  "bankACNumber": "62931111113",
  "bankBranchCode": "250655",
  "supplierTaxNumber": "",
  "supplierRegistrationNumber": "",
  "supplierAddress": "17 Southway steet\nSandton\nKelvin\n2090",
  "SalesRepName":"OVERALL DISCOUNT %: 0.00%",
  "supplierTelephone": "",
  "supplierEmailAddress": "",
  "documentAmountDue": "R1,650.00",
  "RefundableDeposit": "",
  "lineItems": [
    {
      "description": "Florals Florals with only Peonies and Hydrangeas",
      "quantity": "-",
      "unitPrice": "R450.00",
      "discountPercentage": "0.00%",
      "vatPercentage": "0.00%",
      "totalPrice": "R900.00"
    },
    {
      "description": "White platters",
      "quantity": "5.00",
      "unitPrice": "R150.00",
      "discountPercentage": "0.00%",
      "vatPercentage": "0.00%",
      "totalPrice": "R750.00"
    }
  ]
}

Failed Scenario:

{
  "supplierName": "Monique Neethling",
  "documentType": "INVOICE",
  "documentReference": "1000",
  "documentDate": "2023/11/22",
  "serviceDescription": "Setup Box Tower Discounted as per Budget",
  "documentTotal": "1 250.00",
  "documentTax": "-",
  "documentDueByDate": "Due Upon Receipt",
  "nameOfBank": "FNB",
  "bankACNumber": "62841111113",
  "bankBranchCode": "-",
  "supplierTaxNumber": "-",
  "supplierRegistrationNumber": "-",
  "supplierAddress": "131046 Bryanston, Bryanston",
  "SalesRepName": "-",
  "supplierTelephone": "(+27) 117917001",
  "supplierEmailAddress": "segall.warner@keynote.co.za",
  "documentAmountDue": "-",
  "RefundableDeposit": "-",
  "lineItems": [
    {
      "description": "Setup Box Tower Discounted as per Budget",
      "quantity": 2,
      "unitPrice": 625.00,
      "amount": 1 250.00
    }
  ]
}

Can anyone spot a reason? Or have suggestions on where I might look for the issue? (I have used JSON validators to test my generated JSON and not seeing any issues that would result in failure described above.)

Appreciate any input.

The space in the amount?

Thanks @Robert_Petitto -

  1. Would that cause the other fields (like supplier name for example) to also fail?
  2. How would I clean that up? I am thinking I would need to create an AI Generate Text column per outputted number field and “exclude” spaces?

I imagine if any of your key value pairs are formatted incorrectly, then the whole column will fail.you could try to tell it to output all values as strings?

1 Like

Not a problem that contributes directly, but why is “documentTotal” not a number? Is it due to the currency symbol?

Thank you!

@ThinhDinh The “documentTotal” value is extracted from a pdf - I am guessing if this value is in fact text (the value in the “failed” scenario had a space intended as thousands separator), then the output is not recognised as a number?

@Robert_Petitto I tried changing the prompt:

"Here is a supplier document against which payment will be made. Format as JSON in this structure with all key value pairs expressed as strings:

This seems to have resolved the issue :grinning: :pray:

3 Likes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.