Words per sentence

I want to count the numbers of words in successive sentences of a paragraph. If one cell contains the whole paragraph, it’s easy to split it into sentences, making a cell with multiple values, each value being one sentence. Then I’m stuck. A roll-up column will count the number of sentences, but what I want is the number of words per sentence. Can Glide do that or do I have to write a bit of code?

1 Like

Couple ways to do this.
First way:

  1. Split text column > separated by space
  2. Rollup column of split text text column to count words

Second way:
Use the word count plug-in column.

1 Like

How do you want the data to be presented? Let’s say I have a paragraph like this.

“My sentence one. Sentence two. Three”

Then should it return “3, 2, 1” for you?

Yes, that’s exactly right.

(Even better if it returned separate rows:
3
2
1
– but I suspect that’s impossible)

It’s not, but it could get really messy and ugly.

What’s your use case? ie. What are you going to do with these numbers?

1 Like

I tried that. What you suggest will count the words in a column, but not the words per sentence.

If I split the text by full stop (period) to get sentences, the roll-up counts the number of sentences but not the number of words.

I think I need two split-texts, one for sentences and then one for words.

I wanted to demonstrate one of the characteristic differences between writers, the statistical distribution of number of words per sentence. Some writers have all sentences much the same length, some use a wide variety of lengths. I know it 's not what Glide was really meant for, but it’s the sort of thing that end-user programmers might be doing – developing simple text analysis tools; and I’m interested in seeing where Glide’s limits are.

Okay, I get it.
So, for example, you may want to plot these on a scatter chart?
Roughly how much text at a time would you want to analyse? I would imagine it would be significantly more than just a few paragraphs? A whole paper, or a whole book?
Would you do it for a single author at a time, or would you want to do it simultaneously for a whole group of authors, so that you could do side by side comparisons?

I’m asking all these questions because I find these types of challenges intriguing. I love pushing the limits of Glide and finding new ways to solve problems. That said, as you point out - Glide is probably not the best tool for this sort of exercise. But that doesn’t mean it can’t be done.

Oh, the other thing I wanted to point out is that using merely the presence of a period to denote the end of a sentence may not be that reliable. What about sentences that end with a question mark, like this one? Or sentences that end with an exclamation! Or those that have a writing style… like this? (was that one sentence, four sentences, or zero?) I’m sure you get my point, but that’s just a side issue :slight_smile:

Amount of text: A couple of pages would be enough: I’m after a demo, not a professional’s tool. But that’s a good question. How well does Glide scale?

Defining ‘sentence’: excellent point. There’s no way afaik to split text by ‘period or question mark or exclamation mark’. Hmmm …

There is… Glide now (indirectly) supports regular expressions which will take care of this part.

I have a few ideas bubbling around inside my head. I’ll have a play around with this a bit later and may come back to you with a suggestion.

How many paragraphs max?

Shall we say 10?

CleanShot 2022-01-31 at 09.27.19@2x

I see @Darren_Murphy replying…wondering if he’s thinking the same thing I’m thinking…

3 Likes

hmm… ill wait too lol, GS that would be few minutes to solve

So I’ve been having a bit of a play with this.

My initial thought was that the Extract Multiple Matching Text plugin column would handle the first part of this. But it turns out that it doesn’t.

So I ended up writing a bit of javascript to do the initial parsing of the paragraph…

if (!p1) { return undefined; }
var sentences = p1.split(/[.?!]\s+/);
  var arr = [];
  sentences.forEach(function (sentence) {
    var words = sentence.split(/\s+/);
    arr.push(words.length);
  })
  return arr.join(',');

The above will give you a comma separated list, representing the number of words in each sentence of the paragraph.

My thought from there is that it could be transposed into a series of rows, and from there you could do various rollups to get average, median, etc. But that’s where I’ve got a bit stuck. I’m able to transpose the data okay, but the rollup column doesn’t seem to play nicely with it.

I’ll come back to this a bit later and see if I can figure it out. In the meantime, perhaps somebody else will jump in with an alternative solution.

4 Likes

and that’s what codes are for… :wink: btw @Darren_Murphy I did find a solution for not running scripts On Change when add, delete a row, or column… lol pretty easy

Probably not, haha :joy:
But I am curious to see what you come up with.

No code option:

4 Likes

I wish you could do video tutorials for my Apps … @Robert_Petitto
You are the best of the best in that field… no doubts… and very good in many others fields… but your video tutorials … even made just to make it… are impressive

1 Like

hehe, that’s more or less the approach I thought you’d take.

Only problem is this:

That’s why I ended up resorting to code, to deal with that. I initially thought that one of the plugins would help overcome that, but it seems not.