To relate, or not to relate, that's the question

Hi,
Relation is one of the best features of Glide. But, it’s also one of the heaviest computing tasks for the Glide engine. It can directly effect the speed of the app.

Assuming our Glide App will reach thousands of rows (or more), should we prefer to avoid using relations? The alternative is to duplicate the required information on every single table we need it.

To better explain, here’s an example for a list of table inside an app:

  1. client information (thousands)
  2. client events/meetings/etc. (hundreds)
  3. client meeting/call summary (thousands)
  4. client documents (thousands)
    and many more.
    The connecting thread is always the client ID, showing up on each of these tables. If we want to show details for a meetings summary, we’ll need to create a relation inside table 3, which will relate between the client ID column in table 1 and table 3.
    Or, we can add more columns to table 3, such as client name, client phone number, etc.

Yes, It’s like putting training wheels on a brand new race car.
Thoughts?
Ideas?
Alternatives?

Thanks

Glide only calculates the relations required to show the current screen. In your example, relations will be calculated efficiently.

3 Likes

Oh, so it’s not computing all the relations with each change in the database?
Could you perhaps provide an example for when NOT to use relations, but have duplicate data instead? (or any other solution)

This is bringing us back to the discussions about the 25K limit.

I personally don’t think there is really any case where duplicating data is a good thing. At least not in respect to good database design. However, a case where I do technically duplicate data in my app is with lesson tracking. When a lesson is taught, a price is attached to that lesson and it’s based on a coach’s current pricing structure. Now, I could use a relation to link that lesson to the related price, but in the future, the coach’s pricing structure will change, and I don’t want new prices to affect old lessons. In that case I duplicate the price for the lesson at the time it was taught.

So, I would say duplicate data if you intend the duplicate data to be static and not dependent on a related table. Basically, you expect it to potentially not match anything in the future, or have a rock solid plan to update that duplicated data if necessary.

Use relations if you want to reliably maintain database integrity and a change to a row in one table will be visible in all other tables that have a relation established.

When an app is opened, all data is downloaded, but not necessarily computed. Then, when a table is accessed in any way through the app, such as opening a tab that uses a particular table, then I believe all computed columns (e.g. Relations) in that table will initially compute to make sure any filters, or visibility, or anything else that uses computed columns is computed before displaying anything in the screen. While the app is open and changes are made to a specific row, I believe only rows affected by that change will recompute, so it’s not a full database computation again. Only what’s affected by that changed data. I observe much faster response times after my app is loaded and all of the tables have been accessed at least once. I have one particular tab in my app that is slow to load initially, but subsequent use of that tab has normal response times, until the app is closed and reopened again.

If I have any advice, it would be to try to avoid using computed values as the source or destination value of a relation. That means that all computations would have to loop through all rows before looping through again so relations can build. It ends up being an exponential number of loops whenever a new row is added. Or, just watch for any computations that rely on large chunks of data to be computed first. I feel like computations that are dependent on other computations can exponentially compound into a large set of calculations that take much longer to complete.

4 Likes

Wow!
Now that’s a detailed response :slight_smile:
Thank you very much for these great insights.

Have you gotten any closer to figuring out what is the root cause of these delays? Even better, how to avoid it?

1 Like

In my case, I have around 8k rows in one table that belong to one user. That table is linked through relations to at least 4 or 5 other tables, one of which is around 2k rows. Some of those other tables also link to each other, so it’s a big circle of interlinking tables.

Of the bigger three tables, one contains a list of billing cycles for each coach, another table contains each lesson that was taught by a coach to a student, and the last table is a google sheet query that builds a list of unique coach/billing cycle/student combinations. All of this is to track lessons, and calculate invoices to bill each student per billing cycle, per coach.

The math is simple, but all individual lesson amounts need to calculate before the amounts per coach/billing cycle/student can calculate, and before the totals for the entire coach/billing cycle can calculate. I couldn’t even tell you how many times it’s iterating through that data to come up with those totals, but I’m sure it’s quite a lot.

So, I have a pretty good idea that my problems are self induced knowing how Glide currently works and how I have my data interconnected. I’ve done several tests to see if different methods would be better or not, which has helped me to better understand how glide works and get my app to where it is today. Over the past few years, I’ve done several things to make it better.

  • I’ve simplified my logic to reduce as many computed columns as I can.
  • I used to have payment logic to track if bills were paid, but my users never used it and it greatly complicated things, so I recently removed it and will rebuild it if needed in the future.
  • Since Glide does not have conditional relations, I had to resort to building keys using google sheet logic so I would have a value to use for relations between tables. I can reproduce it using glide logic, but it got much to complicated and slowed everything down terribly. I usually don’t suggest using google sheets for any computations, but if timing doesn’t matter, it can be a good way to offload some logic outside of glide. When I had the payment stuff, I even went as far as creating a separate google sheet to offload some calculations from the primary google sheet since it was taking minutes to process and would cause weird issues with lost data within glide due to google syncing with glide while google was to still processing. Since I didn’t need instant results, offloading the computations to one or two other google sheets wasn’t much of an issue, but kind of a pain to maintain if I started deleting columns for moving columns around.
  • Recently I finally applied row owners, which makes it 100% better for those that don’t track lessons, but still slow for those that do. With this I’ve rebuilt the lesson entry from a native form to a custom form. I did this for a couple of reasons, but mainly I built a series of relations and actions that will find the earliest lesson and billing cycle that belongs to a coach, but is earlier that the most recent 2 years. Through those relations, I will append ARCHIVE to the row owner email. This will make the rows unowned to the coach but still in the table. For each lesson that a coach enters, potentially two old lessons and two old billing cycle records will be archived and permanently prevented from being downloaded to the user’s device. If a billing cycle still has lessons attached to it, it will not archive until all related lessons are archived. The results won’t be immediate, but over time it will reduce the number of rows that are downloaded and the app will get faster. Eventually the number of lessons and billing cycles will will become a much more manageable number and hover at around 4k rows that are downloaded and processed, even though the underlying table will continue to grow.
  • Once my app converts to the new computation model, I’m also hoping that I will see improvements in speed. Fingers crossed.

It’s been an ongoing process to simplify things as new ideas pop in my head, I learn more, and as new features are released by glide. This whole thing started before glide had any computed columns or even a data editor, so all computations logic had to be in the google sheet. Over time, I’ve moved logic to glide and made it better and more efficient. I’ll never be done making improvements. I think the archiving functionality will give me the best results in the long run. I want to keep the data, but there is no reason that the app needs to load and process that old data anymore.

Overall, it’s really hard to advise on best methods because it can largely depend on each and every situation. All I can say is to keep things as simple as possible and just try different things. Have an open mind to try something regardless of how outlandish it may seem. It just might work. And I’m sure glide will keep improving. It’s already light years ahead of where it was three years ago.

3 Likes

hello, while we are on the subject I have a few quick questions:

  1. Does visibility prevent the system from having to download all of the data? I know that it is not secure because advanced users can get access to the information, but does the app technically load let’s say 500 orders to then only display the 3 that are listed as “out for delivery”?

As an example, you can see from my screenshot that I display orders that are “ordered” or “installed” etc. Is glide downloading all the info or just the ones that are filtered. Down the road, there might be hundred or thousands of orders. Or technically for the other Installer app that I have, they will be the row owner for a whole bunch of installations. I am worried the system will slow down if it downloads all the old orders just to display a handful.

  1. You can also see from the screenshot that I have a selection button at the top that either shows current or completed orders. It just hides different sets of lists. So in total, I am playing around with 6 lists: Installations, Service Orders & Inventory Orders as well as all of their “completed” counterpart. Am I correct to assume that the system only downloads the table info once and then uses it as needed throughout all of the components?

  2. (EDIT: I see that Jeff might have touched on this while I was typing my question. Possibly changing the row owner to something else so that it stays in the table) If the visibility filter does allow for the whole row to be downloaded each time, should I be implementing a system where I move completed orders to a different sheet with no computations and then delete the row in the original table that does have a whole bunch of computed tables? I will have a whole bunch of “time till order” math calculations, which will technically keep running after the order is completed. I could see that bogging down the system.

Yes, all data that is not protected by row owners is downloaded. Visibility and Filtering do not prevent data from downloading, so that is why they are not secure methods of hiding data. The filtering and visibility happens after the data is downloaded to the user’s device.

Same answer as above. You are just controlling visibility of table data that has already been downloaded. That’s why the app appears to be so responsive. It already has the data instead of requesting new data from the server every time you change a view.

Yes, moving old rows to a different table or changing the row owner will definitely make a difference if you plan to have thousands of rows that aren’t needed anymore. You just need to keep an eye on your update counts if you go that route. Just keep in mind though, if you choose to move rows to another table via the app, it’s still a referenced table by the app, so it will be possible that the data is still downloaded, even though it’s not used by the app or has any computations. It should still help, but something to keep in mind. That’s why I chose to change the row owner email. All the data will remain in the same table, but due to row owners, only part of the data will be downloaded. Data that’s not downloaded will not have any computations performed against it.

3 Likes

Hi
Would you mind providing some details about this? I’m not familiar with archiving and/or other method to flag data as “do not download to user device”.
Thanks

I touch on my method in this post below. It’s not a native glide feature, but uses existing glide functionality to change the email in a way that makes it an unowned row. Tables that have row owners enabled, but do not have a matching email for the signed in user will not be downloaded because they are unowned.

1 Like

This looks very interesting and it’s truly out-of-the-box thinking. Without sounding pessimistic, has this been vetted by anyone from Glide? My concern is that it’ll break (or just hinder) some other aspects of the computation/data-handling.

No, it hasn’t been vetted. It’s not so much a hacky workaround, as it is working with existing native features in a way that gets the results I need. It’s pretty straightforward logic. With Row Owners, if I’m signed in with my email, I won’t see data that belongs to someone with another email. All I’m doing is changing the email to one that doesn’t exist in real life, so I no longer own that row, and nobody does for that matter. A user will only be altering their own rows, so it wouldn’t affect anybody else. The only thing that could ever “break” is if glide decided to prevent the option to change an email in a row owner column…which I highly doubt would ever be the case. Just based on my own experience, I think the idea is pretty sound. By appending the word ARCHIVE to the email, the original email address is still intact, so it’s easily traceable who the row used to belong to and can be easily reversed if needed. All it does is basically create orphan rows that will remain in the database, but will never be downloaded or used in any way, so computations against those old rows will never occur.

2 Likes