Parsing XML/HTML into Google Sheets

Does anybody have any experience parsing data from a website into Google Sheets? I am trying to create a workout logging app where people can input (via a form) the “share” links commonly provided by apps like Strava into my app, and the app will parse the linked webpage for specific data (date, distance, time, etc.). I have made a tiny bit of progress but am encountering a number of challenges:

  • Share links, even from the same app, have different formats. For example, the website gives an “Embed in Blog” code block [<iframe height='405' width='590' frameborder='0' allowtransparency='true' scrolling='no' src='https://www.strava.com/activities/2675402780/embed/b2c1380635d8d09c6c7188ba0811cb8e39c5b3c3'></iframe>], while the app gives what looks like a short URL [9.7 mi Run Activity on August 12, 2019 by Mark E on Strava].

  • Links may actually be redirects, and it’s not clear whether they can be parsed using IMPORTXML or IMPORTHTML

  • IMPORT functions don’t work with ARRAYFORMULA

  • Parsing using XPath doesn’t work, which may be the result of the above, or a separate problem

I’d appreciate thoughts on the best way to approach this.

Thanks!

I have used some Google scripting to do this. It’s a bit of trial and error.