Page 2 of 3

Re: Are Strava heatmaps reliably indicative of general cycling activity?

Posted: 9 Nov 2018, 4:55pm
by Richard Fairhurst
geomannie wrote:
I wouldn't refer to "Strava and Other Public Domain Data Sources". In a machine-readable data context, "public domain" usually takes its US meaning of "free of copyright", and Strava Metro very definitely isn't that.


Strava Metro heatmap is public domain in that Strava put it in the public domain. "Public-facing source" may be a better description.


Indeed, that's the UK usage of "public domain". If you're talking in a data context, people will often assume the US meaning, so better to be unambiguous!

Re: Are Strava heatmaps reliably indicative of general cycling activity?

Posted: 9 Nov 2018, 7:18pm
by geomannie
Mick F wrote:
meic wrote:I would be willing to accept Strava data being used on a "we have nothing better" basis.


This is actually an interesting question. If Strava is not the best metric we currently have for mapping cycling activity, what is? If one were to argue that another metric is better than Strava, which one and why?

I fully accept that one would be unwise to interpret Strava heatmaps as "the truth" about cycling activity, but they seem to be a really good starting point, especially as the data I have had access to correlate well with point surveys.

Re: Are Strava heatmaps reliably indicative of general cycling activity?

Posted: 10 Nov 2018, 8:19am
by Mick F
geomannie wrote:I fully accept that one would be unwise to interpret Strava heatmaps as "the truth" about cycling activity, but they seem to be a really good starting point, especially as the data I have had access to correlate well with point surveys.
Yes, it's a good starting point, the the "facts" from Strava should be taken with a pinch of salt.

Maybe in certain parts of the country, it could be more relevant than in other parts of the country, so the data cannot be taken as Fact or even nearly Fact. At any one point survey, the Strava info could well be spot on, but at another point, it could be entirely incorrect.

The main thing though, is that Strava data cannot be relied upon, but it's possibly the best data available even though it's probably wrong.

Re: Are Strava heatmaps reliably indicative of general cycling activity?

Posted: 10 Nov 2018, 8:20am
by Cunobelin
Considering that the average commuting, leisure, utility cyclin will not be using Strava, the answer is an unequivocal NO

A quick scan of the station bike park, shopping centre for the evidence of devices and I would reckon that between 1 - 2% of journeys are recorded.

Also the fact that with the emphasis on "performance" Strava routes will be taken for the ability to ride fast rather than convenience or enjoyment

Re: Are Strava heatmaps reliably indicative of general cycling activity?

Posted: 10 Nov 2018, 9:57am
by pwa
Using Strava puts a person in an unrepresentative subgroup to start with, so looking at Strava use is just looking at the behaviour patterns of an unrepresentative subgroup. It would be a weird coincidence if that pattern matched general cycling patterns.

Re: Are Strava heatmaps reliably indicative of general cycling activity?

Posted: 10 Nov 2018, 10:09am
by meic
Cyclists, they are all the same. :lol:

More seriously this would be just "stereotyping (if you wish to be trendy), a thumbnail sketch, an approximation or a low resolution model (if you wish to be pretentious) but you need something to work with and there isnt a better option is there?

Re: Are Strava heatmaps reliably indicative of general cycling activity?

Posted: 10 Nov 2018, 10:16am
by PH
Si wrote:In Birmingham we used to have lots of BC sky/HSBC rides - perhaps typically the sort of people likely to use strava?
We also have lots of rides put on by Community Cycle Clubs and BC's programme for getting people riding in deprived areas. And we gave away 5000 free bikes to people in deprived area which we could track.
We put all of the above on a map.....guess what: very little overlap between the two groups. This would tend to suggest that Strava is a bit middle class oriented.

Even more than this, some of those in your first group may not even be recording all their rides. There's no obligation when you use Strava to use it all the time, though I know some do. This is how I use it and it wouldn't be an accurate reflection of the routes I choose, only those I choose when using Strava.

Re: Are Strava heatmaps reliably indicative of general cycling activity?

Posted: 10 Nov 2018, 11:29am
by Vorpal
geomannie wrote:Wanlock Dod's point about lack of normal distribution is very valid. I had the Glasgow data a couple of years ago and noted that while the correlation was positive, the survey points were very skewed towards to quieter routes. I had parked the correlation as interesting but inconclusive. I received the East Renfrewshire data much more recently and was pleased to note that while it is somewhat skewed towards quieter locations, it included survey points over wider ranges of cycling activity. I plan to look at this further.

I had made a comment about the normal distribution, and then edited out, as I thought it would require too much explanation.

Cycle usage fits a log-normal distribution. I've made a little picture below to show which part of the usage spectrum I think that we are looking at & how it fits with other types of usage.

I think that part of the problem is that you are comparing two groups that both fit more or less within the orange oval, but leaving out the type of cyclists who are the most common.

Re: Are Strava heatmaps reliably indicative of general cycling activity?

Posted: 10 Nov 2018, 11:58am
by Richard Fairhurst
geomannie wrote:This is actually an interesting question. If Strava is not the best metric we currently have for mapping cycling activity, what is? If one were to argue that another metric is better than Strava, which one and why?


Ideally you want to minimise the selection bias inherent in something like Strava. There's two alternatives I know of. One is plain and simple old-fashioned traffic counters, which exist for all A roads, many B roads, and a smattering of smaller roads. But possibly more interesting is sensor data from mobile phones - i.e. GPS readings which are "phoned home" to a server where the data is anonymised and aggregated. Lots of companies gather this: some don't resell it (Google, Apple, Mapbox), some do (Here, TomTom, Inrix). I suspect Inrix, at least, can isolate cycling journeys - they've done some work in Copenhagen that would suggest they can.

Re: Are Strava heatmaps reliably indicative of general cycling activity?

Posted: 10 Nov 2018, 9:16pm
by Vorpal
Richard Fairhurst wrote:Ideally you want to minimise the selection bias inherent in something like Strava. There's two alternatives I know of. One is plain and simple old-fashioned traffic counters, which exist for all A roads, many B roads, and a smattering of smaller roads. But possibly more interesting is sensor data from mobile phones - i.e. GPS readings which are "phoned home" to a server where the data is anonymised and aggregated. Lots of companies gather this: some don't resell it (Google, Apple, Mapbox), some do (Here, TomTom, Inrix). I suspect Inrix, at least, can isolate cycling journeys - they've done some work in Copenhagen that would suggest they can.

Traffic counters aren't always accurate. The pnuematic tube across the road often miss cyclists. Squirrel cameras and some sensor type counters are more accurate, and such things are better now than they used to be, but each one has some problems.

The DfT largely rely on manual counts, but these often done during office hours, with extrapolations made to other times of the day. They do occasional counts at other times of the day to verify the traffic models, but these models do not account for cyclists well. In addition, the manual counts do not count people on the pavement or side facilities. Only cyclists in the main carriageway.

If that wasn't enough, it has the same sort of bias problem that using strava data does. Many of the people who cycle during office hours avoid main roads.

Re: Are Strava heatmaps reliably indicative of general cycling activity?

Posted: 11 Nov 2018, 8:39am
by Wanlock Dod
Both of the underlying distributions (i.e. of Strava users and Cycle count survey locations) are inherently log-normal.
The sampling approach applied to both is considerably skewed toward the higher tail (as per Vorpal's figure)

Whilst this might not be good statistics does it actually matter for the purpose of the analysis? I suspect not because we are only really interested in the places where people cycle. The message from the Strava analysis is loud and clear and comes across in all of the different cities:
Strava data wrote:Build high quality routes along main roads


The uncertainty issue is probably one which can be addressed easily though because the Strava result is simply a minimum:

Total Cyclists = Strava Cyclists + Other Cyclists

The thing that is of interest once you have the Strava data (and have already built high quality routes along main roads) is how the value of Other Cyclists varies according to stuff like traffic density and Strava Cyclists.

So as well as using the Strava data to tell you to Build high quality routes along main roads you could also use it to identify the best locations for installing automatic cycle counters at the best locations for validation of Strava data and quantification of the actual values of Other Cyclists at those locations.

There is probably a good reason why all the cars use the main road, I can't help wondering if it might be similar for cyclists.

Re: Are Strava heatmaps reliably indicative of general cycling activity?

Posted: 11 Nov 2018, 9:03am
by Richard Fairhurst
Up to a point (and this is drifting slightly away from Strava towards general cycle campaigning, but it's an interesting issue).

In city centres then, yes, absolutely, cycle provision should be segregated tracks along main roads. The main roads are where the destinations are - shops, schools, entertainment, whatever - so by definition that's where people want to go.

In suburban and rural areas it's much less clear-cut. There is no particular reason, for example, why a north-south cycle route in Oxford should have to follow Woodstock Road in the north and Abingdon Road in the south, rather than parallel routes like (say) Kingston Road/Walton Street in the north, and Marlborough Road/Wytham Street in the south. There aren't really more destinations on the main roads - indeed, Walton Street is probably more of a destination than the parallel bit of Woodstock Road.

So why do cyclists still cluster along the main roads in such circumstances? Partly it's road design: main roads tend to have priority, so you can put your head down and just bomb along, whereas back streets have speed-bumps and crossing roads and other encumbrances. Sometimes there's existing minimal 'infrastructure' on the main roads, such as bus lanes.

But I think a lot of it is that many people just aren't very good at route-planning. Their mental map, shaped by driving and (perhaps) buses, tells them that the main roads are the routes from A to B: so naturally that's what they take. When I drive along the rural A44 near here, I'm always amazed how many novice/occasional cyclists I see - by their appearance, not confident main-road warriors like Mick F who've made a deliberate choice, but people who seem to be cycling there because they don't know any better. So I'd be reluctant to say that Strava data (or whatever) proves that infrastructure provision should be focused on the main roads in such areas.

It will be interesting to see how the Quietways fare in London. Clearly the recalcitrance of borough highways officers means they're not as good as they could have been, but nonetheless they do seem to be a step up from the earlier London Cycle Network backstreet routes.

Re: Are Strava heatmaps reliably indicative of general cycling activity?

Posted: 11 Nov 2018, 2:20pm
by Wanlock Dod
I appreciate your point, but I think that we are probably both arguing that there is a need for provision beyond just the main roads, but surely the main roads simply represent a bare minimum standard of provision. Modern Little British A roads are hardly suitable for cycling, but they go the places people need to go, and they are the routes that the peoples of a driving nation already know. If you want to explore the nice quiet routes then being an enthusiast with plenty of time to dedicate to finding them, and some expensive navigational equipment is surely going to help.

Even if some routes do not go along the main roads they need to cross them from time to time. In the UK if you are not comfortable with 3 to 5 miles on a busy trunk road from time to time then there are rather a lot of potential quiet routes and roads that will not be accessible to you. This situation never occurs in The Netherlands, although you still have to put up with the traffic noise and perhaps a couple of sets of traffic lights. If we don't do this then main roads will (actually many already have) become impenetrable barriers, much like a major river or a motorway, to cyclists. For sure there needs to be more, but do we meet the bare minimum anywhere yet?

Re: Are Strava heatmaps reliably indicative of general cycling activity?

Posted: 11 Nov 2018, 4:56pm
by mjr
I still feel that Milton Keynes achieved that bare minimum in the late 1990s in the brief window between completion of the cross city grid redways (currently being rebranded as super red ways, but it seems to be branding and news releases only so far IMO) and the surfaces degrading when basic minimal maintenance was left undone. But driving is so easy there that the city will probably always be held up as a cycling failure for achieving the average levels despite the hostile environment.

Re: Are Strava heatmaps reliably indicative of general cycling activity?

Posted: 11 Nov 2018, 6:22pm
by Vorpal
I don't think that there is a problem with using strava data as an input to local cycle strategy & provision. However, I don't think it should be the main input. Furthermore, it is extremely important to understand the limitations, and make sure to include data from the sort of cyclists represented in the left hand part of the usage distribution.