Could someone here please explain to me why Gitlab's product managers would be so interested in client-side analytics in the first place?
From my familiarity with their service, almost every operation requires an ajax call, or a full page refresh. Is there really that much value for the product managers in these additional analytics?
Even anonymous cohort analysis can be super useful as a product manager. If you want to encourage usage of a particular feature and the most successful users of that feature fit into a cohort, you can reach out to them for feedback, optimize paths between those features, improve documentation connecting relevant features together, etc.
This doesn't mean its malicious or all about the $$$...it might be that users that set up GitLab CI have 40% fewer security incidents and they want to encourage that behavior as a better customer outcome with the overall product.
edit: and this behavior might take place over a long period of time, not something you can get from access logs or just-in-time stats.
That's an interesting point. But wouldn't the vast majority of Gitlab users be signed in (and thus server-side trackable)? Pretty much all functionality other than just reading code seems to require it.
The only way to know this would be if you read the entire discussion across 2+ threads on the Gitlab site for their event tracking MR. Basically this whole shit show started as so:
* Gitlab previously used 3rd party infrastructure for their user event tracking
* They did not send this 3rd party user id for GDPR and other reasons
* Because they did not have user id, they could not understand user behavior across sessions. Understanding user behavior across sessions is important, so they wanted to add it.
* Gitlab had just finished moving their event tracking infrastructure in house.
* The original MR was to add user id as an attribute to their event tracking
What proceeded was what I consider a very reasonable back and forth between data, infrastructure, and legal on the correct way to add user id. But somewhere along the line it went off the rails. How it turned from simply adding user id into including Pendo JS tags for on-prem customers, I have no idea.
I read in one of the issues that it was Marketing that wanted Pendo tracking. (I'm guessing marketing mainly wanted it for gitlab.com).
In one place I saw a developer basically say Pendo is marketing's, and product is only interested in using Snowplow (with first party data processing).
Development was entirely happy with a true opt-in. Development does want to be able to get data back from on premises instances, but is totally fine with having it be an instance wide option that can be off.
years ago, at a previous employer of mine, one of the IT staff members was giving a presentation on an internal website/app they created for some widely used business function (i don't recall exactly) and they were rather proud of the fact that they tied in Google Analytics so that they could get a better view on how people were using the site.
I expect that a lot of other companies are also doing this to themselves...
Wasn't this proposal to add the client-side tracking to their hosted on-premises product?
They don't have any visibility in how those instances are being used, and wanted to use the telemetry to get that. (Not understanding that often, the entire point of choosing to host your own instance is to avoid this in the first place)
No, it was for everyone. They initially conceded not to add it to their enterprise versions (post-backlash), but the intent was always to add it for the rest of us.
I believe in this case it is server-size analytics. The case were this would have been significan would have been with self-hosted instances as there gitlab controls neither the servers not the clients