Last night, I learned that Twitter will now track what apps you have installed on your phone in order to better target its promoted tweets.
This is why we can't have good things on the internet.
Generally speaking, though, I do not think we have a privacy problem. Like Jeff Jarvis is constantly chanting, I believe that an open and sharable internet provides us with far more benefits than a closed and private one. I shake my head at people turn turn their Instagram account to private, I shake my head at them. They are restricting their networks to be the same as in the disconnected world, and I just don't see the point of that.
For the same reason, I publish everything in public. Of course, when I say everything, I really mean "everything I choose to share I do in public", with the choice being the operational word.
Like every other person, especially one living in Europe, I also have a strong sense of privacy. We have seen too many bad examples how data was misused to suppress or mislead people.
But having said that, most of the privacy problems that we see today are based on massive misconceptions over how privacy really works.
The EU cookie law, for instance, is a perfect example of one such misconception. A browser cookie is completely incapable of violating anyone's privacy. It was only in the old days when third party cookies were still a thing that this was possible. But since every browser is now blocking those cookies by default, what is the point of legislating against first party cookies?
Another example is all the stories about US tech companies selling our data. That too is generally not true. Google, for instance, has never sold any data about you or your person to anyone else. And using Google Analytics, doesn't enable Google to track you. Many people think it does, but that's only because they have no idea how the internet works.
The internet is a logical place based on data. To match this data (and hence track people), you need something to identify them with, being and ID. And in order to track someone's movement across two sites, you would need to somehow transfer that ID between them, because otherwise you have no match.
With Google Analytics, you don't have matching IDs between sites, which is also why Google cannot use the data it has from one site to track individuals on another site. It just doesn't work that way.
It's also the reason why we, as site owners, have so much trouble analyzing data between devices.
If you are visiting a website first from your mobile and later from your laptop, you will be recorded as being two different people. The analytics system just don't know that you are the same person because the IDs used to track you with are not the same.
So in many cases where people think their privacy is being violated, no such violation actually takes place, nor is it even possible.
It doesn't mean we don't have a problem. We do have privacy problems, and despite my positiveness about the open internet, things are currently going the wrong way.
But let's define what privacy really is.
Privacy is a very simple principle to both explain and understand, which we can define using four simple laws:
The first law of privacy is that you are the only one who can choose what to share. Meaning that any sharing of information about you as a person must be opt-in.
You can never have opt-out privacy, because that very notion is a violation of the first law.
The second law of privacy is that people you communicate with have an equal right to remember. If you walk into a store, the salesperson there has a right to remember that you came in, what you looked at and what you talked about.
Privacy is not a one-way street. If you have a conversation with someone, both you and the person you are speaking with has equal rights to remember what the discussion was about.
The very idea, for instance, that when a reader comes to this site I should not be allowed to remember that, is absurd.
You cannot have a connected world where only one part of that connection can know what happened. A connection is two-way by definition, and both parties must have equal rights to remember the transaction that took place.
This is a very important thing to remember about privacy. You are the one who choose when and where to communicate. But when you communicate, both parties have equal rights of remembrance.
It's that simple.
There is, of course, a massive loophole here where privacy violations are possible, so let's close this. And we can do that by looking at the data protection laws of my country, which (like in most of Europe) makes a lot of sense.
It goes like this:
The third law of privacy is that you are only allowed to collect data that is necessary and relevant within the sphere of your service.
For instance, if you are a car salesman, you can track what car a person was looking at, but you cannot track people's eating habits. What people eat, and what car they might want to buy are two different things.
This is actually a real law in my country. You are not allowed to track anything outside the sphere of your service and product. And you can only track what is necessary and relevant.
Another example is that it's illegal in my country for a company to ask people to fill out a survey with their personal details, just so that they can download a PDF white paper about something.
Asking for data about people is outside the sphere of the service of downloading a PDF file. You are gathering data that is not related to the interaction that exists between us, and that is illegal in my country. And it should be illegal in any country.
You are not allowed to collect data outside the sphere of what you are offering.
And finally, we have the fourth law of privacy, which is simply that you may not disclose data to a third party for the purpose of marketing or other use, unless people have given their explicit consent.
Again, this fourth law is a real law in my country.
The actual privacy laws in Europe are much more complicated, but these four simple principles are what everyone needs to remember when we are discussing privacy.
The fourth law is, however, the trickiest one of them all, because most people don't understand how it works. By default, there is no problem in using 3rd party services on a website. For instance, when you embed a YouTube video in an article, that is a third party feature, but it doesn't violate your privacy in any way.
It's the same with most analytics services. They too are 3rd party services that site owners add to their site, but they too are not violating anyone's privacy, nor any of the four laws.
The keyword in the fourth law is whether something is usable or not.
Take a service like, say, Chartbeat. It's one of the many great analytics systems out there, allowing people to track data in real time. Thousands of sites use it, including here this site.
Note: Also see "The Usefulness of Real-time Analytics".
The reason it's not in conflict with the 4th law is because it's tracking data individually for each site. In other words, there is no way for Chartbeat to use the analytics data it has about this site for its own gains.
Chartbeat cannot look at the data that is being collected on this site, and then use it to target you as a person on another site. There is nothing in the data being shared that allows such an action. The data shared is not usable outside the site it is shared from.
And this is true for pretty much all analytics services, and in fact, most 3rd party services that sites use.
Another thing about the fourth law is that, unless the data contains information that can be specifically linked to you, no violation of privacy is taking place.
A simple example: If you head over to the US Census Bureau, you can find thousands of charts and raw data files about the demographics of US citizens.
Like this one:
While this is based on personal information of each and every person in the US, it's not violating anyone's privacy. Yes, you might be a part of that graph, but we have no way to identify where.
The 4th law of privacy (you may not share usable data with third party) only comes into effect when it is data that a third party can use to target you specifically. Otherwise, no privacy violation is taking place.
The problem, of course, is that there are legitimate privacy violations taking place every single day, and when it's done by services that we all use, it gets really bad.
For instance, when a large store like Target is actively selling their customer data to data brokers, that's a massive violation of privacy... not to mention trust.
In fact, the whole concept of an industry of data brokers is a violation of privacy (and are illegal in large parts of Europe).
Just because I walk into a store to buy a pair of socks, doesn't mean this information can be sold to anyone who wants that data.
If you think the NSA is bad, I would say this is even worse. And Slate published a good article about this a while back: "What Do Data Brokers Know About Me?". It illustrates how massively inaccurate most of this data gathering really is, but it also illustrate how much these companies are violating people's privacy.
Remember, these are third party companies that nobody has had any direct communication with, who are buying data from companies you thought you could trust. This is then used to build up a profile about you as a person, which they will then happily sell to anyone who wants it.
And, in the US, this is perfectly legal.
When companies like Facebook is then partnering with big data brokers like Acxiom and Epsilon, they are violating our privacy as well.
It means that if you go into a store to buy a box of Pampers, Facebook can add that information to their internal profile of how to target you.
Let's compare what Facebook is doing to our four privacy laws:
In fact, this is only possible because Facebook is a US company. In many parts of Europe, including in my country, none of this would be legal. And it's one of the reasons why there is so much tension between Europe and US tech companies.
Another example is Twitter.
Twitter has now decided that it will collect and track what apps you use on your phone, for the purpose of tailoring its promoted posts.
As they say:
To help build a more personal Twitter experience for you, we are collecting and occasionally updating the list of apps installed on your mobile device so we can deliver tailored content that you might be interested in.
That too is a massive privacy violation. Again, let's look at the laws.
This is completely unacceptable, and sadly we are seeing more and more of it as social networks optimize how they target their advertising.
It's the same problem with mobile apps who ask for access to data that extends beyond what service it provides. In my country, this would be illegal because you are not allow to collect data that isn't necessary for the service you provide.
But what about Google, the company most people think about when it comes to privacy?
Well, Google is weird. Google theoretically doesn't violate any of the four laws (in most cases). But it depends on how you define what Google is, and how you define the relationship you have with them.
For one thing, and as far as I know, Google has never bought, shared or sold any personal information to 3rd parties.
Sure, their advertising network allows people to target ads based on how they are profiling people, but that's not sharing of data. The brands who advertise via Google, have no idea who they are reaching. They only know they are reaching people within a certain 'target'. They never get any of the profiling data.
That is exactly the same thing that happens when a brand places a print ad in a magazine. You have no idea who the readers are, but you do know that you are reaching the readers of that magazine. If that magazine is a about gardening, you know that you are reaching people who are interested in gardening.
It may feel creepy at times, especially when the targeting is really good, but it's not a violation of any of the four laws. Your privacy has never been violated.
But let's look at some other cases where people often say that Google is violating their privacy. Specifically in terms of Remarketing, profiling, ad scanning in Gmail, and Google sharing data between services.
First up is 'ReMarketing'. We all know how this is. You visit 'Site A', which sends a remarketing ID to Google's advertising engine. Later, when you visit 'Site B', and ad for 'Site A' appears.
This looks like a privacy violation, but it isn't. First of all, you did visit 'Site A' (first law), and the interaction that took place does give equal right to both parties (second law). The knowledge about your interest in their product is also perfectly relevant in relation to the ad (third law).
So nothing violates the first three laws. But what about the 4th law about not providing usable data to a 3rd party?
That law isn't violated either, because 'Site B' doesn't know what ads Google is displaying (no sharing of data is happening). And while Google does have that information, it's only being used to facilitate the service back to 'Site A'. So the fourth law is fine as well.
It may seem creepy that ads appear on other sites (and sometimes annoying), but that is the power of the connected world. Remarketing as a concept isn't violating anything.
But what about the profiling that takes place on the part of Google? For instance, when Site A tells Google that this person should see that ad, Google obviously stores that information. The question is then if Google is using that information to allow other sites to target the same data?
In other words, when you are visiting a non-Google site, is Google then collecting data that it can use for their own gain, outside the sphere of that interaction.
The answer to this is: Yes, they are and yes, they do.
Is that a violation of any of the four laws of privacy? Yes, it is. It's violating the 4th law of privacy in which sites that you visit should not be allowed to share usable information with 3rd parties without explicit consent.
However, this is where things get a bit tricky.
You see, Google profiles you in a very different way depending on whether you are an active Google user or not.
If you are not an active Google user (as in not signed in to Google), Google has no idea who you are. They will set an 'unknown' ID and try to profile you based on that. But, even though, Google is trying to build up a profile, at no point will they know that this is somehow linked to you.
In other words, Google will have no idea that "Thomas read an article on the LA Times about the Good Dinosaur". It will only know that "A person read an article on the LA Times about the Good Dinosaur".
Is that a violation of your privacy?
Well, technically speaking, it is because data about what you did on another site is being shared with Google, who can then build a profile that they can use for other things.
Of course, if you are an active Google User (like me), and you are signed in to Google (like I am all the time), Google will match their profiling specifically to my Google Account. In other words, they know exactly who I am when I'm visiting a site using Google Adsense.
Is that a violation of privacy? Well, yes it is. Why should Google be allowed to track what I'm reading on the LA times?
But... it's not that simple. Because I like that Google is tracking me across the web. Maybe not as much because of the ads, but I absolutely love it in relation to YouTube.
For instance, if I go to a site that has embedded a video from YouTube, when I start playing that, it will automatically be added to my 'history list' on my YouTube account. This is immensely useful, in that it allows me to very quickly find videos I have seen earlier, but can't remember where.
I use that feature several times per week. And yet, it's doing the exact same thing as Google Adsense. It's linking my activity on another site, to my profile on Google.
As I said, it's tricky. But, technically it is a violation of the fourth law since I didn't give Google explicit consent to act that way.
What about the notorious ad scanning that goes on in Gmail. Is that a privacy violation. Well, yes and no. First of all. Scanning the email itself isn't a privacy violation in any way. It's a very useful service that allows a whole slew of amazing features.
These are features like spam filtering, malware protection, usability enhancements (like quick links to interactions, grouping content by conversations), filtering, and many other things. Email scanning is probably one of the most useful features of any email service today.
It's also not a privacy violation if the email service tries to match the email with and ad, as long as none of that data is shared or used in any other way.
So far, nothing about email scanning is in violation with the four laws.
The problem is, once again, if that scanning is also used to build up a profile that Google can then use outside the scope of that mail window, like, for instance, for Google to know who your connections are, or to target your person on other sites.
Remember, the 4th law states that "No data may be shared in usable form with third parties."
The reason why email is so complicated is because an email conversation isn't part of Google. When you send an email to someone, you are not having a conversation between yourself and Google. You are having a conversation between yourself and another person. Google is the 3rd party in that transaction, which means it can only process the data, but it cannot use it for itself.
For Google to use that data for anything else than what takes place at just that specific moment is, again, a violation. It's not a violation to do it from inside Gmail, but it is a violation if they take it outside of Gmail and use it for something else.
At this point, it's not entirely clear if Google uses the result of the ad scanning for other things than just showing the ads.
Note: I'm not affected by this, because I have a Google Apps Account. And these accounts have no ad scanning (they used to, but not anymore).
Finally, let's talk about the final concern that many people have about Google. This relates to the concern many have about Google sharing data between their services. Is that a privacy concern?
In short, no. But it's a bit complicated because it depends on how we define Google.
We can define Google in two ways. One way is to define the whole of Google as just one big company. This is, in fact, how Google defines itself, and it's also very much how I would define it.
Think about it like this. If you go into a Walmart store, you can find many different types of products. You can buy a TV, a new pair of jeans, a toy train, an electric razor, a cat bed, a yoga mat, a scrapbook kit, and many other things.
And whenever you decide to buy any of these from any part of the store, any other part of the store would obviously know about it as well. Walmart may have many different products that you can use, but it's still just one big store.
You would never suggest that when you buy a cat bed from aisle 11, the Walmart employees in aisle 12 shouldn't be allowed to know about it.
It's the same about Google. If you think of Google as a single company (which, in fact, is exactly what it is), what you do in Google+ is no different from what you do in Maps. Sure, it's a different product, but it's all within the single entity of what we know as Google.
Based on this definition, Google is in no way violating your privacy when data from one part of Google is used in another part of Google. It's all Google.
The problem is that, in Europe, many people don't think of Google that way. They think of Google more like a big mall. The mall itself might be one big building, but within this mall, each store is completely separated from all the other stores.
If you go down to the local mall and visit a fashion shop, you wouldn't like the bicycle store in another part of the mall to know about it.
And this is where the conflict is. Some people, due to the size of Google, think of it more like a collection of separate services, thus thinking that each should work as a separate private interaction. While other people (and Google itself), defines itself as just one single company.
I, for one, agree 100% with Google on this, which is probably caused by how much I use its services. I wrote this article in Google Drive, using Chrome, with my Nexus phone next to me, often interacting with people on Google+, and searching for things on Google Search. I don't think of Google's products as separate things. To me it's all Google.
I don't understand the concept that what I do in one part of Google is somehow a violation of my privacy because the data is then also used in another part. That kind of thinking just doesn't make sense to me. Nor do I see how that violates any of the four laws outlined in this article.
A violation of privacy is when information pertaining to your person is being used outside what you choose to interact with. As long as the data that I have myself provided to Google (by using its services) stays within Google (which it does), my privacy is completely intact.
The only real privacy issue with Google, is how it's profiling people as a 3rd party ad network for other sites. I do have some concerns over that. But none of those concerns have any relation to how you and I choose to use Google's services, like Gmail, Search, Google+, Drive, Maps etc.
As I started out saying, generally speaking, I do not think we have a privacy problem. Most of what people are afraid of is based on misconceptions about how the internet works and how your data is being used. And in almost all the cases, no actual privacy is being violated.
But every now and then we see real privacy violations like what Twitter just did. The very idea that Twitter feels it can just scan my private phone to read what other apps I have installed is insane. And it doesn't help that the purpose of this violation of my personal property is to help them sell more ads, which I don't want to see.
That's not acceptable, not even if I can opt-out. Privacy is always opt-in. And not a single person in the world would allow Twitter to have that kind of access.
My wish is for every country in the world, and specifically the US, to adopt the four laws of privacy:
And, of course, this should apply to Governments as well.
But we also have a responsibility to teach the future generations how data on the internet actually flows. Every single day I come across people (and journalists) who think that data can be tracked and matched together by magic. And just by having this data, they think it automatically constitute a violation of one's privacy.
None of that is true.
An actual violation of privacy can only take place when a third party is using it, or when the data in question weren't part of the interaction that happened between you and the site you interacted with.
The flow of data between those you interact with is not a violation of privacy at all. It's something we call: 'communicating with each other'.
For instance, I have now been communicating with you through the words in this article. And you have communicated with me by reading it. Within this, you now know what the article is about, and I hoped you liked it. And I know that you have read it, which I thank you for.
This is not a violation of anyone's privacy. In fact, this is what the internet was designed to do. It's a place where you and I can get together. In this case, in the form as you as a reader and I as writer. Of course, you can also head over to Google+ and post a comment, if you like. But that is entirely up to you, just as reading this article was up to you.
We are communicating via this amazing thing called the internet. And all forms of communication are always based on data being exchanged.
Founder, media analyst, author, and publisher. Follow on Twitter
"Thomas Baekdal is one of Scandinavia's most sought-after experts in the digitization of media companies. He has made himself known for his analysis of how digitization has changed the way we consume media."
Swedish business magazine, Resumé