Data Protection: Towards Data Disarmament

The virtual world has split into virtual countries: Google, Facebook, Amazon, Alibaba, TenCent and Apple.

Nikhil Pahwa @nixxin

02 Apr 2018, 04:34 PM IST

Green light illuminates the keyboard of laptop computer as a man enters the data using the computer keyboard. (Photographer: Chris Ratcliffe/Bloomberg).

This is the fourth article in a series on the future of data protection in India. Read the first, second and third here.

Data is power, and we must restrict the amount of power we allow others to have over us.

There’s a sense of fatalism about privacy these days: that it really doesn’t exist any more, and given the amount of data we generate, given the amount of data that devices are able to track, and given how powerful algorithms can determine things about us even without needed to collect that specific data about us: there is no hope. As Rahul Matthan put it: “Never before in history have we bled information more freely and in all directions.”

Imagine there’s no countries. It isn’t hard to do.

For a few minutes, suspend all judgment and really, imagine if there were no countries. No boundaries. You could flit in and out of places without any immigration queues or a passport. You could belong to several regions – all regions – of the world, if you choose to do so. It is a bit of a stretch, but there was once such a place: the virtual world that was the World Wide Web, with uninhibited free flow of people, ideas, knowledge, debates, music, videos. Today that virtual world has split into, in a manner of speaking, virtual countries: Google, Facebook, Amazon, Alibaba, TenCent and Apple. This is not to say that the rest of the web doesn’t exist. Increasingly, apps and the platforms control access to them, and are winning the battle versus the web.

The only source of ownership these large platform businesses have over you is your data.

Data helps them create new services, find ways of you spending more time with them, spend more money on them, and maybe influence how you feel. The more dopamine hits they deliver to you, the more of your time belongs to them. These platforms can be used to influence whom you vote for, and create revolutions: there is no doubt that they’re having an impact on the physical world. The Arab Spring changed the Internet forever: it put physical countries on alert.

While physical countries are effectively competing with virtual countries for your attention and time, there are two things to keep in mind: they know more about you than the physical country you belong to does, and those running the physical countries need that access to your data, or the tools to reach you, to stay in power.

This is why when Sundar Pichai, Mark Zuckerberg and Jeff Bezos visit India, it’s like a state visit: like between countries, there are problems, and there is cooperation.

Data isn’t oil: it’s not a diminishing commodity that is expendable and needs replenishment. But data is power, and it is the source of the power struggle over you, your attention and your time. This is why more countries are now seeking to roll out national ID projects, and are looking to develop more and more datasets that they own: publicly owned datasets of private information, including health, education, mobile, among others, are on their way. While at one level, these datasets will allow physical countries to offer better targeted services to citizens, they’ll also give them more data to exert their control over you. The US, allegedly, does this through access to data from their own companies: snooping on not just their own citizens, but on the world. India’s NATGRID intends to gain access to data from 950 organisations.

It’s hard to ignore these geopolitical realities, and the tensions between the physical and virtual worlds, when we’re discussing privacy, and any privacy law must take into account these battles. But for whom and how?

Our data is being weaponised and used on us. Thus we need to be given more control over it: who gets to collect it, how they use it, how they share it, and our right for its removal. Democratic republics are built on the idea of individual rights, and our rights and liberty need to be at the centre of any data protection legislation, and the rules that follow.

At the same time, while we need to be given more control over our data, I’d also like to suggest that we take push for a data-disarmament of sorts.

There are eight key aspects for lawmakers to consider when considering a data protection law.

The first of these is Data Generation. This includes the ability of devices and users to generate data. A large part of the issue of this explosion of data is the fact that our interactions and behavior with devices generate a lot of data about our preferences and whom we interact with on or using these devices. Devices, on the basis of observation of ambient information, also generate substantial data.

The second is Data Collection. There’s a reason why Mark Zuckerberg has a tape on his webcam and microphone: to prevent the device from capturing audio and video without his consent. What a device must and mustn’t be allowed to capture needs to be limited, and based on a specified purpose that has been clearly communicated to each users. There’s no reason why a company that gains access to your SMS, in order to read an OTP to authenticate a login shouldn’t be penalised for collecting and storing copies of all your SMS’. In the same manner, for sensitive data, there might need to be limits regarding who can collect what kind of data.

The third is Data Transmission. Not all the data that is collected needs to be transmitted, and transmission of data needs to be governed by certain rules and regulations, based on the type of data that is being collected for transmission, and the need for transmission.

There’s no reason why data that can be processed locally on a device needs to be transmitted.

The fourth is Data Processing. This must be done for the legitimate purpose of serving the user need/interest, and based on clear notice and consent.

The fifth is Data Sharing. This must be done with the consent of the user, based on the type of data being shared, and dependent on explicit consent of the user. At the same time, given that we’re in a world with information asymmetry and there are issues with users of bounded rationality, there might be highly sensitive data which the user should not be allowed to disclose without explicit consent.

Consent by itself shouldn’t be enough, and there need to be instances where active consent should be required.

There should also be types of data that they shouldn’t be allowed to share. This can be a tricky issue, because many services of conglomerates such as Alphabet (Google) rely on collecting and sharing data between these businesses.

The sixth is Data Storage. Data needs to be kept in a manner that doesn’t compromise the user, whether through pseudonymisation, or adopting differential privacy by adding random noise for aggregate data, or by limiting storage to certain geographical boundaries, based on the type of data being collected and its sensitivity.

The seventh is the idea of Data Erasure and Disposal.

The data which is no longer necessary should be disposed of by those collecting it.

There are types of data which should be collected with a specific disposal time attached to it, and as indicated by users. Users should be free to ask for closure of their accounts, and the erasure of that data as well.

The last in my list is Data Classification. This is key because we need to have different rules for different types of data, depending on the sensitivity of that data. It is true that the nature of data changes based on context, and over the years; one point I’ve heard often is that location data is not treated as sensitive data anymore, whereas it was many years ago. That doesn’t mean that people don’t choose whom they inform about their location at a point in time. Fingerprints, for example, is a data type that shouldn’t be stored externally, given that these are irreplaceable and highly sensitive. A public DNA bank, or publicly accessible health records would be a terrible idea.

The fatalistic attitude towards data, that I pointed towards at the beginning of this post, is misplaced: just because we’re living in an era where there’s massive data generation and collection, doesn’t mean that we can’t take steps towards fixing the situation. If anything, our past must inform our future. We need more tape, and not just on our webcams and microphones. Start with addressing data generation and collection.

This article was originally published on Pragati.

Nikhil Pahwa is the Founder and Editor of MediaNama, which reports on technology and tech-policy. He is also the co-Founder of the Internet Freedom Foundation, and the SaveTheInternet.in campaign for Net Neutrality in India.

The views expressed here are those of the author’s and do not necessarily represent the views of BloombergQuint or its editorial team.