BIG DATA AND FAKE NEWS

December 24, 2019

The digital revolution we have been experiencing in recent years will radically change our life, in ways we are not yet fully aware of. This is something extraordinary and surprising, in fact 90% of the data in history has been produced in the last 2 years and in 2020 these numbers will keep growing. This data has a precise name and is big data. The term big data refers to a large amount of generated data and has required the birth of a new professional figure, the data scientist, who is responsible for extracting the most relevant information from this huge amount of data. In fact, the greater the amount of information, the more accurate the future prediction of what is being analysed.

These data not only help the analysis, but also bring significant benefits in different sectors. In the field of health, for example, it will be possible to monitor a person’s vital parameters in real time and prevent any pathologies thanks to the Internet of Things.

But also big data have some negative aspects and concern fake news that expand at great speed and which can confuse people due to their presence on social networks. They influence people’s judgment and can even damage a person’s image. Leaders around the world well know these risks because nowadays they have to hinder the attacks of rivals who behave unfair during election campaigns. Sometimes, precisely because of the ease with which you share your opinions on social networks, fake news is generated very quickly and bring wrong information on varied topics.
In this article we will talk about the proliferation of big data, which is now part of our daily lives and makes it necessary to protect ourselves from misinformation and fake news.

BIG DATA AND PRIVACY

Despite the GDPR provisions, the protection of privacy in a world submerged by big data is becoming increasingly complicated. Recently, the Agcom, the Italian Communications Authority, discovered that many fake news spread to Italy in the last political elections and it is trying to avoid the recurrence of the same phenomenon managed by bots or algorithms.

The current and traditional protection schemes are based on a data management model that dates back to the twentieth century and which preserves the data that the same people voluntarily release. Big data, instead, work differently, they start from the concept of massive data storage, therefore not only data voluntarily released but also those obtained from individuals’ behaviour. The real commercial potential of big data lies precisely in the latter category and many privacy issues are linked to it.

This means that we leave traces of ourselves, or rather we are profiled, every time we ask for a loan, we shop online, we interact with the smartphone or with social media. The Cambridge Analytica case and the non-transparent use of big data highlighted the need for new forms of protection. The GDPR has improved the management of privacy linked to big data, but today it remains a bureaucratic aspect like the old privacy legislation, and therefore it is not always taken into account.

It is necessary to strengthen the GDPR regulation to ensure real protection of privacy in the world of big data. Anonymity, in fact, no longer guarantees protection and the so-called sensitive data help to extract a different value. In fact, through big data we have access to very confidential information.
Personal data, sensitive data and tout court data no longer have differences with the AI. Anyone who processes and generates data other than the original should declare it with total transparency and be authorized. Each of us is subject to continuous profiling, made by third parties, which influences us in the choices that impact on our life. For this reason, explicit regulations and authorizations from the user are needed, since we are not only talking about privacy, but about individual and social freedom.

One example is the reputation rating based on big data, which represents a social guarantee, but it can also be a tool that limits the individual freedom of citizens. The profiling of political ideas and the spread of fake news with the aim of guiding people’s political choices undermines the mechanism of democratic choice. These protections are necessary in order to avoid new forms of totalitarianism based on the control of confidential data.

THE FAKE NEWS PROBLEM

According to the Facebook regulation, it removes fake news only if they are contrary to its policies or the law, while the most common fake news is sent to a fact checking agency, which has the task of verifying them and in Italy this agency is Pagella Politica. If the news is potentially false for the (private) agency, it is analysed and an indication appears under the news inviting the user to think twice before sharing it.
Among many initiatives, a Technical Table was established in Italy to guarantee pluralism and correctness of information on digital platforms, which aims to combat online misinformation. It includes publishers, audio-visuals, journalists and web platforms.

The main problem between big data and fake news is that currently there are no binding rules, but only self-regulation proposals and big data have no jurisdiction. On one hand, big data have allowed the largest online search and social network platforms to dominate online advertising, on the other hand, they are becoming increasingly important in the information landscape, given that news always passes through the search engines and social networks.
Moreover, online advertising is almost an exclusive source of funding for these platforms and this affects the quantity and quality of these information contents and it assumes further importance in terms of safeguarding the pluralistic principle and the privacy.

54.5% of Italians keep informed through tools governed by algorithms, while 39.4% of the population uses websites and traditional means. This leads to a shift in advertising investments that are transferred to the network, on Google and Facebook.
Policy makers and experts have been working to find a solution to the problem of fake news, but at the moment we only have some simple self-regulations. An analysis of five countries has shown that the manipulation of search engines can significantly affect undecided voters, which shows the obvious political and economic power of the big names on the web.
The story is different for user data, for which specific consent is needed. We are talking about information such as the religious, political and sexual orientation used by Facebook and Instagram (for example) to customize functions and products. Digital players can access not only online information, but also particularly sensitive “offline” information, such as health and banking data.

The problem of fake news has not yet been solved because it is very complex and digital players keep postpone the solution to better times (as they are not subject to strict discipline), while they are continuing to treat our personal data undisturbed to the detriment of our privacy and not only.

The data shown, however, do not only concern information of individuals, but also sensitive and strategic data related to companies and their corporate strategies, as well as confidential documents of public administrations and security forces.
The problem, therefore, is not only that of fake news, but also the large and uncontrolled amount of information made available to multinationals.

Even for the directors of the main Italian newspapers, the solution must be the regulation of new technologies and social media, but the path is not easy. The legislation related to social networks cannot be national, precisely because everywhere in the world there is the risk that the consumer is linked to a product. To arrive at globally shared laws, a legal, political and social battle is needed against those who hold this data. It is a question of defending people’s freedom from an invasion that puts democracy at serious risk, and there is no talk of censorship but of intervening in order to placate such a dangerous situation.

How to recognize fake news

To hinder the spread of fake news, Facebook drew up some guidelines to recognize them and stay away from them:

headlines are the element that attracts the most and this is why fake news leverage sensational and exaggerated headlines, often written in capital letters or with too many exclamation marks
a URL very similar to another existing website ("Il fatto Quotidaino" in Italy is an emblematic example) is often a clear indication that we are dealing with fake news
images and videos are also used to capture the reader's attention, but often they are retouched to match false news, or they can be authentic but out of context. With TinEye it is possible to search by images and verify their origin
often the websites that spread fake news are full of typos and anomalous text formatting
it is important to verify that the source that released the news is reliable to be sure of its truthfulness
the dates of publication of the news are often useful to make it clear if we are faced with a fake news. Sometimes old news is proposed with the aim of catching new likes on social media, while the dates shown in the article are wrong
if we find reference to experts without their names or if there is no evidence to support what the news claims, we are probably faced with a false report. Furthermore, if a news is true, it will surely be reported by multiple sources, but if no one else reports it, there will probably be a reason
some news seems true, but there are satirical sites that collect fake news to entertain. For this reason, it is good to reflect before sharing news, if you are not sure of its truthfulness

Data protection and democracy are two topics that go hand in hand, in fact false or biased news is able to influence public opinion and voting intentions. Therefore, digitized information today has become the main resource of every political and economic engine, an instrument through which power and strategy are built.

Big data and algorithms have therefore transformed the ethical and individual problem making it a private and public security problem. Fake news is perceived as something new precisely because of the effects it has. The fact of not being able to identify the author represents the main difference between fake news and classical disinformation. Moreover, the possibility of disseminating targeted news is another risk. This occurs when the disinformation agent has the information relating to the targets based on an accurate profiling of habits and interests. These data derive from likes, purchases, searches and so on, i.e. all those metadata useful for tracing a user profile and organizing effective campaigns, even if based on fake news.

The legal solutions to avoid the dispersion of this information collected for malicious purposes is a problem that impacts national security. In fact, if the data volume of millions of users is treated or spread in an unsafe way, the democratic balance is seriously endangered.
Ultimately, citizens should always be aware of the risks involved in disclosing their sensitive data in exchange for services or goods.

Looking for ICT project partners? Ask PMF Research by filling out the Contact Form

BIG DATA AND FAKE NEWS

BIG DATA AND PRIVACY

THE FAKE NEWS PROBLEM

How to recognize fake news

PKU Smart Sensor

AMELIE

MINERVA

SECESTA ViaSafe