I was wondering, concerning to our data, which ones can be closer to our true self: data are pieces of information, traces of our path on the internet. I always thought that social media are an underrated tool of self-expression since we own a platform that is giving us unprecedented opportunities for sharing informations about ourselves. But, the urge to manipulate our virtual persona because of the different publics on the various social media is making it more difficult to identify where we feel more comfortable to share our most spontaneous and sincere thoughts, our most intimate events. The first thing that came to my mind is the dichotomic image of what I would write on a status and what I would write, on the same social media, in a private message to a friend. Imagine for one second that, for an unfortunate mistake, you would write a very intimate confession to a friend as a public status: anyone would feel extremely uncomfortable even just thinking about such a social little disaster. But is that the most intimate narration of ourselves we share with a public. We are used to thinking to the word public as a noun, as a sum of individuals, while the definition of the adjective public is available for anyone to hear, watch, go to, or be involved in.
This definition allows us to open up our consideration of its concept. In this way, Google itself becomes a public.
We forgot that, and, the fact that we don't consider the words we type as informations we actually share, it allows us to become our truer self. Our worst self some surveys results might argue, but that's another territory too wide to be tackled in this contest. Everybody lies, by Seth-Stephens Davidowitz has some enlightening considerations about that that I will further display. In order to define the data-self on social media, it's of great importance to break it away from our Google-self and to try to understand how big data work.
In its introduction, Seth-Stephens Davidowitz declares that ever since philosophers speculated about a “cerebroscope”, a mythical device that would display a person’s thoughts on a screen, social scientists have been working for tools to expose the workings for human nature: rating scales, reaction times, pupil dilatation, functional neuroimaging... Yet none of these methods provides an unobstructed view into the mind.
Human thoughts are complex propositions, but propositions in all their tangled multidimensional glory are difficult for a scientist to analyze. Sure, when people pour their hearts out, we apprehend the richness of their stream of consciousness, but monologues are not the idea data-set for testing hypotheses. And I would personally argue that even the stream of consciousness of a person opening up to someone might be manipulated in order get closer to a tailor-made reaction that person craves for.
On the other hand, if we trust scientific experiments we can do statistics, but we’ve pureed the complex texture of cognition into a single number. Even the most sophisticated neuroimaging methodologies can tell us how a thought is splayed out in 3-D space, but not what the thought consists of. When and where they search for facts, quotes, likes, places, persons, things, or help, it turns out, can tell us a lot more about what they really think, really desire, really fear, and more than anyone might have guessed. The everyday act of typing a word or a phrase into a compact, rectangular white box leaves a small trace of truth that, when multiplied by millions, eventually reveals profound realities. Google Trends will provide data only when lots of people make the same search. If you look for your name you might have as a response: “We’re sorry, there is not enough search volume “. This is a very intuitive way of understanding when data become big data: when they become useful in recognizing patterns.
The intimacy that Google has been able to achieve is making surveys becoming obsolete. During the Obama election, there were darkness and hatred that were hidden from the traditional sources but they were actually quite apparent in the searches that people made. It's not just about politics and polls: surveys cannot be trusted to tell us the truth about our sex lives, for example. Google searches give a far more accurate picture of sex during marriage than surveys; on Google, the top complaint about a marriage is not having sex.
People frequently lie - to themselves and to others. Seth-Stephens Davidowitz is now convinced that Google searches are the most important dataset ever collected of the human psyche and that the new data increasingly available in our digital age will radically expand our understanding of humankind. The microscope showed us there is more to a drop of pond water than we think we see. The telescope showed us there is more to the night sky than we think we see. And new, digital data now shows us the reis more to human society than we think we see.
Some of this data will include information that would otherwise never be admitted to anybody. If we aggregate it all, keep it anonymous to make sure we never know about fears, desires, and behaviours of any specific individuals, and add some data science, we start to get a new look at human beings - their behaviours, their desires, their natures.
Intuition on steroids
Too many data scientists today are accumulating massive sets of data and telling us very little importance - e.g., that the Knicks are popular in New York. Too many businesses are drowning in data. They have lots of terabytes but few major insights. The size of the dataset is frequently overrated. The smartest Big Data companies are often cutting down their data. At Google, major decisions are based on only a tiny sampling of all their data. You don’t always need a ton of data, you need the right data. A major reason that Google searches are so valuable is not that there are so many of them: It is that people are so honest in them. People lie to friends, lovers, doctors, surveys and themselves. But on Google they might share embarrassing informations about, among other things, their sexless marriage, their mental health issues, their insecurities.
Data is playing an increasingly important role in all of our lives - and its role is going to get larger. Newspapers now have full sections devoted to data. Many people think that a quantitative understanding of the world is for a select few left-brained prodigies, not for them. Good data science is less complicated than people think: the best data science is surprisingly intuitive: is about spotting patterns and predicting how one variable will affect another. People do this all the time.
While the methodology of good data science is often intuitive, the results are frequently counterintuitive. Data science takes a natural and intuitive human process - and injects it with steroids, potentially showing us that the world works in a completely different way from we thought it did.
The truth about your Facebook friends
Many Big Data sources, such as Facebook, are often the opposite of digital truth serum. On social media, as in surveys, you have no incentives to tell the truth: you have a larger incentive to make yourself look good. This is because you have an audience. An example: The Atlantic and the National Enquirer have very different reputations but have roughly same circulations: however, on Facebook, roughly 1.5 million people either like the Atlantic or discuss articles from the Atlantic. Only about 50.000 like the Enquirer or discuss its contents. In Facebook world, a girlfriend posts twenty-six happy pictures from her getaway with her boyfriend. In the real world, immediately after posting this, she Googles “my boyfriend won’t have sex with me.”