During one of our home learning sessions, I gave my daughter an old ‘connect-the-dots’ book. She practiced her counting, number recognition, and drawing skills. She loved it. Fairies, Unicorns, and Rainbows appeared in front of her. Magical. I was happy too as, frankly, it kept her busy for 10 minutes so I could do some spelling with my son.
It bought back memories of dot-to-dot books I did years ago. I remember some which had so many dots, I could only guess what the image was before I did it. Most of the time, I guessed wrong. The really good ones kept me guessing the whole time with the big reveal coming at the end.
Individually each dot told me nothing about the end picture. And sometimes even seeing all the dots gave no clues. Sometimes a few of the dots had lines drawn in already connecting numbers not in sequence (so dot 4 had a line connecting it to dot 46). These were always teasers, my brain focusing on them wondering why. However, once I followed the numbers, the connections and image came to life. This is true of other activities, things like mazes and so on.
And this is true of data. We can anonymise datasets so by themselves the data tells us nothing about a specific person. Effectively, the dataset is like a dot on my daughter’s page. And we do this with lots of datasets creating lots of dots. Sometimes there is a connection made already, a public dataset or data from Facebook, LinkedIn, that connects unrelated dots.
And slowly we connect data with other data with other data and the image emerges. The anonymised datasets are no longer anonymous. This is called the Mosaic Effect. Sometimes this takes a long time, but with the amount of data available and the power of analytic software, the time is becoming shorter and shorter and shorter.
However, the lure of and false security of anonymised data persists. And so we add another dot to the page, and one more…