12 Dec 2016
Team Sherlock began the scenario with one clue: the leaked documents about fracking chemicals had been sent to firstname.lastname@example.org.
With access to the kind of metadata that has been retained and made available to Australian government agencies for the past year, the team of three primary school students were then able to track down the mock corporate whistleblower in two hours.
They were part of a ‘cyber fox hunt’ co-hosted by University of Melbourne to explore how Australia’s 2015 metadata laws affect our privacy. In the scenario, twelve teams used software to filter through a database of mobile, internet and location metadata. All but one team tracked down the home address of the whistleblower, and the winning team took just one hour.
“It was a lot easier than I expected,” she said.
“Basically what happened was we found the data that had the Google searches and the ones that corresponded with searches the whistleblower would use.
We then found the IP address they used with the Google searches and we linked the IP address to their email. We used the email to find their phone number and their address.”
Dr Suelette Dreyfus, a Melbourne Uni technology researcher and privacy expert who helped organise the ‘snitch hunt’ at the weekend, said she was “shocked, surprised and slightly horrified” by how quickly the teams found the whistleblower.
“It illustrates this data is easy to get – agencies don’t need to have a warrant,” she said.
You can have a go at the scenario here, using the analysis tools to try and find the whistleblower. Or read below for a step-by-step guide. (The screenshots aren’t of what people would have searched for in the scenario, but they are of the same kinds of searches).
Step One: Search Google for suspicious searches
The scenario is about an employee at ‘Minecorp’ emailing confidential information to an investigative journalist at ‘MineWatch’.
You are a police data analyst who is told to identify and arrest the employee. “May I remind you the mines in Australia are all critical infrastructure, and those leaked docs cannot get into the wrong hands,” your boss tells you.
You can sift through four sets of information.
One set is a search query log that represents information that would be held by Google and other search engines about what their users have searched for.
You can find the IP address of anyone who has searched for log for “MineWatch” (the news website) or “Anna Dupont” (the name of the journalist who wrote the story).
Through a combination of searches you can narrow it down to a few likely IP addresses.
Step Two: Link the IP address with an email address
Once you have the IP, you can also search the database for what email address has been using that IP.
Step Three: Use email address to access address and phone number
Once you know the email, you can search the telcos’ customer list for a phone number and the billing and postal address. (When you activate a SIM card you have to provide photo ID with an address).
The email and phone metadata logs also contain information about who has emailed who, and who has phoned who, and when. These are kept by telcos and made available without a warrant to 22 agencies. You can now confirm an email was sent from the address to ‘email@example.com’.
Step Four: Use phone number to get a recent location
The phone metadata also gives a record of where the suspected whistleblower has travelled. This is based on cell tower records.
You can use this to predict where they will be, and then intercept them to make an arrest.
Metadata isn’t only about knowing a person’s past movements, it also about predicting their future activities. That’s because most of us live according to routine. Once you know the pattern, it’s easy to work out where a person will be on a given day.
“There’s been research that’s shown nearly half of most people’s everyday activities follow a pretty regular pattern,” Dr Dreyfus said.
“You go to your university class every Tuesday or you go to your job every Monday morning or Mum’s house for dinner every Saturday night. Those patterns become a way of identifying people. One study that was done looked at the geolocation data points of more than a million people. It found that four space and time location points were all you needed to uniquely identify 95 per cent of people.”
In a graphic illustration of how this works, the US’s armed drone program identifies targets through their metadata. People are killed on the basis their metadata fits pattern of a terrorist. “We know former head of the National Security Agency said ‘We kill people based on metadata’,” Dr Dreyfus said.
Step Five: Find out who else has contacted the journalist
Using a graphing tool, you can interpret the metadata to understand relationships and social connections – you can use this to identify the journalist’s other sources.
In April, the AFP admitted it had sought access to a Guardian reporter’s metadata without a warrant in an attempt to hunt down his sources.
The Snitch Hunt used data analysis software that might be similar to that used by the AFP, Dr Dreyfus said.
Gen, the 12-year-old from Team Sherlock, said she had expected it would be harder to find the whistleblower.
“It’s interesting how easy it is to find small pieces of data, and then linking them you can find out so much about a person.”
Her brother, Miles, 10, said that it was fun and his team beat half the adults.
The Snitch Hunt is co-sponsored by partners ThoughtWorks, CryptoParty Sydney, the Platypus Initiative, Hack for Privacy, Blueprint for Free Speech, Digital Rights Watch and Electronic Frontiers Australia.