Data-Driven Investigation Leads to Second Ongoing Offense


In 2015, a complaint was received from an ophthalmologist concerning inconsistent refund amounts that had begun appearing in her financial system. The ophthalmologist explained that under the new Ontario Health Insurance Plan (OHIP) rules, her office charges $75 for an adult eye exam. If an ocular-related disease is discovered, then OHIP will cover the exam and the $75 charge is refunded.

According to Michael Akpata, Team Lead, Investigations and Counterfraud i2 National Sales at IBM Canada, it's normal to see debits and credits of $75 in multiple accounts. He shared this case to a room full of Pre-Conference attendees at the 2018 ACFE Fraud Conference Canada in Ottawa.

The complainant produced a transaction statement that identified two refunds in amounts that weren't consistent with the regular scope of business. The credits had been placed on two transaction cards, and these cards appeared on the transaction statement several times. During an investigation for fraud and theft, multiple data sources were analyzed to determine the true extent of the offense. These data sources were both traditional data sets (known data sets) and nontraditional data (data collected through technology).

Akpata had been pursuing the case using traditional data sets. “This is the thing that was difficult for me: I went down a number of paths based on my presuppositions,” he explained. Akpata had been pursuing paths of conspiracy and other charges using manual investigative techniques that pinned fraud on an innocent party. “When it was all brought together in a manageable and malleable path, the amalgamation of unrelated data sets correctly interpreted led to the discovery of other offenses. Only by closing the data loop was the investigation completed.”

This data-driven investigation led to the discovery of a second ongoing offense of theft by a power of attorney — the daughter of a woman in an assisted-living facility. The daughter slowly depleted her mother’s account of $330,00 over the course of six years. She then stole money from her employer, the ophthalmologist, and transferred it back into her mother’s account $75 at a time. By analyzing multiple data sets, investigators were able to examine both incidents.

“This turned into a $228,000 fraud,” Akpata said. “This case forced me to pivot my investigative position. It wasn’t until I layered the other disparate data sets on top that it changed the investigation.”

Big data can help fraud examiners conduct quality investigations, but what happens when they’re inundated with massive amounts of information? Akpata first detailed the different dimensions of big data:

  1. Volume: The amount of data that is generated from all points and all sources every day across the planet. This data is so large, substantial and overwhelming that it cannot be analyzed with conventional database technology.

  2. Velocity: The speed at which data is generated. Using social media as an example, imagine the number of tweets, messages and other interactions that take place in cyberspace.

  3. Variety: This may be the bane of a fraud examiner’s existence. There are two common types of data sources: structured and unstructured. Structured data, as the name implies, fits into a certain type of layout, while unstructured data, as the name implies, does not follow this specific, stabilized view of information. To complicate matters, most data that is produced is unstructured.

  4. Veracity: Fraud examiners must look at the different ways that truth may be expressed in their data. As data are gleaned from systems, there are ways to validate them that are correct and true, and these processes are used to ensure truthfulness and honesty in the data.

  5. Value: The data’s worth. The value of data is tied directly to what can be extrapolated from it. For example, a massive amount of cat picture data is useless until one can examine the metadata and geolocation information indicating where the images were taken and where they were posted.

With big data’s complexities in mind, Akpata explained that fraud examiners must then think about how to actually solve the problem using data, like in the opening case. “I was at one of my son’s hockey games and my daughter opened up her notebook with a grade school chart that broke down how to solve a problem and I thought, ‘oh! If I had this, I would put it up on a white board in a room full of fraud examiners.’”

His daughter’s chart included problem-solving strategies like using tables and charts, drawing out the problem, simplifying the problem and working backwards. “Every single fraud means the money went somewhere. What hole in our system did they find? Work backwards,” Akpata said. “How did the fraudster know how to do that? Dumb it down, make it simple.”

He then went to the audience to ask what problems they experience in their different fields. His first participant told the crowd he’s self-employed, his clients are mainly in e-commerce and he’s here to learn best practices for data analytics.

“What does the data look like?” Akpata asked. “Your clients are e-commerce, so let’s look at bitcoin. It’s asynchronous data so you need to use a tool to analyze asynchronous data. Using standard data analytics won’t work for you.”

Fraud examiners must understand that analytics will propel the investigation forward to the legal position of “reasonable grounds to believe”; however, the best analytics will not aid a flawed investigation. Fraud examiners should always pursue the standard principles of a quality investigation while using the data-based model.