Catfishing, the act of pretending to be someone you’re not to deceive people over the internet, can be caught early on by machine learning systems to prevent fake online romance scams.
Dating websites and apps are the best place for catfishers to set up shop. All you have to do is create a convincing profile. Make it appealing by adding attractive photos and writing a little bit about your fictional self. Perhaps you like long walks on the beach or reading poetry. Maybe you’ll catch the attention of potential suitors right away, or you might have to do some legwork yourself by reaching out to other people first.
Send a sweet message and wait. Victims will take the bait and a conversation is sparked. Eventually a rapport is built and the catfisher can strike, pulling all sorts of tricks to win the other person’s heart and wallet – asking for funds to buy a ticket to see them or sort out a visa are the most common scams. Figures from the FBI’s most recent 2018 Internet Crime Report estimates that thousands of people fell prey to online romance fraud that cost hundreds of millions of dollars.
So a group of researchers from King’s College London, University of Bristol in the UK, the University of Boston in USA, and Australia’s University of Melbourne, decided to tackle the problem using AI.
“Our work presents the first system for automatically detecting this fraud. Our aim is to provide an early detection system to stop romance scammers as they create fraudulent profiles or before they engage with potential victims,” they wrote in the abstract of a paper hosted on arXiv.
Dating ‘N More and its scamlist
The team trained a neural network to analyze dating profiles scraped from the free dating website Dating ‘N More. The dataset contained 14,720 real profiles from ordinary people, and 5,402 profiles tagged as fake. They used 60 per cent of the data for training, 20 per cent for validation, and the other 20 per cent for testing.
Dating ‘N More may not be the most popular romance service out there, but it provided the researchers with unique data. It boasts that there are no scammers on its site and prevents scroungers registering to the platform by posting their failed profiles publicly so that its users are aware. The list of fake profiles allowed the researchers to look for common patterns that catfishers use to attract people over the internet.
These features were then be used to train a classifier neural network system to sniff out fraudulent profiles. The system is made up of a convolutional neural network to analyze profile pictures and a language-generating recurrent neural network to study the text in people’s biographies. The system looks at a person’s age, gender, ethnicity, occupation, marital status, and country of residence.
What they found was pretty interesting. The gender split in fake profiles was about 60 per cent men and 40 per cent women. There are slight differences in details between a fake profile for a man and a woman. These fake men reported an average age of 50, whereas for women it was younger at 30.
Mother/daughter team jailed for million-dollar internet dating scam
The top three occupations for these middle-aged men were being in the military, an engineer, or self-employed. For the women, however, it was being a student, self-employed, or a carer. Scammers are more likely to report being Caucasian and living in the US or Western European countries such as the UK or Germany. They are also more likely to put themselves down as widowed or single.
There are more generic patterns that scammers employ to make their profiles more alluring regardless of what gender they pretend to be. Sham profiles contain more pictures and text with “emotive language” such as people describing themselves as “caring,” or “passionate,” as well as “loving.”
The convolutional neural network part studied profile pictures and assigned short caption descriptions to the images. The researchers discovered that there was a higher chance of the images being fake if they contained more than one person, children, food and/or animals.
93 per cent accuracy
When the team trained their system to recognise these features, it could scrutinize profiles to estimate the probability that it was real or fake to about 93 per cent accuracy.
It’s not perfect. Sometimes profiles are misclassified, real ones can be tagged as fake and fake ones can slip past if they’re thought of as real. The scammers that are smarter and avoid all the common tropes can evade the detector and go on to send cringey messages like this one from the data set.
“I must confess to you, you look charming and from all I read on your profile Id want you to be my one special woman. I wish to build a one big happy family around you. I’m widowed with two girls, Emily and Mary, I lost their mom some years ago and since then I’ve been celibate. I think you’ve got all it takes to fill the vacuum left by my late wife to me and the kids. I seek to grow old with you, children everywhere and grey hairs on our head. I wanna love you for a life time. Hope to read from you soon. You can add me on Facebook or mail me at @yahoo.com. Looking forward to your message. Regards, Larry.”
Larry’s profile was missed by the system, but if there was a way to look at private messages then it’d be even more accurate, the researchers noted.
These systems may be impressive and actually pretty useful, but they’re probably not very generalizable. A detector trained to sniff out scammers on Dating ‘N More won’t necessarily work on other platforms like OkCupid or Tinder.
The software depends on identifying common features in scammer profiles, and these can vary on different dating sites. The average age on Tinder is younger, for example, so people are probably less interested in having a relationship with a 50-year-old widowed soldier, and these types of profiles probably aren’t as effective.
“As future directions, we aim to more broadly examine the available data on online dating fraud, seeking information actionable for enforcement and other countermeasures. We also hope to explore the question of how, at a local level, interventions designed to warn and protect users from scammers can avoid forming dependencies that reduce awareness,” the paper concluded.
If you want to take a look at their goldigger-catching code, it’s here on GitHub. ®