They seem to come in waves: articles about the preponderance of fake followers on celebrity accounts (Lady Gaga, Barack Obama), or about the proliferation of fake Twitter accounts. And the evidence oftentimes comes from one of the automated fake follower tools on the web, most notably from StatusPeople or SocialBakers.

But how accurately are these tools in identifying fake accounts? Probably not as accurate as you might think.

What’s in a number?

At first glance, there’ s a lot of consistency between the tools. Consider two assessments of the @leaderswest Twitter account:

Statuspeople leaderswest

Social Bakers fake follower check leaerswest

Each tool determined that of my 20,000 followers, about 600 may be “fake.” But the Social Bakers tool goes one step further and proposes “fake” accounts to block:

SocialBakers block list

To SocialBaker’s credit, they list the criteria for inclusion in this list and say specifically:

“We understand that these criteria, number 6 in particular, don’t necessarily define fake followers. However these kinds of followers can be considered empty or inactive and therefore not helpful to you in terms of reach.”

But let’s look at the first “fake” follower @tweetcaroline:

@tweetcaroline

Caroline Barry Klout

@tweetcaroline Faker breakdown

Caroline appears to have been miscategorized as a “fake” account because she uses the paper.li content aggregator. The standard paper.li tweet repeats the same phrase in multiple tweets. But @carolinetweets most certainly isn’t a fake account. And this is a common reason that some legitimate accounts were identified as “fake” with the SocialBakers tool.

Assumptions for fake Twitter followers

Anytime you make assumptions about a large population, slight imprecision can be hugely inaccurate. A recent (more serious) example of this were the Excel errors in the calculations for European austerity measures by two renowned economists. The austerity measures that have been implemented over the last few years were based on small miscalculations that produced huge errors.

Percentage of fake Twitter followers will never approach the seriousness of austerity, but the same principle applies. Assumptions about a group of two-hundred million people, no matter how slight can result in big errors.

What do you think? Are these tools useful? Are they accurate?