Online social influence is one of those phenomena that are hard to define, but we “know it when we see it.”

And social influence is even much harder to track than it is to define.  Businesses are becoming increasingly social in their marketing, sales and customer service using a wide range of strategies, tactics and platforms. Some work, some work better than others.

How big a challenge is measuring social influence online?  The answer lies in why we’re asking the question.  Do we want to know whom influences whom in what ways to get people to buy a certain car, or vote for a certain political candidate?  If that’s the case we’re in for a wild ride because the psychology of individual choice is wide, deep and rich.  We can understand social influence in its correlations—when certain influencers say something we can see a correlated set of responses occurring.

But correlation isn’t the same as causality.  Proving causality means you can specifically attribute when certain influencers say something it causes the following responses.  This is not measurement, its attribution.  And attribution is the real proof of social influence.

Klout and its mongrel brethren have built simple models defining “influencers” as people whose social messages are repeated repeatedly by others.  Klout then parses these influencers into buckets numbered from 1 to 100 and sells them to Virgin Airlines and other corporations like a PR agency in reverse.

I’ve wondered what the respected minds in data science think about the challenge of defining, capturing and measuring online social influence.  I’ve set out to meet and talk with people who are trying to understand and attribute social causality on the largest scales.

Gilad Elbaz, a Google alum, has a start up called Factual Inc, which the New York Times profiled last Sunday.  The goal of Factual leaves one gobsmacked: to capture and normalize all the data in the world.  All the data, baby.   I’m sure that will take a little time and present enough challenges to keep buzy armies of experts on the cutting edges of technology, semantics, and library science.

Gilad Elbaz (courtesy Datafile.com)

I’ve been challenged by some to point to organizations and people who are going about the business of measuring social influence and using that to create understanding and then actions based on that understanding.  If Klout is wrong, who is right?

One place to start is in Austin, Texas, twenty-six floors above Congress Boulevard, in the aviary headquarters of Dachis Group. Here, among half-finished offices lined with whiteboards filled to overflowing, are people working to understand the causality in social influence—how can we accurately attribute social communications to influencing real world actions?

Jeff Dachis founded Dachis Group in 2008 in Austin.  The idea behind Dachis Group is consulting with clients about what he calls “Social Business,” a sweeping change he sees as corporations become more transparent and social.  With this change comes the opportunity to build strong, direct relationships with customers, participating in the marketplace of social conversation.

Dachis is best known as the founder of Razorfish, now the lead digital agency owned by advertising giant Publicis Groupe.   He saw Razorfish through a successful IPO in 1998, and then the dot-bomb crash before leaving in 2001.  Now he’s on to social, and Dachis Group has become  a leading strategic consulting firm to Fortune 500 companies about social media, transparency and authenticity.  Interestingly, they consider themselves to be foremost a SaaS and solutions provider.

As a foundation for that strategy, a year ago Dachis announced an ambitious project, the Dachis Social Business Index (SBI) a project to build a data platform that could ingest enormous amounts of social network messages and use them to provide detailed analysis of howbrands and corporations are talked about in the vast social communications ecosystem.

What nascent start-ups like Kred, Klout and PROScore are trying to do to measure “social influence” of people, the SBI is doing in the online world for global corporations.  And the scale, complexity and analytical capabilities of the SBI platform are close to staggering.  Corporations pay well to collaborate with Dachis on the SBI as it adds more and more capabilities. The basic  SBI has been made available for free to the public since September, providing information and graphing the social communications of major corporations around the world, ranking them on their effectiveness.  (Top dog? News Corp, owned by Rupert Murdoch)

Starting with 2,000 corporations worldwide, the SBI began to scrape 100 million social sources, pulling in conversations and links and analyzing the language of those tweets, posts and comments and adding overlays of dozens of other data sources.  Today more than 30,000 companies and brands are analyzed by the SBI platform.  According to Dachis Group each day more than 50 million “brand relevant” social network messages are analyzed.

Courtney Boyd Myers, writing last September in The Next Web, described the SBI as: “derived from company, employee, partner/vendor, customer, engaged market and influencer data, and it is sourced from scraping sites like Twitter, Facebook, Wikis, YouTube, forums, and blogs as well as data buys, data partnerships, company contributions, and its own internal data team. It had over 100 of the world’s largest companies participating in its early access program to help cement the data and gain insights as to how the data is used.”

“Social listening” platforms like Radian6 provide data about keywords.  Assuming you define your keywords correctly, there’s a lot you can learn.  But it’s a big like playing 20 Questions—you have to ask the right question to get a worthwhile answer.  SBI is a “big data” platform that grabs a huge portion of the relevant social communications of a brand’s “business social graph” and from this enormous data set you can query, organize, drill down, and slice and dice the entire data set.

This pushes much farther towards a new standards for social media attribution.  How effective is a specific marketing campaign?  Which events affected a brand’s perception in the southeastern US?  What messages were triggered by a Facebook promotion?

The SBI platform extends into China, Russia and other foreign countries, further complicating the problem of gathering this morass of big data, wrestling it to the ground and beating it into coherent shape enough to provide analysis and attribution for a brand or corporation.

In the midst of the SXSW swirl this month, I sat with Erik Huddleston, CTO and EVP-Products and Craig Bromberg, Vice President of Business Development for Dachis Group and talked with them about the challenge of measuring the social influence of corporations.

I also took a closer look at the structure and performance of the Social Business Index (SBI) tool and the far more comprehensive cloud platform offering SBISaaS.   (The public SBI is available at http://socialbusinessindex.com.)  In November the next module called Social Portfolio Insight was released, to be the “system of record” for a brand’s social accounts across all brands. Social Performance Monitor was released in January to enable real-time measurement of social conversations on a brand’s performance.  It’s also the first module that requires a substantial subscription.    Coming next are two more subscription modules, Advocate Insight and Employee Insight, which will enable companies to drill down into the influence of brand advocates and their own employees.

The fundamental question is, given the huge morass of unstructured social communication about a brand, realistically how can anyone measure “social influence?”

“Measuring is a big problem,” Huddleston granted, “but attribution is a much bigger one.”

Social messages are what are called “unstructured data,” meaning they travel with almost no metadata that relates them with other messages.  Structured data is like data from a spreadsheet or from a database.  Unstructured data has to be harvested and then evaluated with natural language processing, semantic analysis, and machine learning algorithms in order to interpret what the data mean.  That's the profound challenge of social network messages.

This is akin to taking all the rocks in the Yosemite National Park and carefully reviewing each one for size, shape, mineral content, location, relationship to other rocks and so on.

“From the NLP and so on we build what we call a brand vocabulary index,” said Huddleston.  “This is the taxonomy of meaning of the brand—the words, the ideas, the phrases that the brand uses, and that people use in talking about the brand.  In this way we’re trying to understand conversations, not just keywords.”

“Radian6 and others track and find keywords,” Bromberg explained.  “We’re not saying that’s not valuable—it is.  There is a more important opportunity to analyze the performance of a brand or a company in the conversations and the context they provide.”

“BP could have used Radian6 to analyze social conversations when their oil platform was leaking,” Bromberg continued.  “But they didn’t know which of the five or six methods of social media crisis management were working, and how they were working in different geographies, precisely on what platforms.   That’s what the SBISaaS platform is designed to do.”

Huddleston used a term coined by Jeff Dachis, the “social business graph,” which they take to mean the connections and relationships of a brand or corporation to people across social communications.  SBI builds a map of the social business graph of a corporation across all of the messaging that relates to the brand vocabulary.

“The SBI helps our client’s aggregate social media conversations into a system of record,” said Huddleston.  “From this they can look into the performance of their campaigns and see what measures they can take to reach the full potential of the subscriber set in each social platform.”

“Advocacy is a perplexing and seductive problem,” Bromberg pointed out.  “Every company has to create content programming, and social media represents an opportunity to use measure this programming.  They key is that we’re now able with SBI to not just measure ideas like sentiment and trending, but we can follow influence down to specific conversations.  This isn’t just measurement; we’re drilling down to provide attribution.”

“When you get the complete picture like this it allows you to see how all of your communications are actually working in the market,” Bromberg continued.  “With this you see the power of advertising receding and the power of advocacy becoming much more important.  It gives you answers to questions of “how do we reach them?”

Dachis Group says it is currently working with 40% of Fortune 500 companies.  When it began working on the SBI in 2010, Dachis approached a selected group of companies and asked them to collaborate by integrating internal data with the social data that the SBI would collect and helping think through how to model the data to make it useful.

So how does a platform like the SBI deliver a real understanding of “social influence?”

“There are two answers to that,” said Huddleston with a smile. “The first is that we gather the data and figure out what it means, then (the client) can figure out the value from that meaning according to their existing measures.  Proctor and Gamble has a long track record of modeling what certain actions and chains of action mean to a brand in the marketplace.”

“If Nokia and Microsoft want to reposition Nokia as a smartphone rather than a value brand phone they come to the table looking for specific ways to use social to communicate to achieve their goals.”

“Second,” Huddleston continued, “then we help customers do this, figure out what the attribution means to them.  We’re concerned with business outcomes, how this directly drives our client’s businesses.  We’ve got the biggest social strategy group in the world to help them do this.”

Still he admitted that we’re standing at the dawn of really understanding how communication in social networks drives real business outcomes.

“It’s just too early in this,” he said.  “There are just not solid models of ROMI (return on marketing investment).  I’m sure we’ll see best practices emerge about how to truly accurately measure return on investment, but its way too early in the game for that.”

Huddleston smiled.   “We’re just over a year into it.”

Rohn Jay Miller is Managing Partner of Content & Social, an agency that collaborates with traditional marketing organizations to adopt content and social business strategies, working with senior management to bring positive change to how they do business