Quantifying Spin

With the most fiercely fought Canadian election in more than a decade taking place on Monday, the crossfire of political rhetoric between the incumbent prime minister and his Conservative Party challenger is becoming heated – but which one is more trustworthy?

According to a new computer algorithm, Prime Minister Paul Martin, of the Liberal Party, spins the subject matter of his speeches dramatically more than Conservative Party leader, Stephen Harper, and the New Democratic Party leader, Jack Layton.

Spin, in this case, is defined as “text or speech where the apparent meaning is not the true belief of the person saying or writing it”, says the algorithm’s developer, David Skillicorn at Queen’s University in Ontario, Canada.

He and his team analysed the usage patterns of 88 deception-linked words within the text of recent campaign speeches from the political leaders. They then determined the frequency of these patterns in each speech, and averaged that number over all of that candidate’s speeches. Martin received a ranking of 124, while Harper and Layton scored 73 and 88, respectively.

“I think it’s expected that any party in power is going to use spin more than the challenging party,” Skillicorn says. “They have a track record to defend.”

(New Scientist article)

Prof. Skillicorn is an expert on data mining, and is especially interested in applying matrix decomposition techniques to the topic. I looked up his home page and found this article on applying the same technique as mentioned in the New Scientist article to a collection of over 289,000 Enron emails. The utility of the technique in this case is to have an automated process that can indicate which smaller number of documents should receive human scrutiny. This is pretty cool stuff, and I’m thinking this could be a good technique to have in the toolkit.

Update: It appears that Prof. Skillicorn’s methods specify the use of two commercial products, the “British National Corpus” and the “Linguistic Inquiry and Word Count” software. I had been hoping that the actual software might be open source, but that appears to be out of the question.

I’ve corresponded with Prof. Skillicorn, and he suggests the Coredge “Logik” knowledge discovery product. He is working with Coredge to incorporate his algorithms into their product.

Wesley R. Elsberry

Falconer. Interdisciplinary researcher: biology and computer science. Data scientist in real estate and econometrics. Blogger. Speaker. Photographer. Husband. Christian. Activist.