Monthly Archives: January 2012

TV News Political Slant Report by Network: 1/16 – 1/20

Welcome to the Mediate Metrics inaugural TV news political slant measurement report, based on our version 1.0 text classifier.

To our knowledge, this is the first objective TV news slant rating service ever published. Slant, by our definition, is news containing an embedded statement of bias (opinion) OR an element of editorial influence (factual content that reflects positively or negatively on a particular U.S. political party). This initial report focuses specifically on evaluating  slant contained in the weekday transcripts of  the national nightly news programs on the 3 major broadcast networks (ABC, CBS, and NBC) , as well as programming aired from 5 PM until 11 PM eastern time on top 3 cable news channels (CNN, Fox, and MSNBC). At this stage, analytical coverage varies by network, program, and date, but our intention is to fill in the blanks over time.

CHART 1: Slant by Network - January 16 to January 20, 2012

In keeping with U.S. political tradition, content favoring the Republican party in Chart 1 is portrayed in red (positive numbers), while content that tilts towards the Democratic Party is shown in blue (negative numbers)

To grossly over-simplify, the numerical slant ratings supporting the Chart 1 emanate from a custom text analysis “classifier,” built to extract statements of political slant from TV news transcripts. (For more on the underlying technology, see our post on Text Analytics Basics at http://wp.me/p1MQsU-at.) We have trained our classifier to interpret slant quite conservatively, conforming to strict guidelines for the sake of consistency and objectivity. As such, the ratings we present may be perceived as under-reporting the absolute slant of the actual content under review, but the appropriate way to view our ratings is as relative to similar programming.

As mentioned, our analytical coverage varies by network, program, and date. Correspondingly, our rating confidence is directly proportional to the amount of transcript text available for classification.The exact amount of coverage per network is shown in the table to the right, but we have graphically indicated depth-of-coverage in Chart 1 by way of color shading. For example, the bars representing the slant ratings for both NBC and CBS were purposely made lighter to reflect the relatively small transcript coverage for those particular networks.

During development, we determined that the Republican presidential primaries are an enterprise for which scrutiny is a normal-and-valuable part of the vetting process.  Related news content, however, tends to be disproportionately negative, and often times does not contain a clear inter-party comparison — an element we view as a crucial condition for the evaluation of political slant. With those factors mind, we have partitioned statements about the Republican Presidential primaries, and have excluded them from most slant ratings at this juncture. Similarly, the Republican Presidential debates and other such dedicated program segments have been excluded in their entirety from classification since they do not reflect the political positions of the networks, programs, or contributors under a consideration.

We’ll publish slant ratings by program for the same January 16 – 20 time period tomorrow.

Advertisements
Tagged , , , , , , , , , , ,

Text Analytics Basics

Text analytics, also known as “text mining,” automates what people — researchers, writers, and all seekers of knowledge through the written word—have been doing for years[i]. Thanks to the power of the computer and advancements in the field of Natural Language Processing (NLP), interested parties can tap into the enormous amount of text-based data that is electronically archived, mining it for analysis and insight. In execution, text analytics involves a progression of linguistic and statistical techniques that identify concepts and patterns. When properly tuned, text analytics systems can efficiently extract meaning and relationships from large volumes of information.

To some degree, one can think of the process of text analytics as the evolution of the simple internet search function we use every day, but with added layers of complexity. Searching and ranking words, or even small phrases, is a relatively simple task. Extracting information from large combinations of words and punctuation — which may include elements of slang, humor, or sarcasm — is significantly more difficult. Still, text mining systems generally employ a number of layered techniques to extract meaningful units of information from unstructured text, including:

  • Word/Phrase Search Frequency Ranking – What words or “n-grams” appear most often.
  • Tokenization – Identification of distinct elements within a text.
  • Stemming – Identifying variants of word bases created by conjugation, case, pluralization, etc.
  • POS (Part of Speech) Tagging – Specifically identifying parts of speech.
  • Lexical Analysis – Reduction and statistical analysis of text and the words and multi-word terms it contains.
  • Syntactic Analysis – Evaluation of sequences of language elements, from words and punctuation, and ultimately mapping natural language into a set of grammatical patterns.

The purpose for all of this NLP processing is to compare those computational nuggets with classifications or “codings,” that trained experts have assigned to representative text samples. Interpreting language is an exceedingly complex endeavor, and one that computers and software cannot effectively do without being “trained.” As such, text classification systems are designed to compare human codings with the patterns that emerge from computational analysis, and then mimic the expert coders for all future input.

As you may expect, the quality of any custom text analysis system is largely determined by the quality of the human coders it is trained on. As such, strict rules must be enforced on the human coders, with the knowledge that software classification systems are very literal (think “Dr. Spock”). Still, once effective coding rules are established that result in discernible patterns, text analysis systems are incredibly fast and consistent. Advanced classification systems, like the one employed by Mediate Metrics, are also adaptive, constantly evolving with the ebb and flow of political issues and rhetoric.


[i] Much of the explanation contained herein was gleaned from Text Analytics Basics, Parts 1 & 2, by Seth Grimes. July 28, 2008. http://www.b-eye-network.com/view/8032.

Tagged , , , , , , , ,