Tag Archives: Text analytics

Mediate Metrics Update

My sincere apologies to those who have been following my work, but I have come to the conclusion that I must suspend my efforts to measure political bias in the media, at least for the time being. The demands of other life priorities, coupled with the challenges of getting the system to work to my satisfaction, have made this decision necessary.

Despite this unfortunate turn, the effort was highly educational and afforded me certain perspectives on political news bias —- both in how it is delivered and in how it is received — that I will share with readers over the coming days, weeks, and months. Having devoted over 60 hours a week to this task for 6 months, one cannot help but gain a few insights along the way.

Perhaps most interesting (and amusing) was the reaction I received from the blogosphere when my efforts came to light. I have often commented that so-called media “watchdog” groups are all about watching the other dogs, and therefore lose their value for those who simply want a way of handicapping the political information they gather. But the most engaged viewers ARE partisan, and the feedback I received from them suggested that they were not interested in an objective bias metric. This phenomenon parallels the media construct of the day; “slanted” news outlets are far more popular than those which tend towards the middle, particularly in cable news.

Simply put, partisan viewers tend to be engaged and participate like sports fans at a pep rally.

Not surprisingly, some media people aggressively challenged the fundamental value of measuring news bias at all. My favorite comment came from a British journalist, who starkly said that,” I’m not so into the whole impartial journalism ideal. My ideal is fealty to the truth, not to balance.” When I first read that comment, I pictured a court room in which a lawyer imperiously states that, “I don’t have facts or witnesses, but I am uniquely blessed to know the absolute TRUTH!”

Short trial.

In fairness to that commenter, my view of media objectivity is not — nor has it ever been — robotic commentators stating cold facts without passion or perspective. Rather, it is a healthy balance of thoughtful, engaging analysis that fairly presents BOTH sides of key political issues. That, and the fact that I’m an early riser, is probably why I am a fan of MSNBC’s “Morning Joe.” Viewpoints are intelligently and passionately delivered on both sides of any political topic, although the format equally exposes them to raging partisan criticism (especially when Joe Scarborough takes issue with his own). Still, for an independent like me, it’s a great way to hear a passionate, 2-sided discourse and form my own opinion, discounting for MSNBC’s over-arching liberal bias, of course.

One conclusion I could not help but come to is that those most passionate and engaged about their political views want to be affirmed by the media, not informed. Of course, those folks were not the market segment I was trying to reach, but they were the most vocal. The challenge for any media bias rating service like the one I had envisioned was reaching the next tier — those who are going about their busy daily lives, and simply grazing the news for political insights. As I have noted elsewhere, I cannot tell you how many times I have had conversations with uninitiated viewers who proudly state that, “The only news program I watch is the O’Reilly Factor … or Hardball …,” etc.

If such low engagement viewers and voters are acquiring their political insights this way … or from political news sound bites that resonate throughout our society at the speed of light … or from the deluge of Super-Pac ads sponsored by some seemingly high-minded “citizens” group …

… then we all have cause for concern.

Tagged , , , , , , , , , , ,

FLASH REPORT: Political Slant Ratings by Show – 1/30 to 2/3

Our latest TV News measurement metrics, targeting individual programs aired by the 3 major broadcast networks (ABC, CBS, and NBC) and the top 3 cable news channels (CNN, Fox News, and MSNBC), is fairly consistent with our previous studies.  As has been our pattern, we have limited our focus shows airing from 5 PM until 11 PM eastern time, Monday through Friday. Because of that, transcript coverage is less than normal since Florida primary coverage preempted several regular programs under study.

It’s worth noting that, particularly in the case of CNN, those special programs garnered Nielsen ratings of roughly twice the average of the programs they replaced (Erin Burnett Outfront & Anderson Cooper 360). For those who wonder why the media is obsessively covering the Republican primaries, your answer lies there.

CHART 1: Slant Rating by Show - January 30 to February 3

As always, content with a numerical rating above zero indicates a Republican slant, with ratings below zero representing a Democratic slant. In this case, however, those shows which are in the +2.0 to -2.0 range are shown in gray, indicating that they are in the “balanced” news category. The one notable exclusion this week is CBS Evening News, but our content coverage for it was exceptionally light. Red remains the color indicator for “slanted” news which favors the Republican party, while slanted content that favors towards the Democratic party is shown in blue. Those interested in the underpinnings of the Mediate Metrics slant rating system should review our January 31st post, or see our primer on Text Analytics Basics at: http://wp.me/p1MQsU-at.

Since our analytical coverage varies by program and date, so does our confidence in the associated show ratings. The exact amount of program coverage is shown in the Table 1 below, but we have graphically indicated depth-of-coverage by way of color shading in Chart 1. For example, the cones representing CBS Evening News, Special Report (Fox), and NBC Nightly News were purposely made lighter to reflect the relatively small transcript coverage for those particular show. We should also note that our version 1.4 classifier did exhibit some anomalies that caused our NBC Nightly News ratings to be disproportionately skewed towards favoring Republicans.

TABLE 1: Slant Rating by Show - January 30 to February 3

Further information about our rating system can be found in previous posts, or by contacting us via email at: barry@mediatemetrics.com

Tagged , , , , , , , , , , , , , , , , , , , , ,

Mediate Metrics FAQ #1

Thanks in large part to coverage initiated by Inside Cable News, interest in our media bias/slant rating system has increased dramatically. Rather than field all questions individually, we’ve decided to post some of the most popular ones below:

How does one measure “bias” in the media without introducing bias into the system?

We were diligent in trying to maintain objectivity by adhering to very strict social science/text analytics, guidelines, working with a partner who is very experienced in this area, and engaging multiple “coders” for the sake of system integrity . After many months of incrementally refining the system — and waiting until we achieved high levels of inter-coder correlation — we released our version 1.0 classifier, and have continued to refine it every day since. Systems like ours must be constantly refined to adapt to the changing political rhetoric of the day. Fortunately, our platform is designed to do just that.

Text classification systems use Natural Language Processing elements —- basically, a progression of statistical correlation techniques —- to mimic the results of expert human coders. That being the case, the human coding process is key, since that is where bias can most readily be introduced. Some of the provisions we included to minimize coder bias include:

  • Defining VERY strict rules for identifying transcript statements which can be coded as either “Favoring Democrats/Critical of Republicans,” or “Favoring Republicans/Critical of Democrats.” For example, the experts can only code for slant if if the explicit terms or specific proxy labels for Democrats or Republicans are contained in the text.
  • Randomizing transcript statements for the human coding process so that “slant inertia” is drastically reduced. Even expert coders tend to bring outside context into their evaluations, especially when reading a narrative which has a repetitive theme. Randomizing statements helps the “man” component of this man-machine partnership to be more clinical, and enhances objectivity.
  • Regular adjudication sessions, in which the team members review their mismatches and recommend rule refinements to improve coding clarity. Having done this innumerable times, and operating under the proviso of, “When in doubt, code NEUTRAL,” I can tell you that bias is controlled rather effectively this way.
  • Partitioning statements related to the Republican Presidential primaries. This was critical to making the ratings fair and reasonable. News coverage about the Republican primaries is decidedly negative, and is often about Republican candidates bashing other Republican candidates, while we specifically target inter-party comparisons. Once again, we have VERY strict guidelines for how we treat this situation.
  • Following slant assessment templates which involve identifying the speaker, determining the object of his/her discussion, assessing inter-party comparison(s),uncovering embedded judgments, and noting factual references that clearly reflect positively-or-negatively towards a particular party.

Hopefully, you get the idea. We’ve gone to great pains to make our ratings objective, but I am not so bold as to represent that it is perfect. Even the best text analytics systems have limitations. This one is no exception.

What is the business model for this service?

Beyond the high-level slant metrics we have initially provided free-of-charge, there is additional business value to be reaped from:

  • Networks, news analysts, and interest groups, through secondary slant studies on specific topics such as health care, labor/union issues, military spending, right-to-life, tax reform, regulatory measures, etc.?
  • Watchdog agencies, via insight reports on the political views of prominent news anchors, correspondents, and contributors?
  • Various political groups desiring a deeper understanding of each network’s Republican Primary coverage and slant.
  • Commercial, governmental, and educational bodies desiring to analyze the resonance of TV News slant through social media platforms like Twitter, Facebook, and the blogosphere.
  • Media outlets, who want to certify that their content meets a specific political/informational criteria, for the purpose of differentiation

Say the President has a bad news day…something bad happens…bad job numbers, court case goes against the administration, scandal in the West Wing…whatever. How does your system handle that scenario?

A bad (or good) day by the President will influence our ratings. Slant ratings effectively “move with the market.” Therefore, our ratings are best viewed as relative measure. Said another way, you would find that certain networks and programs are more slanted than others during a “bad” news week, for Democrats or Republicans, but all will be effected by a dominant political news theme.

How does one evaluate “bias” in content that is, by design, supposed to be opinionated?

From our perspective, Op-Ed news content is absolutely valid, as long as viewers are aware that the content they are watching is indeed that. Frankly, we think that boundary between opinion pieces and straight news is often blurry for the general public. News wonks know the difference intuitively, but we have all experienced instances in which an uninitiated viewer proudly states that, “The only news program I watch is {INSERT YOUR OP-ED PROGRAM OF CHOICE}.” Furthermore, straight news programming often contains a subtle-but-consistent political tilt, despite claims to the contrary.

The fact is that TV news programs, regardless of type, often frame the political discourse of the day, which then translates into voting behavior and government policies that dramatically affect our daily lives. That being the case, don’t you think an object entity should “watch the watchers” in order to serve the greater good?

That may sound pretentious, but I don’t know how else to say it.

If Mediate Metrics had been through a rigorous process of development, which can take several months of hard work, they’d be telling us about it, because it would be a big step forward. The biggest trouble is that the initial degree of inter-annotator agreement, depending on how you define it and measure it, is likely to be spectacularly low, say around 30%.

Actually, our inter-coder reliability reached a peak of over 80% before the 1.0 classifier was released.

Our system had been in development for many months, and the supporting the code book is substantial. Still, there are many different outlets for this service, many of which are not staffed with linguistic/text analysis experts.  Knowing that, and in consideration of our limited resources, we did not publicize all of our details, but they are available with certain concessions to confidentiality.

 

Tagged , , , , , , , , ,

FLASH REPORT: Political Slant by Show – 1/23 to 1/27

Building on our previous post, today we our publishing a separate version of our TV news measurement metrics which focuses on the political slant of individual programs aired by the 3 major broadcast networks (ABC, CBS, and NBC) and the top 3 cable news channels (CNN, Fox News, and MSNBC), based on our enhanced 1.2 classifier. The analysis is focused on shows airing from 5 PM until 11 PM eastern time, Monday through Friday.

CHART 2: Slant Rating by Show - January 23 to 27

We’ve constructed this chart slightly differently than in the past. As always, content with a numerical rating above zero indicates a Republican slant, with ratings below zero representing a Democratic slant. In this case, however, those shows which are in the +2.0 to -2.0 range are shown in gray, indicating that they are in the “balanced” news category. The one notable exclusion this week is NBC Nightly News, but our content coverage for it was exceptionally light. Red remains the color indicator for “slanted” news which favors the Republican party, while slanted content that favors towards the Democratic Party is shown in blue. Those interested in the underpinnings of the Mediate Metrics slant rating system should review our January 31st post, or see our primer on Text Analytics Basics at: http://wp.me/p1MQsU-at.

Since our analytical coverage varies by program and date, so does our confidence in the associated show ratings. The exact amount of program coverage is shown in the Table 2 below, but we have graphically indicated depth-of-coverage by way of color shading in Chart 2. For example, the cones representing CBS Evening News, Special Report (Fox), and NBC Nightly News were purposely made lighter to reflect the relatively small transcript coverage for those particular show.

TABLE 2: Political Slant by Show - 1/23 to 27

Further information about our rating system can be found in previous posts, or by contacting us via email at: barry@mediatemetrics.com

Tagged , , , , , , , , , , , , , , , , , , , ,

TV News Political Slant Report by Show: 1/16 – 1/20

Building on our previous post, today we our publishing a separate version of our TV news measurement metrics which focuses on the political slant of individual programs aired by the 3 major broadcast networks (ABC, CBS, and NBC) and the top 3 cable news channels (CNN, Fox News, and MSNBC), for shows aired from 5 PM until 11 PM eastern time, Monday through Friday. As highlighted yesterday, our analytical coverage varies by network, program, and date, but our intention is to augment it over time.

CHART 2: Slant Rating by Program - January 16 to 20, 2012

Content favoring the Republican party in Chart 2 is portrayed in red (numerically positive), while content that slants towards the Democratic Party is shown in blue (numerically negative). Those interested in the underpinnings of the Mediate Metrics slant rating system should review our January 31st post, or see our primer on Text Analytics Basics at: http://wp.me/p1MQsU-at.

Since our analytical coverage varies by network, program, and date, so does the associated confidence factor in our slant ratings. The exact amount of coverage per network is shown in the Table 2 below, but we have graphically indicated depth-of-coverage by way of color shading in Chart 1. For example, the cones representing The Five, Hannity, and On The Record were purposely made lighter to reflect the relatively small transcript coverage for those particular networks. Low transcript coverage likely accounts for certain results that may seem counter-intuitive; we expect those metrics to adapt with volume and time.

TABLE 2: Slant by Program - January 16 to 20, 2012

As mentioned yesterday, we have partitioned statements about the Republican Presidential primaries, since they tend to be disproportionately negative and often lack inter-party comparison, and have largely excluded them from these slant ratings. Similarly, the Republican Presidential debates and other such dedicated program segments have been omitted in their entirety since they do not reflect the political positions of the networks, programs, or contributors under a consideration.

We’ll publish an “impact rating” for the same January 16 – 20 time period tomorrow.

Tagged , , , , , , , , , , , , , , , , ,

Text Analytics Basics

Text analytics, also known as “text mining,” automates what people — researchers, writers, and all seekers of knowledge through the written word—have been doing for years[i]. Thanks to the power of the computer and advancements in the field of Natural Language Processing (NLP), interested parties can tap into the enormous amount of text-based data that is electronically archived, mining it for analysis and insight. In execution, text analytics involves a progression of linguistic and statistical techniques that identify concepts and patterns. When properly tuned, text analytics systems can efficiently extract meaning and relationships from large volumes of information.

To some degree, one can think of the process of text analytics as the evolution of the simple internet search function we use every day, but with added layers of complexity. Searching and ranking words, or even small phrases, is a relatively simple task. Extracting information from large combinations of words and punctuation — which may include elements of slang, humor, or sarcasm — is significantly more difficult. Still, text mining systems generally employ a number of layered techniques to extract meaningful units of information from unstructured text, including:

  • Word/Phrase Search Frequency Ranking – What words or “n-grams” appear most often.
  • Tokenization – Identification of distinct elements within a text.
  • Stemming – Identifying variants of word bases created by conjugation, case, pluralization, etc.
  • POS (Part of Speech) Tagging – Specifically identifying parts of speech.
  • Lexical Analysis – Reduction and statistical analysis of text and the words and multi-word terms it contains.
  • Syntactic Analysis – Evaluation of sequences of language elements, from words and punctuation, and ultimately mapping natural language into a set of grammatical patterns.

The purpose for all of this NLP processing is to compare those computational nuggets with classifications or “codings,” that trained experts have assigned to representative text samples. Interpreting language is an exceedingly complex endeavor, and one that computers and software cannot effectively do without being “trained.” As such, text classification systems are designed to compare human codings with the patterns that emerge from computational analysis, and then mimic the expert coders for all future input.

As you may expect, the quality of any custom text analysis system is largely determined by the quality of the human coders it is trained on. As such, strict rules must be enforced on the human coders, with the knowledge that software classification systems are very literal (think “Dr. Spock”). Still, once effective coding rules are established that result in discernible patterns, text analysis systems are incredibly fast and consistent. Advanced classification systems, like the one employed by Mediate Metrics, are also adaptive, constantly evolving with the ebb and flow of political issues and rhetoric.


[i] Much of the explanation contained herein was gleaned from Text Analytics Basics, Parts 1 & 2, by Seth Grimes. July 28, 2008. http://www.b-eye-network.com/view/8032.

Tagged , , , , , , , ,