Thanks in large part to coverage initiated by Inside Cable News, interest in our media bias/slant rating system has increased dramatically. Rather than field all questions individually, we’ve decided to post some of the most popular ones below:
How does one measure “bias” in the media without introducing bias into the system?
We were diligent in trying to maintain objectivity by adhering to very strict social science/text analytics, guidelines, working with a partner who is very experienced in this area, and engaging multiple “coders” for the sake of system integrity . After many months of incrementally refining the system — and waiting until we achieved high levels of inter-coder correlation — we released our version 1.0 classifier, and have continued to refine it every day since. Systems like ours must be constantly refined to adapt to the changing political rhetoric of the day. Fortunately, our platform is designed to do just that.
Text classification systems use Natural Language Processing elements —- basically, a progression of statistical correlation techniques —- to mimic the results of expert human coders. That being the case, the human coding process is key, since that is where bias can most readily be introduced. Some of the provisions we included to minimize coder bias include:
- Defining VERY strict rules for identifying transcript statements which can be coded as either “Favoring Democrats/Critical of Republicans,” or “Favoring Republicans/Critical of Democrats.” For example, the experts can only code for slant if if the explicit terms or specific proxy labels for Democrats or Republicans are contained in the text.
- Randomizing transcript statements for the human coding process so that “slant inertia” is drastically reduced. Even expert coders tend to bring outside context into their evaluations, especially when reading a narrative which has a repetitive theme. Randomizing statements helps the “man” component of this man-machine partnership to be more clinical, and enhances objectivity.
- Regular adjudication sessions, in which the team members review their mismatches and recommend rule refinements to improve coding clarity. Having done this innumerable times, and operating under the proviso of, “When in doubt, code NEUTRAL,” I can tell you that bias is controlled rather effectively this way.
- Partitioning statements related to the Republican Presidential primaries. This was critical to making the ratings fair and reasonable. News coverage about the Republican primaries is decidedly negative, and is often about Republican candidates bashing other Republican candidates, while we specifically target inter-party comparisons. Once again, we have VERY strict guidelines for how we treat this situation.
- Following slant assessment templates which involve identifying the speaker, determining the object of his/her discussion, assessing inter-party comparison(s),uncovering embedded judgments, and noting factual references that clearly reflect positively-or-negatively towards a particular party.
Hopefully, you get the idea. We’ve gone to great pains to make our ratings objective, but I am not so bold as to represent that it is perfect. Even the best text analytics systems have limitations. This one is no exception.
What is the business model for this service?
Beyond the high-level slant metrics we have initially provided free-of-charge, there is additional business value to be reaped from:
- Networks, news analysts, and interest groups, through secondary slant studies on specific topics such as health care, labor/union issues, military spending, right-to-life, tax reform, regulatory measures, etc.?
- Watchdog agencies, via insight reports on the political views of prominent news anchors, correspondents, and contributors?
- Various political groups desiring a deeper understanding of each network’s Republican Primary coverage and slant.
- Commercial, governmental, and educational bodies desiring to analyze the resonance of TV News slant through social media platforms like Twitter, Facebook, and the blogosphere.
- Media outlets, who want to certify that their content meets a specific political/informational criteria, for the purpose of differentiation
Say the President has a bad news day…something bad happens…bad job numbers, court case goes against the administration, scandal in the West Wing…whatever. How does your system handle that scenario?
A bad (or good) day by the President will influence our ratings. Slant ratings effectively “move with the market.” Therefore, our ratings are best viewed as relative measure. Said another way, you would find that certain networks and programs are more slanted than others during a “bad” news week, for Democrats or Republicans, but all will be effected by a dominant political news theme.
How does one evaluate “bias” in content that is, by design, supposed to be opinionated?
From our perspective, Op-Ed news content is absolutely valid, as long as viewers are aware that the content they are watching is indeed that. Frankly, we think that boundary between opinion pieces and straight news is often blurry for the general public. News wonks know the difference intuitively, but we have all experienced instances in which an uninitiated viewer proudly states that, “The only news program I watch is {INSERT YOUR OP-ED PROGRAM OF CHOICE}.” Furthermore, straight news programming often contains a subtle-but-consistent political tilt, despite claims to the contrary.
The fact is that TV news programs, regardless of type, often frame the political discourse of the day, which then translates into voting behavior and government policies that dramatically affect our daily lives. That being the case, don’t you think an object entity should “watch the watchers” in order to serve the greater good?
That may sound pretentious, but I don’t know how else to say it.
If Mediate Metrics had been through a rigorous process of development, which can take several months of hard work, they’d be telling us about it, because it would be a big step forward. The biggest trouble is that the initial degree of inter-annotator agreement, depending on how you define it and measure it, is likely to be spectacularly low, say around 30%.
Actually, our inter-coder reliability reached a peak of over 80% before the 1.0 classifier was released.
Our system had been in development for many months, and the supporting the code book is substantial. Still, there are many different outlets for this service, many of which are not staffed with linguistic/text analysis experts. Knowing that, and in consideration of our limited resources, we did not publicize all of our details, but they are available with certain concessions to confidentiality.