Thursday, March 5, 2015

Google’s Fact Checking Scheme: A War on Truth

OK, this is SCARY! This site, at Blogger.com, is OWNED BY GOOGLE.

Within a day of the FCC taking over the Internet, Google has decided it wants to define the truth of all web content!

From here:

Google to become final arbiter of “facts”?

Google recently published a research paper proposing that the world’s largest search engine change its ranking algorithms to dampen sites with a high number of “false facts”. The paper specifically uses the example of Barack Obama’s nationality, and New Scientist uses the specific example of “anti-vaccination sites”,  leading some to speculate this is an effort to target “conspiracy theories” and alternative news.
It’s a worthy concern. Facebook not long ago launched a feature to do just that. When people don’t like a story that’s getting around, they can report it as false, and Facebook will dampen that link, no matter who shares it. I’m going to speak anecdotally about my own experience here for a moment, but I hope you’ll come to see the bigger picture. For those of us who make enemies on social media, this presents a pretty serious problem, because Facebook does not check these reports, at least not thoroughly. I have been banned from Facebook dozens of times, and usually it is because somebody reported something I posted as containing nudity, violence, or racism, where none existed. Those reports are made by ideological rivals for ideological purposes, and my voice is repeatedly and severely diminished as a result.
The High Tech War on Truth
The High Tech War on Truth
Whether the issue at hand is Facebook’s reporting system, or Google’s fact checking, those of us with unpopular ideas are going to be contradicting the vast majority of people out there, and a system that punishes us for doing so is troubling to say the least. For me, Facebook is my top referrer for traffic to this website, Google is my second. I imagine that I’m not alone in this. Algorithm changes of any sort damage my business and my voice, and they happen more frequently than you might imagine if you don’t monitor these things. Were the standard for relevancy to change to credibility, and to be measured by “how many people agree with him” I would essentially be erased from the Internet. That might sound like a good idea to some people, but I’m not the one you need to be concerned about.
What I do here, for better or for worse, is start conversations. This works out pretty well in the present paradigm where the primary ranking indicator is popularity in terms of who is linking to and talking about you. I’ve long said that there is no such thing as bad press, in part because all “bad press” does is drive up your relevancy on search engines and social media. I certainly prefer people say nice things about me, but I prefer them to say bad things, than to say nothing at all.
This doesn’t always work out so well on a system like Reddit, where there is a reputation system that includes a downvote. There are coalitions of people on the web who hate me so viciously that they will do anything to damage my reputation and diminish my distribution. One such coalition exists on Reddit, and downvotes anything posted there from this website, without even reading the content. The fact that it comes from my domain is enough. This has caused Redditers to delete posts from this website, because the downvotes on my content damage their reputation on Reddit. Other times it causes the content to become labeled as controversial, and the competition between upvotes and downvotes causes the content to appear more popular, driving a great deal of traffic to this website. In large part, it just depends on who gets to it first, but it has almost nothing to do with the quality, much less the factual accuracy, of the content in question. Point being, agreement has no bearing on quality or accuracy.

Potential Upside

Rather than paint this as all doom and gloom, I should point out that this could be used for good. We’ll talk about problems with the fact checking algorithm later, but for a moment let’s consider the value of a “truth meter”. Presuming one existed and was accurate, this could be an excellent tool.
The paper proposes a similar model to Google’s “Knowledge Vault”. If you ask Google a question, Google will often give you an answer that doesn’t require you to click through to another site. When I want to check the exchange rate of Bitcoin for example, I search Google for “Bitcoin price” and Google displays the current average from Coinbase.com, updated every three minutes. This is fine with me, because I consider Coinbase to be a reputable source for this data. If I search for “cure for cold” I get some information from the Mayo Clinic on cold remedies. Even though we all know there is no cure for the common cold, this is about as close as we’re going to get to an answer to my question.
There are other reputation metrics on the web right now, and clearly there is nothing wrong with that. Web of Trust for example is a browser plugin with over 131 million downloads. It checks the reputation of the website you’re visiting “based on a unique crowdsourcing approach that collects ratings and reviews from a global community of millions of users who rate and comment on websites based on their personal experiences”. That store with the prices that seem too good to be true? WOT might save you from a frustrating and time consuming battle with identity theft. LazyTruth is a browser plugin that checks if the email you’re reading is some kind of chain letter hoax by comparing it against FactCheck.org and PolitiFact, which are themselves a sort of reputation metric.
If Google wants to get into the fact checking business, they could save me the trouble of installing yet another browser plugin, or searching for topics on fact checking sites, similarly to how they save me the trouble of visiting Coinbase or Mayo Clinic now. That’s a valuable service which a lot of people might appreciate.
Additionally, ranking a site highly just because it is already popular might not be the best way to determine the quality of the content, factual or not. New websites pop up every day, and some method of promoting them over a popular competitor whose quality is in decline might be of great benefit to content producers and consumers alike. Fact checking could be one way of establishing that quality.

Mission Creep

The primary problem as I see it, is that Google, Facebook, and other services we use to get ourselves connected to information have largely served to connect us with the information we want by determining what others have found relevant. This is a very specific task, and changing it to fact checking is a dramatic change in protocol. It is a completely different business model that in no way resembles the service you originally signed up for.
Read Adam Kokesh's Book "Freedom" for Free!
As stated earlier, there are already reputation metrics available for people who want to use them. These are specific services provided by entities who specialize in providing it. If they aren’t doing a good job, you just stop using that service, or go to a competitor. You lose nothing other than their fact checking service by leaving that website.
If Google gets into the fact checking business, and does a bad job, you lose your search engine. If Facebook gets into it and does a bad job, you lose your social network. More importantly since the idea behind these algorithms is to prevent you from seeing the content in the first place, you’ll never actually know if the content is fact or fiction, because you’ll never even see it. When the services that connect us to content start hiding it from us because they disagree with it, we never get the opportunity to find out of the service provider is doing a bad job. If we don’t find out that they are doing a bad job, then we lack the information necessary to make the decision to choose a different service provider.
I would have next to no problem with “Google Truth” or “Facebook Reputation” services. Even if they did a poor job, I could always choose a different service, or compete with them if I saw fit. They might wrongly slander and harm the reputations of good people, but if they did so on a large enough scale, it would in turn damage the reputation of those services. Take for example, the Southern Poverty Law Center. They smear people as violent racists and sexists and gay bashers all the time, often on very flimsy evidence. This type of institution gains a lot of sway with race baiters and social justice warriors, but they are despised by most conservatives and libertarians. We just make a value judgement on whose word we’re going to accept and move on for the most part.
The SPLC on the other hand isn’t deciding whether or not I see the content of the people they smear. Google and Facebook do. That presents a far more serious problem, which makes me inclined to stop doing business with them.

Chilling Effect

Let us say that Google takes the position that widgets do not cause cancer. Let us also say that some smaller source says that widgets do cause cancer. Now there is a dispute between these two institutions on the factual accuracy of the question.
Buy VPN
As a content producer, I see that Google sends me a great deal of traffic. I am dependent on that traffic for my livelihood. I also see that the smaller source provides me with no traffic at all, because they are not a search engine, that is not their function. Should I come to the side of the smaller entity, Google will punish me by pushing my site down in the search results. Potentially not only the article in question, but the entire domain could become discredited and even content Google sees as credible would be pushed down in the rankings, dramatically injuring my business. This makes content producers extraordinarily unlikely to disagree with Google.
Let us say someone does defy Google, and gets their business ruined as a result. Someone else sees that happening. They know that Google is constantly adding new “facts” to their database, and contradicting those facts will punish your search engine placement. A statement they make today, could be deemed false by Google tomorrow. Even if the content producer who Google punished yesterday was factually incorrect, even if he deserved to be erased from the Internet, doing that to him will have a dramatic impact on other well meaning content producers. They will be weary of challenging main stream narratives in general.
Google’s research paper mentioned the “Birther” conspiracy, and this provides a perfect example. Let us pretend that in 2016 there emerged a presidential candidate whose nationality really was in question. Content producers would be inclined not to report on the issue, for fear of being silenced by the world’s largest search engine, a system they rely on to put food on their tables.
New Scientist mentioned “anti-vaccination” sites. Well, if Google were to take the position that vaccines were safe and anybody who said they weren’t was a liar, then the emergence of a dangerous vaccine at some point in the future would not be as widely reported on due to fears of censorship by their biggest driver of traffic and advertising revenue. The real fact on vaccines is that they are not as safe as marijuana, so would pro vaccine sites that say “vaccines are safe” as a blanket statement be flushed? I have a hard time imagining that this would result in anything positive.
In short, Google would not just be driving down the emergence of incorrect information, they would have a dramatic impact on what people said before they said it. That goes contrary to everything we’ve come to enjoy about the Internet as a place for the exchange of ideas, and competition for hearts and minds.

The Myth of Neutrality

As I recently remarked, there is no such thing as neutral. We all have certain biases, and the best that we can hope for is to make those biases known and let people decide for themselves how to interpret data. The FCC cannot make the net “neutral” they can only control it in the way they see fit. There is no such thing as an unbiased news source, because to even say a thing is news is to take a position on an issue. The pages of Wikipedia are fraught with left wing and feminist bias that has been well reported on by many, not the least of which was to label “Cultural Marxism” as a conspiracy theory.
To decide whether or not a thing is true, and to flush the untrue statement to the bottom of search results, necessarily would make Google an arbiter of debates. Google might make some attempt to appear neutral in that, but the biases of their developers would bleed through in some way no matter what. Say a developer implements a fact checking system, sees results he does not agree with, and then alters the fact checking algorithm to fit his world view. If Wikipedia, a system that anybody can edit, will throw cultural Marxism into the conspiracy theory dumpster, then let us not pretend that a more tightly controlled mechanism like Google’s search algorithms won’t be subject to the same biases.
Were Google search results to begin flushing “conspiracy theories” down the toilet on wikipedia standards, just imagine how many conservative and libertarian websites would cease to have a voice on the world’s largest search engine.

Popularity By Any Other Name

Ultimately, Google is incapable of actually being the arbiter of truth. The proposed method of determining the factual accuracy of a given statement is actually just a different measurement of popularity than they currently use. Instead of determining popularity based on how many people are linking to the content, they would judge the popularity based on how many people were saying the same thing, and call that factual accuracy. This itself, is factually inaccurate, as popularity and truth are often worlds apart.
Buy VPN
Per the research paper from Google; (emphasis mine)
We propose using Knowledge-Based Trust (KBT) to estimate source trustworthiness as follows. We extract a plurality of facts from many pages using information extraction techniques. We then jointly estimate the correctness of these facts and the accuracy of the sources using inference in a probabilistic model. Inference is an iterative process, since we believe a source is accurate if its facts are correct, and we believe the facts are correct if they are extracted from an accurate source. We leverage the redundancy of information on the web to break the symmetry. Furthermore, we show how to initialize our estimate of the accuracy of sources based on authoritative information, in order to ensure that this iterative process converges to a good solution.
Facts are accurate because they come from a reputable source, and a source is reputable because it is accurate. They determine if it is accurate based on how many people are saying the same thing. This is just another popularity contest using a different yard stick. Instead of judging relevancy based on inbound links, they are determining “accuracy” based on similar thought processes.
This of course, is no measure of accuracy at all.
For a great deal of time it was thought that tomatoes were evil. This very popular notion parroted by many of the time, turned out to be complete nonsense. It took awhile for people to figure out that aristocrats were dying of lead poisoning, not tomatoes themselves. Imagine that phenomenon in the age of Google fact checking, any site that said tomatoes were safe, would quickly be flushed to the bottom of the search results.
Think about the hysteria that followed the release of “Reefer Madness” in the 30’s. That nonsense went largely unrefuted, and was in no small part responsible for the countless cages and coffins now filled due to the war on drugs. Challenging those ideas on the Internet is why Washington, Colorado, Alaska, and Washington DC have legalized marijuana in defiance of the federal government, and states like New Hampshire are moving towards decriminalization. Now imagine that trying to take place in an era of Google “fact checking” where to disagree with the notion that marijuana leads to rape and murder, would flush you to the bottom of the search engine rankings.
Trying to pass off popularity, however your measurement is taken, as accuracy is a fundamental flaw in the way human beings think. Our confirmation bias is a huge problem in discussions on all sorts of topics from politics, to economics, to health, and beyond. It is specifically because things like Google and Facebook have operated in the way they operate that we’re able to overcome some portion of that hazard, and the proposal by Google’s researchers would put a huge dent in that progress while competitors struggled to gain market share back with better ranking policies.

That doesn’t just harm businesses or screw up your enjoyment of the Internet. That costs lives, fills prisons, and steers nations to war.

No comments: