HINTS Whitepaper

HINTS Whitepaper
·
Contributors (3)
EP
Published
Dec 04, 2018

Detecting fake news via network analysis (HINTS)

Fake news is considered a relatively hard problem with important social impactX . With the rise of automated disinformation, we need automated ways to identify fake news. Network analysis (https://en.wikipedia.org/wiki/Network_theory accessed 10/28/2018) of the social + other accounts that share fake news can help classify or identify it, and limit its reach, as it is being shared. This is in contrast to content analysis + source analysis, which attempt to limit fake news before it is shared.

There are many attempts to detect, discover and define fake news. For example, Facebook has hired thousands of reviewers to manually detect, review, rank, and mark fake news. For a documentary om this manual process, see The Cleaners (http://www.pbs.org/independentlens/videos/the-cleaners/). Facebook has signed contracts with external organizations such as Politifact, to detect and rank fake news. Other attempts use NLP to attempt to discover fake news (see e.g., https://towardsdatascience.com/i-trained-fake-news-detection-ai-with-95-accuracy-and-almost-went-crazy-d10589aa57c accessed on 10/12/2018 or the attempts on https://www.ramp.studio/problems/fake_news, accessed 10/12/2018). Several startups use NLP for fake news detection (e.g. https://www.logically.co.uk/ accessed 11/17/2018). Most of these use a combination of humans and machine learning to analyze the content of the text/article/video, or the quality of the source, and teaches away from using network analysis. [Indeed, network analysis is only useful where you have access to data about how the story will be shared. For example, “AP Verify”, a joint project of Google and the AP, uses only textual understanding and humans, since at publication, AP does not have access to the data about how the story will be shared.]

This problem is not unique to Facebook. For example, Reddit, Twitter, Facebook, Instagram, Whatsapp, YouTube (comments and recommendations) and email providers all face a version of this challenge.

Solutions and history

Present attempts

Automated attempts to identify problematic texts from their content include Google’s ‘hate speech AI’ https://thenextweb.com/artificial-intelligence/2018/09/11/googles-hate-speech-ai-easily-fooled/ and China’s keyword-based censorship of social media. Twitter attempts to detect bots with humans reporting (https://www.theverge.com/2018/10/31/18048838/twitter-report-fake-accounts-spam-bot-crackdown accessed 11/1/2018).

Other attempts exist. For example, “Our work on the Credibility Coalition, an effort to develop web-wide standards around online-content credibility, and PATH, a project aimed at translating and surfacing scientific claims in new ways, is part of two efforts of many to think about data standards and information access across different platforms. The Trust Project, meanwhile, has developed a set of machine-readable trust indicators for news platforms; Hypothesis is a tool used by scientists and others to annotate content online; and Hoaxy visualizes the spread of claims online.” (https://www.theatlantic.com/technology/archive/2018/08/how-misinfodemics-spread-disease/568921/ accessed 11/1/2018)

However, these attempts can be fooled by manipulating the exact words used in an article (or tweet), and have issues with detecting sarcasm, irony, criticism of the problematic texts, and other subtle elements of discourse. For some mediums such as videos (e.g., beheadings by ISIS) or photos, text search does not work and other methods are employed (see e.g., http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.706.7108&rep=rep1&type=pdf accessed 11/8/2018) which are not sufficient.

For examples of other attempts at ranking and increasing trust in news, see http://www.niemanlab.org/2018/04/so-what-is-that-er-trusted-news-integrity-trust-project-all-about-a-guide-to-the-many-similarly-named-new-efforts-fighting-for-journalism/ accessed 9/10/2018.

One notable attempt is Trustrank (https://en.wikipedia.org/wiki/TrustRank accessed 10/13/2018), which attempts to combat web spam by defining reliability. TrustRank uses a seed of reliable websites (selected manually) and propagates reliability by using Pagerank. Notably, TrustRank does not utilize passive data collected from user behaviors, or measures of user reliability.

Search is an important component of the way we use the Internet. Early search engines attempted to understand pages in terms of how humans classified them. For instance, the Yahoo directory attempted to manually annotate and rank the content of the web. Manual ranking suffered from immense scaling issues, and fell out of favor.

The next generation of search engines tried to understand page content automatically. Methods such as tf-idf (term-frequency inverse document-frequency https://en.wikipedia.org/wiki/Tf%E2%80%93idf accessed 10/13/2018) or natural language processing were widely used. Difficulties arose due to the complexity of natural language processing, language subtleties, context and differing languages; however this is still is a component of many search tools.

The current generation of search engines utilizes very different mechanisms. Algorithms such as HITS (https://en.wikipedia.org/wiki/HITS_algorithm accessed 9/29/2018) and Pagerank (https://en.wikipedia.org/wiki/PageRank accessed 10/13/2018) have become mainstays of modern search. The unifying factor is that they look at networks of webpages, bootstrapping reliability and relevance scores, more than they look at the page content itself.


In HITS, each node is assigned two numerical scores. The Authoritative score indicates how likely a given webpage is likely to have good information, while the Hub score indicates how likely it is to link to pages with a good Authoritative score. A page with a good Authoritative score is pointed to by many pages with good Hubness, and one with a good Hub score points to many Authoritative pages.

These definitions are recursive, as each page’s score references the scores of neighbors in its link graph. This recursion is solved by assigning initial weights to each page and updating the scores until the values converge.


Our approach (HINTS)

We describe an automated, robust fake news detector which we call the Human Interaction News Trust System [HINTS] to detect fake news and misinformation, even in the presence of adversaries who know how the detector works. Our key tools are network dynamics and classification of members of the network in terms of their historical interaction with news. We look at how known and suspected fake news propagates in a dynamic network of people, and use this data to identify new posts/publication/news items that are likely to be fake as well. This also gives us information about accounts controlled by an adversary. Platforms can use this data to limit the damage a fake news article can do, by limiting the reach of such an article. And while limiting its reach, they can still increase confidence in the fakeness of the article e.g., by making it visible to small groups of users whose use patterns are the strongest indicators.

Our solution works for a wide variety of classification problems.

Applying HITS to news sharing

A key insight behind our fake news detector is that we focus on limiting the exposure of people to fake news, rather than trying to block all such news from being shared. This dramatically increases the cost of spreading fake news, as it is most cost effective when people spread it on their own. For instance, there is very little fake news on broadcast television.

We identify people who are disproportionately likely to spread fake news, and use this to weight the credibility of what they share. The proverb “consider the source” shows that we already implicitly weigh the source of a statement in deciding how much to trust it.

This leads us to the following working definition: A credulous person is someone who disproportionately interacts positively with fake news, and a piece of fake news is one that is interacted with disproportionately by credulous people. Of course, some of these credulous accounts are intentionally sharing fake news, and may not be real people. As with HITS, this definition is recursive and converges: we assign an initial fake value to each article and a credulous value to each user, and iterate.

Depending on the application, modes of interactions can include liking, sharing, spending time reading a source (estimated by for instance mouse movement over an article), commenting, reposting, following, favoriting, etc. Other metrics such as bounce time (amount of time before user returns to previous page) and changes in search patterns can also be used. For any individual, this signal might be weak (or wrong) -- for example, some individuals might comment to disprove an article. However different modes of interaction can be assigned different weights, to make the aggregate signal useful. (And despite disclaimers, retweets are endorsements http://cs.wellesley.edu/~pmetaxas/WorkingPapers/Retweet-meaning.pdf).

The method of user identification can vary. Some websites have enough user activity on their website to rank the user by themselves. Others can utilize plugins on other websites such as Facebook or Twitter plugins, or can use the browser, such as Google sync (https://www.theverge.com/2018/9/24/17895536/google-chrome-69-log-in-sync-password-user-data-privacy accessed 10/13/2018) which tracks data through backup of users behaviors.
Another way is to utilize ad network data (https://en.wikipedia.org/wiki/Advertising_network accessed 10/13/2018), such as cookies on a user’s computer, or device identification (https://www.devhub.com/blog/2672675-reach-a-target-audience-with-device-id-targeting/ accessed 10/13/2018) or device fingerprinting to identify users -- to calibrate a user’s information level or other traits. Yet another way is to use browserhistory (https://www.spinda.net/papers/smith-2018-revisited.pdf accessed 11/4/2018).
Further methods are possible.

Thus, similar to HITS, we can define a graph. In the case of fake news the graph will be bipartite (HITS itself is not bipartite, but a person and a webpage are different entities) in which one side are people and the other side are articles or posts (or clusters of articles and posts), and there is a weighted link where there is an interactions between a person and an article. The weight can depend on the type of interaction, and can be negative if the person saw but declined to interact with the article – e.g., if a person habitually interacts with links they see on their twitter feed, and we know (or can assign a probability) that they saw an article and did not interact with it. Weights can be modified by the individual’s propensity to interact with content (this would be equivalent to the ‘out-degree’ measure in the original HITS algorithm).

Details and novel elements

Negative links are novel to this use case; among web pages we don’t have negative links: while we see which links exists on a webpage, we do not see which pages an author considered and rejected.

In order to seed the algorithm and teach it, we can use existing labeling by humans. Sources that label data include Politifact, ABC News, the Associated Press, FactCheck.org, Snopes (see e.g., https://www.cjr.org/tow_center/facebook-fact-checking-partnerships.php accessed 10/1/2018), and AP Verify (https://newsinitiative.withgoogle.com/dnifund/dni-projects/ap-verify/ accessed 10/12/2018). When an article is manually fact checked, we can set the ‘fakeness’ value of that article to zero or one (or some appropriate value). While the algorithm can modify the fake news value for most articles, articles which are manually checked can optionally be pegged to that value, and the algorithm will not update them. This does not interfere with convergence.

A user can similarly be assigned a fixed credulous value of one if it is known to be a bot controlled by an adversary.

Clustering: when an article is marked as being untrustworthy we do not merely mark an individual link. We can aggregate links to similar stories, or similar links to the same story. This is similar to how Google News aggregates stories based on text similarity. Obviously if multiple links point to the same text (e.g., short links such as bit.ly) it is even easier to aggregate stories. Users can similarly be clustered when the same user has accounts on multiple platforms. Users can be identified/linked e..g, by cookies on their computers, browser fingerprinting or other methods. If users can not be identified the algorithm will still work but convergence will eb slightly slower.

Of course, the spread of news in a social network is different from new webpages. In particular, the speed of distribution is much faster. So it is useful to calculate marginal values for the ranking of articles and people based on the already-calculated values from the graph at a prior time point. This makes it unnecessary to recalculate the values from scratch (though that can be done as a sanity check from time to time). For example, we can frequently update the fakeness of an article based on user interactions, and only update the user values infrequently, or when a new user appears.

Updating one side of the graph (e.g., articles) much faster than the other side of the graph (e.g., users) is a novel need for this type of graph. We can also update the values of users with a limited number of steps. All of these methods introduce additional error, but it is small compared to the signal.

Applications

Given these rankings, various actions can be taken. For example, pages or sources can be downranked and appear less frequently in newsfeeds or social media feeds. Warnings can be displayed or sources can be banned. It is also possible to show information from other sources to counterbalance. Of course, this can require some integration with other providers. However, in some cases a plugin can be used similar to how Adblock (https://en.wikipedia.org/wiki/AdBlock accessed 10/28/2018) hides ads or how Facebook purity filters post (https://en.wikipedia.org/wiki/Fluff_Busting_Purity accessed 10/28/2018).

Extended use cases

While we have focused on fake news, similar analysis can be performed on other issues or objectionable content, such as deepfakes.x

Note that the same person will have different scores for different propensities. It is possible that some sources (e.g., bots) might have high scores in multiple areas. For instance, some people are particularly good at detecting deepfakes (https://www.sciencedaily.com/releases/2018/10/181011173106.htm accessed 10/12/2018). Propaganda, conspiracy theories and misinformation are subject to similar analysis. This scoring can also be used to divide people into a variety of bins. For example, given a seed of political affiliation (e.g., Fox news links vs MSNBC links) one can detect political affiliation as well as the bias of various news outlets. It is particularly useful where there is a correlation between the properties of the different types of entities.

Another use case is identifying patterns of small social media channels. For example, some chat servers running the Discord chat tool have issues with Nazi communities forming against the wishes of the server maintainers. Some of these have names such as “Nazism ’n’ Chill,” “Reich Lords,” “Rotten Reich,” “KKK of America,” “Oven Baked Jews,” and “Whitetopia.” By manually labeling these groups we can then use the algorithm to find other groups which are disproportionately inhabited by Nazis. These can be shut down or marked for manual inspection. Similar efforts can be done for chatrooms frequented by ISIS or other militant groups.

The place of a “user” can be replaced with other aspects of identity, such as IP address, username, typing habits (e.g., by using https://www.typingdna.com/ accessed on 10/9/2018) or any other method of statistically identifying a user across time or location. This identification can be unique or merely probabilistic.

We can also seed such a network with reliable classifications of users as well as, or instead of, with content classification. For example, if a user makes a statement that “I’d be the first to sign up and help you slaughter Muslims.” (example from https://slate.com/technology/2018/10/discord-safe-space-white-supremacists.html accessed on 9/10/2018) we can mark that user as a racist/threat and then see where similar users congregate or what articles similar users read.

One interesting effect of using users to detect servers/articles/webpages/newspapers/group/etc is that while it is easy to change the name of a server, it is much harder to simultaneously change all of the users. Thus even if an adversary tries to rename/reinstall/move their chatrooms/webpage/twitter account/etc they must simultaneously change their users base IDs (which can be tracked e.g., using adtech which tracks users across the web). This poses some slight technical difficulties for an adversary.

Further notes

Adversarial models

We expect adversaries to try to outwit such detection methods, for instance by creating fake profiles which appear to be benign (e.g., no interaction with any fake article) until they are called upon to manipulate the algorithm. But compared to the sybil attacks possible on current platforms, fooling HINTS is expensive and time consuming for an adversary.

In particular, while a successful sybil account becomes more effective over time as its follower count increases, our network analysis reduces its effectiveness after its first few broadcasts of fake items.

Compounding and chaining

We can chain this method with other known methods of identifying fake news, including manual human input or NLP. It is similar to methods for identifying clickfraud, and can enhance those efforts too. Explicit or implicit knowledge about an account, of the sort that ad networks gather, can improve effectiveness by highlighting what is disproportionate.

Feedback loops

HINTS can be paired with human and ML classification methods to improve fake news detection before network interaction. It can be integrated into content ranking on Google, or used as a prefilter for human filtering by Facebook.

Scores collected for a set of articles can provide labelled data for training a classifier. HINTS also has an assigned probability which can also be thought of as a marginx; boostingx can reduce the number of users which have to interact with a given piece of content before we can do classification.

Time based linkages and harassment

Another useful application is to detect harassment.

Phrases

The unit of analysis does not have to be pages. It can be phrases or hashtags.


Discussions

Labels

No Discussions on this Branch