Post Google-Twitter Launch: How is Google Indexing Twitter Today?
View original post by Stone Temple Consulting
Back on February 4th, 2015, the news broke on a new deal between Google and Twitter, and on May 19th the the new deal went live. Back on February 10, 2015, we took a snapshot of how Google was indexing tweets before the deal went into effect. Today, we are releasing data on how Google is currently indexing tweets, now that the deal has been live for a number of weeks.
TL;DR, we see significant increases in indexation of tweets by Google, but Google is a long, long way from indexing all tweets. Google is remaining selective in what they are choosing to index, and it definitely skews towards people with higher follower count or “authority” (we used Followerwonk‘s Social Authority as a measure of authority).
Indexation of tweets in the first 7 days increased from about 0.6% in February to 3.4% in June. That’s a whopping 466% increase, but still leaves more than 96% of Tweets out of Google’s index. By no means do I think that this will be the end of the story. I would bet that Google is testing many things with Twitter integration, and that we will see changes over time. Not to worry, we will repeat our tests on an ongoing basis!
In this video from our Here’s Why with Mark and Eric series, Mark Traphagen asks me to explain why I think Google still doesn’t index all tweets, even though Google now has full access to those tweets:
It’s easy for us to all believe that Google captures all the data to be found everywhere on the web. After all, they have the best infrastructure for data capture on planet Earth. However, that does not mean that they don’t have limits. They do, and they need to be selective. Even with this new Twitter deal where they get all of Twitter’s tweets by firehose, it’s just too much for them to swallow and index it all.
That does not mean that their indexation rate won’t expand over time. It may well do so, but it will only do so after they find an effective use for that additional data.
Show Me the Details!
One of the most interesting areas to explore is how quickly Google indexes tweets. People have long believed that Google places more weight on recency of tweets. For that reason, we evaluated indexation of tweets by day for the first 7 days. That leaves the question as to how this changed between February and June, and here is your detailed answer:
There is clear evidence here that Google has significantly picked up their level of indexation, with an increase of 466%. This is a big deal, and probably brings a lot of incremental value to Twitter. However, Google is still NOT indexing 96.6% of the data. Note also that Google’s indexation of Twitter does go up over longer period times, to about 12% of all tweets tested – still leaving 88% not indexed.
We also looked at indexation based on follower count. Both February and June show a strong bias towards indexing content tweeted by people with larger follower counts:
Note that the time horizon used for this June data slice was 7 weeks, so the older tweets from that sample are from prior to when Google turned the switch on for the new deal with Twitter, so the increase levels are somewhat dampened by that.
We also took a look at the data based on Followerwonk Social Authority, to see how that might vary:
We believe that using Social Authority is a better metric for us to use going forward, as it takes into account the engagement level with a person’s tweets (which a simple count of followers doesn’t). In this view, you can see a strong skew towards indexing the content from higher authority people.
This suggests that Google is looking at more than simple follower count to pick out what tweets they want to index.
In this study we used a fixed user set. The data sample of people used was 900+ users that were the same ones used both in February and June of this year. Note that we also tested that exact same sample of users in a Twitter indexation study we ran in July of 2014).
Using the exact same user set is important, as we do not know what criteria Google may use to evaluate whether or not to index tweets. However, by using the exact same user set, we are trying to eliminate some of those variables.
As noted during the TL;DR at the beginning, Google’s indexation of Twitter has taken a significant jump upwards, perhaps as much as 4.66 times. That’s significant, but they are still clearly not indexing the great majority of tweets.
I expect to see significant changes in the way Google uses Twitter data over time, and we will continue to monitor that here at Stone Temple Consulting.