Welcome to WebmasterWorld Guest from 23.23.46.20

We have a big rankings dataset. How to use it to detect algo changes?

   
5:20 pm on May 23, 2011 (gmt 0)

10+ Year Member



We've got a lot of data that we're collecting for our clients, and I realized that it might be possible to detect search engine algorithm changes retroactively, and possibly in real time as they're happening.

But we're not really mathematicians, so I'm not sure how we could use the data to make some meaningful conclusions.

We track all the search engine traffic to our customers over a 30 day period.

  • We have 3 million data points of traffic data containing the keyword and rank in Google (from the referrer)
  • We have a total of 5 million unique keywords in our database across all our customers
  • We're tracking 2500 different websites
  • We have 2 million pages indexed
  • Most of our clients are integrated with Google Webmaster Tools and Google Analytics, so we could draw data from there as well
  • ... and much more.

    We need to protect our client confidentiality, but we'd be happy to search for information in aggregate to detect search engine algorithm changes, or... anything. It didn't occur to us back then, but I'm pretty sure we would have detected Panda within minutes of it sweeping through our clients.

    Let me give you an example.

    I rank highly for the term "universe", so it gets a few hundred search visitors a day. Roughly 30% of those visitors contain the rank in the referrer URL from Google. So I get a measurement of my rank every few minutes. Obviously it's going to vary because of personalization settings, but it should be possible to pick out larger changes - and something like Panda would have been pretty significant.

    Our plan right now is to give our clients some kind of diagnostic tool, so they see how their overall rank is doing from day to day, but I think we could aggregate that across all our clients.

    Anyway, if anyone has any ideas, we'd be happy to try and implement them and pull some signal from the noise.

    [edited by: tedster at 7:42 pm (utc) on May 23, 2011]

  • 7:41 pm on May 23, 2011 (gmt 0)

    WebmasterWorld Senior Member tedster is a WebmasterWorld Top Contributor of All Time 10+ Year Member



    The first thing that strikes me is that you would be able to know when a major algo change occurs, but you would not know WHAT that change is all about. In order to do that you would need to pull a lot of other data into your analysis.

    A few key factors would be essential - things like:
    - type of website and page (direct ecommerce, affiliate, general information, company B2B information, online app, etc);
    - some indicator of the backlink profile that goes beyond number of links;
    - where the specific query term fits in a user intention taxonomy, etc.

    Even a few bits of related data like this could make a lot of difference in making sense of the ranking shifts you see.
    7:57 pm on May 23, 2011 (gmt 0)

    10+ Year Member



    Determining "why" a ranking shift happened is a whole other level of complexity. My first goal is to just detect that a shift occurred at all.

    Regarding your suggestions, we don't classify the sites we're tracking, but that would be possible.

    We do track their backlink profile, but we don't hold them up to any criteria. We could count links and compare them to the number of domains, but again, I think that would be trying to focus on the "why". I'm not sure I'm prepared to "chase the algo".

    We only get the one query, so we don't see it as part of a larger stream.

    I thought it might be helpful to SEOs in general to pinpoint the moment that a new algorithm was released into the wild, so they can compare before and after that time to see if it gives them any clues.
    8:31 pm on May 23, 2011 (gmt 0)

    WebmasterWorld Administrator goodroi is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



    I've used a simple tool to detect if a shift occurred - posting levels on WebmasterWorld. Its worked for over 10 years. The more Google changes their algo, the more posts there are asking for help.

    Knowing there has been an algo changes is not very useful to me, Google makes over 500 algo tweaks a year. The "why" it changed today and "what" will be needed tomorrow imho is what is really valuable.
    8:38 pm on May 23, 2011 (gmt 0)

    WebmasterWorld Administrator brotherhood_of_lan is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



    A basic measure of volatility may be to use click throughs with the kind of percentages mentioned on this thread [webmasterworld.com]

    e.g. these figures
    1 - 42%
    2 - 12%
    3 - 8%
    4 - 6%
    5 - 5%
    6 - 4%
    7 - 3%
    8 - 3%
    9 - 3%
    10 - 3%


    So a move of #1 to #5 may amount to 37% change. Since you only have data for your sites (and assuming you only have one site appearing for most queries)... it could be a basic measure of a rolled-out algo change. Whatever metric you test out, graph it and check the interesting points in time.

    I've had this discussion recently though... and it was pointed out to me that relative traffic from SERPs may not necessarily be synonymous with the weight google gives any given page on a query. (rank #10 usually gets more traffic than #9)
    8:52 pm on May 23, 2011 (gmt 0)

    10+ Year Member



    Well, here's an example of what we just did. We tracked 2 million search referrers hour-by-hour over the last month. During each hour, we checked whether a keyword went up or down from the last time we checked it. Again... not a mathematician.

    Date|Rank Up|Rank Down
    23-05-2011 6 PM|23.92%|27.59%
    23-05-2011 5 PM|32.51%|27.29%
    23-05-2011 4 PM|27.02%|33.12%
    23-05-2011 3 PM|25.94%|28.72%
    23-05-2011 2 PM|40.15%|27.63%
    23-05-2011 1 PM|27.64%|37.92%
    23-05-2011 12 PM|41.29%|31.61%
    23-05-2011 11 AM|43.19%|26.34%
    23-05-2011 10 AM|5.68%|38.43%
    23-05-2011 9 AM|34.54%|40.02%
    23-05-2011 8 AM|22.25%|26.34%
    23-05-2011 7 AM|33.85%|27.68%
    23-05-2011 6 AM|36.22%|22.77%
    23-05-2011 5 AM|21.98%|24.21%
    23-05-2011 4 AM|32.07%|30.4%
    23-05-2011 3 AM|24.43%|26.75%
    23-05-2011 2 AM|37.9%|26.21%
    23-05-2011 1 AM|27.74%|29.34%
    23-05-2011 12 AM|28.4%|27.31%
    22-05-2011 11 PM|29.89%|24.14%
    22-05-2011 10 PM|28.97%|26.32%
    22-05-2011 9 PM|20.8%|31.43%
    22-05-2011 8 PM|22.92%|27.28%
    22-05-2011 7 PM|24.65%|25.4%
    22-05-2011 6 PM|28.38%|24.76%
    22-05-2011 5 PM|26.56%|30.16%
    22-05-2011 4 PM|24.85%|25.46%
    22-05-2011 3 PM|38.34%|24.14%
    22-05-2011 2 PM|32.24%|28.08%
    22-05-2011 1 PM|29.62%|19.26%
    22-05-2011 12 PM|19.67%|28.56%
    22-05-2011 11 AM|10.02%|28.11%
    22-05-2011 10 AM|22.53%|24.33%
    22-05-2011 9 AM|14.17%|7.21%
    22-05-2011 8 AM|20.37%|13.91%
    22-05-2011 7 AM|30.22%|65.28%
    22-05-2011 6 AM|13.42%|38.37%
    22-05-2011 5 AM|34.01%|20.49%
    22-05-2011 4 AM|28.72%|23.7%
    22-05-2011 3 AM|23.77%|23.83%
    22-05-2011 2 AM|29.97%|21.83%
    22-05-2011 1 AM|29.05%|24.5%
    8:22 am on May 24, 2011 (gmt 0)

    WebmasterWorld Senior Member 10+ Year Member



    If you have the positions for all keywords, a basic measure would be determining the average (absolute) change of the position

    Change = (sum of all keyword) |(old position)-(new position)| / (number of keywords)

    Of course, this would be just a simple measure, because it wouldn't be taken into account that a change of a higher position indicates a bigger change than a change at lower positions.

    You should make some tests with different measures, e.g. using

    Change = (sum of all keyword) | 1/(old position) - 1/(new position) | / (number of keywords)

    and analyze which works the best.
     

    Featured Threads

    My Threads

    Hot Threads This Week

    Hot Threads This Month