Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Wordpress hack 6 months ago still shows in SERPs

         

Western

11:55 pm on Feb 18, 2009 (gmt 0)

10+ Year Member



I'm looking into a site running a wordpress blog that was corrupted about six months ago. Links to 'bad neighborhoods' were found in the source code. I was asked to look into the site again (haven't followed its progress for awhile) after the owner noticed that site:example.com 'drugname' still returned results showing the bad neighborhood links in the cached source code.

The site seems to have been in a holding pattern for the last few months as far as Google traffic is concerned.

Interestingly, the site does well for long tail search terms and extremely well for competitive image queries (by competitive...3,000+ unique visits per day from a single query). The site isn't optimized for images and has relatively few images in proportion to the rest of the content. New content is added to the blog daily.

I'm confident that the original hack was resolved and there has been no other wordpress corruptions since the original incident. But I don't know whether a site:example.com 'drugname' query a month, 2 months, or 3 months ago was returning the cached hack results. I imagine it's more likely that the query was returning this same result in the interim. Seem more likely than six month old cache results returning.

Any ideas?

[edited by: tedster at 12:07 am (utc) on Feb. 19, 2009]
[edit reason] switch to example.com - it cannot be owned [/edit]

tedster

1:02 am on Feb 19, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Have you browsed the site using a googlebot user agent? Some of the WP hacks cloak the content to an ordinary browser user agent and only show it to googlebot. I ask because 6 months is WAY too long for Google to be showing old source code.

Western

8:08 pm on Apr 29, 2009 (gmt 0)

10+ Year Member



Update:

Tedster was correct. The hack was cloaking for the Googlebot user agent. The code was removed in late February.

But the SERPs are still showing the same cached results for site:example.com "spam drug name".

I cannot see any of the spam URLs that Google has cached for a page when I view that page as Googlebot. Any ideas?

Receptional Andy

8:12 pm on Apr 29, 2009 (gmt 0)



What date is listed in the cached page, Western?

Western

10:26 pm on Apr 29, 2009 (gmt 0)

10+ Year Member



The latest cache date is March 25th 2009 (for these results). Other pages on the site are showing cache dates as recently as few days ago.

Receptional Andy

11:03 pm on Apr 29, 2009 (gmt 0)



The Google cache is pretty screwy sometimes, but the date is usually pretty reliable. Chances are high that whatever is in the cache matches what the response from your server was to googlebot on that date.

It's possible to cloak via IP, which you will not be able to detect with a browser. If the cache post-dates your fix for the hack, I would check for the hack again - looking particularly at mechanisms for IP delivery - server config (including htaccess files) and any dynamically-generated content.

Western

11:11 pm on Apr 29, 2009 (gmt 0)

10+ Year Member



AFAIK the cache date is is after the fix. But I need to confirm this with the site owner.

If that is not the case, what is a reasonable amount of time in which Google will update the cache? And what kind of penalty liability am I looking at for this hack?

Receptional Andy

11:28 pm on Apr 29, 2009 (gmt 0)



You should see if there are any messages within Google Webmaster Tools for the site, in which case you will be able to respond to Google directly to get the site reinstated and begin any recovery.

Whether the site has suffered at all is something you can only diagnose by its performance. If the URLs have an "expected" rank then you don't have any problem.

what is a reasonable amount of time in which Google will update the cache?

Never. 6 months. 2 weeks. 10 minutes ;)

It depends on the links directly to the URL, and whether Google's spidering algorithms believe it to be content worthy of checking on frequently (mostly the former if you ask me).

IMO, a telling factor is whether googlebot requests the URL, but that copy is not retrievable in results.

Western

11:48 pm on Apr 29, 2009 (gmt 0)

10+ Year Member



Never. 6 months. 2 weeks. 10 minutes ;)
.

I had a feeling that would be the answer. Thanks for the help guys.

Receptional Andy

12:08 am on Apr 30, 2009 (gmt 0)



I had a feeling that would be the answer

It's the range of possible answers ;)

You can look at an individual URL and get a reasonable idea of how frequently you would expect it to be cached - primarily based on past spidering activity and the "strength" of links to the page. Otherwise, your guess is as good as mine :)