Hack Redirect Analysis from Google's John Mueller - scary stuff - Google Search and SEO forum at WebmasterWorld - WebmasterWorld

Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Hack Redirect Analysis from Google's John Mueller - scary stuff

tedster

6:17 pm on Feb 21, 2009 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

Google's John Mueller (JohnMu on their webmaster forum) has spotted and blogged about a hack that may be very hard to notice. If your page has been hacked this way, a user clicking on the Google search result will be redirected to another site, but ONLY THE FIRST TIME!

Googlebot will not see the redirect when it spiders, and a direct visit to your page through the location bar or a link elsewhere will not be redirected -- only the first click by an end user on a Google result will be redirected.

I first heard one single report like this last fall, but apparently the hack is now growing in the wild. I wonder how many traffic anomalies we hear about are related to this type of hack. If your Google traffic seems too low for your SERP position, this is worth looking into.

John closes with this observation:

Recognizing something like this algorithmically on Google's side would be possible with the Googlebar-data. Assuming all shown URLs are recorded, they could compare the URL clicked in the search results with the URL finally shown on the user's browser (within the frames). At the same time, the setup could be used to detect almost any kind of cloaking.
[johnmu.com...]

tedster

6:25 pm on Feb 21, 2009 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

I'm not usually a happy camper when it comes to any organization collecting huge masses of data. But if the toolbar data can help Google spot this hack, that sounds like a very positive thing to me - especially if they send the hacked webmaster a heads-up email.

This hack is no mere defacement, and not even parasite hosting for links. It's out and out traffic theft.

rollinj

10:13 pm on Feb 21, 2009 (gmt 0)

10+ Year Member

This information is 2 years old by now.. (re: post date 2007..) - do you think it's still relevant? If they were outed so professionally and publicly.. I doubt they're still using anywhere near the same tactics.. but maybe!

adamnichols45

11:21 pm on Feb 21, 2009 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

So basically when using livehttp headers if I cant see a 302 then I should be ok?

gouri

12:40 am on Feb 22, 2009 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

Are you supposed to use the live http header as a sidebar and click on a webpage from google and then check if there is a 302 mentioned in the info for the page that you land on? So if it says
HTTP/1.x 200 OK
does that mean everything is ok? I am not sure which line to look at?

tedster

1:44 am on Feb 22, 2009 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

This information is 2 years old by now

Thanks, clearly I didn't notice that. His RSS feed just published it again for some reason, and I took off running.

tedster

2:12 am on Feb 22, 2009 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

So if it says
HTTP/1.x 200 OK
does that mean everything is ok? I am not sure which line to look at?

You know if your page redirects, right? So if your server should not redirect the request for your page but it does, then you've got this kind of problem.

Which line to look at in HTTP headres? There may be lots of chatter in the http headers when you click on Google search result, depending on your toolbars, add-ons etc. You're looking for the section with your server's response to the browser request "GET [your suspect url] Host:[your hostname]". It will come immediately soon after (usually immediately after) the google server chatter.

Not every related hack will be a 302, there may be a 301 or whatever.

aakk9999

3:28 am on Apr 11, 2009 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

Top Contributors Of The Month

I am not sure if this is the above problem but we have spotted what we consider far too many redirects on one of our sites (about 3.5% of rows in the logs are 301 redirects). Also, there are redirects for pages that should not redirect, and there are even redirects when accessing robots.txt.

What is odd is that the same page will normally not redirect, but sometimes does. For example, in the same log we can find both of lines shown below, the first one is redirect, the second one is what we would normally expect.

GET /robots.txt HTTP/1.1" 301 349 "-" "Mozilla/5.0 (compatible; Charlotte/1.1; http://www.searchme.com/support/)

GET /robots.txt HTTP/1.1" 200 2017 "-" "Mozilla/5.0 (compatible; Charlotte/1.1; http://www.searchme.com/support/)

We also have:

GET / HTTP/1.0" 301 327 "-" "Mozilla/5.0 (compatible; Yahoo! Slurp/3.0; http://help.yahoo.com/help/us/ysearch/slurp)

GET / HTTP/1.0" 200 21848 "-" "Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp)

I have also noticed that most of entries where the user agent is Slurp/3.0 is 301 redirect.

Any ideas how I can find out what is going on and what would be the best course of action to take?

Many thanks

wilderness

5:09 am on Apr 11, 2009 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

Top Contributors Of The Month

aakk9999,
More than likely what is happening to you is the bots are requesting www.example.com rather than www.example.com/

The subject of this thread is an entirely different issue.

aakk9999

10:38 am on Apr 11, 2009 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

Top Contributors Of The Month

Well, I have allowed for backslash in my checks of 301 and I do not think this is the issue, I can recognise these.

The reason why I posted is here is:

You know if your page redirects, right? So if your server should not redirect the request for your page but it does, then you've got this kind of problem.

I have a number of GET statements in logs where the page requested is actual html page which should not redirect (and also should not have backslash at the end), so it is not backslash issue and your response does not explain occasional 301 on robots.txt.

I know pages that should not redirect.

GET /example.html
GET /example1.html
etc...

All these should not redirect. And looking in the logs, in most cases they don't, but then I can see some GET requests where they do, and the request is exactly the same request (the user agent may or may not be the same, it varies).

And such redirects are never followed by another request to our site by the same user agent.

Receptional Andy

11:55 am on Apr 11, 2009 (gmt 0)

aakk9999 - as your logs don't include the host requested, they may well be starting at (for example) http://example.com as opposed to http://www.example.com (the requests would appear to be the same in your logs, as they don't include the host). If you're redirecting appropriately, then you'd see this log line. Nothing to worry about.