Forum Moderators: phranque
Realise I can't give specifics so this is a made-up illustration.
Suppose my site is called best-beaches (it's not) and I might have urls like:-
foo-beaches/Australia/Bondi
foo-beaches/Spain/CostaBrava
etc.
What I've been seeing in my server logs for 2 days is a bunch of 404s for plausible looking (but non-existent) URLs where the referrer is also a page that does not and never did exist on my site. Like this
File does not exist; foo-beaches/Iceland, referrer foo-beaches/Iceland/Reykjavik
I'm getting hundreds, with numerous places and countries injected into a URL that looks like my real ones but that are nonsense (like beaches in Iceland, Antarctica, Vatican City - all made up but you get the idea).
Where do I start? Anyone recognise this form of attack and know how to prevent it? Obviously my key concern is that Google is also seeing these fake URLs somehow and starting to index whatever garbage is being placed there.
Many thanks in advance - really appreciate any help I can get on this.
Cheers.
have you noticed any pattern of user agents or ip addresses in your access logs?
The 404s are coming from a range of IPs, but there is one that stands out causing around 25% of them. Appears to be somewhere in the Phillipines (not a likely market for my site). I've added a firewall rule to drop packets from that particular IP but am still seeing the 404s from other IPs.
The User Agent for *all* the requests are set to "Opera/9" (from 9.1 to 9.17).
Still puzzling over why someone should bother - it's like a dictionary attack to guess URLs of my site but using place names; except the place names are unrelated to the subject of my site...
Have Opera users been infected with a bizarre geographical guessing redirect trojan... (;^) ? Or does this indicate some kind of distributed scraping (from multiple IPs).
If I redirect the bogus URLs to my home page, is there a chance Google will associate whatever it might have found at those bogus URLs (assuming they ever actually got resolved) with my site?
Might there be mileage in redirecting them to a less valuable page, or even a page specifically created for that purpose that I could apply a meta noindex to?