Forum Moderators: phranque
http://example.com/porn-keyword.html
After some checkingI relised the spamming site must have learned that the search form on my site doesn't filter out html codes. It then appearently set up links to my search form that would cause google to check it out, turning the phrase:
We didn't find "search keyword"
Into a link to their spam site.
Unfortunately I have no longer access to the source code of this script, and until itcan be replaced I am considering a mod_rewrite solution.
Current candidates include:
- check for the spammers domain and rewrite all urls containing it to 404 (how?)
- check for "<" and ">" and rewrite them to entity references.
- check for url elements such as http or html and remove those.
What would you do?
SN
The mode_reqrite rule seems to have worked in the end.
Another complicating factor is that this search page can accept searches both via the path as well as a query string. So it can be called as "/search/keywords" or as "/search?q=keywords". My mod_rewrite rules fix the first, but not the second. Can I rewrite query strings in mo_rewrite?
SN
Yes I understand google creates a link to their site but that's only after YOU back-check things, it doesn't do a thing for the spammer that I know of, nor does it help them in any other way except google files, someplace, a few error links... So much for blackhat techniques, how stupid. This really is no more effective than referral spam, it's just plain dumb.
Here is the only part that bothers me:
If an SEO creates deceptive or misleading content on your behalf, such as doorway pages or "throwaway" domains, your site could be removed entirely from Google's index.
Might this affect you?
You did not hire them, but...
I don't know...
To be on the safe side, you may wish to file a report:
[google.com...]
It's the same scheme of referral spam (at least in the way I can see this).
But... and if my stats pages are password protected?
This site is a large business directory, running on old scripts. One of these scripts, a search, does not escape incomming search queries. It also repeats the query on the "entry not found" error page, which returns a 200 http code.
Now basically the spammer must have created links to this site (I only know indirectly from googles indexing behavior) in this form:
[mydomain.com...] href=http://spammerdomain.com/pron-keyword.html>the</a>
This causes my search script to not find anything, and report that it hasn't found any results for the keyword "the" where the word is hot linked to his site.
The targeted purpose of this spam operation is to create backlinks from this site to his (which is a strong PR6 with 10s of 1000s of pages with high PR).
I have now solved the issue by using mod_rewrite to encode the "<" and ">" characters and by deleting all occurances of "spamdomain.com" from incoming url strings.
The spammer must ahave investigated this site and deliberately targeted it after finding out how the script works.
Also, googlebot hit it with 1000s of different porn keywords as target pages, so it was a wide attack.
SN
Would the spam report sill be a good place to report? And what would I use to fill in the query related items?
SN