Welcome to WebmasterWorld Guest from

Forum Moderators: phranque

Message Too Old, No Replies

Non existent pages indexed by search engines

How do return the right code?



7:19 pm on Jan 24, 2012 (gmt 0)

5+ Year Member

I noticed that I have hundreds of pages indexed that don't exist on my website. For instance,


They are almost always spammy keywords for drugs and games etc. When I click on these pages, all I see are blank pages with empty code. When I check the server, these pages simply don't exist.

In addition I have many other pages indexed with URL's like this and when I click on them I do end up either on the archive/category/post page but then these are duplicate pages because ideally someone should be able to reach them directly:




My wordpress installation is totally up to date and I asked question a while ago in another place with the answer being "google will index any page that returns a 200 status code. if someone links to a non-existant page on your site and it returns a 200 status code, google will likely index it."

This is like rocket science to me, but still I went to check my 404 page and noticed that I did customize my page many years ago with links to my search page and home page. Assuming that it maybe the problem, I cleaned all that up and now have inserted the Google 404 page widget javascript.

Will this do it? What else do I need to do? How do I make sure that whoever is linking to these pages will not be able to achieve this result? Is there a way to test that the 404 page is in compliance? By the way, why would anyone try to link to these pages? I have noticed a lot of Russian and Polish spam websites linking to my plastic surgery website. Why would they do it? I thought getting links was difficult but these people probably mean harm.


8:39 pm on Jan 24, 2012 (gmt 0)

I would check your plugins ... anything you have installed from overseas developers? Also check your source code for weird stuff.

I've been hacked a few times on my wordpress sites, and I suspect most of the time it is due to a plugin installed.... other times by not keeping up to date on security.


8:52 pm on Jan 24, 2012 (gmt 0)

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

The Live HTTP Headers extension for Firefox will show the HTTP status code that is returned for any and all URL requests made.

The server always sends a status code back. Yes, returning 200 OK for all is a BIG problem.


1:12 am on Jan 25, 2012 (gmt 0)

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month

You need to look from the other end. What happens when someone requests a pseudo-directory (I assume you're rewriting from displayed directory to actual query) or uses a bogus query term? What should not happen is that your site goes ahead and creates a page.

A custom 404 page is very very unlikely to be the problem. Error pages are for humans. Robots just note the 404 and carry on. Unlike humans, they can even choose not to follow redirects.

Even in the Parameters area of gwt and similar, there's no way to say "Ignore everything except...." So you need to make sure that spurious parameters-- or impossible values-- aren't being processed in the first place.

Featured Threads

Hot Threads This Week

Hot Threads This Month