homepage Welcome to WebmasterWorld Guest from 174.129.103.100
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Visit PubCon.com
Home / Forums Index / WebmasterWorld / Webmaster General
Forum Library, Charter, Moderators: phranque

Webmaster General Forum

    
Non existent pages indexed by search engines
How do return the right code?
skweb




msg:4410388
 7:19 pm on Jan 24, 2012 (gmt 0)

I noticed that I have hundreds of pages indexed that don't exist on my website. For instance,

domain.com/blog/wp-content/keyword-keyword.html

They are almost always spammy keywords for drugs and games etc. When I click on these pages, all I see are blank pages with empty code. When I check the server, these pages simply don't exist.

In addition I have many other pages indexed with URL's like this and when I click on them I do end up either on the archive/category/post page but then these are duplicate pages because ideally someone should be able to reach them directly:

domain.com/blog/?m=fcwzvwvr&paged=112

domain.com/blog/?cat=iukuqvfv&paged=3

domain.com/blog/?p=pwivtbwb&paged=180

My wordpress installation is totally up to date and I asked question a while ago in another place with the answer being "google will index any page that returns a 200 status code. if someone links to a non-existant page on your site and it returns a 200 status code, google will likely index it."

This is like rocket science to me, but still I went to check my 404 page and noticed that I did customize my page many years ago with links to my search page and home page. Assuming that it maybe the problem, I cleaned all that up and now have inserted the Google 404 page widget javascript.

Will this do it? What else do I need to do? How do I make sure that whoever is linking to these pages will not be able to achieve this result? Is there a way to test that the 404 page is in compliance? By the way, why would anyone try to link to these pages? I have noticed a lot of Russian and Polish spam websites linking to my plastic surgery website. Why would they do it? I thought getting links was difficult but these people probably mean harm.

 

viralvideowall




msg:4410416
 8:39 pm on Jan 24, 2012 (gmt 0)

I would check your plugins ... anything you have installed from overseas developers? Also check your source code for weird stuff.

I've been hacked a few times on my wordpress sites, and I suspect most of the time it is due to a plugin installed.... other times by not keeping up to date on security.

g1smd




msg:4410422
 8:52 pm on Jan 24, 2012 (gmt 0)

The Live HTTP Headers extension for Firefox will show the HTTP status code that is returned for any and all URL requests made.

The server always sends a status code back. Yes, returning 200 OK for all is a BIG problem.

lucy24




msg:4410510
 1:12 am on Jan 25, 2012 (gmt 0)

You need to look from the other end. What happens when someone requests a pseudo-directory (I assume you're rewriting from displayed directory to actual query) or uses a bogus query term? What should not happen is that your site goes ahead and creates a page.

A custom 404 page is very very unlikely to be the problem. Error pages are for humans. Robots just note the 404 and carry on. Unlike humans, they can even choose not to follow redirects.

Even in the Parameters area of gwt and similar, there's no way to say "Ignore everything except...." So you need to make sure that spurious parameters-- or impossible values-- aren't being processed in the first place.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / WebmasterWorld / Webmaster General
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved