homepage Welcome to WebmasterWorld Guest from 54.211.97.242
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

    
How to delete Joomla site search pages from Google index?
simplesimon

10+ Year Member



 
Msg#: 3720652 posted 9:57 pm on Aug 11, 2008 (gmt 0)

I have a Joomla site and at some point in the last few months Google started including results pages from my site's own search function in it's index. The searches are all on random-looking word fragments. Now I have 360 garbage pages in the index. I have them blocked now using robots.txt, but too late.

What is the best way to remove these from the index? I started to do so using the Webmaster Tools removal tool, but it will take a while.

 

tedster

WebmasterWorld Senior Member tedster us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 3720652 posted 12:53 am on Aug 12, 2008 (gmt 0)

The robots.txt file will kick in to prevent future indexing. If you used the url removal tool in your Webmaster Tools account, then it usually only takes a few days to see the removal. You've done what you can do.

BillyS

WebmasterWorld Senior Member billys us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 3720652 posted 3:06 am on Aug 12, 2008 (gmt 0)

I use a triple play when possible:

410 them (if possible)
block with robots.txt
URL removal tool

simplesimon

10+ Year Member



 
Msg#: 3720652 posted 4:17 am on Aug 12, 2008 (gmt 0)

Thanks - Glad I'm on the right track. I'll try the 410 approach if the url removal tool doesn't do the trick.

simplesimon

10+ Year Member



 
Msg#: 3720652 posted 4:27 pm on Aug 17, 2008 (gmt 0)

The URL removal tool denied all of the removals of the garbage search results pages. It seems that I've followed the guidelines for removal, but they were denied anyway, with no further explanation than pointing me back to the list of conditions I complied with. Can't do the 410 because they are search results and not pages that have been removed. All URLs are showing up as blocked by robots.txt in Webmaster Tools. I guess all I can do is wait for them to drop from the index? I hope? Soon?

FromRocky

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 3720652 posted 4:55 pm on Aug 17, 2008 (gmt 0)

Can you use a "nofollow" command on the search results pages? If you can, it will prevent any future indexing and the old pages will gradually drop out once they are re-indexed.

simplesimon

10+ Year Member



 
Msg#: 3720652 posted 12:22 am on Aug 18, 2008 (gmt 0)

The search results are displayed on the index page. The URL is something like index.php?component=search&searchstring=foo. Shouldn't the robots.txt file prevent future reindexing?

BillyS

WebmasterWorld Senior Member billys us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 3720652 posted 12:40 am on Aug 18, 2008 (gmt 0)

Disallow: /index.php?component=search

simplesimon

10+ Year Member



 
Msg#: 3720652 posted 9:21 pm on Aug 21, 2008 (gmt 0)

Yes, the pages are blocked by robots.txt, and webmaster tools confirms this by showing an error for each i.e. "URLs restricted by robots.txt". The problem is they have been blocked for well over a month, maybe two, they aren't dropping out of Google's index, and I can't remove them with the URL removal tool. Does anyone know if they will just go away by themselves, and if so, how long it will take? There are more of these garbage results in the Google index than actual valid pages on my website, which can't be a good thing.

tedster

WebmasterWorld Senior Member tedster us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 3720652 posted 10:41 pm on Aug 21, 2008 (gmt 0)

I can't remove them with the URL removal tool

There's an option in the tool to use your robots.txt rules for URL removal. Have you tried that?

simplesimon

10+ Year Member



 
Msg#: 3720652 posted 12:24 am on Aug 22, 2008 (gmt 0)

No. I did run across that recommendation in my research, but wasn't able to find the function in the removal tool and figured it was something that had been taken out. When I go to the tool I see these options:

1) Individual URLs: web pages, images, or other files Remove outdated or blocked web pages, images, and other documents from appearing in Google search results.
2) A directory and all subdirectories on your site Remove all files and subdirectories in a specific directory on your site from appearing in Google search results.
3)Your entire site Remove your site from appearing in Google search results.
4) Cached copy of a Google search result Remove the cached copy and description of a page that is either outdated or to which you've added a noarchive meta tag.

Can you point me to the function that allows me to submit robots.txt to the removal tool? Maybe I'm looking in the wrong place.

tedster

WebmasterWorld Senior Member tedster us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 3720652 posted 1:07 am on Aug 22, 2008 (gmt 0)

It's that first option - robots.txt or meta robots noindex create "blocked web pages". See the top page under the "Remove URLs" section for the exact description.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved