homepage Welcome to WebmasterWorld Guest from 23.23.22.200
register, free tools, login, search, subscribe, help, library, announcements, recent posts, open posts,
Subscribe to WebmasterWorld

Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

    
Best way to get url out of supplemental
cheesy snacks




msg:4376105
 6:52 pm on Oct 18, 2011 (gmt 0)

Hi I've just noticed that google has crawled many URls of mine which link to our merchants.

ie
mywebsitedotcom/linktoamerchant.php?id=LINK1
mywebsitedotcom/linktoamerchant.php?id=LINK2
mywebsitedotcom/linktoamerchant.php?id=LINK3
mywebsitedotcom/linktoamerchant.php?id=LINK4

I've noticed that these links have appeared when I use the site: my website operator.

Google has crawled these links - how can I get rid of them?

301 them to pages on my own site?
Remove URL in google webmaster tools? (This may be difficult as I have 100's of these urls deeplinked in 1000's of pages)

I'm certain having a large amount of these links is not beneficial to my rankings.

So whats the best way to get them out of the 'supplemental index'?

Thanks!

 

tedster




msg:4376126
 7:17 pm on Oct 18, 2011 (gmt 0)

How about a robots.txt rule first?

Disallow: /linktoamerchant.php

cheesy snacks




msg:4376187
 8:41 pm on Oct 18, 2011 (gmt 0)

Hi Tedster I thought of this but would this actually remove them from the index? (or 'supplemental' index')

cheesy snacks




msg:4376190
 8:45 pm on Oct 18, 2011 (gmt 0)

actually i just checked my robots.txt and I have disallowed the particular php file.

However they are still appearing in the search results using the 'site:mywebsite' command, so obviously google does know they are there and is taking them into account.

What do you suggest?

Thanks

tedster




msg:4376221
 9:38 pm on Oct 18, 2011 (gmt 0)

Well, you have a tangle here. Because the URLs are disallowed in robots.txt they cannot be crawled any longer. This means you could request removal - which as you say might be quite time consuming.

1. If the robots.txt block is relatively new, then you might just wait. I assume the URLs are not serving content but rather doing redirect. So Google will most likely drop those URLs in the relatively near future. At any rate, they are unlikely to be showing up in the SERPs except for site: operator results.

2. Another approach would be to move your php script into a new directory (or even just use a new name) that you also disallow in robots.txt from the start. With this approach, you would also edit all the internal linking to point to the new location - do not rely on a 301 redirect here. If there are old style URLs in your internal linking and if the crawl is disallowed, then they may never go away.

No matter what you do here, anything except a URL Removal Request will likely take time to be reflected in the site: operator results. However, moving your link script to a new location, disallowing it in robots.txt and changing the internal links is the best long-term solution.

cheesy snacks




msg:4376409
 7:11 am on Oct 19, 2011 (gmt 0)

thanks for that reply Ted.

The approach you suggest in point (2) is probably the way I'll go.

If I create a new directory for the 'jump script', immediately disallow in the robots and I guess I'll have to put in the hard yards and amend each link on my deep pages.

Once this is done go in and ask for a URL removal request for the old jump script URL.

tedster




msg:4376413
 7:24 am on Oct 19, 2011 (gmt 0)

Once this is done go in and ask for a URL removal request for the old jump script URL.

You can just make the old script 404 and remove the disallow rule from robots.txt. Google will then be able to crawl it, get the 404 response several times and then POOF! it will be gone without you needing to go through the ordeal of the URL removal tool.

cheesy snacks




msg:4376424
 7:59 am on Oct 19, 2011 (gmt 0)

Thanks Ted that sounds like the way to go.

cheesy snacks




msg:4376432
 8:30 am on Oct 19, 2011 (gmt 0)

Just out of interest Ted which method do you use to assess pages of lower quality (I guess 'supplemental index' is a phrase of yesteryear!)

I use site:mysiteaddress, click to the last page then click

"If you like, you can repeat the search with the omitted results included."

Once again click to the last page and measure the difference?

On a side note when I use the site:mysite operator, google indicates it has found 1100 results but when I scroll through the pages there are only around 50 (10 results on each page)= around 500 articles. But google has just said its found over 1000?

Could you clarify this for me Ted?

Thanks

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved