Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

WMT shows new urls everyday even though they are blocked by robots

         

fsmobilez

4:05 pm on Jan 31, 2009 (gmt 0)

10+ Year Member



there were many links crawled by google and i have removed each of them using wmt tools but every day i check using some rules lets say it is

site:www.example.com inurl:mostview

new urls are appearing, no doubt they are blocked by robots

when i was removing these urls google was showing result like this

Results 1 - 100 of 12000 from

but when i clicked on 10 page i was not able to find furthure pages or urls means i removed only 100 urls that time.

and when i checked them in google using my domain only none of these pages are showing

site:www.example.com

IS there anyway to get rid of these undesireable urls forever.

i cant added them at same time bcoz these are extra strings added at the ended of url

e.g

www.example.com/post_id=1mostviewed

*mostviewed* is extra string

and will google effect my ranking for removing too many urls no doubt all are undesireable urls

Thanks tedster for ur help

THis forum really helped me

tedster

5:46 pm on Jan 31, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Google respects pattern matching in a robots.txt rule, so you can use the wildcards * and ? to build your disallow rule.

url=http://www.google.com/support/webmasters/bin/answer.py?hl=en&answer=40360

fsmobilez

7:19 pm on Jan 31, 2009 (gmt 0)

10+ Year Member



Im already using this robots rule

User-agent: *
Disallow: /*mostviewed*

but still urls displaying in google

and my main question if u answer that plz

Google shows me

Results 1 - 100 of 12000 from site:www.example.com inurl:mostview

but when i clicked on 10 page i was not able to find furthure pages or urls

and if i do some varitation in search opeator it will show me some other new urls

site:www.example.com mostview

How can i get a complete list of mostview string urls>>?

and when i checked them in google using my domain only none of these pages are showing

site:www.example.com

tedster

7:37 pm on Jan 31, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



but when i clicked on 10 page i was not able to find furthure pages or urls

It happens all the time with the site: operator. The early numbers say "about" and they are often WAY wrong for what you can actually get a report about. See this thread [webmasterworld.com] for a discussion.

There is no way to get beyond 1,000 anyway - and often you can't even get close.

when i checked them in google using my domain only none of these pages are showing

Then I would not have much concern. Just watch your server logs to see if Google is sending any traffic directly to those excluded urls - they probably are not. They may just be showing in Webmaster Tools for your information. As long as they are excluded from public search results, that's what matters. Google still "knows about them." As they must!

fsmobilez

5:09 am on Feb 1, 2009 (gmt 0)

10+ Year Member



Should i request reconsideration for removing dynamic urls which were crawled by google ?

tedster

5:46 am on Feb 1, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I'd say fix it first, then wait to see how Google adapts. If your robots.txt is sound any problems are already on their way to being fixed.

Are you watching your server logs? If googlebot is still requesting those urls, then it sounds like something may be wrong in your robots.txt file. WebmasterTools includes some utilities for you to check your file.

fsmobilez

5:58 am on Feb 1, 2009 (gmt 0)

10+ Year Member



Well im not seeing these dynamic urls in wmt im seeing them in google search engine index and sory in title it was Wmt typed mistaken it is acutually google search.

so should i wait for robots to do all and should i worry if dynamic urls as showing in google as url only.

Thanks tedster u really helped me alot and im sorry for taking ur precious time.

tedster

6:18 am on Feb 1, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



should i worry if dynamic urls as showing in google as url only

That's a sign that your robots.txt rule is being honored, but there are inbound links pointing to those urls somewhere...or possibly those urls are listed in a sitemap. but the url-only shows that the urls are no longer being spidered.

I'd eliminate those inbound links. Or, if those links are important for your visitors, then make sure they all have a rel="nofollow" attribute.

Then after a couple weeks if your rankings do not begin to go back in the direction of wherever they used to be, look for other problems with your site.

g1smd

12:05 am on Feb 2, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Since robots.txt rules match URLs "from the left", a trailing * in a rule means nothing, has no effect, and should be removed.

fsmobilez

8:34 am on Feb 2, 2009 (gmt 0)

10+ Year Member



will u plz tell me what u exactly mean this is what is currently added in robots file

User-agent: *
Disallow: /*mostviewed*

Kindly modify it accordingly as i want these urls to be fully removed from google and also cant add no follow in pages of site.

tedster

8:45 am on Feb 2, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The last * is not needed, because robots.txt automatically is "wild carded" at the end.

User-agent: *
Disallow: /*mostviewed

That's all you need.

fsmobilez

8:48 am on Feb 2, 2009 (gmt 0)

10+ Year Member



ok thanks

fsmobilez

8:52 am on Feb 2, 2009 (gmt 0)

10+ Year Member



Well there is another problem created, my adsense adds are not showing on some pages i followed this rule

User-agent: *
Disallow: /*mostviewed

User-agent: Mediapartners-Google
Allow: /*mostviewed*

tedster

8:55 am on Feb 2, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



See [webmasterworld.com...]

I use [Adsnese] on pages that are disallowed to Googlebot (but not mediabot) and it works fine.

fsmobilez

9:00 am on Feb 2, 2009 (gmt 0)

10+ Year Member



is my coding ok if yes i will wait as i allowd media parter almost 20 hours back

g1smd

12:54 pm on Feb 2, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



You still have an uneccesary * on the end of the other line.

fsmobilez

4:38 pm on Feb 2, 2009 (gmt 0)

10+ Year Member



ok im going to remove it now thanks

and now i come to know that url only entries in google dont create any problem for duplicate content but what about dynamic urls

Is there any penalty for sites which generates dynamic urls

what if they are blocked by robots and they show as url entry

g1smd

5:11 pm on Feb 2, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Same thing, same non-effect.