Meaning of "URLs restricted by robots.txt" - in Webmaster Tools - Google Search and SEO forum at WebmasterWorld

Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Meaning of "URLs restricted by robots.txt" - in Webmaster Tools

fraudcop

11:15 pm on Nov 3, 2006 (gmt 0)

inside google's webmasters tools

I reached (9800) Urls restricted by robots.txt
Now after almost 2 months they are (7407) Urls.

Can anyone explains what is the meanining of the lowered number?

thanks in advance

thecoalman

5:55 am on Nov 4, 2006 (gmt 0)

It goes by whatever was resrticted in x amount of days, looking on my page it only goes back 2 weeks. The total will change each whenever the latest days are added.

fraudcop

11:05 am on Nov 4, 2006 (gmt 0)

thanks for the answer.

I'm wondering what happens with my double content pages
30 login pages- 70 registration pages etc,

how long will they stay in the index before they get deleted by google and stop doing harm.

g1smd

11:15 am on Nov 4, 2006 (gmt 0)

If they are marked as Supplemental Results they can take a year to drop out.

If they have a noindex tag then they are already not causing a problem.

fraudcop

1:13 pm on Nov 4, 2006 (gmt 0)

thansk g1smd for the answer

is the disallow inside the robots.tx

User-agent: *
Disallow: /cgi-bin/Register
Disallow: /cgi-bin/Login

enough to stop givng problems or should I add a noindex tag inside each page code?

g1smd

1:38 pm on Nov 4, 2006 (gmt 0)

The robots.txt disallow stops the page being spidered, but if another page links to that disallowed URL then it can still appear in Google results as a URL-only entry.

The meta robots noindex tag allows the page to be spidered, but says to not allow the content to appear in the SERPs at all. Nothing about that page will appear in the SERPs.

Use whichever one is appropriate. If you use both, then Google will not ever get to the page to see the meta tag.

hooter

1:43 pm on Nov 4, 2006 (gmt 0)

Just as an aside...the results given by google in the aforementioned "URLs restricted by robots.txt" is completely broken and worthless - I manage several sites for clients including using the Webmaster Tools interface...These sites are a mix of small static page sites, larger dynamic query URL type sites, and some using mod_rewrite for their URLs - all report 1,000s of restricted URLs yet in no way shape or form are these URLs restricted by their respective robots.txt file.

In fact, I can take any of these so-called "restricted" URLs and paste directly into google's OWN robots.txt analysis box under the Diagnostics tab, and they all come back as allowed.

thecoalman

6:22 pm on Nov 4, 2006 (gmt 0)

I have a lot of supplementals and have been following the progress of them getting removed quite closely over the last month or better. Everything that shows up is legitatmely blocked in my case.

One thing that set the alarm bells off for me recently was a URL that at first glance should have been indexed but was denied. Upon further investigation it was a URL that went to page that I had removed from the public. The redirect was going to the login page which of course is denied in robots.txt

fraudcop

12:51 am on Nov 5, 2006 (gmt 0)

so in my case,
having to remove many duplicate pages (9000 pages)from the index it seems that

is better than robots.txt