Welcome to WebmasterWorld Guest from

Message Too Old, No Replies

Site crawled without submitting to Google, no links, nothing.

8:17 am on Jan 5, 2008 (gmt 0)

New User

5+ Year Member

joined:Nov 13, 2007
posts: 16
votes: 0

How is it possible that without indexing a site it is crawled by Google? My site is not submitted to any of the search engines. This site is not having any incoming links, still it i crawled by Google and hence cached...
Why does this happen?
What should I do now, if I don't want Google to crawl this site again?
11:19 am on Jan 5, 2008 (gmt 0)

Moderator This Forum from US 

WebmasterWorld Administrator robert_charlton is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Nov 11, 2000
votes: 163

Hi darshanaasodekar - This thread might be of help...

Why is Google indexing my entire web server?
google indexing

The consensus is that you need to block your site on the server if it's under development or if you don't want it indexed. Lack of inbound links is not enough, as publicly available server logs are likely to get spidered by Googlebot. Use password protection or the no-index robots meta tag on pages you want to block.

2:43 pm on Jan 5, 2008 (gmt 0)

Preferred Member

5+ Year Member

joined:Sept 28, 2007
votes: 0

"This site is not having any incoming links" is not always possible to say with certainty. Maybe you didn't put any links but others might without your knowledge and it is not always possible to find these links through a search engine index. The biggest mistake you committed is not used the robots.txt file to block Google. No damage done, though. You can still tell G to go away and eventually it will drop all pages.
5:39 pm on Jan 5, 2008 (gmt 0)

Junior Member

5+ Year Member

joined:Mar 6, 2007
votes: 0

Do you have the Google Tool Bar installed showing you page rank?
If so that's one way to surely get Googlebot's attention.
You need to use your robots.txt to block bots for sure, or throw a redirect to an error page until you are ready to open for business. I wouldn't recommend the error page though unless you want to throw Google for a loop, it may never come back!

To block all (good) bots just make a robots.txt file and enter:

User-agent: *
Disallow: /

9:48 pm on Jan 5, 2008 (gmt 0)

Junior Member

10+ Year Member

joined:Apr 12, 2003
votes: 0

Didn't Google become a registrar a while back? This would give them access to lists of new domain registrations, so they could go out and check them out regardless of links or submissions.
5:58 am on Jan 7, 2008 (gmt 0)

New User

5+ Year Member

joined:Nov 13, 2007
votes: 0

Thanks all. Thanks for robots.txt. I have included meta tags in my site for no follow and have done the needful. Thanks for the help...:)