aristotle

msg:4256758 | 9:12 pm on Jan 22, 2011 (gmt 0) |
There are some sites like quantcast that publish domain registration information. I think googlebot may find the link to your site on those sites.
|
kahuna

msg:4256771 | 9:29 pm on Jan 22, 2011 (gmt 0) |
Not affiliated with quantcast. Never heard of them... and I am pretty sure my company doesn't use them.. or never seen that they would have. <snip> I am not sure if I am allowed to post the name... so moderators edit this if necessary. Thanks group. and mods. (Mod note: Edited hosting company name. Thanks for your understanding, kahuna.) [edited by: Robert_Charlton at 11:12 pm (utc) on Jan 22, 2011]
|
bhukkel

msg:4256774 | 9:32 pm on Jan 22, 2011 (gmt 0) |
i think google has access to the .com zone file. As soon your domain has ns records google will find it.
|
jimbeetle

msg:4256782 | 9:45 pm on Jan 22, 2011 (gmt 0) |
Not much you can do to hide from Google as it has been a registrar for a number of years and has access to domain data.
|
Samizdata

msg:4256799 | 10:42 pm on Jan 22, 2011 (gmt 0) |
| Just because I register a domain... what gives anybody the permission to index it? |
| You give indexing permission by default (unless you take steps to prevent it). This is not a Google-specific question - several other bots are likely to have turned up before GoogleBot, within hours (not days) of the domain going live. It has nothing to do with your hosting company or registrar, the WHOIS information is in a public database - you can use privacy services to hide your personal details, but not the fact that the domain exists. The lesson to learn is that you need to control bot access from the very start, usually with a combination of robots.txt or meta tags (for honourable bots) and .htaccess or other exclusion methods for the really creepy ones. ...
|
kahuna

msg:4256827 | 1:21 am on Jan 23, 2011 (gmt 0) |
Thanks once again group... I haven't been very active around here for years because I have been rather busy on other matters... case in point....... My blog pertains to my last nine years of taking care of my alzheimers father... which is my rather personal and experiences dealing with many situations. Seeing so many attempts to hack in to the PHP side of my site(s) pisses me off. But this is not personal to me... I am sure most of you are aware of these "comment spammers" etc... the best a person can do is keep up with updates to php and their other scripts, so these malicious people can do no harm. My thought is that if the domain had an ability to be private it should be... well I guess it "ain't" so. Yes.. we can hide the "whois" info... but I am not worried about that. Because hiding that information is available. Maybe this is in one of the categories where we don't want regulation to many parts of the internet... so "I" have to live with these paradoxes. Before I set up the newest domain... I put in place scripts to catch all who enter.. well the log files are there but my scripts give me minute by minute "ease" of evaluation... and "G" was the first to make the visit. And so I posted here... I think to myself.. that the competition factor of indexing a site... before another search engine, gives that search engine a competitive edge. Without consideration to the owner of the site for this competitive motive. I am probably in the smallest percentile possible... in that I don't want the site immediately index and placed in a search engine. Considering all the work "we" try to get for placement. I think that... well.. I shouldn't have even gotten a domain name... and that would have solved the problem in both of my situations. The blog I mentioned above... the domain registered this week. The blog with it's domain I started years ago... and noticing the php exploit attempts... I changed the directory name so only I can see and work on it. But I still see the attempts almost daily to find exploits under the actual domain name. The domain I set up this week... and with dropping in my snooping scripts... saw "G" find it in no time at all. Thanks a bunch group :-)) you have helped me better understand what is going on... whether I like it or not... I guess I can only sum it up to a famous quote by the revered writer to Surfing Magazine Don Redondo (water surfing.. you know with salt and waves)... "Well cornflake... if you wanted to be in the water today by yourself... you should have stayed in bed" Thanks again group and this website. K.
|
TheMadScientist

msg:4256859 | 3:20 am on Jan 23, 2011 (gmt 0) |
I've had this happen too when I've registered some 'obscure' domain names and they weren't only .com either... One thing you can do for an 'easy login' page is use the FireFox user-agent switcher to create a unique user-agent for your browser, then you can stick a php switch in... if($_SERVER['HTTP_USER_AGENT']!=='YourUserAgentHere') { header('HTTP/1.1 403 Forbidden'); echo ' <html><head> <title>403 Forbidden</title> <meta name="robots" content="noindex,nofollow,noarchive"> </head> <body>You Do NOT Have Permission to Access This Page</body>'; } else { What you see goes here }
|
leadegroot

msg:4256927 | 10:08 am on Jan 23, 2011 (gmt 0) |
I've seen Google hit newly registered domains within minutes more times than I can count - its just what they do. But - don't think that discovery crawl will be enough to be "placed in a search engine" - it takes a lot more work than just registering the domain to get traffic :) If you really don't want your site being indexed, just put in a robots.txt entry to forbid all crawling. This won't keep out the spam crawlers, but it will keep the legit crawlers out 99.9% of the time :)
|
kahuna

msg:4260067 | 12:29 am on Jan 30, 2011 (gmt 0) |
Sorry to revive this topic... as all has been explained. But today I got hit by a bot called copilot.thunderstone.com I hope I am not violating terms of service here by mentioning there name, it is a bot. It appears to be a device driven bot... that is from their website. Maybe there is a "software" version they are running. As mentioned this is "private" domain... complete whois anonymous . G comes back many times a day... and now Yawhooo has made it's appearance . There is nothing illegal going on from my end... and I know I can do robots.txt file to exclude... But it still causes me wonder how just registering a domain brings the "world" to know it is out there. Especially this thunderstone bot... oh well... guess I need to go out and shovel more snow from the house in place of these trival matters. Thanks again group.. and my apologies for reviving this thread. K.
|
Robert Charlton

msg:4260104 | 8:45 am on Jan 30, 2011 (gmt 0) |
Why is Google indexing my entire web server? http://www.webmasterworld.com/google/3396393.htm [webmasterworld.com] Note... the Google "secret server" article is now offline, but the quoted references about public server logs and referrer logs should suffice.
|
netmeg

msg:4260153 | 2:14 pm on Jan 30, 2011 (gmt 0) |
There are plenty of sites out there that release lists every day of new domains that are registered, and new domains that are dropped. And those sites are picked up by Google. You can't control it. You can't stop it. It is what it is. You can put a lock on the front door (by making it password protected) You can NOINDEX it. Robots.txt only works for well-behaved bots (and there are plenty who ignore it) And the domain name will *still* be on those lists, so the fact that it exists will still be out there, even people can't see the site behind the domain. Like I said, it is what it is.
|
MonkeyFace

msg:4260232 | 7:51 pm on Jan 30, 2011 (gmt 0) |
Did you sign up for Google apps? If yes, that's the reason. I have seen the Gbot hitting new domains when I signed up for G apps, almost instantly.
|
|