homepage Welcome to WebmasterWorld Guest from 54.81.170.186
register, free tools, login, search, subscribe, help, library, announcements, recent posts, open posts,
Subscribe to WebmasterWorld
Home / Forums Index / Search Engines / Alternative Search Engines
Forum Library, Charter, Moderators: bakedjake

Alternative Search Engines Forum

    
Spider visiting if you pay Looksmart?
jlara




msg:463559
 7:51 pm on Dec 7, 2000 (gmt 0)

I have seen the architect spider only visit one domain of 700 in the last six months. It just happens to be the only domain I paid for a listing in Looksmart with.

Is there any other submissions to Excite that have been getting the spider to visit?

 

cirelle




msg:463560
 5:52 pm on Dec 8, 2000 (gmt 0)

ArchitextSpider has visited 3 domains in the past 24 hrs. Not one of the more active spiders.

c

cirelle




msg:463561
 8:46 pm on Dec 8, 2000 (gmt 0)

oops! forgot to add,
the sites are not paid sites

mivox




msg:463562
 9:58 pm on Dec 11, 2000 (gmt 0)

I've gotten visits from Architext, but no listing in Excite. First submitted in October...

Most of the spider visits only requested robots.txt and logged a hit on the base directory. Didn't seem to actually 'spider' anything.

budterm




msg:463563
 2:08 am on Dec 12, 2000 (gmt 0)

ArchitextSpider occasionally visits sites that have been in their database for years. It so happens that these are the only ones that I have paid for at LookSmart (but spidering was already ongoing). No new sites have been spidered, but I have not paid for any new ones yet at LookSmart.

littleman




msg:463564
 1:54 am on Jun 10, 2001 (gmt 0)

Add google to the list of spiders who will visit a site after it gets a listing in looksmart.

jeremy goodrich




msg:463565
 12:57 pm on Jun 10, 2001 (gmt 0)

If you're right about that, then google doesn't necessarily obey robots.txt.

That is huge. From LookSmart's Robots.txt file [looksmart.com]:
User-agent: Googlebot
Disallow:

From a log file I just looked up: "Googlebot/2.1 (+http://www.googlebot.com/bot.html)"

So if you're correct, good old google is indeed ignoring robots.txt. Unless they have a special deal with LookSmart that we all don't know about.

ihelpyou




msg:463566
 2:06 pm on Jun 10, 2001 (gmt 0)

I have thought for awhile now that Google will scoot over to Looksmart to get newly listed pages. This is why I list the page first in Looksmart before anything else.

Brett_Tabke




msg:463567
 12:45 pm on Jun 11, 2001 (gmt 0)

If there were a star there jeremy, that would ban Google. No star in the disallow field means it is ok. (I think there was another thread around here where I misspoke about that).


From: the Robots Exclusion Standard [robotstxt.org]:
To allow all robots complete access:
User-agent: *
Disallow:

jeremy goodrich




msg:463568
 12:57 pm on Jun 11, 2001 (gmt 0)

(smacks head, looks really foolish).

Oops. (And here I was thinking I was on to something, too bad I know realize it smells like manure :)

Until recently I've never really dealt personally with robots.txt. (note to self, star means ban, and nothing means do whatever...) Gotcha.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Alternative Search Engines
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About
© Webmaster World 1996-2014 all rights reserved