Welcome to WebmasterWorld Guest from 107.20.34.173

Forum Moderators: bakedjake

Message Too Old, No Replies

Spider visiting if you pay Looksmart?

     

jlara

7:51 pm on Dec 7, 2000 (gmt 0)

10+ Year Member



I have seen the architect spider only visit one domain of 700 in the last six months. It just happens to be the only domain I paid for a listing in Looksmart with.

Is there any other submissions to Excite that have been getting the spider to visit?

cirelle

5:52 pm on Dec 8, 2000 (gmt 0)

10+ Year Member



ArchitextSpider has visited 3 domains in the past 24 hrs. Not one of the more active spiders.

c

cirelle

8:46 pm on Dec 8, 2000 (gmt 0)

10+ Year Member



oops! forgot to add,
the sites are not paid sites

mivox

9:58 pm on Dec 11, 2000 (gmt 0)

WebmasterWorld Senior Member mivox is a WebmasterWorld Top Contributor of All Time 10+ Year Member



I've gotten visits from Architext, but no listing in Excite. First submitted in October...

Most of the spider visits only requested robots.txt and logged a hit on the base directory. Didn't seem to actually 'spider' anything.

budterm

2:08 am on Dec 12, 2000 (gmt 0)

10+ Year Member



ArchitextSpider occasionally visits sites that have been in their database for years. It so happens that these are the only ones that I have paid for at LookSmart (but spidering was already ongoing). No new sites have been spidered, but I have not paid for any new ones yet at LookSmart.

littleman

1:54 am on Jun 10, 2001 (gmt 0)

WebmasterWorld Senior Member littleman is a WebmasterWorld Top Contributor of All Time 10+ Year Member



Add google to the list of spiders who will visit a site after it gets a listing in looksmart.

jeremy goodrich

12:57 pm on Jun 10, 2001 (gmt 0)

WebmasterWorld Senior Member jeremy_goodrich is a WebmasterWorld Top Contributor of All Time 10+ Year Member



If you're right about that, then google doesn't necessarily obey robots.txt.

That is huge. From LookSmart's Robots.txt file [looksmart.com]:
User-agent: Googlebot
Disallow:

From a log file I just looked up: "Googlebot/2.1 (+http://www.googlebot.com/bot.html)"

So if you're correct, good old google is indeed ignoring robots.txt. Unless they have a special deal with LookSmart that we all don't know about.

ihelpyou

2:06 pm on Jun 10, 2001 (gmt 0)

10+ Year Member



I have thought for awhile now that Google will scoot over to Looksmart to get newly listed pages. This is why I list the page first in Looksmart before anything else.

Brett_Tabke

12:45 pm on Jun 11, 2001 (gmt 0)

WebmasterWorld Administrator brett_tabke is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



If there were a star there jeremy, that would ban Google. No star in the disallow field means it is ok. (I think there was another thread around here where I misspoke about that).


From: the Robots Exclusion Standard [robotstxt.org]:
To allow all robots complete access:
User-agent: *
Disallow:

jeremy goodrich

12:57 pm on Jun 11, 2001 (gmt 0)

WebmasterWorld Senior Member jeremy_goodrich is a WebmasterWorld Top Contributor of All Time 10+ Year Member



(smacks head, looks really foolish).

Oops. (And here I was thinking I was on to something, too bad I know realize it smells like manure :)

Until recently I've never really dealt personally with robots.txt. (note to self, star means ban, and nothing means do whatever...) Gotcha.

 

Featured Threads

Hot Threads This Week

Hot Threads This Month