homepage Welcome to WebmasterWorld Guest from 54.166.14.218
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL

Search Engine Spider and User Agent Identification Forum

    
Brett's IP lists
maybe some corrections needed..
PeteU

10+ Year Member



 
Msg#: 117 posted 10:21 pm on Sep 6, 2000 (gmt 0)

Hi, Brett, I was looking at your IP lists pages and comparing to some of my notes, very complete, excellent job!
There are few things however that maye need to be looked at,
here you go..

Altavista:
206.53.238.34 and 203.46.136.2
these two IP are highly suspect I don't think they belong on the list

Excite:
208.11.147.3 and 203.46.136.2
suspect numbers again, one is the same IP as in AV above
you also did not mention that Excite spiders use libwww-perl/5.47 and libwww-perl/5.10 generic User Agent names

Lycos:
166.48.225.254 lycosinc.NorthRoyalton.cw.net - hmm???

Fast:
206.40.240.215 definitely suspect
208.186.202.21 ah-ha.com powered by Fast but...
207.138.42.105 suspect

cheers
PeteU

 

Brett_Tabke

WebmasterWorld Administrator brett_tabke us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 117 posted 10:36 pm on Sep 6, 2000 (gmt 0)

Thanks. I'd not uploaded a rechecked list from this morning yet. The alta and excite ones are fixed and uploaded. The lycos 166.48.225.254 I am pretty sure is lycos. Some of the other stray Lycos ip's walk an almost identicle trace route. Thanks on the Fast ones, I'd thought those were all correct, but obviously a couple of stray ones in there.

The looksmart ones where thrown in there because 'they were there'. That really is Looksmart who uses Fast's spider and search engine programs.

updated lists uploaded...

eljefe3

WebmasterWorld Administrator 10+ Year Member



 
Msg#: 117 posted 3:07 am on Sep 8, 2000 (gmt 0)

Any URL where I can find this magical list? Thanks

Brett_Tabke

WebmasterWorld Administrator brett_tabke us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 117 posted 5:08 am on Sep 8, 2000 (gmt 0)

[searchengineworld.com]

eljefe3

WebmasterWorld Administrator 10+ Year Member



 
Msg#: 117 posted 2:05 pm on Sep 8, 2000 (gmt 0)

Brett,

Thanks for that.

Smokin Joe

10+ Year Member



 
Msg#: 117 posted 7:57 pm on Sep 8, 2000 (gmt 0)

I have a list that my company bought from Fantomaster which is insanely big compared to the list you posted brett.

Are some of those useless... redundant... obsolete?

I'm confused as the the size of my database is swelling.

What I'm saying in short, is that I'd rather keep my database small and if your listing is accurate enough to keep me off the SE's poo poo list I'd be estatic.

Brett_Tabke

WebmasterWorld Administrator brett_tabke us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 117 posted 8:04 pm on Sep 8, 2000 (gmt 0)

Ralph does a great job with that list from what I've seen and heard. He is into it, and is covering EVERYTHING though. I'm finely targeted there on the majors only. If you go into some of the 'machine name' links under my lists, you will also find some HUGE lists (inktomi), that list everthing under the sun. The primary smaller lists are the ones that have been caught 100% known for sure as running spiders from that host.

I'd just shorten the list down to the ip's of engines you know you want to target. My own list is a bit bigger than the one I put online - there are some alta boxes that are questionable origin not listed, and I won't cloak for Excite, or Fast. So for me, that leaves primarily Ink, Alta, and some tricky link work with Google. That comes out a pretty small list (100-125 I think).

I tell you what, you just send me that fancy list and I'll trim it right up for you (lol - only kidding).

lizzie

10+ Year Member



 
Msg#: 117 posted 3:50 am on Sep 18, 2000 (gmt 0)

dumb question from amateur:
How can I view my log files?
Thanks in advance to anyone who can
help!

Air

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 117 posted 4:36 am on Sep 19, 2000 (gmt 0)

Normally the web host gives you access to the raw server logs and/or access to stats derived from your logs. It is best to have access to the raw logs, that way you can determine what you want to see, rather then what the host decided to set up with their log stats program.

Not all hosts offer access to your logs, or only do so on certain price plans. Ask them about this.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved