|Brett's IP lists|
maybe some corrections needed..
Hi, Brett, I was looking at your IP lists pages and comparing to some of my notes, very complete, excellent job!
There are few things however that maye need to be looked at,
here you go..
220.127.116.11 and 18.104.22.168
these two IP are highly suspect I don't think they belong on the list
22.214.171.124 and 126.96.36.199
suspect numbers again, one is the same IP as in AV above
you also did not mention that Excite spiders use libwww-perl/5.47 and libwww-perl/5.10 generic User Agent names
188.8.131.52 lycosinc.NorthRoyalton.cw.net - hmm???
184.108.40.206 definitely suspect
220.127.116.11 ah-ha.com powered by Fast but...
Thanks. I'd not uploaded a rechecked list from this morning yet. The alta and excite ones are fixed and uploaded. The lycos 18.104.22.168 I am pretty sure is lycos. Some of the other stray Lycos ip's walk an almost identicle trace route. Thanks on the Fast ones, I'd thought those were all correct, but obviously a couple of stray ones in there.
The looksmart ones where thrown in there because 'they were there'. That really is Looksmart who uses Fast's spider and search engine programs.
updated lists uploaded...
Any URL where I can find this magical list? Thanks
Thanks for that.
I have a list that my company bought from Fantomaster which is insanely big compared to the list you posted brett.
Are some of those useless... redundant... obsolete?
I'm confused as the the size of my database is swelling.
What I'm saying in short, is that I'd rather keep my database small and if your listing is accurate enough to keep me off the SE's poo poo list I'd be estatic.
Ralph does a great job with that list from what I've seen and heard. He is into it, and is covering EVERYTHING though. I'm finely targeted there on the majors only. If you go into some of the 'machine name' links under my lists, you will also find some HUGE lists (inktomi), that list everthing under the sun. The primary smaller lists are the ones that have been caught 100% known for sure as running spiders from that host.
I'd just shorten the list down to the ip's of engines you know you want to target. My own list is a bit bigger than the one I put online - there are some alta boxes that are questionable origin not listed, and I won't cloak for Excite, or Fast. So for me, that leaves primarily Ink, Alta, and some tricky link work with Google. That comes out a pretty small list (100-125 I think).
I tell you what, you just send me that fancy list and I'll trim it right up for you (lol - only kidding).
dumb question from amateur:
How can I view my log files?
Thanks in advance to anyone who can
Normally the web host gives you access to the raw server logs and/or access to stats derived from your logs. It is best to have access to the raw logs, that way you can determine what you want to see, rather then what the host decided to set up with their log stats program.
Not all hosts offer access to your logs, or only do so on certain price plans. Ask them about this.