homepage Welcome to WebmasterWorld Guest from 23.20.220.61
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Visit PubCon.com
Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL

Search Engine Spider and User Agent Identification Forum

This 36 message thread spans 2 pages: 36 ( [1] 2 > >     
IP Lists
Brett_Tabke




msg:395546
 4:32 pm on Jun 10, 2002 (gmt 0)

Ran into this noncommercial site with free ip lists to the major search engines. This is something many of us have wanted for years:
[iplists.com...]

 

Nick_W




msg:395547
 4:54 pm on Jun 10, 2002 (gmt 0)

Great find!

Wonder how accurate it is?

Nick

Brett_Tabke




msg:395548
 5:07 pm on Jun 10, 2002 (gmt 0)

Oh, it's very accurate.

littleman




msg:395549
 5:12 pm on Jun 10, 2002 (gmt 0)

But a bit dated.
Still a very useful list.

Nick_W




msg:395550
 5:14 pm on Jun 10, 2002 (gmt 0)

Maybe the IP's just have not altered since the date on the last revision?

How would you get these details, you'd have to see the IP's hit your site right?

Nick

volatilegx




msg:395551
 5:32 pm on Jun 10, 2002 (gmt 0)

Hi Guys...

It's my site/list ... questions? Comments?

It's as up to date and accurate as I can keep it. If anybody has IPs that I don't... I'd certainly appreciate knowing about them. I post many new ones that I find here and also glean quite a bit of info here, too, as many of you already know.

Thanks!

<added>Oh yeah, there is a rearranged version located here: [iplists.com...] . The rearranged version reflects some shifts in Search Engine prominence.</added>

[edited by: volatilegx at 3:04 pm (utc) on Oct. 25, 2004]

ann




msg:395552
 9:28 pm on Jun 10, 2002 (gmt 0)

Does MSN have a spider? Looked at my logs today and msn was all over them????

Ann

volatilegx




msg:395553
 10:13 pm on Jun 10, 2002 (gmt 0)

I belive MSN uses the Inktomi index... can you give some examples of the log entries with MSN in them?

papabaer




msg:395554
 10:38 pm on Jun 10, 2002 (gmt 0)

Nice find Brett! I've got a new toy... AND a very useful one!

nell




msg:395555
 11:09 pm on Jun 10, 2002 (gmt 0)

I've had 2 major sites trashed in the last 3 months owing to cloaking with outdated IP lists. Lists from an expensive service that were always current but outdated immediately when new spider IPs were introduced.

I asked one customer of mine if they were going to fire me because of it. They laughed. It seems I cost them so much money they couldn't afford to fire me. I'll be working it off with them over the next 6-8 years.

Job security in it's basic form.

Key_Master




msg:395556
 11:23 pm on Jun 10, 2002 (gmt 0)

>>>Lists from an expensive service that were always current but outdated immediately when new spider IPs were introduced.

You mean I can make money off of IP lists? ;) I don't cloak but I knew there was a reason why I've been hoarding those IP lists away all this time. If you don't mind, I would be interested in knowing the name of the service you were dealing with. You can stickymail it you like.

hanuman




msg:395557
 7:03 am on Jun 11, 2002 (gmt 0)

Great resource volatilegx!

Would love to see a search for ip option on your site. btw, I was hitted by 209.10.169.24 and 64.140.48.38 with 10K hits each! , are these IP belongs to SE? Whis information below:
Thanks
Hanuman

209.10.169.24
^^^^^^^^^^^^^
Globix Corporation (NETBLK-GLOBIXBLK3)
295 Lafayette St- 3rd Fl

NY, NY 10012

US

Netname: GLOBIXBLK3
Netblock: 209.10.0.0 - 209.11.223.255
Maintainer: PFMC

Coordinator:
Hostmaster, Globix Corporation (GCH2-ARIN) arin-admin@GLOBIX.NET
+1-212-334-8500 (FAX) 212.334.8615

Domain System inverse mapping provided by:

Z1.NS.NYC1.GLOBIX.NET 209.10.66.55
Z1.NS.SJC1.GLOBIX.NET 209.10.34.55
Z1.NS.LHR1.GLOBIX.NET 212.111.32.38

64.140.48.30
^^^^^^^^^^^^
ICG NetAhead, Inc. (NETBLK-ICG-BLK5)
161 Inverness Dr. West

Englewood, CO 80112

US

Netname: ICG-BLK5
Netblock: 64.140.0.0 - 64.140.95.255
Maintainer: ICGN

Coordinator:
Taylor, Stacy (ST452-ARIN) abuse@icgcom.com
408-579-5000

Domain System inverse mapping provided by:

AS1.ICG.NET 170.147.45.163
AS2.ICG.NET 170.147.45.164

littleman




msg:395558
 7:11 am on Jun 11, 2002 (gmt 0)

Yeah, volatilegx, it is really good of you to make this publicly available.

volatilegx




msg:395559
 4:45 pm on Jun 11, 2002 (gmt 0)

hanuman,

I'm following up with the network owners of those IP addresses. Hopefully, I'll get some information. I'm especially curious about the bot out of Colorado, as I've seen a number of unidentified bots coming out of Englewood, CO.

I'll post what I find here.

john316




msg:395560
 4:58 pm on Jun 11, 2002 (gmt 0)

Thanks volatilegx !

volatilegx




msg:395561
 5:27 pm on Jun 11, 2002 (gmt 0)

I've added sort of a crude search feature. [iplists.com...]

You can input a list of IP Addresses, one per line, and you'll be told if they were found in my lists, and which list they were on.

volatilegx




msg:395562
 10:17 pm on Jun 11, 2002 (gmt 0)

hanuman,

64.140.48.30, the IP from Colorado, appears to belong to the company Iparadigms.com. They are a copyright violation search company. They appear to be related to TurnItIn.com, Plagairism.org and SlySearch. Prime banning material and definitely not a search engine. Did you happen to get the User Agent on this one? I'm betting it was the SlySearch bot. According to the documentation for the SlySearch bot, it obeys robots.txt. Docs: [slysearch.com...]

Still working on the other one.

Key_Master




msg:395563
 10:28 pm on Jun 11, 2002 (gmt 0)

hanuman,

209.10.169.24 is most likely PortalBSpider/2.0 (spider@portalb.com)

[portalb.com...]

Hope this helps.

tvr23




msg:395564
 1:33 pm on Jun 12, 2002 (gmt 0)

Not sure if this is the correct place to post this?

I am currently working on a project where I need to identfy major ISP providers from IP's - much like the IPlists.com site identifies the search engines.

The ISP's i need to find are aol, bt open world, freeserve, pippex and compuserve. Are there any sites that would give me this information?

Thanks in advance

OhMyPixel




msg:395565
 3:36 pm on Jun 12, 2002 (gmt 0)

Call me stupid, but what is the benefit of having the Search Engines IP? Would it be to see their visits in your stats? Going one step further it would be cool to have a stats package that highlighted the Hits/IP's that came from search engines.

volatilegx




msg:395566
 3:41 pm on Jun 12, 2002 (gmt 0)

One reason to collect search engine IP addresses is for cloaking purposes.

Another is to make sure you don't ban the search engine IP addresses from your website. Many people ban other bots but want to leave the SE IPs alone.

wilderness




msg:395567
 4:03 pm on Jun 12, 2002 (gmt 0)

<snip>Going one step further it would be cool to have a stats package that highlighted the Hits/IP's that came from search engines>

Analog does an excellent job of accumulating search engine stats. Although it doesn't highlight them.
Analog does most anything if you can figure out how to configure it.

OhMyPixel




msg:395568
 4:06 pm on Jun 12, 2002 (gmt 0)

hmm cloaking. is that similar to taking the invisible ink and painting yourself with? could you point me somewhere or just give me a gen. description please?

NFFC




msg:395569
 4:10 pm on Jun 12, 2002 (gmt 0)

[searchengineworld.com...] is a good starting point OhMyPixel.

volatilegx




msg:395570
 4:13 pm on Jun 12, 2002 (gmt 0)

Also look here: [webmasterworld.com...]

OhMyPixel




msg:395571
 4:36 pm on Jun 12, 2002 (gmt 0)

thank you, reading it now.

hanuman




msg:395572
 1:41 am on Jun 13, 2002 (gmt 0)

Thank you volatilegx for the help in the matter! and Key_Master for the information.!

Hanuman

hanuman




msg:395573
 4:47 am on Jun 13, 2002 (gmt 0)

To block 209.10.169.24 - PortalBSpider, and 64.140.48.30 Slysearch

I added these lines to my .htaccess file

RewriteEngine On
RewriteCond %{HTTP_USER_AGENT} ^(SlySearch.*¦PortalBSpider.*) [NC,OR]
RewriteRule ^(.*) block.htm [L]

I would also recommend adding the following lines

RewriteCond %{HTTP_USER_AGENT} ^(-?¦[A-Z]{10})$ [OR]

RewriteCond %{REMOTE_HOST} ^private$ [NC,OR]

Thanks the group for the kind help!
Hanuman

Friday




msg:395574
 7:09 pm on Jun 13, 2002 (gmt 0)

Thanks volatilegx!

Man, that's a LOT of IPs!

Many more than I'm currently using.

Is the old list obsolete and can be ignored or should it be cm0bined with the new?

Also: Like I said, that's a LOT of IPs. Are your sure none of these are non SE spiders. I mean I've had clients get really pissed when one of their memebrs accessed cloaked pages because I included a block of IPs that contained one their ISP was using.

:-(

Friday




msg:395575
 7:13 pm on Jun 13, 2002 (gmt 0)

Regarding my above post:

I noticed some IPs in the "New" list that were included as entire blocks in the "Old" list.

Does this mean you discovered that the entire block was not correct?

Thanks Again,
Friday

This 36 message thread spans 2 pages: 36 ( [1] 2 > >
Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved