Welcome to WebmasterWorld Guest from 107.22.83.0

Forum Moderators: Ocean10000 & incrediBILL & keyplyr

Message Too Old, No Replies

ia archiver

     
7:59 pm on Jun 29, 2012 (gmt 0)

Moderator This Forum from US 

WebmasterWorld Administrator keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 26, 2001
posts:9268
votes: 445



I'm seeing increasing hits from Chinese and even Japanese IP ranges.

Anyone have a list of valid IP ranges for ia_archiver?
8:40 pm on June 29, 2012 (gmt 0)

Administrator from US 

WebmasterWorld Administrator incredibill is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Jan 25, 2005
posts:14664
votes: 99


You seriously let that thing access your site?

Letting anything archive your content should be avoided, it has more pitfalls than benefits.

I'd base valid on the first two user agents below and if it comes from AWS or archive.org IP ranges.

The good:

USER AGENT: "ia_archiver (+http://www.alexa.com/site/help/webmasters; crawler@alexa.com)"
IP: 174.129.228.67
IP: 204.236.235.245

USER AGENT: "ia_archiver(OS-Wayback)"
IP: 207.241.224.41
IP: 207.241.224.43
IP: 207.241.226.116
IP: 207.241.226.153
IP: 207.241.226.66
IP: 207.241.226.67
IP: 207.241.227.244

The bad and the ugly:

USER AGENT: "ia_archiver"
IP: 49.72.162.197
IP: 49.72.212.77
IP: 49.72.213.2
IP: 49.72.213.21
IP: 49.73.156.140
IP: 49.73.33.128
IP: 49.75.198.15
IP: 58.208.113.29
IP: 58.208.115.131
IP: 58.208.176.125
IP: 58.208.176.166
IP: 58.208.177.55
IP: 58.208.240.228
IP: 58.208.241.138
IP: 58.209.122.124
IP: 58.209.123.35
IP: 58.209.124.111
IP: 58.209.124.7
IP: 58.209.124.77
IP: 58.209.160.165
IP: 58.209.163.52
IP: 58.209.179.152
IP: 58.209.179.166
IP: 58.209.179.215
IP: 58.209.179.218
IP: 58.209.18.134
IP: 58.209.248.70
IP: 58.209.250.79
IP: 58.209.250.88
IP: 58.209.253.116
IP: 58.209.254.119
IP: 58.209.255.39
IP: 58.209.52.236
IP: 58.209.55.118
IP: 58.209.55.51
IP: 61.183.41.36
IP: 63.141.228.126
IP: 114.113.228.107
IP: 114.216.98.102
IP: 114.218.226.235
IP: 114.218.226.99
IP: 114.218.227.196
IP: 114.218.227.96
IP: 114.218.238.43
IP: 114.218.249.216
IP: 114.219.164.143
IP: 114.219.165.221
IP: 117.135.160.14
IP: 117.80.204.202
IP: 117.80.205.212
IP: 117.80.206.191
IP: 117.80.207.1
IP: 117.80.65.230
IP: 117.81.10.18
IP: 117.81.11.16
IP: 117.81.120.175
IP: 117.81.123.27
IP: 117.81.231.166
IP: 117.81.232.103
IP: 117.81.232.236
IP: 117.81.233.137
IP: 117.81.233.193
IP: 117.81.233.60
IP: 117.81.6.252
IP: 117.81.7.119
IP: 117.81.9.32
IP: 117.83.200.170
IP: 117.83.201.140
IP: 117.83.38.251
IP: 117.83.39.23
IP: 117.83.39.65
IP: 117.83.67.224
IP: 117.83.68.65
IP: 117.83.96.198
IP: 117.83.97.113
IP: 119.148.160.84
IP: 119.148.161.64
IP: 121.228.0.43
IP: 121.228.1.214
IP: 121.228.15.78
IP: 121.228.152.135
IP: 121.228.154.82
IP: 121.228.156.87
IP: 121.228.157.102
IP: 121.228.157.45
IP: 121.228.158.100
IP: 121.228.158.184
IP: 121.228.158.25
IP: 121.228.2.136
IP: 121.228.3.217
IP: 121.228.4.204
IP: 121.228.5.197
IP: 121.228.9.243
IP: 121.236.150.143
IP: 121.236.80.170
IP: 126.114.226.88
IP: 180.106.152.145
IP: 180.106.153.15
IP: 180.106.153.193
IP: 180.107.122.24
IP: 180.107.123.71
IP: 180.107.40.195
IP: 180.117.249.18
IP: 180.117.250.201
IP: 184.171.169.117
IP: 202.165.179.32
IP: 203.156.231.178
IP: 205.164.48.130
IP: 208.94.240.36
IP: 211.143.200.88
IP: 211.144.76.77
IP: 211.95.79.156
IP: 218.16.124.252
IP: 218.65.30.73
IP: 218.75.152.226
IP: 218.75.152.235
IP: 219.235.3.162
IP: 220.181.158.160
IP: 221.225.39.180
IP: 221.225.69.13
IP: 221.225.70.165
IP: 222.73.173.221
IP: 222.93.126.101
IP: 222.93.126.157
IP: 222.93.127.50
IP: 222.93.127.58
IP: 222.93.127.73
IP: 222.93.170.20
IP: 222.93.226.94
IP: 222.93.227.1
IP: 222.93.227.48
9:22 pm on June 29, 2012 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:13913
votes: 493


Astounding but true: If you e-mail them and ask, they will tell you if it's theirs. I can count on the fingers of one hand the number of times I've got an answer to a "Please identify your robot" query.
10:44 pm on June 29, 2012 (gmt 0)

Senior Member

WebmasterWorld Senior Member wilderness is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Nov 11, 2001
posts:5460
votes: 3


Letting anything archive your content should be avoided, it has more pitfalls than benefits.


lets not be so vague here ;)

IMO that archival also applies to major SE's and "cache.
12:58 am on June 30, 2012 (gmt 0)

Moderator This Forum from US 

WebmasterWorld Administrator keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 26, 2001
posts:9268
votes: 445


No Bill, to answer your question; nothing archives my content. However ia_archiver is also the bot Alexa uses to check your site and keep it in its index, which does have a positive affect when my advertisers check ranking prior to deciding whether to purchase one of my ad campaigns. Thanks for the IP list.
2:35 am on June 30, 2012 (gmt 0)

Administrator from US 

WebmasterWorld Administrator incredibill is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Jan 25, 2005
posts:14664
votes: 99


You're very welcome.

I have never permitted ia_archiver on my site yet have always showed up in Alexa.

I also sell advertising, never had anyone ask about Alexa, ever :)

YMMV
2:46 am on June 30, 2012 (gmt 0)

Moderator This Forum from US 

WebmasterWorld Administrator keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 26, 2001
posts:9268
votes: 445


Sorry to say, Alexa data displays on too many domain/traffic info sites, sometimes not identified as Alexa. I've only received feedback twice about Alexa stats during biz proposals, but I assume many more potential customers consider those numbers. Happily, I think the Alexa phenom has steadily decreased in importance over the last few years, replaced by the even more perplexing phenom of social media. Regardless, my strategy of covering all my bases remains my MO.
12:28 am on July 2, 2012 (gmt 0)

Preferred Member

10+ Year Member

joined:Mar 10, 2004
posts: 425
votes: 23


Alexa it to ranking what "MIPS" is to CPU benchmarks: meaningless indicator of performance.
6:53 am on July 2, 2012 (gmt 0)

Moderator This Forum from US 

WebmasterWorld Administrator keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 26, 2001
posts:9268
votes: 445



@ motorhaven - agreed, but countless users don't know that. I never get the chance to enlighten many potential clients, they just make their money decisions with the data at hand, valid or not.
2:48 pm on July 2, 2012 (gmt 0)

Preferred Member

10+ Year Member

joined:Mar 10, 2004
posts: 425
votes: 23


I agree with that. I recently had someone approach me about doing a major overhaul on his site and he kept bragging about his Alexa ranking. When I gave him more accurate information, he was disappointed.

I'd put Alexa up for potential advertisers, but my experience is that blocking my sites from Alexa outweighs the potential advertising benefits.

Obviously this varies for all of us depending on niche, so it comes down to a business decisions pro/con. Quite similar to Websense - sites which rely heavily on corporate traffic may not want to block it while others may find it beneficial.
3:16 pm on July 2, 2012 (gmt 0)

Senior Member

WebmasterWorld Senior Member wilderness is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Nov 11, 2001
posts:5460
votes: 3


Obviously this varies for all of us depending on niche, so it comes down to a business decisions pro/con. Quite similar to Websense - sites which rely heavily on corporate traffic may not want to block it while others may find it beneficial.


Given today's market trends and the variety of devices (PC's, cell-phones, Handhelds), I'm seeing legitimate users simply switching devices.

Course that's not economical for the mass harvester, nor are the bandwidth restrictions of most mobile devices.
9:20 pm on July 13, 2012 (gmt 0)

Administrator from US 

WebmasterWorld Administrator incredibill is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Jan 25, 2005
posts:14664
votes: 99


This obnoxious bot using the China IPs is now asking for the index page of one site almost 100 times a day.

Why do they keep coming back with such frequency?

Insanity.
9:38 pm on July 13, 2012 (gmt 0)

Moderator This Forum from US 

WebmasterWorld Administrator keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 26, 2001
posts:9268
votes: 445



Today started trial period blocking all variations of ia_archiver, Alexa, etc.

I'll be monitoring the numbers with advert requests in relationship to any changes in Alexa ranking. Will post anything significant in a few weeks, thanks.
10:07 pm on July 14, 2012 (gmt 0)

Senior Member from GB 

WebmasterWorld Senior Member dstiles is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:May 14, 2008
posts:3152
votes: 4


Real ia_archiver DOES respect robots.txt - haven't seen the real one in years.
1:35 am on July 15, 2012 (gmt 0)

Preferred Member

10+ Year Member

joined:Mar 10, 2004
posts: 425
votes: 23


It shouldn't impact Alexa rankings. Alexa builds ranking based on hits from browsers with the plug-in installed. I blocked it for years on a site in the top 10,000, and several other sites with no negative change in Alexa rank (and wouldn't care if it did!).