Welcome to WebmasterWorld Guest from 54.205.209.95

Forum Moderators: Ocean10000 & incrediBILL

Message Too Old, No Replies

ia archiver

   
7:59 pm on Jun 29, 2012 (gmt 0)

WebmasterWorld Senior Member keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month




I'm seeing increasing hits from Chinese and even Japanese IP ranges.

Anyone have a list of valid IP ranges for ia_archiver?
8:40 pm on Jun 29, 2012 (gmt 0)

WebmasterWorld Administrator incredibill is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



You seriously let that thing access your site?

Letting anything archive your content should be avoided, it has more pitfalls than benefits.

I'd base valid on the first two user agents below and if it comes from AWS or archive.org IP ranges.

The good:

USER AGENT: "ia_archiver (+http://www.alexa.com/site/help/webmasters; crawler@alexa.com)"
IP: 174.129.228.67
IP: 204.236.235.245

USER AGENT: "ia_archiver(OS-Wayback)"
IP: 207.241.224.41
IP: 207.241.224.43
IP: 207.241.226.116
IP: 207.241.226.153
IP: 207.241.226.66
IP: 207.241.226.67
IP: 207.241.227.244

The bad and the ugly:

USER AGENT: "ia_archiver"
IP: 49.72.162.197
IP: 49.72.212.77
IP: 49.72.213.2
IP: 49.72.213.21
IP: 49.73.156.140
IP: 49.73.33.128
IP: 49.75.198.15
IP: 58.208.113.29
IP: 58.208.115.131
IP: 58.208.176.125
IP: 58.208.176.166
IP: 58.208.177.55
IP: 58.208.240.228
IP: 58.208.241.138
IP: 58.209.122.124
IP: 58.209.123.35
IP: 58.209.124.111
IP: 58.209.124.7
IP: 58.209.124.77
IP: 58.209.160.165
IP: 58.209.163.52
IP: 58.209.179.152
IP: 58.209.179.166
IP: 58.209.179.215
IP: 58.209.179.218
IP: 58.209.18.134
IP: 58.209.248.70
IP: 58.209.250.79
IP: 58.209.250.88
IP: 58.209.253.116
IP: 58.209.254.119
IP: 58.209.255.39
IP: 58.209.52.236
IP: 58.209.55.118
IP: 58.209.55.51
IP: 61.183.41.36
IP: 63.141.228.126
IP: 114.113.228.107
IP: 114.216.98.102
IP: 114.218.226.235
IP: 114.218.226.99
IP: 114.218.227.196
IP: 114.218.227.96
IP: 114.218.238.43
IP: 114.218.249.216
IP: 114.219.164.143
IP: 114.219.165.221
IP: 117.135.160.14
IP: 117.80.204.202
IP: 117.80.205.212
IP: 117.80.206.191
IP: 117.80.207.1
IP: 117.80.65.230
IP: 117.81.10.18
IP: 117.81.11.16
IP: 117.81.120.175
IP: 117.81.123.27
IP: 117.81.231.166
IP: 117.81.232.103
IP: 117.81.232.236
IP: 117.81.233.137
IP: 117.81.233.193
IP: 117.81.233.60
IP: 117.81.6.252
IP: 117.81.7.119
IP: 117.81.9.32
IP: 117.83.200.170
IP: 117.83.201.140
IP: 117.83.38.251
IP: 117.83.39.23
IP: 117.83.39.65
IP: 117.83.67.224
IP: 117.83.68.65
IP: 117.83.96.198
IP: 117.83.97.113
IP: 119.148.160.84
IP: 119.148.161.64
IP: 121.228.0.43
IP: 121.228.1.214
IP: 121.228.15.78
IP: 121.228.152.135
IP: 121.228.154.82
IP: 121.228.156.87
IP: 121.228.157.102
IP: 121.228.157.45
IP: 121.228.158.100
IP: 121.228.158.184
IP: 121.228.158.25
IP: 121.228.2.136
IP: 121.228.3.217
IP: 121.228.4.204
IP: 121.228.5.197
IP: 121.228.9.243
IP: 121.236.150.143
IP: 121.236.80.170
IP: 126.114.226.88
IP: 180.106.152.145
IP: 180.106.153.15
IP: 180.106.153.193
IP: 180.107.122.24
IP: 180.107.123.71
IP: 180.107.40.195
IP: 180.117.249.18
IP: 180.117.250.201
IP: 184.171.169.117
IP: 202.165.179.32
IP: 203.156.231.178
IP: 205.164.48.130
IP: 208.94.240.36
IP: 211.143.200.88
IP: 211.144.76.77
IP: 211.95.79.156
IP: 218.16.124.252
IP: 218.65.30.73
IP: 218.75.152.226
IP: 218.75.152.235
IP: 219.235.3.162
IP: 220.181.158.160
IP: 221.225.39.180
IP: 221.225.69.13
IP: 221.225.70.165
IP: 222.73.173.221
IP: 222.93.126.101
IP: 222.93.126.157
IP: 222.93.127.50
IP: 222.93.127.58
IP: 222.93.127.73
IP: 222.93.170.20
IP: 222.93.226.94
IP: 222.93.227.1
IP: 222.93.227.48
9:22 pm on Jun 29, 2012 (gmt 0)

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



Astounding but true: If you e-mail them and ask, they will tell you if it's theirs. I can count on the fingers of one hand the number of times I've got an answer to a "Please identify your robot" query.
10:44 pm on Jun 29, 2012 (gmt 0)

WebmasterWorld Senior Member wilderness is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



Letting anything archive your content should be avoided, it has more pitfalls than benefits.


lets not be so vague here ;)

IMO that archival also applies to major SE's and "cache.
12:58 am on Jun 30, 2012 (gmt 0)

WebmasterWorld Senior Member keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



No Bill, to answer your question; nothing archives my content. However ia_archiver is also the bot Alexa uses to check your site and keep it in its index, which does have a positive affect when my advertisers check ranking prior to deciding whether to purchase one of my ad campaigns. Thanks for the IP list.
2:35 am on Jun 30, 2012 (gmt 0)

WebmasterWorld Administrator incredibill is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



You're very welcome.

I have never permitted ia_archiver on my site yet have always showed up in Alexa.

I also sell advertising, never had anyone ask about Alexa, ever :)

YMMV
2:46 am on Jun 30, 2012 (gmt 0)

WebmasterWorld Senior Member keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



Sorry to say, Alexa data displays on too many domain/traffic info sites, sometimes not identified as Alexa. I've only received feedback twice about Alexa stats during biz proposals, but I assume many more potential customers consider those numbers. Happily, I think the Alexa phenom has steadily decreased in importance over the last few years, replaced by the even more perplexing phenom of social media. Regardless, my strategy of covering all my bases remains my MO.
12:28 am on Jul 2, 2012 (gmt 0)

10+ Year Member



Alexa it to ranking what "MIPS" is to CPU benchmarks: meaningless indicator of performance.
6:53 am on Jul 2, 2012 (gmt 0)

WebmasterWorld Senior Member keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month




@ motorhaven - agreed, but countless users don't know that. I never get the chance to enlighten many potential clients, they just make their money decisions with the data at hand, valid or not.
2:48 pm on Jul 2, 2012 (gmt 0)

10+ Year Member



I agree with that. I recently had someone approach me about doing a major overhaul on his site and he kept bragging about his Alexa ranking. When I gave him more accurate information, he was disappointed.

I'd put Alexa up for potential advertisers, but my experience is that blocking my sites from Alexa outweighs the potential advertising benefits.

Obviously this varies for all of us depending on niche, so it comes down to a business decisions pro/con. Quite similar to Websense - sites which rely heavily on corporate traffic may not want to block it while others may find it beneficial.
3:16 pm on Jul 2, 2012 (gmt 0)

WebmasterWorld Senior Member wilderness is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



Obviously this varies for all of us depending on niche, so it comes down to a business decisions pro/con. Quite similar to Websense - sites which rely heavily on corporate traffic may not want to block it while others may find it beneficial.


Given today's market trends and the variety of devices (PC's, cell-phones, Handhelds), I'm seeing legitimate users simply switching devices.

Course that's not economical for the mass harvester, nor are the bandwidth restrictions of most mobile devices.
9:20 pm on Jul 13, 2012 (gmt 0)

WebmasterWorld Administrator incredibill is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



This obnoxious bot using the China IPs is now asking for the index page of one site almost 100 times a day.

Why do they keep coming back with such frequency?

Insanity.
9:38 pm on Jul 13, 2012 (gmt 0)

WebmasterWorld Senior Member keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month




Today started trial period blocking all variations of ia_archiver, Alexa, etc.

I'll be monitoring the numbers with advert requests in relationship to any changes in Alexa ranking. Will post anything significant in a few weeks, thanks.
10:07 pm on Jul 14, 2012 (gmt 0)

WebmasterWorld Senior Member dstiles is a WebmasterWorld Top Contributor of All Time 5+ Year Member



Real ia_archiver DOES respect robots.txt - haven't seen the real one in years.
1:35 am on Jul 15, 2012 (gmt 0)

10+ Year Member



It shouldn't impact Alexa rankings. Alexa builds ranking based on hits from browsers with the plug-in installed. I blocked it for years on a site in the top 10,000, and several other sites with no negative change in Alexa rank (and wouldn't care if it did!).