blend27

msg:3555949 | 4:13 pm on Jan 23, 2008 (gmt 0) |
banned until URL provided in UA resolves.
|
kamikaze Optimizer

msg:3557730 | 10:51 am on Jan 25, 2008 (gmt 0) |
I just blocked it. I block most all that are not from sites that send me traffic (G, Y & M).
|
Lord Majestic

msg:3557838 | 2:51 pm on Jan 25, 2008 (gmt 0) |
It all probability this is the bot of [webalta.ru...] - primarily Russian language search site. They used to have another user-agent though, so it could be fake/pretender.
|
keyplyr

msg:3558283 | 11:33 pm on Jan 25, 2008 (gmt 0) |
The IP range the bot is coming from is the same inetnum as whois for webalta.net so I have taken down blocks and allowing it to crawl through my accounts. However, they do need to get their act together and get their info visible.
|
mcneely

msg:3566032 | 6:55 pm on Feb 4, 2008 (gmt 0) |
These guys will forever remained banned on our end. Seems all they want to do is follow site scrapers around. It's too bad really, because WebAlta is supposed to be the Russian equivalent of Google with regard to it's popularity over there. We've several sites with weblogs, and that WebAlta huckster comes in within the hour to parse and/or collect the "exact" page the scraper couldn't get the first time around. The requests are perfectly aligned, in that it asks for the same thing the scraper did right down to the anchor links. weblogs are it's favourite target, with movies, mp3's, and software following. If a scraper puts in a request for a file and gets denied, I can almost count, to within the minute when WebAlta will show. I've been watching these guys for a good long while now, and patterns are patterns, so it's asta-le-way-you-go to WebAlta.
|
wilderness

msg:3566135 | 9:01 pm on Feb 4, 2008 (gmt 0) |
| Seems all they want to do is follow site scrapers around. |
| Have a friend who moved a widget website that had been online through three free hosts and a domain. The website was her college thesis some ten years ago. When the friend didn't renew their hosting, I offered a sub-folder. The result is that the sub-folder has much, much less restrictive access than the rest of my websites. Most every visitor that hits the subfolder results in multiple requests for the root (403'd) and then is followed immediately by 2-5 entirely different IP's making identical requests. I don't accumulate these request/denies because most are non-North American ranges which don't get into my sites anyway (with only a few exceptions). The coincidence of these repeated patterns is simply too frequent to overlook.
|
Eric

msg:3601975 | 9:48 am on Mar 16, 2008 (gmt 0) |
At least this one obey robots.txt Accept: text/html;q=1.0, text/plain;q=1.0, text/;q=0.5, */*;q=0.1 Accept-Charset: utf-8;q=1.0, windows-1251;q=0.8, cp1251;q=0.8, koi8-r;q=0.8, *;q=0.5 Accept-Encoding: gzip;q=1.0, deflate;q=1.0, identity;q=0.5, *;q=0 Host: www.some.com User-Agent: WebAlta Crawler/2.0 (http://www.webalta.net/ru/about_webmaster.html) (Windows; U; Windows NT 5.1; ru-RU) 77.91.224.16
|
newbie6

msg:3617136 | 1:55 pm on Apr 2, 2008 (gmt 0) |
Just to check, is it User-agent: WebAlta ?
|
Loeffler

msg:3662660 | 9:22 am on May 30, 2008 (gmt 0) |
WebAlta Crawler is a harvester. He collects email addresses from web sites and adds them to spam mailing lists. He visited our web page [abx.de ] on 14th May 2008 at 08:56 CEST His IP address: 85.17.173.8 ( LeaseWeb, AMSTERDAM, Netherlands ) We showed him a new email address generated only for him. We received the first spam for this email address on 30th May. Andreas.
|
blend27

msg:3662745 | 11:43 am on May 30, 2008 (gmt 0) |
I've never seen WebAlta Crawler come from LeaseWeb ranges. But I've seen plenty harvesters/scrapers come from the range that IP is in that would spoof the UAs like there is no tomorrow.
|
Loeffler

msg:3662773 | 12:43 pm on May 30, 2008 (gmt 0) |
You are right. WebAlta visited our web page several times from IP 77.91.224... (WEBALTA-NET, Russia) He identified himself as webalta crawler/2.0. We received no spam. This time, he came from 85.17.173.8. He identified himself as WebAlta Crawler/1.3.34. We received spam for the shown email address. This is the list of the last visits: 27th jan 2008 13:09 cet IP: 77.91.224.5 (WEBALTA-NET Moscow, Russia ) Browser: webalta crawler/2.0 (http://www.webalta.net/ru/about_webmaster.html) (windows; u; windows nt 5.1; ru-ru) 04th feb 2008 13:51 cet IP: 77.91.224.15 (WEBALTA-NET Moscow, Russia ) Browser: webalta crawler/2.0 (http://www.webalta.net/ru/about_webmaster.html) (windows; u; windows nt 5.1; ru-ru) 10th feb 2008 04:52 cet IP: 77.91.224.15 (WEBALTA-NET Moscow, Russia ) Browser: webalta crawler/2.0 (http://www.webalta.net/ru/about_webmaster.html) (windows; u; windows nt 5.1; ru-ru) 17th feb 2008 13:55 cet IP: 77.91.224.5 (WEBALTA-NET Moscow, Russia ) Browser: webalta crawler/2.0 (http://www.webalta.net/ru/about_webmaster.html) (windows; u; windows nt 5.1; ru-ru) 27th mar 2008 11:09 cet IP: 77.91.224.15 (WEBALTA-NET Moscow, Russia ) Browser: webalta crawler/2.0 (http://www.webalta.net/ru/about_webmaster.html) (windows; u; windows nt 5.1; ru-ru) 07th may 2008 03:39 cest IP: 77.91.224.15 (WEBALTA-NET Moscow, Russia ) Browser: webalta crawler/2.0 (http://www.webalta.net/ru/about_webmaster.html) (windows; u; windows nt 5.1; ru-ru) 14th may 2008 08:56 cest IP: 85.17.173.8 (LeaseWeb, AMSTERDAM, Netherlands ) Browser: webalta crawler/1.3.34 (http://www.webalta.net/ru/about_webmaster.html) (windows; u; windows nt 5.1; ru-ru) 21st may 2008 12:01 cest IP: 77.91.224.15 (WEBALTA-NET Moscow, Russia ) Browser: webalta crawler/2.0 (http://www.webalta.net/ru/about_webmaster.html) (windows; u; windows nt 5.1; ru-ru) Andreas
|
reddragons

msg:3695184 | 6:49 am on Jul 10, 2008 (gmt 0) |
The WebAlta bot visits my sites on a regular basis and as stated above the "http://www.webalta.net/ru/about_webmaster.html" url doesn't work. But heres some info and links from the WebAlta.ru website that seems to be a Russian portal offering Email, Dating and Search's where the WebAlta.net bot comes from (maybe) . Main Site: [webalta.ru...] Top ranked sites in its index: [top.webalta.ru...] Run a search query: [top.webalta.ru...] There Marketing site ?: [altastat.com...] If you want to see the site in english run it through google's translator: translate.google.com
|
keyplyr

msg:3702809 | 4:46 am on Jul 20, 2008 (gmt 0) |
I decided to let WebAlta Crawler have access to a few sites on one server cluster. The very next day it came, requested robots.txt then disobeyed it. Took disallowed files and crawled through disallowed directories. Now permanently banned.
|
Lord Majestic

msg:3702857 | 9:36 am on Jul 20, 2008 (gmt 0) |
WebAlta stopped showing search engine interface some months ago, the search query at top.webalta.ru is a directory search not web search.
|
Megaclinium

msg:3732140 | 4:50 am on Aug 27, 2008 (gmt 0) |
Nasty thing started scraping parts of my site. And going way too fast at a couple pages a second, and taking ka-jillions of pages per session. (that's metric for 'billions and billions') coming from 77.91.224.*. a 2006 webmasteworld discussion had in coming from 87.224.173.* based on excessive speed and above email harvesting, deep sixed it.
|
thetrasher

msg:3743130 | 12:08 am on Sep 12, 2008 (gmt 0) |
Formerly known as "WebAlta Crawler/2.0 (http://www.webalta.net/ru/about_webmaster.html) (Windows; U; Windows NT 5.1; ru-RU)": 77.91.224.## User-Agent: Yanga WorldSearch Bot v1.1/beta (http://www.yanga.co.uk/) GET /robots.txt HTTP/1.1 User-agent: * Disallow: / GET / HTTP/1.1
|
keyplyr

msg:3743137 | 12:29 am on Sep 12, 2008 (gmt 0) |
Yanga WorldSearch Bot requested robots.txt then disobeyed it, requesting disallowed files and scraping through hundreds of webpages and image files. Now banned.
|
|