Forum Moderators: open

Message Too Old, No Replies

NetIQ - what are they looking for?

         

Mokita

10:42 pm on Oct 20, 2006 (gmt 0)

10+ Year Member



I have a really interesting situation - a site belonging to a client who died last year. He had just renewed the domain for 2 years a couple of months before he died. His wife decided, as courtesy to his former clients and people who had bookmarked the site, to leave up one page until the domain expires, saying simply and briefly what had happened.

I changed robots.txt to block all (compliant) bots and now have the fun of seeing the non-compliant ones and cloaked bots reveal themselves.

One persistent bot comes from an IP range belonging to NetIQ (owners I think of WebTrends) 63.88.212.nnn. User agent is:
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; .NET CLR 1.0.3705; .NET CLR 1.1.4322)
I have blocked it using .htaccess but it still keeps returning.

Does anyone know what it is looking for?

The other thing shown up rather starkly in otherwise very sparse logs, is Yahoo Slurp visiting four times per day without fail. Seems it just doesn't believe what it is seeing in robots.txt and has to keep rechecking and rechecking etc etc <sigh>

Mokita

1:18 am on Oct 21, 2006 (gmt 0)

10+ Year Member



Edit of above post:

Yahoo Slurp visiting four times per day without fail.

Meant to say four to eleven times per day.

wilderness

4:32 am on Oct 21, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Mokita,
Webtrends has another range as well.
If you denied and they keep coming, there's a good chance that they are just stuck in a loop (took me a long time to understand this effect). You might try remarking out the denied lined for a few of their visits and see if that stops the loop.

As far as Yahoo (and MSN), they have more bots than cockroaches.
They visit me many times during that day with just a page or two at a time. Personally, I'd rather have the major bots crawl my entire sites in a few minutes and then not return again until weeks or perhaps a month later (fat chance I'll get that).

Mokita

11:07 pm on Oct 21, 2006 (gmt 0)

10+ Year Member



Webtrends has another range as well.

Should I be blocking them in other sites too? If so, for what reason?

If you denied and they keep coming, there's a good chance that they are just stuck in a loop (took me a long time to understand this effect).

Would you explain the effect a bit more?

You might try remarking out the denied lined for a few of their visits

I will, thanks for the suggestion.

However, that method doesn't work with one pest I have in several sites - Websense bot. It continually asks for pages that were long ago deleted and throws heaps of 404 errors. I have tried feeding it 403s for a long time and now I redirect it back to their own web site, in the faint hope they'll notice and do something about it.

They visit me many times during that day with just a page or two at a time.

Yahoo is far worse than MSN in this regard. Yahoo does that in all our sites, but the reason I commented is that in a very sparse raw log, the multitudinous Yahoo visits are very striking as they are not broken up by human visitors like in a normal live site. I'd guess that easily 90% or more of the entries in the raw log are Yahoo Slurp.

Personally, I'd rather have the major bots crawl my entire sites in a few minutes and then not return again until weeks or perhaps a month later

I'll heartily second that (at least for our static sites). I think there should be an option similar to the defunct meta tag: revisit-after: 14 days
It could be placed in robots.txt and function similarly to the crawl-delay directive. I won't be holding my breath waiting for it though.

jdMorgan

11:31 pm on Oct 21, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Just anote: Websense is, by all appearances, a content filter. By blocking Websense, you may be blocking access to your site by all of their subscribers. They are probably mostly corporate subscribers, but could include some ISPs and their subscribers as well.

I'm not recommending a decision either way, but the above should inform that decision.

Jim

Mokita

11:55 pm on Oct 21, 2006 (gmt 0)

10+ Year Member



Thanks Jim - however I doubt that blocking it would adversely affect any of our sites.

But if they want to give good service to their paying clients, they need to fix their bot! I might not have noticed it but for the fact that it continually asks for long-gone pages (2 years +). It only asks for those, not any current ones except the index page.

You also would have thought they'd notice it had been eating 403s for a long time.

jdMorgan

12:11 am on Oct 22, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



It's not a 'bot. It's more like a proxy. It will either request resources on behalf of their clients in real-time, or it will 'check up on" resources previously requested by their clients.

Sorry, I can't be more specific; I spoke with them several months ago, and could not get any info out of them about how their filter works.

But based on how content filters work in general, their visits indicate that at least one of their clients does have (or had) an interest in your site.

On-topic: I looked into NetIQ, and they seem to be another corporate network security -oriented firm. No telling if the NetIQ requests you're seeing are a result of their security products, or a legacy of their WebTrends purchase ca. 2004 (I believe they sold it off to someone else, and NetIQ was itself purchsed -- I lost track of who owns what after just a few searches).

Jim

[edited by: jdMorgan at 12:22 am (utc) on Oct. 22, 2006]

wilderness

2:29 am on Oct 22, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



wilderness wrote:

If you denied and they keep coming, there's a good chance that they are just stuck in a loop (took me a long time to understand this effect).

Mokita wrote:

Would you explain the effect a bit more?

Mokita as we speak/type, I currently have three visitors stuck in loops:

1) the newest Yahoo bot (74.6.73.***) I do NOT have Yahoo denied in mt robots.txt and do have the IP range denied.
(Yahoo hasn't added a supplemtary name to their many bots, therefore I'm unable to include this "specific Yahoo bot in robots.txt.)

2) Two months ago the Google-Image bot began crawling my images even though there was an existing exclusion request in robots.txt for more than a few years.
I added the UA to my denys and contacted Google and they presented a manual soluion.
Later I neglected to renmove the deny from my non-rewrites [deny from] (which are not allowed to read my robots.txt)
Today the Google-image bot started crawling again the result of somebody with a Google tool installed [72.14.194.27]. Within 20-minutes the image bot appeared [66.249.66.244]
Thus, Google-Image has continued spidering because it cannot read robots (my own stupid fault).
I'm not jumping through hoops to resolve these 403's again.

3) A vistor come to one of my pages with the referrer "#*$!X:+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++" " and I have a denial exlcusion in place.
The visitor goes to the website main page (multipl times without the phony refer and is granted access). Goes back to the refer and is denied access.

All these are stuck in loops (chasing their tail in a circle) because there is not a notification in place which will prevent them from chasing their own tail.

In addtion today I have a 4th problem from a DE range (RIPE denied access to my sites anyway) in which the user with both blank referrer and UA is denied access and has returned at least 50 visits today. [88.198.43.**]

Hope this helps.

Mokita

4:51 am on Oct 22, 2006 (gmt 0)

10+ Year Member



Hope this helps.

It does. Many thanks! :)

Mokita

4:59 am on Oct 22, 2006 (gmt 0)

10+ Year Member



I looked into NetIQ...

Thanks for looking into it Jim.

-- I lost track of who owns what after just a few searches).

I did a Google search before I made my OP. I got confused about the ownership too, but was more interested in what they might be looking for. I think I'm not likely to find out.

[edited by: Mokita at 5:02 am (utc) on Oct. 22, 2006]

wilderness

5:06 am on Oct 22, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



First hit google

[google.com...]

Second search google:
[google.com...]

Mokita

5:24 am on Oct 22, 2006 (gmt 0)

10+ Year Member



Thanks Don - I found those sites too.

But I am none the wiser about what they are seeking in a one page site that says only that the owner died and that the site is closed.

Neither security or market analytics would seem to be appropriate.

Even when the site was fully functional, it was just a five page static site belonging to a suburban accountant in Sydney. Not worthwhile for anyone paying anything to monitor.

wilderness

6:08 am on Oct 22, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Mokita,
Webtrends is marketing and markerting analysis.

From what I read at NetIq and their pages and interpreted into my own understanding (right or wrong)?

They offering monitoring of both employees and networks to make costs and production more effective. There's a handful of companies and software's that do the same thing. I also get the impression that they filter incoming (firewall) as well.

In one instance while an Aussie is at work, and the solutions are in effect, the employee may not be enter his online TAB account or porno (just two examples).

thetrasher

1:45 pm on Oct 22, 2006 (gmt 0)

10+ Year Member



187 accesses by 63.88.212.164 = pdxg1n-o.netiq.com since 2004-09. UA doesn't change. It takes only the default page ("GET / HTTP/1.1"), nothing else. This bot gets a 200 together with unchanged page data and arrives 3-5 days later again.

A search for "netiq" and "IP" leads to a page that calls this server "Microsoft Exchange Spammer". What is this mail server looking for?

jdMorgan

1:59 pm on Oct 22, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



thetrasher,

I'm getting "pdxg1n-o.webtrends.com" as the RDNS for that IP address. So, it looks to me like this is Webtrends...

Jim

thetrasher

6:50 pm on Oct 22, 2006 (gmt 0)

10+ Year Member



Jim, you're right. The name changed since 2004.

63.88.212.*** - - [25/Sep/2004:**:**:** +0200] "GET / HTTP/1.1" 200 *** "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; .NET CLR 1.0.3705; .NET CLR 1.1.4322)"
(...)
63.88.212.*** - - [22/Oct/2006:**:**:** +0200] "GET / HTTP/1.1" 200 *** "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; .NET CLR 1.0.3705; .NET CLR 1.1.4322)"

[edited by: volatilegx at 2:47 am (utc) on Oct. 23, 2006]
[edit reason] obfuscated ip addresses [/edit]