Competitor in log files big time ... advice please

Forum Moderators: DixonJones

Message Too Old, No Replies

Competitor in log files big time ... advice please

18,000 visitor sessions in one month

mayor

6:10 pm on Jan 27, 2002 (gmt 0)

My WebTrends site report for the last 30 days reports over 18,000 visitor sessions from a competitor. That's sure more than a casual peek. Can anyone advise me as to what they might be up to?

WebTrends reported them by organization name, not by IP address, so I can't find their track details in my raw log files, even though I've searched the log file by their IP address as reported by Whois. They must be visiting my site from an IP address different than the one reported by Whois.

I very suspicious that they're up to no good. Any sleuths that might give me an idea what to look for?

TallTroll

10:35 am on Jan 28, 2002 (gmt 0)

18k sessions in 30 days? Wow, thats like 600 sessions/day (are you sure you don't mean page views?)

I don't know too much about WebTrends, but can you get it to display a report on the activity by time for that IP?

I would be concerned to see how much time they are spending per page. ie are they taking a detailed look at your site, or are they constantly requesting the same page, or pages over and over at peak times to try and disrupt your service (a mild DOS attack)

Check your traffic by hour of the day. If you haver a couple of big spikes, find the IP they are coming from, and block it

DaveN

10:52 am on Jan 28, 2002 (gmt 0)

Mayor I think you'll find your being web jacked. I had a site ripped about a month ago and they took everything. Then ripped about 60% and reposted it under there own banner.

DaveN

mayor

3:42 pm on Jan 28, 2002 (gmt 0)

I do mean Webtrend visitor sessions, TallTroll.

I can't actually find them in my raw logs. Again, WebTrends reported them by organization name, not by IP, and I have not been able to find an IP associated with that organization that is to be found in my raw logs.

I looked for daily traffic spikes and saw none. The only spike I saw was in geographic data, by most popular cities, and that was Reston Virginia with over 28,000 visitor sessions (9000 which were reported as AOL visitor sessions). Since the named competitor is not located in Reston Virginia, or even in Virginia, I'm starting to wonder if WebTrends didn't blow a fuse somehow. If it weren't for the name of a competitor showing up, I would conclude this all to be a WebTrends glitch. But a "clever" scammer might try to camouflage themselves as AOL by using an ISP located in Reston Virginia.

Thanks for the suggestions!

TallTroll

4:14 pm on Jan 28, 2002 (gmt 0)

Hmmm, I reckon DaveN has a very good point, and the Virginia data would support that theory.

Can you see the raw referrer URL? If you are getting '00s of hits from the same page of your competitors site, or they are requesting the same doc/docs over and over (maybe the same image) that would also indicate you are being pagejacked.

Have you been to look at your competitor site, see if you can spot any links to your own site?

Liane

4:58 pm on Jan 28, 2002 (gmt 0)

I have been having a similar problem and don't know how to go about fixing it.

somehost.affinity.com has been hitting my site hard for several months.

Webalizer version 2.01 stats for this month so far are:

Files
18,296

KBytes
173,343

Visits
1,828

Any help appreciated. I have no idea how to trace this or block it or even if it should be blocked?

(edited by: Liane at 5:06 pm (utc) on Jan. 28, 2002)

john316

5:01 pm on Jan 28, 2002 (gmt 0)

Hi Dave

I was wondering if you could tell me what you mean by webjacked? Would this be where someone rips your code and posts it on their domain? Or something else like framing your content?

Thanks

EliteWeb

5:06 pm on Jan 28, 2002 (gmt 0)

Posting as their own using it as their own. Hell maybe they just want to show their clients this is OUR site and this is the COMPETITORS site and say how much better theirs is... (not saying that yours is bad since i havnt seen it... but ive done that once before... )

Damian

5:19 pm on Jan 28, 2002 (gmt 0)

Liane, is the ip with that 216.185.159.21 [samspade.org] ?

If so, you could just block the ip with an .htaccess file.

Liane

5:24 pm on Jan 28, 2002 (gmt 0)

Hi Daniel,

There is no IP listed ... just somehost.affinity.com

TallTroll

5:27 pm on Jan 28, 2002 (gmt 0)

>> maybe they just want to show their clients this is OUR site and this is the COMPETITORS site and say how much better theirs is

So long as its clearly marked as someone elses site/content, that shouldn't be a problem, IMO.

Its when you are presenting someone elses work, and taking the credit that you have trouble. I think the test case here was the Shetland Times vs. the Shetland Express or something similar

Basically one had been ripping off the others stories by pulling the content into a frame, thus chopping off all of the branding, and presenting it to the user as their own journalism. They got slammed

rcjordan

5:44 pm on Jan 28, 2002 (gmt 0)

>I think you'll find your being web jacked

Ditto.

>what you mean by webjacked? Would this be where someone rips your code and posts it on their domain? Or something else like framing your content?

There are pageripper scripts that will pull contents from one page and write them into the page[s] tbat makes the call -sort of like having SSI to anything with an url, as I understand it. There are legitimate uses for these, I'm researching the possible use of such a script to move MLS data which I'm authorized to access into a real estate site. That said, there are plenty of opportunities for mischief. A serious jacker could use a variation to serve your page to an engine yet never have it visible to the general public.

mayor

5:47 pm on Jan 28, 2002 (gmt 0)

No, TallTroll, I can't see any raw referral url associated with them. And I've been all over their site looking for something that looks like mine and can't find anything.

However, the Google cache reports "this page cannot be found" on their search terms and the site description on the Google serps is quite different from what shows up on their pages so there is some kind of cloaking or deception going on there. Digging into the Google database for one non-competitive search term alone I find 3000 doorway pages in the Google serps, and they all take me to the same page on the competitor's site through some kind of re-direction (I have to hit my browser back button twice in quick succession to back out of the doorways' target page and get back to the Google serps). So this company does appear to be engaged in big time spamming.

I don't want to name them here because they are a reputable well-know company with nationwide markets and recognition and I certainly don't want to tarnish their name unless I have absolute proof that they are pagejacking, stealing and spamming. If they are, I would find it shocking.

rcjordan

6:22 pm on Jan 28, 2002 (gmt 0)

>don't want to name them here

Right. WmW doesn't want that either. No specifics even if you DO nail their carcass to the wall.

Out of curiousity, could they be running some sort of price-comparison spider?

mayor

6:30 pm on Jan 28, 2002 (gmt 0)

RCJordan, are you saying they could just cloak a page and embed some ripping code in it that reads my page(s) and when a spider came around, it would serve them the contents of my page under the ripper's URL?

Any suggestions how I could determine this if the cloaked page also had the Google no-archive tag?

What I guess I need is a way to read cloaked pages.

rcjordan

6:45 pm on Jan 28, 2002 (gmt 0)

>just cloak a page and embed some ripping code

Magic 8-Ball says "A definite possibility."

I think there are less exotic ways, too -but I'm not an authority on the nitty-gritty of this. Just because they are a big name, don't dismiss the fact that they might be ripping. Some guy in the web marketing department might have his job/contract on the line and decided to go over the line.

mayor

10:59 pm on Jan 28, 2002 (gmt 0)

>>could they be running some sort of price-comparison spider?

They do have hundreds of products that need to be competitively priced. Any ideas how to tell if I'm getting visits from such a 'bot ... like does anyone know the names of some price-comparison bots that I can look for in my raw logs?

Hannu

3:56 pm on Feb 4, 2002 (gmt 0)

> WebTrends reported them by organization name, not by IP address

You can prevent WT from resolving the IP numbers by selecting "Quick mode, using format from log file" under the profile setup.

Reanalyze and check the IP nos.

Petey

3:28 am on Feb 6, 2002 (gmt 0)

Why not simply contact them? Tell them what you know. Ask them to explain.

mayor

5:17 am on Feb 6, 2002 (gmt 0)

I dug into this for a couple days. My latest findings: It appears Webtrends was dropping a trailing zero on an AOL IP address, and what was left just happened to be the IP address of a competitor. I'm not sure of this but after two days of intensive sleuting I could find no substantial evidence that my competitor was pagejacking in any way, even though I did uncover a lot of doorway spamming they were up to.

I'll re-visit this a month but for now, life has to go on.