Forum Moderators: open

Message Too Old, No Replies

Alexa bot intent on harvesting email addresses

Why does Alexa want email addresses?

         

surfin2u

2:14 pm on Sep 10, 2004 (gmt 0)

10+ Year Member



Alexa crawls my site every few days and shows a strong preference for pages that give out email addresses. Their bot doesn't even bother with the majority of my site's pages, even though that's where the content that my visitors find useful is found.

The funny thing is that Alexa is wasting its time because my site will not to give out email addresses to any visitor unless cookies are enabled. Alexa's bot doesn't accept cookies so every time it tries to get an email address all it gets is a warning that cookies must be enabled.

My question is about their intended use of the email addresses they're trying to obtain. Anyone know what they're up to?

volatilegx

8:35 pm on Sep 10, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I don't intend to offend you, but it sounds to me like your logic needs some work. If they can't see the email addresses, how does the spider know the page contains them?

Maybe it keeps accessing the page hoping an error it is encountering will be corrected. Who knows? But it can't be targetting that page for email addresses, since there are none there (at least that the spider is able to see).

wilderness

9:14 pm on Sep 10, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Surfin,
Alexa honors robot.txt with the exception of the ocassional non-identified poking around that Majestic mentioned previously.

surfin2u

11:29 pm on Sep 11, 2004 (gmt 0)

10+ Year Member



No question Alexa honors robots.txt. The page that Alexa keeps returning to has only one function, to give the email address of a business listed in my directory. It is a page linked to by a javascript link from the directory listing page for a business.

wilderness

12:26 am on Sep 12, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Perhaps Alexa is in a loop?

An alternative may be a NOINDEX, NOFOLLOW in the header?

surfin2u

12:45 am on Sep 12, 2004 (gmt 0)

10+ Year Member



I'm not ready to tell the Alexa bot to go away yet. I appreciate the information that they publish about my site (traffic, reviews, etc.).

The thing that I wonder about is why they are so interested in attempting to (and failing to) gather email addresses, as opposed to being interested in my site's content.