Forum Moderators: bakedjake

Message Too Old, No Replies

ArchitextSpider

It's gone mad?

         

maccas

3:07 am on Oct 24, 2001 (gmt 0)

10+ Year Member



I am currently getting hit by architextspider but every page it is requesting dosn't resemble any of my files, it is requesting files like /folder/folder/divinelove.htm, /folder/folder/bhaktishatak/sld059.htm anyone else notice this?

jeremy goodrich

5:53 am on Oct 25, 2001 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Somehow, Architext cruised by and grabbed my error page...without getting the 302 that is supposed to get people there. And looking over the logs, I don't see any evidence of Architext ever getting a link to the error page...or a bad link (could be wrong, though).

I guess for me the strangeness is that it's actually spidering a site now...and it seems like this is a faster update than I've seen from it in nearly a year. Odd, that...they declare bankrupcy, and then start fixing their search engine.

maccas

1:24 pm on Nov 23, 2001 (gmt 0)

10+ Year Member



Hmm its still going on, architextspider is hitting about 10 pages a day for the last few months and they are all turning up as not found. It is trying to grab files that don't even remotely resemble any of my files.
198.3.103.49 - - [23/Nov/2001:03:51:16 -050198.3.103.65 - - [23/Nov/2001:05:36:10 -0500] "GET /users/gchamilton/1 HTTP/1.0" 302 235 "-" "ArchitextSpider"0]
"GET /users/joanr/1/smartwindow/3936/carterdodge/1/263 HTTP/1.0" 302 235 "-" "ArchitextSpider"
198.3.103.65 - - [23/Nov/2001:06:47:57 -0500]
"GET /users/crochetlady1/1/data/4897 HTTP/1.0" 302 235 "-" "ArchitextSpider"
199.172.149.146 - - [23/Nov/2001:01:57:34 -0500]
"GET /info/promise HTTP/1.0" 302 235 "-" "ArchitextSpider"

Out of the 500 or so pages it is trying to grab over the last few months only 3 are on my server the rest are getting redirected to my custom 404, anyone else noticing this?

mr_dredd2

2:25 pm on Nov 25, 2001 (gmt 0)

10+ Year Member



index pages on a load of my sites have now been picked up by excite and present in the results - only index pages mind..

Key_Master

3:17 pm on Nov 25, 2001 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



maccas,

I can track nearly everyone of those requests to gencircles.com:

[google.com...]

Also:
[google.com...]

I'm curious, are you sharing an IP with them or did you just forget you those files existed.? :)

anathema

11:23 am on Dec 9, 2001 (gmt 0)



I've been getting the same requests on a few of my sites for files that never existed, also in that same directory "bhaktishatak". That's acutally how I found this thread, I did a search on google to try and track down where this "bhaktishatak" was originating from and I found this thread instead. I still have no idea where these requests are originating from... if you figure it out I would like to hear about it.

maccas

1:38 pm on Dec 9, 2001 (gmt 0)

10+ Year Member



I fired off a email to excite about this a 10 days ago, no reply as yet. Key Master no I have my own ip and these files have never existed on my site, I would remember if I had divinelove :)

Key_Master

3:28 pm on Dec 9, 2001 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Man I'm good...#1 listing on Google and I didn't even try. "bhaktishatak" is a popular keyword...right? ;)

I still believe there is a connection. Would take more snooping to find the answer.