Forum Moderators: open

Message Too Old, No Replies

Altavista is looking at style sheets

Scooter is grabbing our css files

         

Son_House

4:41 am on Jan 18, 2002 (gmt 0)

10+ Year Member



Not sure if this is old news or not but I don't remember seeing it here.

Looking at yesterdays logs I noticed Altavista is grabbing css files.

trek13.sv.av.com - - [16/Jan/2002:03:20:29 -0500] "GET /css/a.css HTTP/1.0" 200 1227 "-" "Scooter-W3.1.2"
trek13.sv.av.com - - [16/Jan/2002:14:46:00 -0500] "GET /css/b.css HTTP/1.0" 200 994 "-" "Scooter-W3.1.2"
trek13.sv.av.com - - [16/Jan/2002:16:16:32 -0500] "GET /css/c.css HTTP/1.0" 200 1863 "-" "Scooter-W3.1.2"
trek13.sv.av.com - - [16/Jan/2002:18:51:18 -0500] "GET /css/a.css HTTP/1.0" 200 1227 "-" "Scooter-W3.1.2"

grnidone

6:56 pm on Jan 20, 2002 (gmt 0)



Why would it do this? To check for hidden links?

rogerd

1:36 pm on Jan 22, 2002 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



It could be experimental, or it could be to look for things like hidden text or H1 tags in 8 point type. AV has been so dead lately, it's hard to imagine that they'd be investing a lot of resources in sophisticated spam detection. I think a little work on the basic algo to return better SERPs would be a better use of technie time.

william_dw

6:44 pm on Jan 22, 2002 (gmt 0)

10+ Year Member



Just a thought,
but they might be grabbing style sheet's to get their enterprise product(s) in top form and then go the way of NL?

For the first time today a server we manage went down from memory overuse, the second the server came back online a pile of connections from
trek28.sv.av.com:44285
trek28.sv.av.com:44360

etc started coming in,
so I guess they havent perfected the whole idea of not requesting every page from a server at the same moment in time.

Marcia

12:23 pm on Jan 28, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



64.152.75.52 - - [28/Jan/2002:05:01:21 -0500] "GET /directory-name/stylesheets/global.css HTTP/1.0" 200 939 "-" "Scooter-W3.1.2"

Marcia

6:03 pm on Jan 30, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Again this morning, but looked for robots.txt first

64.152.75.52 - - [30/Jan/2002:08:24:01 -0500] "GET /robots.txt HTTP/1.0" 200 1308 "-" "Scooter-W3.1.2"
64.152.75.52 - - [30/Jan/2002:08:24:01 -0500] "GET /directory-name/stylesheets/global.css HTTP/1.0" 200 939 "-" "Scooter-W3.1.2"

Alta has been very active on another site, going a few directories deep and grabbing graphics and useless HTML pages displaying full backgrounds. Just that site, not the same activity on others.

Crazy_Fool

8:16 am on Feb 1, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



i noticed this as well. i also noticed another new scooter UA which is grabbing images. more here [webmasterworld.com...]

Marcia

8:27 am on Feb 1, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Again yesterday - taking the stylesheets right along with the pages.

crash

5:29 pm on Feb 1, 2002 (gmt 0)

10+ Year Member



hm, do any of you have your style sheets dissallowed in your robots.txt? just wondering if they are ignoring that. could be that their bot is on drugs and pulling every link?

Crazy_Fool

1:52 am on Feb 3, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



if i want to hide css files from robots or anyone else i can always keep them at ../ or ../private/ (below web root) where nobody can get at them directly over the net. i don't normally do anything fancy with css so there's nothing to hide.

the one thing that bugs me though, is that despite the sheer amount of spidering from scooter, altavista doesn't update it's index ....