Forum Moderators: DixonJones

Message Too Old, No Replies

Altered file names not of my doing.

The addition of "..." causes 404 error code?

         

pendanticist

7:14 am on Oct 29, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



First time I've seen a pattern develope this way.

  • This request somehow has three dots (noted in red), inserted where I do not have them. (I should point out that this individual has previously had him/herself banned, hence the 403 rather than a 404.)

    24.173.210.90 - - [27/Oct/2003:08:25:45 -0800] "GET /Human_Resou
    ...
    Labor.html HTTP/1.1" [b]403[/b] 480 "-" "Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0)"

    Human_Resource_Management-Organized-Labor is the correct file name.

    10/28/03 22:41:03 IP block 24.173.210.90
    Trying 24.173.210.90 at ARIN
    Trying 24.173.210 at ARIN

    OrgName: ROADRUNNER-COMMERCIAL-SOUTHWEST
    OrgID: RCSW
    Address: 13241 Woodland Park Road
    City: Herndon
    StateProv: VA
    PostalCode: 20171
    Country: US

    NetRange: 24.173.0.0 - 24.173.255.255
    CIDR: 24.173.0.0/16
    NetName: RR-COMM-SOUTHEAST
    NetHandle: NET-24-173-0-0-1
    Parent: NET-24-0-0-0-0
    NetType: Direct Allocation
    NameServer: NS1.BIZ.RR.COM
    NameServer: NS2.BIZ.RR.COM
    NameServer: DNS4.RR.COM
    Comment:
    RegDate: 2003-03-17
    Updated: 2003-07-30


  • Note how the first request is honored with a 200 code. Then, 84 seconds later requests a file that renders the 404 only to go on to three other files and get the same 404 served up.

    68.39.128.24 - - [28/Oct/2003:17:34:07 -0800] "GET /Aboriginal_
    Tribes-Council
    s_A-O.html HTTP/1.1" [b]200[/b] 20031 "-" "Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0)"
    -----------------------------------------------
    68.39.128.24 - - [28/Oct/2003:17:35:31 -0800] "GET /Aboriginal_
    ...
    s_A-O.html HTTP/1.1" [b]404[/b] 2847 "-" "Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0)"
    -----------------------------------------------
    68.39.128.24 - - [27/Oct/2003:17:40:50 -0800] "GET /Aboriginal_
    ...
    ation.html HTTP/1.1" [b]404[/b] 2847 "-" "Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0)"
    -----------------------------------------------
    68.39.128.24 - - [28/Oct/2003:19:57:48 -0800] "GET /Aboriginal_
    ...
    s_P-Z.html HTTP/1.1" [b]404[/b] 2847 "-" "Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0)"

    The 3 three-dotted requests are not only odd in that they request files I don't have, but when they come back, they request files a human would know are bound to render 404's!

    So, maybe we rule out human?

    10/28/03 22:51:59 IP block 68.39.128.24
    Trying 68.39.128.24 at ARIN
    Trying 68.39.128 at ARIN
    Comcast Cable Communications, Inc. JUMPSTART-1 (NET-68-32-0-0-1)

    68.32.0.0 - 68.63.255.255
    Comcast Cable Communications, Inc. NJ-NORTH-8 (NET-68-39-128-0-1)

    68.39.128.0 - 68.39.255.255

    # ARIN WHOIS database, last updated 2003-10-28 19:15
    # Enter? for additional hints on searching ARIN's WHOIS database.


  • Even ia_archiver (bot) get's into the picture somehow.

    209.237.238.175 - - [28/Oct/2003:19:35:55 -0800] "GET /
    ...
     HTTP/1.0" [b]404[/b] 2847 "-" "[b]ia_archiver[/b]"

    10/28/03 23:20:47 IP block 209.237.238.175
    Trying 209.237.238.175 at ARIN
    Trying 209.237.238 at ARIN
    United Layer, Inc. UNITEDLAYER-1 (NET-209-237-224-0-1)
    209.237.224.0 - 209.237.255.255
    Alexa Internet ALEXA-INTERNET (NET-209-237-237-0-1)
    209.237.237.0 - 209.237.238.255

    # ARIN WHOIS database, last updated 2003-10-28 19:15
    # Enter? for additional hints on searching ARIN's WHOIS database.

    Given the distributive affect represented by the various ISP/IP Numbers, I can't help but wonder if this isn't somehow propogating, whether amongst a few friends for now, or more wide-spread in the future.

    <aside>
    Some months ago I had a similar situation. During routine inspection of my access_log files, I began to notice all my file names were mysteriously showing up in total lower-case form and as having rendered 404s during failed attempts at ripping my site.

    At first it was just one IP Number. Within a few weeks I was getting hit periodically from ten geographically different places, all following the same lower-case pattern. Wasn't long and I was getting hit by an increasing number of people from around the World.

    To me that suggested distribution, however faulty the database was.

    It stopped abruptly.
    </aside>

  • Where'd those three dots come from?

  • How would you explain the variety of IP Numbers in such a relatively short period of time?

  • Are these perchance little tykes excercising their newfound skills?

    If someone would enlighten me, I'd be...well, enlightened. :)

    Thanks.

    Pendanticist.

  • marcs

    7:20 am on Oct 29, 2003 (gmt 0)

    10+ Year Member



    Could this be search engines adding the "..." to the URL in their results (if it is a long URL).

    Obviously, the actual link/URL provided by those search engines would be correct. I'm talking about the URL they would show below the result for the site.

    Maybe some bot is using that text version of the URL instead and is getting the wrong URL.

    pendanticist

    1:23 pm on Oct 29, 2003 (gmt 0)

    WebmasterWorld Senior Member 10+ Year Member



    <shrug> I really have no clue.

    Pendanticist.

    pendanticist

    5:37 am on Oct 30, 2003 (gmt 0)

    WebmasterWorld Senior Member 10+ Year Member



    This new visitor left the three dots replacing correct file name (Home_Schooling_Kids-Stuff.html).

    24.193.205.42 - - [29/Oct/2003:15:07:18 -0800] "GET /Home_School...Stuff.html HTTP/1.1" 404 2847 "-" "Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0)"

    Where are they getting this three dotted thing from?

    Do tell! Seems he/she came back exactly fourteen minutes later only this time requested the file correctly.

    24.193.205.42 - - [29/Oct/2003:15:21:18 -0800] "GET /Home_Schooling_Kids-Stuff.html HTTP/1.1" 200 17432 "-" "Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0)"

    10/29/03 21:55:42 IP block 24.193.205.42
    Trying 24.193.205.42 at ARIN
    Trying 24.193.205 at ARIN

    OrgName: ROADRUNNER-NYC
    OrgID: RRNY
    Address: 13241 Woodland Park Road
    City: Herndon
    StateProv: VA
    PostalCode: 20171
    Country: US

    NetRange: 24.193.0.0 - 24.193.255.255
    CIDR: 24.193.0.0/16
    NetName: ROADRUNNER-NYC-3
    NetHandle: NET-24-193-0-0-1
    Parent: NET-24-0-0-0-0
    NetType: Direct Allocation
    NameServer: DNS1.RR.COM
    NameServer: DNS2.RR.COM
    NameServer: DNS3.RR.COM
    NameServer: DNS4.RR.COM
    Comment: ADDRESSES WITHIN THIS BLOCK ARE NON-PORTABLE
    RegDate: 2002-04-05
    Updated: 2002-11-25

    <snipo addies>
    # ARIN WHOIS database, last updated 2003-10-29 19:15
    # Enter? for additional hints on searching ARIN's WHOIS database.

    Any ideas how, or why this occurs?

    Perhaps a trickle-down-effect of some kind?

    My curiosity is still peaking with respect to IA Archiver too. How would an established bot, whom I assume crawls established urls, get ahold of this, much less start using/following it?

    Is there now an established list out there which contains tons and tons of 404s that might give the someone else (like Google if it propogates any more) the wrong impression?

    Could someone be trying a re-direct of some sort?

    Is there any possibility these two events are somehow connected [webmasterworld.com]?

    <shrug> Dunno, just asking.... :)

    Thanks.

    Pendanticist.

    BlueSky

    7:05 am on Oct 30, 2003 (gmt 0)

    10+ Year Member



    On my site, I shorten long URLs in a similar fashion. Some of the premade forums, CMS', weblogs, and other scripts also do the same thing. If a visitor copies the text of such links into his browser's address bar instead of clicking on them that would produce the same effect you're seeing. Not sure if that is what is happening here. Recommend you do a search on those pages to see if anyone has linked to them and is using some sort of technique to shorten them.