Odd link high up in search on All The Web...Can anyone explain?
| 4:59 am on Sep 1, 2001 (gmt 0)|
Wondering if anyone can help explain;
I've encountered an odd link, returned as a result, originally on Lycos, but now gone, but still present on All The Web.
The link occured high up in a search, initailly using a string consisting of the name of someone I know, who has a commercial webpresence. The page is unobtainable (404).
Search result lists the file as a "No Title", with the description header consisting of gibberish consisting primarily of a's, albeit with a noticeable pattern (initially 4,5,6,7,8). The reported size is 3mb ( ! ).
Safebrowsing simple tells me content length = 461, content type = text/html, and that the page has a robot related tag <meta name="robots" content="noindex">.
The file is called findme.htm and is/was on a server belonging to a site desgn and hosting company here in the UK.
I know that findme's can be placed on the web for someone to find (message related), but have also found reference to them in an Active VB tutorial regarding a file designed to access info in a database.
Out of curiosity I continued to "search" the link, using a cumulative search string, eventually consisting of 144 words so far. These include names (eg.mine), places (eg.city names), adjectives, verbs, swear words. All of these only work as isolated words, not phrases, and there are no "connectives" (is, and, it, a, etc..), presumably ruling out the idea that the words I've used are/were from sentences on a leading page...
Oddly the search string seems to be case sensitive; Search engine strings aren't supposed to be case sensitive are they?
Equally oddly, the result dissappears if I use an "English language only" filter (all the words I've used are english words.)
I have considered contacting the sys. admin. for the host, but would like to know why this is occuring in the first place. Presumably if I tell them, they'll just remove it.
Can a page be deliberately designed to come up in a search, such as the one I've described, yet be inaccessable itself? Is it most likely that it has been removed, or could it still be present somehow, just inaccessable and returing a 404 error. I have read a little about metatagging, doorway pages, page cloaking etc,etc, but the iterative string I've used is rediculously long and contains a wide variety of content...
I posted a similar query in the Foo section. Someone replied suggesting possible cloaked page and redirect, but I need further explanation, and also as to why those particular words in the string (very odd string). What's the value? And why case sensitive etc..?
This is really bothering me.Can anyone elucidate what might be happening here?
| 9:08 pm on Sep 2, 2001 (gmt 0)|
>Search engine strings aren't supposed to be case sensitive are they?
Sure some are. Alta is very peticular about case.
>particular words in the string
A web dump. Just scraping and refeeding an SE to see what "bites". It's like fishing - words are the bait.
Could it have been a web log file that is no longer on the web?
Without more info - no way to determine what it really was.