Forum Moderators: open
Requests for:
[error] [client 64.68.82.31] File does not exist: /public_html/maillist/maillist_signin.asp
If this is Google, have they turned their technology loose on the web to create SPAM mailing list to sell in a quest for even greater profits?
paranoid or what, but this is what is out there.
What happened is simple:
- someone tried those urls while using the google toolbar
- the toolbar sent the url back to google,
- google tossed the url in the spider inbox
- attempted to spider the pages
- end of story.
eg: this was reported on webmasterworld a year ago.
[webmasterworld.com...]
eg: [webmasterworld.com...]
Most of us have been using the toolbar to "submit" new pages for over a year. Works perfectly. Simply visit the url with the toolbar and along comes googlebot. (I'm talking hundreds of cases of this where the pages are not linked anywhere).
Simply visit the url with the toolbar and along comes googlebot
I think they are only following Webmasterworld admins and mods to make them paranoid. ;)
The thread that you cite from last year is a great "must read." G-Guy has said the same thing since then.
Most of us have been using the toolbar to "submit" new pages for over a year. Works perfectly. Simply visit the url with the toolbar and along comes googlebot. (I'm talking hundreds of cases of this where the pages are not linked anywhere).
I've seen the same thing myself and tried to say so in a thread. G-guy sort of denied it. A forum member challenged me and I couldn't understand why because I (basically) found the same thing that Lisa did. I didn't know who to believe. Now I do: Lisa.
But lately I notice Googleguy using a lot of terms like "should," "ought to," "eventually" and "maybe." I understand and grasp the constraints that he operates under, but sometimes I don't see the point of chiming in just to say "don't worry, be happy" (or words to that effect.) If he cannot (in fact) reveal the Google policy at work, why say anything at all? Sometimes I feel that stuff is going on that Googleguy disagrees with, but can't say so out loud.
I have a properly constructed, non commcercial site with adequate backlinks that is five months old and ONE page is listed in Google. For about 56 hours we had "fresh" tags and a current "cache." We still have no PR and no "Links" and now we're back to a cached image circa the first week in June.
"Fresh deep bot?' Maybe if you're fighting for the top listing of selling Viagra online....
I'll say it again. :) I don't think our privacy policy prevents Google from doing this, because we are allowed to use anonymous user data to improve our search, but installing the toolbar didn't make googlebot crawl your page. See [google.com...] for some of the typical ways that urls leak. Other ways include people guessing urls, network/DNS setups, etc.
My teacher called this "post hoc" logical fallacies ("It rained after I washed my car, so washing my car must cause it to rain!"). One of the reasons I'm here is to dispel myths; if people still want to believe myths after I've dispelled them, that's their business. ;)
Fearless, just to be extra clear: we don't do this. My personal opinion is that our privacy policy would let us. But as of right now, if someone tells you that the toolbar caused their page to be crawled, they're mistaken. Hope that's definitive enough for ya? ;)
P.S. While we're on common myths, advertising on Google doesn't cause sites to show up in the index either.
OK. I get it.
that's definitive enough for ya?
I was taught about "post hoc, ergo propter hoc" a long time ago. AND I see a lot of that confusion going on all the time in this forum. Temporality does not equal causality.
However, on my "hobby" site, I accidentally (I think) replicated Lisa's experiment and got the same result.
Many people have made posts to this forum to the effect of how their new site has been crawled and indexed in "48 hours."
Which certainly is not consistent with my experience of late. In one thread you went so far as to say "it's sort of a curve" or something to that effect. In other threads you've referred to sites linking to mine AND to "people finding your site." (You've used that phrase more than once if my feeble memory serves me right.) To me, that implies some measure of traffic.
How do you measure that?
And if you can be "definitive"do backlinks via jumpmenus count in your evaluation of a new site?
How about backlinks via php pages? How about other scripts like asp? cfm?
I'm not talking about links within my site being crawled. I mean with a new site will this type of link help us over "the curve?"
Do the hackers or Google have a magic key that gives their crawler wide open access to this /maillist/maillist_signin.asp program?
A magic key? I thought this was from a 'security news portal'?
However if I was Google (or GoogleGuy) I would certainly be concerned about the amount of negative publicity they've been attracting recently. The white knight of web search is becoming decidedly grey...
And if you can be "definitive"do backlinks via jumpmenus count in your evaluation of a new site?How about backlinks via php pages? How about other scripts like asp? cfm?
a side note: This thread title was changed (for obvious reasons), but it was changed to the same exact title as another current thread.
Referral leaking is a common way that we find leaks to unindexed pages, but that has nothing to do with the toolbar. I still differ with you Brett, but feel free to mail me some examples (deep pages--none of this root page stuff ;).
GoogleGuy smiles and maintains his position.
I should make a t-shirt like that, but not too many people would 'get it' ;)
I understand people coming at this from a 'business' perspective when their livelihood depends on the free traffic they can muster from the SEs, but that isn't an excuse (imho) for all the whining and accusations and conspiracy theories that get bantered about.
Not all 'business' needs to be so brutal and cut-throat. Sure, some will say it's needed in a capitalistic society, but I don't think so.
Apply the integrity and 'good content' theory to your whole business perspective. It pays off, I believe.
I have to admit I didn't even read the whole article.
Anyway, that's my 2 shekels.
You mean four variables? Google will do one or two, maybe three maybe not, but not four or more. Use mod-rewrite to make the URL into something that looks like folders instead. If any if the variables contain id= then Google will ignore the link as having a session ID.
If any if the variables contain id= then Google will ignore the link as having a session ID.
You sure? That's a bit amateurish for Googlebot - after all, 9/10 newbie database driven pages pass "id" as a parameter referring to the ID of a row in a database,
viewproduct.asp?id=1234
"sessid" or anything with "ses" in it I can understand (or perhaps id= where the value is over a certain number of characters and contains letters aswell as numbers), but simply not crawling on "id" alone would be a bit restrictive I would have thought.
www.somedomain.com/category_list.cfm?CatID=45
One of the quickest ways to find what files Google will index backlinks from is to check pages' backlinks. They index plenty o stuff. I tend to see a much greater variety of filetypes with the different types of searches (link: site: etc) rather than in regular serps.
That's what I was trying to say. I went back through my older established sites and checked "link" and didn't find any script or jumpmenu generated backlinks. And yet, that's clearly the future of the web.
In my case, from a "reality" perspective- those sites are the most signifigant ones that link to our organizational sites. Way more significant that some plain HTML links that Google does appear to pick up.
If Google is looking at backlinks from script generated pages in their asessment of the new site, (As G-guy seems to imply) they aren't showing those backlinks in the "links" search function, (in the sites that I checked.) (And I double checked a few known ones to make certain they were PR 4 or higher.)