Forum Moderators: open

Message Too Old, No Replies

Google favouritism?

Are companies getting paid or just being sensible?

         

Pricey

9:58 am on Jul 11, 2003 (gmt 0)

10+ Year Member



Whilst rummaging through some robots.txt files I came across this on macromedia.com

#
# This file is used to allow crawlers to index our site.
#
# List of all web robots: [robotstxt.org...]
#
# Check robots.txt at:
# [searchengineworld.com...]
#

# Details about Googlebot available at: [google.com...]
# The Google search engine can see everything
User-agent: Googlebot
Disallow:

# All other robots will be restricted from accessing the Google-specific index pages
User-agent: *
Disallow: /google_indexing/

Whats this all about? Are Macromedia getting backhanders from google or vice-versa or something.

-or-

Are Macromedia just being sensible by using the biggest SE.?

lazerzubb

8:25 am on Jul 14, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



i'm not sure what that is about, but my guess would be it has something to do with the specific site search that Macromedia has, which is powered by Google.

Powdork

8:38 am on Jul 14, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I read about this once in a book called
Cloaking your way to PR 10
Evidently all the seach engines like pepperoni pizza, except Google, which like sausage and pepperoni.

<added>Forgot to put sarcastic smiley</added>;)

Pricey

11:18 am on Jul 14, 2003 (gmt 0)

10+ Year Member



lol.

Interesting though. I'm gonna look out for more sites that have this.

Herenvardo

2:44 pm on Jul 14, 2003 (gmt 0)

10+ Year Member



I've visited the macromedia site and looked for google_indexing/. I was unable to find this folder, so it raises a question:
-Can a search engine access content that is not available to normal users?
If the answer is no, that robots.txt file is very suspicious.
If the answer is yes, I'm very troubled:
-Can a bot access the private files of my website?
-Is this legal?
-Don't were penalized those sites that created different contents for the bots and for the users? (like invisible links to hide from users or js ones to hide from SEs)

In any case, I think that Google is very suspicious. I hope GoogleGuy will answer some of this questions, unless they have no answer. I won't negate their innocence if thet are not proved guilty; but, even so, too many unanswered questions will make me (and probably others) lose my confidence in google.

scoobontour

3:18 pm on Jul 14, 2003 (gmt 0)

10+ Year Member



could be that the macromedia site is of course completely written in flash. so search engines like google cannot read it. therefore macromedia write index files into a directory that google uses to index their pages? There arn't many sites that are completely written in flash so maybe macromedia has arrangement with google to pick up the content via another directory? Not sure there is anything wrong with that, if I had a site as important as macromedia (and written in flash) guess I would contact them as well...

ciml

3:31 pm on Jul 14, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Herenvardo, search engines often get to access content that is not available to normal users. It's called cloaking, and it's often used for things like shopping cart 'buy now' links that cause the engines to get into endless loops.

> -Can a bot access the private files of my website?
-Is this legal?

You mean can Googlebot crack into your hosting account and read your files? No, you can be pretty sure that Google wouldn't want to (they even adhere to Robots Exclusion Protocol). On the other hand, I suspect that the Google Search Appliance [google.com] could be tweaked to spider whatever parts of your site that you wanted it to, so that may be what Macromedia were thinking of doing.

> Don't were penalized those sites that created different contents for the bots and for the users?

Yes, sometimes cloaking is used to get a page indexed for words that are popular, but not directly related. This annoys search engines, but is very uncommon these days.

pixel_juice

3:38 pm on Jul 14, 2003 (gmt 0)

10+ Year Member



This may be a silly question, but if the /google_indexing/ pages mentioned above actually existed, wouldn't they be in the index?

To me that robots file looks like a template (hence all the comments) that is now outdated. Like CNN ban a whole bunch of pages and directories that no longer exist because their robots file is so out of date.

One other thing - someone the size of macromedia doesn't need to 'cloak' using robots.txt. They could just do IP/useragent cloaking like everybody else. They could even show a different robots txt to Google if they wanted.

>>I won't negate their innocence if thet are not proved guilty

I don't buy the conspiracy theory. I'm struggling to find any evidence that Google is guilty of anything because of Macromedias robots file. If Google is corrupt or favouring certain sites or companies, I'd expect them to be a little more technically accomplished than this ;)

Why would they mess around with robots files when they could just be giving out PR or any other number of much more effective and much more difficult to detect measures?

Pricey

4:06 pm on Jul 14, 2003 (gmt 0)

10+ Year Member



Well, Macromedia.com is one of the very few who have a PR10

Maybe this is something to do with it?

Receptional Andy

4:07 pm on Jul 14, 2003 (gmt 0)



>>Maybe this is something to do with it?

Or perhaps the fact that practically every site that uses any Flash whatsoever has a link to macromedia.com? ;)