Forum Moderators: open

Message Too Old, No Replies

Subhosts and Google

Google's Spidering and Penalizing of Subhosts

         

james_a

2:06 am on Apr 7, 2003 (gmt 0)

10+ Year Member


Hi, I'm new here so bear with me.

I have a site which provides subhosting of sites (through a managed interface) such that they get published to a subdomain www.theirname.mydom.sufix or to a domain which they have purchased. At the bottom of each site I have a little, very discreate provided by... and then a link back to my main site.

There are some things which I'm wondering...
1. Does google neglect subdomains, even if the content is completely different from the base site.

2. Does google penalise for a the provided by tag at the bottom if like 20 sites have the exact same provided by thing or does it count as a referring link? (increase rating).

3. If I had the link back as a small 2px image that had a link back to the base site instead of the referring link would I get penalised by google?

4. Most of the sub sites have their navigation such that the main parts are stored in javascript menus which appears when you move your mouse over the image. This is structured such that the code for showing/hiding the menus is stored in one JS file which is called by the current html file and the link names and locations are stored in arrays in another JS file. Would these links be picked up at all or would all these sublinks be discarded completely. Even if google reads the JS files would it pick them up, seeing that they are in a variable format

i.e.
menuitem[0][1][1] = "index5.html";

and then don't get compiled into a <a href=... format until the page is loaded and then aren't made visible (using div/layers) until the image calls a mouse over event. There is also a different functions script depending on the user agent and I don't think googlebot is listed as one...

Does anyone have any ideas how to get these links processed?

5. Google can't read links in flash movies can it. So a flash intro with no (skip intro) link essentially kills that site in terms of further spidering doesn't it?

6. How damaging is a 404 error off a link on your site.

I haven't had much to do with search engines, apart form some feable attempts in the past which failed unless I typed in the exact site name in ""

Hope someone can help,
Thanks James.

jdMorgan

3:32 am on Apr 7, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



james_a,

Welcome to WebmasterWorld [webmasterworld.com]!

1) No. Otherwise, you might find search.google.com to have a PR0.
2) It's a link. Is it relevant to visitors? I would not do this 20 times. I might do it 5 times.
3) Can't recommend "invisible" links - Read Google's guidelines.
4a) Google doesn't process client-side scripts like JavaScript.
4b) Yes, provide a <noscript> section, and then use SSI, PHP, or whatever to reproduce the links for non-JavaScript-enabled user-agents. Turn JavaScript off in your browser, and work on it until it looks good and is usable. 10% of internet users surf with JS disabled. If you are very rich, you can afford to lose them. If not, don't.
5) If there are incoming links to the flash page and only to the flash page, then yes, that can "kill" the site. But remember that spiders look at pages, not sites. If an alternate entry point is provided, and Google finds links to that alternate entry point, then the pages that are linked from the alternate entry point will get spidered.
6) Well, that would depend on how prominent the bad link was. One bad link on a page with 4 links is pretty bad. One out of 100 is not so bad. But fix it anyway. Implement a custom 404 handler or redirect the bad link to something similar, to a site index, or to your home page. (I feel that a 404 means the webmaster forgot something. We have 301, 302, and 410 as far more useful alternatives.)

I'm feeling rather opinionated tonight - If I don't succeed in my long-winded attempt to bump your post, dig around in the forums and the library here on WebmasterWorld to get some more information.

HTH,
Jim

takagi

3:58 am on Apr 7, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hello James, welcome to WebmasterWorld.

> 1. Does google neglect subdomains, even if the content is completely different from the base site.
For Google pages are more important than sites. Pages will be indexed unless they are identical (on same site or not), and linked by other pages in the index.

> 2. Does google penalise for a the provided by tag at the bottom if like 20 sites have the exact same provided by thing or does it count as a referring link? (increase rating).
Those links might be considered an internal link. External links seems to do better for PR.

> 3. If I had the link back as a small 2px image that had a link back to the base site instead of the referring link would I get penalised by google?
Google advices to think from a user point of view. A user will not click on a 2px image. So it is a kind of hidden link and therefore Google might penalise you. Especially if you combine it with a lot of other tricks.

> 4. ... Would these links be picked up at all or would all these sublinks be discarded completely.
The reason to put these links in JS, is to prevent Google from finding them. Sometimes the 'robots.txt' file has a line to prevent SEs from spidering the JS file with the URLs. Just to be sure, no PR is leaking to other sites.

> 5. Google can't read links in flash movies can it. So a flash intro with no (skip intro) link essentially kills that site in terms of further spidering doesn't it?
On this forum I read something about Alltheweb being able to read some information stored in what you call a 'flash movie'. As far as I know, Google ignores it. In that case it would prevent further spidering unless there are other (hidden) links on the page.

> 6. How damaging is a 404 error off a link on your site.
For the user it is not good. IMHO it wouldn't hurt your page in Google that much. Outgoing links will decrease the PR of the site. So it is a kind of spilling to point to a non-existing page.

james_a

7:41 am on Apr 7, 2003 (gmt 0)

10+ Year Member



Thanks, your help's been invaluable, just one quick question.

With the <noscript> solution, if a noscript alternative was impractical for the site, (this only occurs in 1 in 20 of the sites) would it work if I were to create a site index, (a page with links to all pages within that site) and then submit that to the search engine, hence allowing spidering for those subpages, or does google get suspicious of pages simply made up of links?

Also one more thing, does google filter the useragents of those submiting to it (i.e. block submission from scripts sending the post request to the engine).

James.

takagi

10:11 am on Apr 7, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hello James, it is all about spreading the incoming PR on your site. If you submit the 'site index' to google, but add no links Google might add it to their index (if you're lucky). However, Google will remove it after some time if the links stay away. So to keep it indexed, you need internal or external links to your page. Most, if not all, inbound links (links from other sites to your site) will point to your homepage. The index page will keep all the PR if you make your homepage a flash page without any links to sub pages. But the other pages will have PR0 or disappear from the index. That means, those pages will never show up in the SERPs. Probably that is not OK with you. So you better find a way to spread some of the PR on your homepage to the rest of the site.

jdMorgan

3:44 pm on Apr 7, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



james_a,

Yup, takagi's telling you the truth. Google won't read the links in the <script> section(s), so if you want your other pages to be indexed, you must either have a <noscript> section, or duplicate the scripted links as plain html on the page along with those scripted links. Using <noscript> is helpful because it avoids this duplication which might confuse your visitors.

Forget submitting, except to directories. All the worthwhile search engines will find your site on their own if you have incoming links. If you don't have incoming links, they'll drop you anyway. Submit to the ODP, Zeal, Yahoo, Looksmart, etc. - Whichever directories are appropriate. But search engine submission is fast becoming superfluous - Take a look at Google's submit page, where it says it is provided for those who feel they simply must submit to be happy; they say they'll find you on their own whether you submit or not as long as you have incoming links from pages already in their index, and that is true.

The only use I've found for submit recently is for disaster recovery - for example, rescuing someone's site after they have been dropped due to an error in their robots.txt file. Submitting in this case may speed recovery.

Jim

swerve

6:52 pm on Apr 7, 2003 (gmt 0)

10+ Year Member



Take a look at Google's submit page, where it says it is provided for those who feel they simply must submit to be happy; they say they'll find you on their own whether you submit or not as long as you have incoming links from pages already in their index, and that is true.

Can you point me to where this is stated? I don't see this at [google.ca...] Thanks.

jdMorgan

7:26 pm on Apr 7, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



swerve,

They've rearranged things, I guess... See Section A, #2 here [google.com] and the second-to-last "Fact" here [google.com].

Jim

james_a

11:47 pm on Apr 7, 2003 (gmt 0)

10+ Year Member



Thanks for all your help, I come away from this wiser ;)

One last case, the subhosts on their site generate their sites through an online system, at the end of this I have developed a feature that does an auto submission to these search engines, according to google they don't like these auto submissions, is there another way to get these subhosts to get their site spidered for the first time without telling them to go through and manually submit it to each search facility (Such that I, the service provider don't get my base domain punished for providing this service on the search listings)?

James.

jdMorgan

12:40 am on Apr 8, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



james_a,

From msg 6 above:

All the worthwhile search engines will find your site on their own if you have incoming links.

Last year, I put up a new non-commercial site, and submitted it for review and inclusion in the Open Directoy Project - dmoz.org. Within 90 days, it was in all the major search engines. That is all I did - one "submit" to a directory, nothing else. Over time, the site drew incoming links because it has worthwhile content, but it got into all the majors with one link from the ODP - including the "pay-for-inclusion" search engines.

I wouldn't suggest this minimalist approach for a commercial site, but it would probably still work if you didn't mind waiting.

Jim

james_a

1:32 am on Apr 8, 2003 (gmt 0)

10+ Year Member



Thanks for that, it's just clicked! That's the most valuable information on search engines I've ever heard!

Thanks so much,
James.