Google Search Engine Optimization 101 My list.

Forum Moderators: open

Message Too Old, No Replies

Google Search Engine Optimization 101 My list.

How would you rank the most important elements?

Widestrides

6:44 pm on Aug 19, 2003 (gmt 0)

Google Search Engine Optimization 101
Ranked by order of importance. (My best guess. Results of course will vary, as will the algo.)
Comments welcomed.
What have I missed?
What have I listed that is not at all important?
What would be YOUR order of importance?

1. Keyword in Title Meta tag
2. Keyword in Description Meta tag
3. Keyword in Body text
4. Keyword density - 1-7%(?) - No more, no less. Results will vary. 5. Page Rank/Links - best if from related sites with higher PRs
6. Sufficient Content (Google seems to like bigger sites, more content)
7. Keyword in incoming links
8. Keyword in incoming link text
9. Keyword in text surrounding incoming link text
10. Keyword in <H1> tags
11. Keyword in outgoing links - best if to related sites with higher PRs
12. Keyword in outgoing link text
13. Keyword in text surrounding outgoing link text
14. Keyword in alt image tags
15. Keyword in bold
16. Keyword in italics
17. Keyword in domain or sub domain name
18. Keyword in keywords Meta tag

GlynMusica

2:29 pm on Sep 29, 2003 (gmt 0)

But it will not follow .php links in the same way it does .html

claus

3:57 pm on Sep 29, 2003 (gmt 0)

>> it will not follow .php links in the same way it does .html

I think there is a general misunderstanding here. Php or html or asp or jsp or cgi or php3 or ... does not matter as far as i know.

The issue with dynamic pages is more about avoiding long querystrings in your url with lots of "x=this&y=that&ID=12345&ID2=987654321&general=stuff&such=nonsense"

- especially to avoid session ID's. Here's a few quotes from Google's own webmaster pages:

"If you decide to use dynamic pages (i.e., the URL contains a '?' character), be aware that not every search engine spider crawls dynamic pages as well as static pages. It helps to keep the parameters short and the number of them small."

and:

"Allow search bots to crawl your sites without session ID's or arguments that track their path through the site."

Link: [google.com...]

/claus

GlynMusica

4:23 pm on Sep 29, 2003 (gmt 0)

Thanks Claus.

While you definition of differences between specific file type extensions is true, your application to Google is not.

This confusion I see quite a lot.

Glyn.

Net_Wizard

7:41 pm on Sep 29, 2003 (gmt 0)

>>But it will not follow .php links in the same way it does .html

Would you care to elaborate, please?

One of my site is pure php...dynamically generated but with shorter URL parameters...all pages have been indexed by Google(1,300+) and everytime I add new content, a few days later it's in the index.

World Wide Wibble

8:20 pm on Sep 29, 2003 (gmt 0)

Thanks for all your replies on Google and PHP... Nobody has attampted to answer why the Google spider keeps requesting the same php page (see my previous msg), I assume it is because it is trying it each time it encounters links with different variables after the question mark.

On a related topic, it occurred to me that PHP could easily be used to feed different information to googlebot crawlers than everyone else who visits your site, along these lines:

if (preg_match("/googlebot\.com/",gethostbyaddr($_SERVER['REMOTE_ADDR']))) { HTML PAGE FOR GOOGLEBOT HERE } 
 else { 'REAL' HTML PAGE HERE }

This opens up all sorts of malicious and devious possibilities, but I can see more legitimate uses for this idea too. For example, at the top of some of my pages are some tabs (Add to favorites, go to home page, etc.) and Google annoyingly almost always uses this text in its snippets. I could simply use PHP to hide these from googlebot spiders, couldn't I?

And it could be used to increase (or decrease) the apparent keyword density. It could also be one solution to all you posters who didn't want to use huge <H1> headings. Easy: just use php to feed <H1>Keyword</H2> to the googlebot (as some appropriate place on the page) and not to your visitors!

I admit this a is somewhat dishonest technique, and its usefulness may be limited, but can anyone see why it wouldn't work? Or maybe you have other ideas how it could be used in positively...?

claus

8:52 pm on Sep 29, 2003 (gmt 0)

>> why the Google spider keeps requesting the same php page

The Googlebot does not know that you don't want your tell-a-friend page spidered. To avoid this, you should mention this file in your robots.txt and to get it removed from the index you should first use this tag on the tell-a-friend page:

<meta name="robots" content="noindex,nofollow">

Here's more info on the robots.txt from Google: [google.com...]

>> it occurred to me that PHP could easily be used to feed different information

You can accomplish this using all kinds of file formats, even standard plain html - this is not specific to php at all. There's even a whole cloaking forum [webmasterworld.com] that deals with such issues.

>> more legitimate uses for this idea too

Yep, there are all kinds of good legitimate reasons as well. One related issue (that does not necessarily involve php) is rewriting your URL's to avoid those session ID's and parameters that the SE spiders tend to choke on.

>> can anyone see why it wouldn't work?

It will work nicely - meaning, you can make your pages do that easily if you know how to. Search Engines definitely don't like it, however, so if you plan on doing this you should be prepared for what happens when it gets discovered. The Google FAQ also have a topic about cloaking, and other SE's might have similar views:

[google.com...]

I'll suggest reading a bit in the cloaking forum before you go ahead with it (if that's what you want to do) - there are all kinds of things apart from Googlebot and your own pages that you would want to consider as well.

/claus

Added: Just noticed your post count,so:

Welcome to WebmasterWorld World_Wide_Wibble :)

[edited by: claus at 11:30 pm (utc) on Sep. 29, 2003]

World Wide Wibble

9:26 pm on Sep 29, 2003 (gmt 0)

Googlebot does not know that you don't want your tell-a-friend page spidered.

No, of course; I'm not greatly bothered if it does spider it; there's not much interesting content so it won't get high in the rankings if it's indexed, and if it does, well, that can't be a bad thing!

I'll suggest reading a bit in the cloaking forum before you go ahead with it (if that's what you want to do)

I doubt I will use 'cloaking', though will bear it in mind as a possible workaround for problems. Thanks for drawing my attention to the online literature, etc. I'm sure if it's a significant problem for search engines (as you suggest) Google has (or will have) an anonymous spider that checks sites out... so as you say, probably not wise to use this idea :)

GlynMusica

9:21 am on Sep 30, 2003 (gmt 0)

"Would you care to elaborate, please?"

Sure. Watching the trends of Googlebot over the past 3 years. Working on 000's of sites for SEO this is my personal opinion, nothing more. Take it or leave it. However if you can demonstrate the same I'm no stickler to traditions and my opinion can change.

So,

If linked to page = php = then don't crawl all the links because it could be some kind of trap (with some of the amazing techniques that SEO's are using a simple spider will treat with caution any dynamic page first time around, because they are become easier to fool as the scripting gets more advanced...as previous posts show!). Or crawl with caution. IE. Crawl a bit this month a bit next month.

However, if link to page = .html then crawl away like there is no tomorrow.

Exceptions:
If PR = <high or number of page indexed = high, or page in index <x years old treat content as "honourable", and add new pages as they go, almost like HTML.

Google is getting better at parsing variables as it seems now to crawl a dynamic +1 variable. Plus 2 occassionally and plus 3 rarer still. Please don't post your plus 4's, I'm talking about generalisations here.

Jumping left: Fantastically I took a domain the other day which was completely new. In 3 days Googlebot had captured "home". In 5 days it had crawled all the first level directory. I did no submission, just placed a link on 4 other websites. Next up who do you think got the domain? It was Inktomi, then Fast and on Altavista, I'm still waiting!

Don't know if this will help anyone but that's a small part of my Google ruleset.

Glyn.

World Wide Wibble

12:59 pm on Sep 30, 2003 (gmt 0)

On one of my sites my pages are dynamically generated and I take advantage of that to insert random elements into each of my pages, such as a recommended book (not entirely random as the algorithm selects a book from a list I have made that is relevant to the content of the page, but random in the sense that if you hit reload it may be different).

My question is... those of you who have been saying Google loves fresh updated content, do you think the fact that the page is a little different every time Google looks at it is likely to (a) improve the site's rankings and/or (b) make Google come and crawl my site more frequently? and (c) any other advantages?

Are there any disadvantages? For example, for the past few days googlebots have requested my home page about 3 times daily. Do you think the googlebot is clever enough to get suspicious and say "hold on, this is being updated too frequently, it's probably 'random content'..."?

ogletree

2:51 pm on Sep 30, 2003 (gmt 0)

How often are you getting a fresh tag. Check every day and keep a log. If it is often then it is working.

dougmcc1

2:59 pm on Sep 30, 2003 (gmt 0)

Sure. Watching the trends of Googlebot over the past 3 years. Working on 000's of sites for SEO this is my personal opinion, nothing more.

Would you care to elaborate MORE please? With so much experience you should be able to provide a good reason for your personal opinion. If you don't have a good reason then please don't post misleading statements for newbies. For anyone else willing to accept GlynMusica's statement, I suggest using .htaccess to convert .php extensions to .htm.

GlynMusica

10:31 am on Oct 1, 2003 (gmt 0)

Geesh! Oh no...you got me...it was all made up... heck you're good!

;)

nippi

11:02 am on Oct 1, 2003 (gmt 0)

can anyone provide example of a site with 20,000 plus versions/different id of the same page that has been crawled?

eg widgets.php?id=20001

My experience is not good.

CCowboy

4:13 am on Oct 2, 2003 (gmt 0)

I could tell about one, but then I would have to kill you...

nippi

4:27 am on Oct 2, 2003 (gmt 0)

haha

ok next question

do thee pages have a pr of 5+?

My theory is YES you can get the pages crawled, but hard to get a high PR

Net_Wizard

10:39 pm on Oct 2, 2003 (gmt 0)

>>If linked to page = php = then don't crawl all the links because it could be some kind of trap...

Well, it could be true to any extention as well and not exclusive to php. Extensions such as .asp, .cgi and even .html/.htm/.shtml can be programmed to trap a spider.

So, I have to disagree with your observation that Google avoid new php sites because it could be a spider trap.

Cheers

claus

2:20 pm on Oct 3, 2003 (gmt 0)

>> a simple spider will treat with caution any dynamic page first time around

As to Googlebot, i really don't think php is any problem at all - spidering problems have always had another reason in the cases i've seen (i gladly admit that these are not too many). Then again, the line of reasoning, that Gbot should "take care out there" sounds allright to me. I suspect they use other methods than looking at file extensions though, and that they don't think php is the only tool in the box.

For the thread topic, did anyone mention this link? There's good information found on those Google webmaster pages: [google.com...]

Also - as it's a "101" - i'd like to add to the discussion that Google is a "page-based" search engine, not a "site-based" one. It's not your site that gets shown in the serps, it's your individual pages. That's important, as sub-sub-sub page can easily perform better for some search than your index page.

/claus

nippi

3:49 am on Oct 4, 2003 (gmt 0)

I am assuming, the original adviser meant php files with delimiters, not just with php as an extension.

anallawalla

5:13 am on Oct 4, 2003 (gmt 0)

There is an older thread here [webmasterworld.com] that needs to be visited and updated in the current one or a new thread started. I liked the allocation of points in that thread so it would be interesting to see current perceptions.

Ash
(on the road in Mumbai, India)

plasma

12:28 am on Oct 8, 2003 (gmt 0)

What do you mean by "Title Meta Tag"

Is there another title tag in the meta tags?

World Wide Wibble

11:49 am on Oct 9, 2003 (gmt 0)

What do you mean by "Title Meta Tag"
Is there another title tag in the meta tags?

No, it means the <title></title> HTML tags in the header. You're right; they're not strictly META tags, but it's often called the Title Meta tag because it goes where all the Meta tags go in the header...

jk3210

12:21 pm on Oct 14, 2003 (gmt 0)

Wired Suzanne -

>>jk3210 - Do you use H3 for the text or P? And does it make any difference?<<

No. My response to you was just an attempt to answer your question as to what Trillianjedi was referring to, not advocating H1, H3, H2.

I always use <P> and I rarely use anything other than css-modified H1's. :)

davidmaster

10:27 am on Oct 18, 2003 (gmt 0)

Hey everyone... i was a "fan" to the google pagerank... but i have a sad news to you all...

all this optimization for nothing now... =(

Look at this news here...
PageRank is Dead: [snip]some blog[//snip]

Someone that knows anyting more?

[edited by: heini at 11:12 am (utc) on Oct. 18, 2003]
[edit reason] please don't post urls, thank you! [/edit]

davidmaster

1:36 pm on Oct 18, 2003 (gmt 0)

How should i view the others that is interessted in the news about it?

[edited by: heini at 11:12 am (utc) on Oct. 18, 2003]
[edit reason] please don't post urls, thank you! [/edit]

nippi

2:22 pm on Oct 18, 2003 (gmt 0)

and the article was bollocks anyway

netnerd

1:48 pm on Oct 20, 2003 (gmt 0)

On a 101 theme,

Does it look like it matters to google what your pages are called?

I.e. will www.something.com/keyword.html
rank differently than
www.something.com/nonsens.html?

I think it used to be the case, but it doesnt seem to matter any more. Also, will it effect the relevancy of the page that links to it?

i.e. Should my widgets page link to www.something.com/widgets.html or www.something.com/nonsens.html ?

This 146 message thread spans 5 pages: 146