Forum Moderators: phranque
Some things that I know:
I'm sure there's a lot more that I'm missing, and so I'm all ears.
PS The one exception to the "forums don't fare well on SE's" is, can you guess? WebmasterWorld! I tried to describe some its technique above... So, if anyone can share more, I'd be most obliged
Forums that do this type of stuff, like WebmasterWorld for example, rank well, forums that don't, don't rank well. Lots of forums do it, but most don't. But IMHO nobody does it better than WebmasterWorld on a technical level, not even close, but that's because WebmasterWorld doesn't run generic forum software.
There are other hacks (for a customized version of phpbb) that will slip your forum topics into a spider friendly url (forum-topic.html), as well as slip it into the title tags etc. Really cool stuff. Would love to see that for the regular phpbb. I suppose I could hack the hack, maybe someday. hehe
What else?
PS Maritnibuster - I would think that forum-topic.html is no better these days - it seems that these days all spiders can handle fourm.php?t=xyz or whatever.
topic-34.html is better than topic.php?topic=34. There is no case where not using mod rewrite is better. Why? Because with the rewrite, the spider is receiving a list of independent pages, eg: topic-34.html, topic-35.html etc. Without rewrite, it's receiving a single page, with different content. Does it matter? Definitely, no question. On one site I'd been lazy, gallery section done with simple query string parameters, google only had main page indexed, or given in serps I should say, all were indexed to some degree, but weren't performing at all. I switched it to mod rewrite, all pages now perform. What a spider can handle and what is best are not the same thing. You want to give the spider unique pages, it doesn't get confused then.
The question is not: what is the least I can do to get some of this working, but rather: what is the most I can do to get it all working? Happily, almost everyone chooses the former.
...would think that forum-topic.html is no better these days - it seems that these days all spiders can handle fourm.php?t=xyz or whatever.
Yeah, I thought so too until I saw all of the forum posts in supplemental. :(
The site map hack can be found by Googling phpbb hacks site map. It's the Advanced Site Map. I'm not endorsing it. I'm just saying what my experience with it so far was. I find that it is a simple and non-intrusive solution. My main concern is that it needs to break down the map into smaller chunks. But so far it's worked very well. Over seven thousand pages indexed. Makes me giddy to think about it.
There is an html error in the sitemap.php file, and you might want to comment out the section that displays your member list (I commented it out so that if I screwed something up I can just un-comment it). When I first set it up I received an error message but the message tells you what line is going wrong, and it was a folder location variable that needed to be fixed. The usual script thing, easy to configure.
And, a general question: many bb's slightly modify the html when a spider is detected? Is this bad? Would it be considered cloaking?
Re page size:
D) Page Size:
The smaller the better. Keep it under 15k if you can. The smaller the better. Keep it under 12k if you can. The smaller the better. Keep it under 10k if you can - I trust you are getting the idea here. Over 5k and under 10k. Ya - that bites - it's tough to do, but it works. It works for search engines, and it works for surfers.
Top in priority is to get rid of query string sids. If you are going to run adsense, you also need to get rid of them for logged in users. Again, simply look at how it's done here, logins by cookie only. If you're williing to sacrifice old browser support, you can drop page size down even more.
"You said earlier that robots.txt is not enough to avoid PR leak - what about rel="nofollow"? (My guess would be that this would suffice, as the whole point is to keep any benefit from spammers etc.)"
I didn't say this was to drop PR leak, never mentioned PR.
G) Outbound Links:
From every page, link to one or two high ranking sites under that particular keyword. Use your keyword in the link text (this is ultra important for the future).
Obviously you can succeed without this, see these boards for example, but his advice was solid then, and it's solid now.
As I said, you can approach this two ways: what is the least you can get away with, or what is the most you can do to achieve long term success. You seem to be leaning towards the former.
you can approach this two ways: what is the least you can get away with, or what is the most you can do to achieve long term success. You seem to be leaning towards the former.
Don't get me wrong, though - if you think I'm doing something wrong, by all means, please let me know!
Links are something you have to decide for yourself, there's pluses and minuses, using nofollow of course means you can worry less about undesirable spammy type links, but mods in general have to watch for spammy stuff no matter what. Using link redirectors like WebmasterWorld does is another way to do it, then blocking the redirect page in robots.txt. It just depends on what you want the forums to do long term, what type of audience you'll be getting, I don't think there's any hard and fast rules re links and forums, I'd say it's a case by case thing.
"I'm just trying to direct my efforts as effeciently as possible, and combine SEO with functionality"
Functionality and SEO go hand in hand from my experience, the better on page stuff is, the better seo is.
a massive advantage of the mod.rewrite hack I employ on my forums (invision) is that the URL is rewritten to http://www.example.com/boards/example_keywords_.html
[example_keywords] is the title of the thread. This has had a noticeable advantage: google adwords ads relevancy has increased, so I assume the page relevancy as a whole has increased. Certainly my pages are doing well in Yahoo and MSN (though they're not 'doing' at all in google for an unrelated reason which i'll pick your collective brains about later on)
I'm not certain about the benefit of the standard topic.html mod.rewrite hack but at least it kills session ID's.
I'd be very interested to see a highly (successfully) hacked invision 2.0 board. I probably wouldn't bother doing that with mine though, as 2.1 is fast approaching.
[edited by: Woz at 12:07 am (utc) on July 12, 2005]
[edit reason] Examplified, please see TOS#13 [/edit]