Forum Moderators: phranque
Using the sim spider - I can see the list of links on the forum home page, which take the format
http://us.mysite.com/forums/show.php/act/SF/f/10
this links to the forum page for a specific topic (this is a car owners forum - and it links to a forum for a specific model)
All the links in this forum are also search engine engine friendly, eg
http://us.mysite.com/forums/show.php/act/ST/f/10/t/304
Checking the page through sim spider shows a load of links on the forum index to all the individual forums.
But none of the links on the individual forums (the links to each individual post) are shown.
Is this a problem with sim spider? does t have known limitations?
The links are there - but sim spider does't show them.
This is difficult to explain without giving a URL so you can try - would it be against the rules if I told you what to replace mysite with in those URLS (but without posting a real URL)?
[edited by: engine at 12:20 pm (utc) on Feb. 23, 2004]
[edit reason] de-linked [/edit]
<a href='http://us.mysite.com/forums/show.php/act/ST/f/1/t/727' class='linkthru' title='This topic was started: Mar 24 2003, 11:57 AM'>Import Revolution Pictures.....what A Day!</a>
looks clean too me!
really worried that my hard work will be in vain - as if google works like sim spider - it is going to miss all the content in my forums 8(
I think a better solution is to get the urls to resemble the urls of this forum, which would require mod_rewrite [httpd.apache.org]
I'm going to respond to this in public in case someone else can learn from it also, however, if there's anything private you want to ask I can reply by sticky.
Make sure when you run the tests that you are logged out (i.e. looking at the page as a guest). That is how the sim spider will see it.
I noticed two possible issues google may have with your setup. One is the links to the individual messages still have the s=(sessionhash) on it. The second issue is that even though the link to the individual forums are in a spider friendly format, when you actually click on the link, it seems you are forwarded to another url...I don't know how that is happening, but you have some issue there. Hope this helps.
8)
Being a guest / logged in makes no difference to the HTML presented by the forums software - can you explain in a bit more detail why you think this is relevant?
With regards to the links, then yes, the actual links them selves are very search engine unfriendly, they include several query strings, including the search engines worst enemy - session ids! 8)
the mod I applied to the forum gets around this by formatting the links in the manner I illustrated above.
These links are passed to a file called show.php, which changes the ST/F/t/ type links into the unfriendly URLs that Invision normally uses.
I have posted this file beow - it is included in the Invision "package", and I have also posted the full "credits" - so hopefully, I'm not breaking any rules by posting it (I'm sure a mod will delete it if I am - and I'll understand)
<?php/*
Script Name: show.php
Script Author: Matt Mecham
Package: Invision Board
Date: 16th September 2002
-------------------------
What is this?
-------------------------
It's a little "add-on" that simply allows for a neater / easier to read
URL to an invision board URL.
Example: http://www.domain.com/forums/show.php/act/ST/f/3/t/45
Resolves to: http://www.domain.com/forums/index.php?act=ST&f=3&t=45
It's not used in Invision Board itself, but might come in handy in your
own projects (such as a search engine friendly menu, etc).
Probably only works with PHP 4.1+
*/
$base_url = 'http://us.mysite.com/forums/index.php'; //Edit this to suit
$redirect = "";
if ( $_SERVER['PATH_INFO']!= "" )
{
$c = 0;
foreach( explode( "/", $_SERVER['PATH_INFO'] ) as $bit)
{
if ($bit!= "")
{
if ($c == 0)
{
$c++;
$redirect .= $bit.'=';
}
else
{
$c = 0;
$redirect .= $bit.'&';
}
}
}
}
header("Location: $base_url?".$redirect);
exit();
?>
this is why the page you receive, is not the one reffered to in the link (it is exactly the same / right page for your request - just has a different URL)
Sim spider picks up one set of these links (the main forum page), but misses the individual forums / topics
would love to be able to do this with Mod_ReWrite, but that would definitely be beyond me at the moment
[edited by: engine at 12:21 pm (utc) on Feb. 23, 2004]
[edit reason] de-linked [/edit]
Try passing all the variables in the url, and reduce or better yet eliminate the need for session variables. You will notice much better crawling results.
The links I'm feeding google take the form http://us.mysite.com/forums/show.php/act/ST/f/10/t/304 and this always resolves to the same page.
Ony difference is the show.php script I posted translates this to
http://us.mysite.com/forums/index.php?act=ST&f=10&t=304&
[edited by: engine at 12:22 pm (utc) on Feb. 23, 2004]
[edit reason] de-linked [/edit]
The big problem I see is that Google will have an easier time seeing the link if you don't make the url "neater" then if you make the url "neater" but then do a redirect. These days google is ok with passing one or two variables to a script. Just kill that hack and you should be fine as long as the forum names don't contain sessionids.
To tell the truth, I don't know why all those forums put sessions in as a default. It's just for those pesky few people in the world who are a pain about not having cookies in their site. I wouldn't even want those PITA's on my site anyways. They'll always complain about something ;) (if any of you PITAs are reading this, I was just kidding. Just a joke. If none of you are reading this, then I was serious.)
P.S. I would fix this ASAP. If google didn't start the deepcrawl already, it'll start very soon.
so - I am pretty happy.
would post screenies - but the URL that I need to show to "prove" what I am saying is in the title bar, and its too late for me to start editting the true URL out of the image.
> P.S. I would fix this ASAP. If google didn't start the deepcrawl already, it'll start very soon.
PS - why do you think I was worrying!
8) 8) 8)