Welcome to WebmasterWorld Guest from 220.127.116.11
Forum Moderators: open
I have a weird search engine problem.
For some reason search engines are not spidering my message boards.
Google spiders these pages fine:
I have done just a site and search on google and can't find any message board links. The links should look like:
Its very weird. I thought that googlebot must not even be able to see the message board but I found a link to my who is on-line feature which is only linked to from the message board so this must mean googlebot is view the message boards but just not storing them in the database.
The way my site works is the /main and /boards bits in the URL are not folders but files with no extention and every thing after that is a variable. for these files, I thought that would have something to do with it but /boards works the exact same way as /main and /main files are getting stored on google.
I have a robots.txt file but it only has restrictions on the directories /admin/ and /mainadmin/
Does anyone have any clues as I am stumped!
[edited by: Woz at 11:09 am (utc) on Jan. 10, 2003]
[edit reason] TOS#13 [/edit]
I have used the Spider SIM with this URL
It returns a 302 and it says its getting forwarded to
Which is weird because its not. I have used another Spider sim on the same url and it doesn't get forwarded.....I am confuzed...
Also using the google tool bar and visting any of the front pages of my message board I get a 2/20 ranking which is not good and no back linking to these urls. This should not be the case as every story page for each sup site links to their message boards!
I know it has a lot to do with external sites linking but an example of a high ranked page with no exteranl linking is:
[example.net...] which is ranked No 1 for certain keywords
I am confuzed still..
[edited by: Woz at 12:02 pm (utc) on Jan. 10, 2003]
I have been lurking about for a few weeks as I was drawn in to find out more about "the dance"...which I am getting to grips with.
I am still baffled at the reason I can't find any of my message boards in the database....
I have figured out that they have been indexed as google tool bar is giving them a 2/10 rack..or so I have been told as I can't get google tool bar for Moz....so I persum it has to do with my keyword density and backlinking...
I am off to play with my keywords I think..but would love to find a message board using keywords to give me a start but have yet to find one!
If anyone has any more information to get me started that would be great!
Im not too sure about the whole message board indexing deal - i know that if the board is members only i wont be indexed.
Maybe someone else can chip in a couple of points?
Would it have anything to do with this bit:
i.e. it not indexing the indavidual messages because they are variables? Or have too many variables?
I caould rewrite my code to look like this:
Would that help things?
it might do (can't see your boards myself to check). the session ID might only be visible to you if you turn cookies off - remember that spiders don't accept cookies. it might do the same as oscommerce shopping cart, ie, no session ID in the first hit on any message board page, but session IDs added to every link just in case cookies are turned off. try the SIM spider at [searchengineworld.com...] to see your site how the search engines might see it.
google also has trouble indexing pages with name=value pairs in the URL. it seems ok with 2 name=value pairs on some ASP sites i have, but definitely cannot cope with 3 name=value pairs and does not spider my CGI or PHP sites with 1 or 2 name=value pairs in the URL.
FAST spiders all of them just fine.
Google is getting better at spidering dynamic URLs, but there are a few rules of thumb that help.
1) The more static the URL the better. The page can still be dynamic but a static URL helps heaps.
2) The higher the Base PR of the site the better. There is a threshold below which Google stops spidering Dynamic URLs so the higher the PR of the referring page/s the more pages get indexed.
3) Related to #1, the fewer the variables the better. page?A=1 is better then page?A=1&B=2&C=3. Somethings to try are:- a) converting the variables into a speudo directory structure such as a1/b2/b3/page, or b) concatenating the variables such as page?A=1xb2xc3 and then splitting the variables with x as the split point.
There are many more techniques and tips available. Do a search (link at top of page) on Dynamic URLs and you should get more ideas.
I am begining to think there are a few problems with my message boards:
1. Keyword Relevance
2. To many dynamic URLs which I can fix with mod_rewrite
3. Back Linking...
I can deal with the first 2 problems my self but the back linking problem is a weird one.
As every /main page has a link to coresponding /boards page and /main pages are being linked no problem at all I would think this would be enough back linking?
Also the posts are all have a link back to the corisponding /boards page....
I have done the following things to see if this will make a difference:
1. Increased the keyword relevance on the message boards to about 10-15% in 1,2,3 keyword phrases for the relevant keywords int he meta tags.
2. Increased the number of links to the message board 10 fold and links back also. Internal for just now
3. Changed the fisrt link to the message board to /boards/list/s85.php instead of /boards/list/s85.php?f=85
I noticed that when running the spider SIM that it was picking up the links like this: /boards/list/boards/list/s85?f=85
I have now fixed this so it picks up the links correctly.
Hopefully that should make a difference but I am still confuzed why I can't find any refrences to my message boards within google. I think it might have been the base ref problem.
I rarely say the following anymore, because the php guys get all fired up when I say it: php pages rank lower than stock .html pages. Always have, probably always will. There are alot of tangible and intangible reasons for that from people not linking to dynamic urls, to just plain dynamic urls. That's true for all the nonstandard filetypes.
Whens the last time you saw a .shtml page ranked under any kind of quality kw?
It is a "fake" extention anyway as the /boards /main bit is the file with everthing else just being variables after that.
Would google not moan if I have a .php version and a .htm version? And would I have to change all my links?