Forum Moderators: open
I have a site that google knows about and the bot visits often but it does not go through my internal site. I have created a site map, a robots.txt and everything I can think of to get google to go through everything.....this is upwards of 10,000 pages that google is not getting too! Please help me! :)
The only problems I can think of is that my site is dynamic using PHP and some variables in the URL but not session id no ID= or anything that I read can mess up search engines...
Will google not go deep into pages if PR is low? (I just started my site a couple months ago)
It is weird though because even some links on my home page arent getting followed.
So I was hoping someone would take the time and check out my site and let me know if they could see what was wrong.
I would really appreciate it!
Kevin
[edited by: deft_spyder at 8:00 pm (utc) on April 4, 2003]
Those are the links for my site map....one fore each state which then links to a listing of stuff in that state.
However google does not even follow this link from my site map.
Hope that info helps a little!
In the early days I had a java/jsp site and my host had issues with more than X pages being produced from the JVM/server in a short space of time and I saw the same behavior.
One way around this (apart from changing host which I did) is to improve the links between your pages so that Google has different ways to reach parts of the site. This improves your site's resilience to failure in any one dynamic page, which is always a possibility no matter what language and methdology you use.
Curiously I find the same thing - there are links on my front page which don't seem to get spidered. I have no idea why not. Only thing I can suggest is making sure each of your pages is linked to by at least a couple of your other pages... most importantly this makes it easier for the user.
We are for users, right? Not for the robots...
My page has under 100 links on it and I think it is generated by the server relativly quickly.
My links look like this:
[widgets.com...]
And here are a couple of lines from my page source:
<html>
<head>
<META NAME="keywords" CONTENT="widgets">
<META NAME="description" CONTENT="widgets">
<META NAME="ROBOTS" CONTENT="ALL">
<META NAME="revisit-after" CONTENT="31 days">
<title>Site Map : widgets.com</title>
<LINK REL=STYLESHEET TYPE="text/css" HREF="main.css" TITLE="Main css">
<body>
<center>
<a href="http://www.widgets.com"><img src="logo.jpg" width="350" height="172" alt="" border="0"></a>
</center>
<div class="content"><br>
Some content<br><br>
</div>
<div class="map">
<strong>Courses:</strong><br><br>
<a href="http://www.widgets.com/index.php?show=demo&country=USA"><strong>United States:</strong></a><br>
<a href="http://www.widgets.com/index.php?show=demo&state=AL">Alabama</a><br>
<a href="http://www.widgets.com/index.php?show=demo&state=AK">Alaska</a><br>
<a href="http://www.widgets.com/index.php?show=demo&state=AZ">Arizona</a><br>
_____________________________________________________________
Google gets to this page fine but then it does not continue on with the links...also google doesnt even spider some of the links on my main page even though it gets to other links on my index just fine!
Any further ideas or help would be greatly appreciated!
Thanks,
Kevin
Just missed it I guess....I will take a look at the HTML validator and check that out...but like others have suggested maybe it is just a matter of time.
Hopefully after the next update my PR will go up a little bit and then maybe the bot will have a little more love for me and crawl the site a little deeper, it just doesnt make much sense for it to only hit pages linked from the home page and go no further than that.
Well...we will see when update begins, hopefully the deep crawler will go through everything next time. If not then you will see me on here asking a bunch more questions :)
Thanks a lot!
Kevin