feeder, since when do you see that? I've a site which is public since 10 days where I see the same. Some other domains get about 100 GB hits/day which is also just a bit of the normal crawling.
It may be worth trying to navigate the site using a browser like lynx.
Is your index page returning a 304 status (not modified since). I don't think google will go deeper anymore if it recieves this status.
Is your site new or did you recenlty moved to another host?
No to both.
Gbot would come on a regular basis to my index page and recieve a 304 status and then go away again like clockwork.
Added a new link and some minor changes to my index and gbot crawled but only one level deep.
Waiting for G to crawl the the rest of the pages the next level down.
Its normal for it to grab the index one day and then usually return a week or two later and spider all.
If its been going on longer than a month than
you can be sure theres something wrong.
Im no expert but this has been par for the course in my logs for a year.
I also have a site, the googlebot visited the sites and get about 80 pages then it refrain from to deep crawl.
Now every day googlebot just go to the homepage and then leave.
I cannot understand why the googlebot don't continue to crawl the left hundreds of pages......
do you have a sitemap page linked to your main page? if not, i'd do it... and include a link to it from every page, too...
As I say, other spiders crawl it fine. It isn't a link issue. Googlebot visits two or three times a day, grabs the first page, leaves.
Yes, the site is new.
You don't say what PR your index page has. If it's not very much that will discourage Google from going any deeper.
As tantalus said, the other problem could be if your index page isn't modified. My solution is to add a link from the index page to any new pages. The next time Google visits it sees the index page has been modified and usually within a few hours has come back to take the new pages. I remove the links when the toolbar PR for the new pages show white.
I usually get about 70% of my pages indexed every week, but I don't know whether this is because of my PR or because Google perceives the site as active.
I have many sites. I've never had any problems getting them deep crawled.
This is the first time I've seen this type of behavior. I wondered if it is something other people are seeing, or it it's just me.
Googlebot crawls higher PR pages with a greater frequency than lower PR pages. If your new site has an index page PR of 4, and inner pages that are less than that, then the bot won't hit the inner pages very often.
On a personal note: I got back to the internet several days ago after having been in places where digital, at best, means counting on your fingers. I missed Austin entirely... it didn't seem to make a lot of difference for us, but we have serps for some minor kw combos bouncing in and out of the serps with every other search. I can either dig into the WW archives for Austin, to figure this out, or just assume that Google has become slightly schizophrenic, (not that there's anything wrong with that). Anyone who feels like giving me a quick run-down on what happened gets a free underground tour in Jamaica.
I launched two sites before Austin that were deep crawled and ranked. I launched one that was "due" to be listed/ranked right around Austin, but Google won't touch it.
The previous sites are not being recrawled or updated either. It's driving me crazy, but you're not alone.
Wjat is your sites PR?
You are not alone - a new site I launced about 8 days ago is seeing the same thing - Index page spidered and indexed for over a week, but no additional pages spidered or added... My site includes a site map, linked to from all pages, as well as only using basic text links to link to most pages off the index as well. It is a new site, about 60 pages so far, just index.htm listed in Google. And I am getting a fresh date for the index as well - last visit according to Google was the 19th.
Mine has crawled by google every 2 or 1 day, but only the index page, even i changed other page, Google still crawl only my index page. Also on serps, i got 1rst and 5th position but for the last 4 update it gone. not even appear until page 6.
I'm seeing this also. Visits once a day picks up robots.txt and index and then leaves. This is a relatively new thing (2 or three weeksI havn't had time to plough back through my logs) and coincides with the sites in question being dropped from SERPs.
I worried for a while that this may be caused by a poison word. I have a folder called redirects with pages that do a meta refresh to an outside page. Perhaps redirects is a poison word. I disallowed this for a while in robots.txt. I've just gone and stripped this down to just one line.
And I've changed the home page and that directory name to something less obvious. I then spent much of yesterday submitting pages that have a link to this domain to google submit in the vain hope that Googlebot might follow the backlinks and think "this sites worth crawling". PRs low only 3 but pages from this site were previously #1 for what I would call secondary three word terms.
Has anybody had any luck getting crawled?
I created a new site almost a month ago and linked to it from a few PR4 sites. Same story, index page get's visited every day and appears in the index. Google hasn't crawled any deeper though.
|feeder - Wjat is your sites PR? |
I'm not sure how that's relevant. New sites don't show PR, but that doesn't stop them getting crawled.
The site has strong inbound linking.
Update: the site has been crawled, and pages included in the index. Googlebots behaviour hasn't changed, however. It arrives, grabs the index page, leaves. Once in a blue moon it will crawl half the site.
I've noticed the very same thing with sites from PR4-PR5, new and old - but mostly new. Google grabs the robots and index page and then leaves. This goes on for weeks.
I've noticed this ever since the Florida update. GB seems to be much much slower at crawling whole sites these days.
i have a theory that google won't deepcrawl your website unless your sub pages have incoming links from external sources, like a different ip, different domain... only then will your page be worthwhile crawling..
do you concur?
"do you concur?"
re site PR.
|I'm not sure how that's relevant. |
|i have a theory that google won't deepcrawl your website unless your sub pages have incoming links from external sources |
My take on this question is that these are two aspects of the same issue. PR is very relevant.
The number of pages Google crawls is probably proportional to the PR value of the index page. If the index page is PR0 because Google had not yet determined its appropriate value, then there is little chance of being crawled.
Even when the index page is more than PR0, as Google penetrates deeper the PR decreases and at a certain point Google stops crawling. The presence of deep links boosts the PR for those pages and so Google continues crawling. However once the index page becomes a reasonable value the presence of deep links is not so necessary, at least as far as crawling goes.
I suspect another factor is that if Google does not discover any changed or new pages part way through its crawl then it abandons the crawl.
All IMHO, and I disclaim any responsiblity for being wrong. :)
You're wrong :)
One link can get a new site crawled, no problem.
|One link can get a new site crawled, no problem |
How many pages are we talking about here? 10? 100? 1000? And does Google return?
Hey, who cares about Google! I've just made Preferred Member. So who's going to offer me a beer?
feeder, if you guys even cared to read through my post properly you would see that I said an EXTERNAL link... without the external link your page is not worth viewing because no-one else is voting for your page...
do you concur now?
oh and feeder, just how rude are you? I'm WRONG? have some respect for other people...
|feeder, if you guys even cared to read through my post properly you would see that I said an EXTERNAL link... without the external link your page is not worth viewing because no-one else is voting for your page...do you concur now? |
I've been a full-time SEM since 2001, I know what a link is and how Google crawls. My question relates to a recent CHANGE in crawl activity. As I said in my posts, the linking structures of the site in question, both internal and external, are strong.
|oh and feeder, just how rude are you? I'm WRONG? have some respect for other people... |
I wasn't talking to you, I was talking to Harry :)
I was pointing out I thought he was wrong in this instance. I know this because I have access to countless clients sites logs that demonstrate otherwise. I can, and do, get sites crawled with one inbound link to an index page.
| This 182 message thread spans 7 pages: 182 (  2 3 4 5 6 7 ) > > |