Forum Moderators: open
The MediaBot crawls deeper into the site without issue. The site runs AdSense.
Could there be anything in the server config that is causing this? It isn't robots.txt. The index page is lo-fi and xenu crawls it fine, as does the searchengineworld sim spider.
Any ideas?
Oops, I forgot why I came here. She just grabbed all the gallery pages she skipped before. I guess she just had to go back for more PR. I'm pretty stoked as my site just grew by a factor of 4 since yesterday in Google's eyes.
Index new pages that contain AdSense immediately. That's right, send mediabot out right away and index the page right away. BINGO! Puts those pages to work right away. Google begins earning revenues and the website begins earning revenues too, RIGHT AWAY.
To get their new pages indexed right away, webmasters would have a choice of paying Yahoo or putting AdSense on their new pages and receiving revenue from Google. Which do you think they will choose?
Index new pages that contain AdSense immediately. That's right, send mediabot out right away and index the page right away. BINGO! Puts those pages to work right away. Google begins earning revenues and the website begins earning revenues too, RIGHT AWAY.
Do you think they haven't considered this?
Google has consistently tried to maintain a barrier between their revenue streams and SERPs. Its an admirable goal and tends to squelch accusations of impropriety. Why would they change that policy now?
I then remembered I launched a hobby site about a month ago. It's the same story with googlebot, lot's of homepage visits.
As a matter of fact I'm getting more traffic from dmoz and AV to the site than google. It's pretty sad that AV is outperforming google.
I launched a hobby site about a month ago. It's the same story with googlebot, lot's of homepage visits
You don't say what PR Google has given the site, if any, and what are the value of the inbound links.
IMHO Google appears to have triggers whether a site is worth crawling, and to what extent. I think that if Google has followed a link from a high PR page to get to a new site, it awards a temporary highish PR to the page it found on the new site (typically the home page). This triggers deeper crawls, and eventually Google awards a more permanent PR based on pages and inbound links found - which could be greater or less.
If it has followed a low PR link the temporary PR awarded may not be sufficient to trigger deeper crawling. Increasing the PR depends upon Google following other inbound links, but this depends on how often Google crawls the linking pages.
This also applies in reverse. If the high value inbound links start disappearing, the PR reduces, crawling reduces, and if the PR becomes low enough, the site starts to die.
I'm not sure if the above scenario is absolutely accurate, but common sense says that Google must have some mechanism in place to determine what sites are important enough to be worth crawling immediately, and those that are of less priority. It's a big web out there, and getting bigger.
It appears that google has removed a number of crawlers from service and has had to prioritize the crawls due to less resources. It seems like new sites are getting ignored except for the home page. Existing sites are still getting crawled based on their PR.
I suspect that Google is getting ready to pull the trigger on a revamped crawler. There are a few threads about a test crawler crawling js files.
And who knows what else is in store? Maybe they will now be able to follow cgi/php redirects.
It should be interesting.
it would be interesting to know if other sites with the same problem are beeing crawled now.
to reach some articles you must click 10 times
I would suggest that could be a real problem for users, who typically have the attention span of a gnat. The golden rule is not more than 2 or 3 clicks to get anywhere. If the average user hasn't found what he wants by then, he's gone.
Your idea of adding a sitemap is a good one. Then as far as Google is concerned each page is only 2 clicks deep. This should help with crawling and pushing PR deeper into the site.
This is my first post and I would like to say hi to everyone.
I have built a small static site. Submitted it to google about 3 months ago. Took a couple of weeks to visit the first time and has then returned on the same day every month since - no deep crawl - just gets the robots.txt file and index page and leaves.
I have added a sitemap today hoping it might make a difference.
Hoping google comes 'a calling' soon for the real deal.
Adam
When I checked my logs this morning I noticed that googlebot came twice within 3 hours.
I just checked again now and I am finally getting a deeper crawl. We'll see how many pages get crawled.
These two sites have been waiting for a crawl since early February. To tell the truth I was being a little stubborn about trying some of the suggestions to encourage a deeper crawl because there was nothing different about these sites that I haven't done in the past.
is that a good idea? i did this so theres no cross linking and multiple pages of the same content with different urls.. let me know guys
Don't try too hard with this. It seems that Google has developed a major problem in crawling sites at the moment. No one from Google has actually said so but there are so many people affected it is hard to conclude otherwise.
My own site has not been crawled properly for weeks and despite numerous pleas to Google they have did nothing about it. I think my symptoms are similar to most of the others. For example, this morning Googlebot came along, looked at my robots text and index page then left. Despite these random, brief visits I still have no PR, title, cache or descriptions on my pages.
No one knows what is going on and Google have not been tempted to comment so just don't bust a gut on this one.
"GET /robots.txt HTTP/1.0" 200 23 www. mydomain.co.uk "-" "Googlebot/2.1
"GET / HTTP/1.0" 200 44320 www. mydomain.co.uk "-" "Googlebot/2.1
I don't know a lot about how to interpret these results. Does this second entry (with no files specified) signify a deep crawl or whatever?
1 launched new site ( 6 weeks all pages indexed in google ) appearing in serps ok
2 added classified adds software ( still ok but some url's appearing no title or description
3 added click tracking via php mysql ( over next 2 months no title or description and no google referals )
PR also went down pr4 to pr1 ( reason i thought was some sort of penalty )
Like the fool i am decided must be some sort of penalty and just left domain sitting in limbo for 3 months before relooking at
this is what i then found when investigated and tried some stuff
took classified adds software off
stopped using PHP MYSQL clicktracking
over next 6 weeks all pages back in google and appeared back in serps
my own view is it was php clicktracking that stopped google indexing but not 100% sure on that
don't know if the above is relevent to any of you guys but just my own experiences
steve