shaunm - 10:17 am on Sep 3, 2012 (gmt 0)
I am still confused of this two terms 'crawl' 'index'
This is my understanding about the above two, if I am wrong please guide me on the right path.
CRAWL - Spiders/Bots visits a website/webpage. It scans the webpage for content and links - this scanning process is called crawling am I right?
INDEX - Once done with the scanning process, they then index the content in the databases to display in the search results later - This process is indexing right?
My Question is:-
1. If I block a page as 'do not' crawl, how the spiders still index it? If they don't crawl a page how can they index it? Crawling is the very first step to indexing right?
2. Do the SE spiders actually care about what is in robots.txt? :(