Forum Moderators: Robert Charlton & goodroi
Can you please help me out? I have a website in which I did 301 redirect from non www to www. Like when any one open abc.com it redirected to www.abc.com as the site was not able to cache with www. I did the changes one & half month back. But now the site is not cached with www and without www.
Please tell me how many more days it will take to get this site cached from Google? OR what should I do to get this site cached? Link building & article submission work is going on with decent pace.
Looking for your valuable feedback.
Thanks,
SandySEO
The first situation is something you cannot have any effect on, as far as I know. They store a cached version of your url whenever their processing schedule allows for it. The cache is apparently not stored in the same "place" in their immense server farm as the ranking data is - so you can get traffic a-plenty even without a cache link showing in the search results. Just make sure there are no technical errors anywhere along the line, and then it's up to Google.
No, the site is indexed in Google with this command - site:www.abc.com, but not cached. Moreover few pages are showing supplement results, but i deleted all those pages from live server 4 months back, but it still showing those pages. earlier the pages were in html and after redesigning the pages are in .aspx. Now the html pages are also indexed with supplement results.
Please make me understandable me the entire concept.
Thanks,
SandySEO
Even when pages are deleted, they stay in the supplemental index for a long,long time.
>>earlier the pages were in html and after redesigning the pages are in .aspx. Now the html pages are also indexed with supplement results.
Now this is a different story and a different issue. So all the new aspx pages are actually duplicates of the html pages then. They may be deleted, but they'll still hang around, so the engines have both. Have you considered redirecting the old html pages to the new .aspx pages? Have you checked into isapi rewrite?
[edited by: Marcia at 9:54 am (utc) on Feb. 19, 2007]
Regarding the robots.txt file I have the following text in my file:
User-agent: *
Disallow: /source/admin/
User-agent: *
Disallow: /Database
User-agent: *
Disallow: /Images
User-agent: *
Disallow: /Includes
User-agent: asterias
Disallow: /
User-agent: BackDoorBot/1.0
Disallow: /
User-agent: Black Hole
Disallow: /
User-agent: BlowFish/1.0
Disallow: /
User-agent: BotALot
Disallow: /
User-agent: BuiltBotTough
Disallow: /
User-agent: Bullseye/1.0
Disallow: /
User-agent: BunnySlippers
Disallow: /
User-agent: Cegbfeieh
Disallow: /
User-agent: CheeseBot
Disallow: /
User-agent: CherryPicker
Disallow: /
User-agent: CherryPickerElite/1.0
Disallow: /
User-agent: CherryPickerSE/1.0
Disallow: /
User-agent: CopyRightCheck
Disallow: /
User-agent: cosmos
Disallow: /
User-agent: Crescent
Disallow: /
User-agent: Crescent Internet ToolPak HTTP OLE Control v.1.0
Disallow: /
User-agent: DittoSpyder
Disallow: /
User-agent: EmailCollector
Disallow: /
User-agent: EmailSiphon
Disallow: /
User-agent: EmailWolf
Disallow: /
User-agent: EroCrawler
Disallow: /
User-agent: ExtractorPro
Disallow: /
User-agent: Foobot
Disallow: /
User-agent: Harvest/1.5
Disallow: /
User-agent: hloader
Disallow: /
User-agent: httplib
Disallow: /
User-agent: humanlinks
Disallow: /
User-agent: InfoNaviRobot
Disallow: /
User-agent: JennyBot
Disallow: /
User-agent: Kenjin Spider
Disallow: /
User-agent: Keyword Density/0.9
Disallow: /
User-agent: LexiBot
Disallow: /
User-agent: libWeb/clsHTTP
Disallow: /
User-agent: LinkextractorPro
Disallow: /
User-agent: LinkScan/8.1a Unix
Disallow: /
User-agent: LinkWalker
Disallow: /
User-agent: LNSpiderguy
Disallow: /
User-agent: lwp-trivial
Disallow: /
User-agent: lwp-trivial/1.34
Disallow: /
User-agent: Mata Hari
Disallow: /
User-agent: Microsoft URL Control - 5.01.4511
Disallow: /
User-agent: Microsoft URL Control - 6.00.8169
Disallow: /
User-agent: MIIxpc
Disallow: /
User-agent: MIIxpc/4.2
Disallow: /
User-agent: Mister PiX
Disallow: /
User-agent: moget
Disallow: /
User-agent: moget/2.1
Disallow: /
User-agent: mozilla/4
Disallow: /
User-agent: Mozilla/4.0 (compatible; BullsEye; Windows 95)
Disallow: /
User-agent: Mozilla/4.0 (compatible; MSIE 4.0; Windows 95)
Disallow: /
User-agent: Mozilla/4.0 (compatible; MSIE 4.0; Windows 98)
Disallow: /
User-agent: Mozilla/4.0 (compatible; MSIE 4.0; Windows NT)
Disallow: /
User-agent: Mozilla/4.0 (compatible; MSIE 4.0; Windows XP)
Disallow: /
User-agent: Mozilla/4.0 (compatible; MSIE 4.0; Windows 2000)
Disallow: /
User-agent: Mozilla/4.0 (compatible; MSIE 4.0; Windows ME)
Disallow: /
User-agent: mozilla/5
Disallow: /
User-agent: NetAnts
Disallow: /
User-agent: NICErsPRO
Disallow: /
User-agent: Offline Explorer
Disallow: /
User-agent: Openfind
Disallow: /
User-agent: Openfind data gathere
Disallow: /
User-agent: ProPowerBot/2.14
Disallow: /
User-agent: ProWebWalker
Disallow: /
User-agent: QueryN Metasearch
Disallow: /
User-agent: RepoMonkey
Disallow: /
User-agent: RepoMonkey Bait & Tackle/v1.01
Disallow: /
User-agent: RMA
Disallow: /
User-agent: SiteSnagger
Disallow: /
User-agent: SpankBot
Disallow: /
User-agent: spanner
Disallow: /
User-agent: suzuran
Disallow: /
User-agent: Szukacz/1.4
Disallow: /
User-agent: Teleport
Disallow: /
User-agent: TeleportPro
Disallow: /
User-agent: Telesoft
Disallow: /
User-agent: The Intraformant
Disallow: /
User-agent: TheNomad
Disallow: /
User-agent: TightTwatBot
Disallow: /
User-agent: Titan
Disallow: /
User-agent: toCrawl/UrlDispatcher
Disallow: /
User-agent: True_Robot
Disallow: /
User-agent: True_Robot/1.0
Disallow: /
User-agent: turingos
Disallow: /
User-agent: URLy Warning
Disallow: /
User-agent: VCI
Disallow: /
User-agent: VCI WebViewer VCI WebViewer Win32
Disallow: /
User-agent: Web Image Collector
Disallow: /
User-agent: WebAuto
Disallow: /
User-agent: WebBandit
Disallow: /
User-agent: WebBandit/3.50
Disallow: /
User-agent: WebCopier
Disallow: /
User-agent: WebEnhancer
Disallow: /
User-agent: WebmasterWorldForumBot
Disallow: /
User-agent: WebSauger
Disallow: /
User-agent: Website Quester
Disallow: /
User-agent: Webster Pro
Disallow: /
User-agent: WebStripper
Disallow: /
User-agent: WebZip
Disallow: /
User-agent: WebZip/4.0
Disallow: /
User-agent: Wget
Disallow: /
User-agent: Wget/1.5.3
Disallow: /
User-agent: Wget/1.6
Disallow: /
User-agent: WWW-Collector-E
Disallow: /
User-agent: Xenu's
Disallow: /
User-agent: Xenu's Link Sleuth 1.1c
Disallow: /
User-agent: Zeus
Disallow: /
User-agent: Zeus 32297 Webster Pro V2.9 Win32
Disallow: /
Please tell me is it fine or any things need to be update?
Thanks,
SandySEO
Moreover few pages are showing supplement results, but i deleted all those pages from live server 4 months back, but it still showing those pages.
As you have deleted the pages from site make sure the page header showing 404. If server header showing 404 but still the pages are in Google index you can use Google Removal Tool and remove only the deleted pages, but you have to be very careful using this tool.
As long as your server sends a HTTP Status Code of 404 for those URls there is nothing else that you need to do regarding Google indexing. Make sure that the visitor is served a custom error page with some helpful navigation to the major parts of your site.