|Help! Cloaking URLs violates google's policy?|
Recently banned from google - could it be those cloaked urls in my site?
I'm a webmaster of my own site and have limitted knowledge and experience. My website was listed on the top 10 search results pages for years. It got banned from google last week. Here is what google emailed me:
"Your page has been blocked from our index because it does not meet the quality standards necessary to assign accurate PageRank. We cannot comment on the individual reasons your page was removed. However, certain actions such as cloaking, writing text in such a way that it can be seen by search engines but not by users, or setting up pages/links with the sole purpose of fooling search engines may result in permanent removal from our index. Please read our webmaster guidelines at [google.com...] for more information."
I have affiliation with several merchants and I have been cloaking some of these merchant's links to protect my interest. Could these cloaked links have offended google and caused my site to be banned?
To prevent google to index my cloaked files, I have just started using the robots.txt file with the following command: User-agent: * Disallow: /a/
(/a/ is the directory where I have all the cloaked files). Will this robots.txt file satisfy google's policy?
Someone wrote that the cloaking of URLs on my site is not the same as the search engine cloaking people discuss on the message board and that SE cloaking is basically presenting Googlebot with a different page than other users will see.
I have also removed all hidden dots (which I was using to position my images and text) and I have removed 5 links to friend's sites I had placed at the bottom of each of my files. I deleted all keywords I had at the bottom of my index file.
I don't know what else I can do to please google. What are my chances that google will re-index my site?
Your feedback would be GREATLY appreciated.
|5 links to friend's sites I had placed at the bottom of each of my files. I deleted all keywords I had at the bottom of my index file. |
Examine all links on your site. Remove all that do not directly serve the needs of visitors who come to your site by searching for the subject matter of your site - I.e., links should be on-topic.
Robots.txt should contain:
Check it using this robots.txt checking utility [searchengineworld.com].
One "urban myth" that comes up a lot is that Google has a filter to detect any and all kinds of spamming practices. Some filters are technologically difficult, but exist nonetheless - In the form of your competitors doing a view-page-source and reporting your site to Google for spamming. Make sure you can pass this test.
Will Googlebot come back? Check your log files. In many cases, sites which have been dropped still get visits from the 'bot. If the bot doesn't come back at all, it's time for a re-design, a new host, and a new domain name. You might want to do a search here on WebmasterWorld for "dropped from google", "banned from google" and similar phrases - I'm afraid it happens quite often, and brings a lot of new members here.
Best of luck, and I hope this helps,
...Couple of things I forgot:
Cloaking is feeding a search engine spider something materially different from what a visitor gets in order to fool it into ranking your page higher than it would otherwise be ranked. Redirecting off-site links in order to count them or for other reasons is not necessarily cloaking.
There have been several reported cases of redemption - even for banned sites - if the webmaster made a mistake because he/she didn't know any better. The formula seems to be: Clean up your site, email Google with a confession and a statement that your site has been cleaned up, ask to be reinstated, and hope for the best. The folks at the help desk would appreciate it if you make sure you're really banned before asking for help, though. You may want to read around here for the many other reasons a site can get dropped: Server outages, GoogleGlitch, etc.
Thank you so much for your feedback Jim and for giving me hope that google may reindex my site.
The Robots.txt validator is very useful and detected no error in "user-agent: * Disallow: /a/ " however it detected an error when I added the <blank line>
I see that you also replied to my other post at [webmasterworld.com...] - I don't understand your message " Watch your capitalization."
I still don't know if this command: " user-agent: * Disallow: /a/ " will only ban robots from indexing the "a" directory while still allowing robots to index the rest of my site.
I'm trying to be careful with everything I'm doing. I've been burned too many times in the past.
Very grateful for your help.
The code as shown above in msg#2 with exact capitalization (i.e. "User-agent", not "user-agent"), followed by a blank line, not the literal text "<blank line>", will disallow all spiders from accessing any file or subdirectory in the subdirectory at yourdomain.com/a/ and will not disallow any robots from any other directories or files.
If you have entered a blank line following "Disallow: /a/" and you are still getting errors from the validator, then look to your text-editor. Use the simplest one you have, like Windows NotePad. Some robots, but not Google, require a blank line at the end of the file.
I got it now. Indeed I didn't understand and I had literally added "<blank line>". I checked with the validator which verified that the code is correct. You've been a terrific help Jim. I'm deeply grateful.
I have downloaded my log file access_log-19.gz but it looks like it may be a zipped file. I thought windows xp came with an unzipping program but I don't think I have one. Can you recommend a good unzipping program that I can download (preferably free).
If google decides to re-indexes my site, would you know if it will maintain a no ranking or will it give me my former ranking back?
One more thing: can you recommend a site map builder that is not too complicated to use and not too expensive or even free?