Forum Moderators: open
1/ I am working on a beta version of my website and will upload it on a subdomain (beta.example.com) just to try it.
I do not want robots to crawl the site. If I put a robots.txt file in the root of the directory (with User-agent: * and Disallow: /), will robots ignore the site despite I have on each file a meta tag CONTENT="all".
2/ Second question. I have many tables within table in my webpages. Is it safe for robots when they strip all HTML tags.
Thank you
[edited by: tedster at 6:43 pm (utc) on Feb. 3, 2004]
2. If your table markup is valid, the content will get indexed by most engines. Your biggest concern would be a skipped or mangled tag -- table, row or cell.
If important text is nested deeply, it used to be devalued, but it was still indexed. I haven't taken a hard look at this issue in a few years, because I avoid nesting layout tables more than two levels deep. But I'd guess that deep nesting is not the problem it
once was. I'd still keep it minimal, because you just never know.
One of the issues with complex table layouts is that sometimes text that is visually connect when you view the page is not at all connected in the HTML - this can hurt proximity factors on multiple word searches.