homepage Welcome to WebmasterWorld Guest from 23.23.12.202
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Pubcon Platinum Sponsor 2014
Home / Forums Index / Google / Google News Archive
Forum Library, Charter, Moderator: open

Google News Archive Forum

    
Google and PHP
Works PHP in Google
Kewe

10+ Year Member



 
Msg#: 25582 posted 7:06 am on Sep 3, 2004 (gmt 0)

Hi,

I was reading at the forum and found this link:
[google.com...]

And I saw the last sentence:

At Google, we are able to index most types of pages and files with very few exceptions. File types we are able to index include: pdf, asp, jsp, hdml, shtml, xml, cfm, doc, xls, ppt, rtf, wks, lwp, wri, swf.

My new site has 100 pages php. And .php is not include in this text. Could somebody tell me more about this?

I never have used php before.

Kewe.

 

tantalus

10+ Year Member



 
Msg#: 25582 posted 11:21 am on Sep 3, 2004 (gmt 0)

That's interesting that PHP is omitted.

I don't use PHP however Google does index PHP pages, if they suddenly stopped this is the first place we'd know about.

Beware PHP sessions google and other bots cannot index these.

Receptional Andy



 
Msg#: 25582 posted 11:25 am on Sep 3, 2004 (gmt 0)

Google are only listing the more unusual filetypes in that list - they don't include html, htm, asp or php for instance, but they certainly support all of those ;)

charlier

10+ Year Member



 
Msg#: 25582 posted 11:48 am on Sep 3, 2004 (gmt 0)

Php, asp etc are all considered to be html file types as that is the type of output which they send to the browser (or spider) ie a mime type of text/html.

Hugene

10+ Year Member



 
Msg#: 25582 posted 2:41 pm on Sep 3, 2004 (gmt 0)

My site is all php and gets indexed no prob.

I remeber over a year ago, reading on Google that their crawler might have trouble with urls containing? (as many php files do in order to pass variables). But my pages with? get indexed (not all though)

I just looked on Google ofr a few minutes, couldn't find anything about the? Maybe they fixed that.

trillianjedi

WebmasterWorld Senior Member trillianjedi us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 25582 posted 2:48 pm on Sep 3, 2004 (gmt 0)

Receptional Andy points out:-

they don't include html, htm

So I think his conclusion that the list only contains the more unusual flavours of file extension is pretty accurate.....

TJ

yangtao72

10+ Year Member



 
Msg#: 25582 posted 2:57 pm on Sep 3, 2004 (gmt 0)

I'm interested in if google like 'aspx', a lots of sites would switch to .NET and many .NET program generate <form> to client even it's not really a typical FORM but just some .NET server site controls.
I don't think PHP has problem. You know there's a eCompare site, they use PHP&MYSQL build their site and they have many affiliates sites as partners. Beside the top of these pages are different , all content are same. so in addition to pages from www.thisecommercedomain.com were indexed by google , tons of pages from affiliatessitesdomain.thisecommercedomain.com were also indexed by google. So when you search related keyword on Google, WAO....., excellent rank of course. I traced this site's pages indexed by google for years, the number increase to 1.5 million from 0.4 million just these 6 months. So you know, Google is very general to some sites but very? to others ....
It's out of '.PHP' issue!

Airportibo

10+ Year Member



 
Msg#: 25582 posted 3:34 pm on Sep 3, 2004 (gmt 0)

Google defenitely doesn't have a problem with php. But to be on the very, very safe side, you could just tell your webserver to use rewrite rules. So every .php can also be accessed as .html

Receptional Andy



 
Msg#: 25582 posted 3:46 pm on Sep 3, 2004 (gmt 0)

I'm interested in if google like 'aspx'

An easy way to check is just to see how many urls Google has containing whichever file extension you're interested in.

[google.com...]

30,000,000 results with 'aspx' - I would guess that it isn't a problem ;)

Kewe

10+ Year Member



 
Msg#: 25582 posted 7:53 am on Sep 4, 2004 (gmt 0)

Thanks for all the reactions on my Topic.
I'm new on this forum but in the last few days I've learned a lot.

I think I rewrite rules. So every .php can also be accessed as .html.
I want to test it also, just incase.

Kewe.

Airportibo

10+ Year Member



 
Msg#: 25582 posted 9:59 pm on Sep 4, 2004 (gmt 0)

...and when you're already implementing rewrite rules, just get rid of the cgi-parameters as well (if you're using any). It'll be worth it :-)

cayleyv

10+ Year Member



 
Msg#: 25582 posted 12:30 am on Sep 5, 2004 (gmt 0)

I promise you Google can index .php - interesting, they seem to believe its so common, its not in their extension list (.html and variants are missing)

they also dont list .py Python scripting which some of their own site is written in. They can index their own site.

vincevincevince

WebmasterWorld Senior Member vincevincevince us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 25582 posted 1:30 pm on Sep 5, 2004 (gmt 0)

As google guy said quite a while ago (lost the link) avoid arguments with ID in the name as these are taken to indicate sessions. Otherwise you'll be fine.

i.e.

OUT:
test.foo/?PHP_SESSID=a56df1s
test.foo/?pageid=345

IN:
test.foo/?whoami=a56df1s
test.foo/?whichpage=345

Also, from my own observations:

Avoid?search=, give each page a unique <title> or <h1> at minimum.

ciml

WebmasterWorld Senior Member ciml us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 25582 posted 1:55 pm on Sep 5, 2004 (gmt 0)

A while ago Google changed behaviour; it will now include unknown extensions. So as well as new-ish common extensions such as .aspx, you can have .foo, .bar or whatever.

Also, you can have something/index.php, index.shtml, index.asp, etc, which can be different from something/

You can have something/Index.html but not something/index.html, something/index.HTML but not something/index.htm

rainborick

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 25582 posted 2:52 pm on Sep 5, 2004 (gmt 0)

As long as the page is sent back to Google with the header information "Content-Type: text/html", Google will treat the page as HTML and index it normally. .php, .asp, .cgi, .jsp, etc. all typically generate this header so that browsers (and search engines) will accept the output regardless of the filename extension.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google News Archive
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved