Forum Moderators: coopster
Anyways, when I type into the client program a URL that has a php query such as: [exampleAddress.com...] , the client often reports a html file from the server at www.exampleAddress.com that has the "404 index.php not found" error
Depending on the web site, I will either get that error code or a 301 moved permanently error. Depending on the query and the site , I might just get a form page returned from the server saying that it is set up properly. That is really weird.
Initially I wondered if I was typing the URLs correctly. By this I mean , was I taking into account the & representation of & symbol in the url. Or the %3f representation of? character. Anyways, that does not seem to be the problem, as I have tried all combinations. I found one website that runs off of php queries where my client worked fine with by entering the url without any special & or %3f stuff at all.
Question 1:
When is that & and %3f needed. Do you do that for all the sympbols including the / character and? character that comes before the "index.php" part of the url. What about that period in index.php? Why is it that sometimes I see in various websites source the & or the & represention of the ampersand in the HREF tags? It seems very inconsistent to me.
Question 2:
This is the more important question, and why I am making this post. Why might my client have such a tough time with the majority of websites that have php in their url's?
Question 3:
What is the proper "Content-type:" I should designate with my GET action? Is any Content-type neccesary to specify? Precisely how should it read? Like :
Content-type: application/x-www-form-urlencoded ?
Or one of these:
application/x-httpd-php .php .php3 .html?
Which one precisely? And if it is say the .php choice
would I write it appication/x-.php precisely?
Any ideas. I am stuck.
Thanks.
Hm, some tough questions!
I'll try 3 first: normally php files will give content of the type text/html . That said, this is only the default for php - php can be made to dish out lots of different content types. But as content types go in general, it can only dish them out properly when issuing the proper server headers. So the safest thing to do would be to see what the server headers are in the 'Content-Type' category, and base your client's reaction on that.
1. HTML is just another language that has its own way of coding things and escaping things that otherwise would have special meanings. The 'standard' and safest way of putting an & in your html is, I believe, to use & , since this is the symbol for &, and & is a 'special character' and in combination with other things can mean other characters, which could confuse browsers if these other characters are found after the &. But in the address box in the browser, and inside php itself, the & is used alone, since this is no longer a part of an html file or in html coding. I'd guess that the & is not the issue.
2. I have really no idea. Is this also happening in multi-parameter url's in jsp, cfm, or asp? Just an initial question - I don't think I'm likely to have any meaningful answers past this stage, but others might.
Good luck with your browser client thingie.
Question 2:
I have no idea :(, it shouldn't. The only thing I can think of is that your client is probably not hadling the reponse headers well.
Question3:
As the name suggests Content-type specifies, um ... the content type. The client sends it to help the server process the information correctly. Similarly the server sends it to help the client.
E.g. If the server sends "Content-type: application/zip" then my browser (Mozilla) opens a file download box.
If you give us more details as to how your client works, we could help you better.
Saurabh.
For background,
I am not utlizing the standard java URL class to handle the Url's.
Instead , my client program communicates to the server utilizing a Printstream object that was created from a Socket object's Outputstream object that links the client program to the server.
If I use the URL class then it works.
I would rather not use the URL class.
I see no reason why I should not be able to do all the communication over the Socket object's Outputstream via the Prinstream created from it.
The strange is for one website where the client program does not work, is that if I substitute a %3f for the? in the example: [exampleAddress.com...]
so that rather than sending this:
GET /index.php?var1=red&var2=blue HTTP/1.1 \r\n
I send
GET /index.php%3fvar1=red&var2=blue HTTP/1.1 \r\n
Then I get the server saying:
404 Not Found
The requested URL /index.php?var1=red&var2=blue was not found on this server
whereas if I did not substitute the? with the %3f as in:
GET /index.php?var1=red&var2=blue HTTP/1.1 \r\n
Then it says:
404 Not Found
The requested URL /index.php was not found on this server.
Any other substitution of the = or & characters does not help matters out.
So I wonder what is up with that.
______________________
But the amazing thing is in this case from above:
GET /index.php%3fvar1=red&var2=blue HTTP/1.1 \r\n
Where I get the server saying:
404 Not Found
The requested URL /index.php?var1=red&var2=blue was not found on this server
well, that is exactly the URL I wanted it to find, and that it indead will serve up that query on a standard web browser or if I were to have used the URL class provided with java!
So maybe the problem is somewhere in how the server is processing my GET command. Or maybe how that GET looks on the output stream either in java or in the native implementation of java. Or maybe with the other info I provide in the header. It seems I must at least provide with the header the line:
Host: _clientHostIpaddress__ otherwise I get a 400 error returned back from the server. If I include at least that info, I get then get the 404 errors coming back from the server as described above.
Thanks.
gives this as some of the output lines of the page returned back from the server:
HTTP/1.1 404 Not Found
<p>The requested URL /index.php was not found on this server.</p>
I get this type of response from more than a single site. Depending on the site, instead of a 404, it might be a 301 Moved Permanently or 302 Moved Temporarily Error.
Again, if I utilize a standard web browser obviously such a query works , and likewise if I utilize the URL class provided with java, it works. I really don't understand this.
The main issue is that many websites are virtually served. By this it is meant that multiple internet names such as www.fictitiousName1 and www.fictitiousName2 can be mapped to a single physical computer which only has a single numerical ip address in the form of number1.number2.number3.number4 where "number" is between 0-255.
It seem to me the HTTP/1.1 relies on the "Host: " header to map the URL the client requests to the virtually hosted location on the server with its single physical ip address.
The mistake I made, which lead to the varying symptoms described above, is that my client sent in its header to the server under the "Host: " portion of the header, the ip address which is a number.
So after my GET line, I had a line in the header such as:
Host: number1.number2.number3.number4
If my client had been written correctly, it would have sent the words of the URL's website name, such as www.fictitiousName1.com instead of that website name translated to its numeric ip address.
I should have had the line written as:
Host: www.fictitiousName1.com
Since my program incorrectly sent the numeric ip address, the server could not map the virtually hosted website, and serve up the appropriate page back to the client. This explains all the varying behavior. It seems a virtually hosted websites' server relies on the "Host: " header to react appropriately to client
request for a page.
Thanks lazydog for the exact examples.
Such a complexity of behavior exhibited by such a mistake.....
"Location: "?
So I guess if I understand correctly , "Location: " would list the new location . And I would create a header with the clients "Host: " line in the header containing the info from the "Location: "?
______________
Back to my original problem, when I did ever receive the 302 message and maybe an accompanying Location: header, well that new Location would just be some garbage because it never understood my original request because my client never appropriately filled out the "Host: " header line. So the redirection was meaningless. There was nothing appropriate to follow. The issue was not that the original page got moved anywhere.