My web application does screen scraping to do some work.

One Monday, the returned page started to contain CR+LF+someHex+CR+LF. When I use cURL all is just fine but when I used Internet Command provided by 4D this sets of characters (CR+LF+someHex+CR+LF) was randomly inserted to returned HTML page.

What I end up finding out was that:

1. the web site I was accessing must have enabled chunked transfer encoding on their web server on “One Monday” or over the weekend.

2. The chunked encoding modifies the body of a message in order to transfer it as a series of chunks

3. chunked body contains chunk size which is represented in HEX (ah…) and the way I was receiving this page did not understand this encoding…

Solution?

Well I could re-write this with cURL that’s the best way to go and I will do that when I have a chance.

Since chunked encoding is only available in HTTP version 1.1. changing the version to 1.0 in request header solved this problem for now…