Wednesday, September 1. 2010
I am still working on the feature to mark some entries as mandatory. I have changed the table so I can now define a style for each cell of the table. And I have started to add the functions to define, how to mark the fields in the table. There is "only" one thing missing, using that information when creating the table. The current code to create the table has to be changed, it is not possible to implement all the features. I thought of another way I will try, which should also make it possible to put more than one form-field into a line.
Sunday, August 29. 2010
The next thing I want to add to the table containing the forms is the usual way to mark a field as mandatory (often a small asterisk). I don't want to hard code that symbol, though, instead I will supply the code with a string to display and where to put it. But to format it, I need some way to specify an id or a class so I can use CSS.
And there a longer tangent started today. I have intended to add that feature to the table right from the start, but didn't come up with a good way to do it properly. Today I found one, I have created a new class which stores the id, the list of classes and the direct styles for an element. That class can then be used to specify the style of an HTML-element and also format the attributes. I have implemented that class and also changed the class for the table so, that I can specify a style for the table, one for the table head (if it is used) and one per table row. What is still missing is, to specify a style per table entry. After that is added, I can add the functions to mark form entries as mandatory in an extra column. Quite a lot of work for a small feature, but I consider the style class a very important part of my implementation, as I expect it to be useful later.
Saturday, August 28. 2010
The function to turn the form data into a table is now mostly working. Most of the options are in the program, I can handle normal text fields and password fields, I know how to handle default values and I can set the size of the text fields. Also the previously entered values can be written to the table when the data has to be presented again. As a part of this I have also implemented a function to turn UCS-4 into UTF-8 data.
For the text fields I am only missing two more features. First I want to be able to add the little star to mark a field as mandatory and second the data being returned to the user is not sanitized. Currently it would be very easy to break out of the form and inject arbitrary code into the site. I will have to add some function to fix that problem, but I haven't yet seen a complete set of rules what to do.
Then all what is missing are more handlers for the forms. E.g. for check boxes or options. Also additional buttons can be quite useful and I would like to be able to add javascript event handlers and CSS-styles. And finally I would like to be able to tell the code, that some form fields are supposed to be in one line of the table without having to specify the full layout.
Sunday, August 22. 2010
I have started to implement the function to turn the form description into a table (containing the form). I will add some data to make it possible to handle more cases, but I will only handle creation of very simple tables. If something more complex is needed, it is possible to create the page by hand. The next issue is to add a function to turn UCS-4 strings into UTF-8 strings. Then I should be able to create very, very simple forms. I will have to add more classes to represent the different options possible in forms, though.
Saturday, August 21. 2010
I swapped the implementations around, instead of first writing the function to turn the data into a table, I have implemented the test server. By using parts from the last one I could quickly add the necessary data. I found and fixed a number of problems and it now seems to work. I might change the way the form data is stored because I have the feeling I do too much copying when evaluating the data. Also I still have to implement the function to turn the form data into a table.
Sunday, August 15. 2010
The code is slowly progressing. I have the evaluation implemented, the code now checks, if the values from the form are acceptable. Error reporting is still rather bad, though at least it exists. I also found out, that I won't need a second class to also create a table based form from the data, a simple function in the class I currently write is sufficient. That way I can put arbitrary code before and after the table without any trouble. The next step will be to write that function and then implement the next test server to try this abstraction.
Saturday, August 14. 2010
I have continued with the class to aid in using forms. The easy parts are finished and so I have started with the hard part. I want to have a series of small objects I can use to define, what key-value pairs the form contains. The class is then supposed to scan the body and get the values and supply it to the function handling the POST-request.
I have defined a base class for those small objects, I intend to implement a number of classes to handle different types of values appearing in a form. I have started with a class, which expects strings or to be correct UCS-4 strings. I also will implement classes which check, that the option which was sent is valid or which convert a string into a number with optional checks if the value is within supplied bounds.
The class handling the forms will check, if the values it received match the specification. It will handle the case, where a key is missing, is sent multiple times or where unexpected keys are sent by rejecting the request. I will later also implement a second class using the specification to create the form.
Saturday, August 7. 2010
After quite a lot of tries I think I finally have found all the issues with the conversion of the form data to UCS-4. There were quite a number of stupid bugs in the conversion function. I am considering to add a second function to convert it just to UTF-8. I need both, because as soon as I start to work with the strings, UCS-4 is easier to handle, but for strings I only receive and send, UTF-8 is sufficient and needs less memory in most cases. UTF-8 is a bit less efficient when receiving, as I first have to decode the data (and build the UCS-4 representation of each character) to check, if the received data is valid UTF-8 and then encode it back to UTF-8.
Sunday, August 1. 2010
I tried to find out, if the decoding works and can now say for sure that it doesn't. I still didn't get it to print the values, but the string length of the converted form parameters is always 1 and that is not correct. I will try to debug the decoding function and find out, what is wrong.
Saturday, July 31. 2010
Not much to write, though it was not easy to implement. I also added decoding of the UTF-8 encoding so now I should be able to actually read (and understand) the forms. There only is one problem, I can't check if the decoding works. I tried to output the decoded values which failed completely and also the debugger only shows some error message and doesn't seem to be willing to show me the values. So for now I hope it works correctly and continue as if it would.
Sunday, July 18. 2010
Today I wrote the conversion function from a URL-encoded string to an ISO 8859 string. I needed quite some code to not only convert, but also check the conversion. I want to be able to reject wrongly encoded strings, e.g. strings containing 0-bytes (which make problems in C++-strings) or control characters. The code can handle control characters, but only when instructed to do so.
I have also started to look into decoding UTF-8 and that is even worse. There are quite a lot of rules you have to obey. Also the IETF choose to modify UTF-8 a bit, especially they limited the length of characters to 21bits (instead of the 32 Unicode allows). And it is mandatory to detect and reject invalid UTF-8 strings for security reasons. But as I need to decode it, I will have to implement it.
Saturday, July 17. 2010
Today I didn't get much done, but at least I added a handler to the page receiving the post page to answer to GET- or HEAD-requests by redirecting the client to the form input page. So I can now redirect the client and I have also tested, if I am able to use a frequent pattern, where one handler (or page, which is the same in my system) formats the form and also receives the sent data. That way you can check the data and should it be unacceptable directly present the page again, containing the data and indicting what was wrong.
The hardest part here is security, you can't just push the data you received out to the user or you open up a big hole, someone could have used some form of CSRF (cross site request forgery) using the form to make the clients browser execute arbitrary script code in your sites context. So the input has to be sanitized before presenting it back to the user. You might think, that only accepting POST-requests and not using the parameters of GET-requests makes CSRF impossible, but it is fairly easy to set up a simple script which turns a GET-request with parameters into a POST-request, though in that case no cookies should be transmitted. But I am not 100% sure, you can't get the browser to send a POST-request, so it is probably better to make sure, the output is properly sanitized and add code to make it easy to not get it wrong.
Sunday, July 11. 2010
The parsing of the body of a POST-request is now also done. After the body has been parsed and therefor split into key/value pairs, the body is attached to the request and sent to the normal handling functions. The parsing seems to work, at least for now. As I also tried with some characters not in the ASCII-set I discovered, that the browser was sending characters in ISO 8859-1 encoding and wondered, why it would do that.
I discovered, that I had forgotten to add a small detail, I could not set the content type in the standard and stored response. So the browser was probably guessing that the returned data might be HTML and then set the character encoding to the default value, 8859-1. I added some code so I have to supply the content-type. I have also added a way to specify the encoding, though for the content-type of text/html (the content-type telling the browser to handle the data as HTML) a default of UTF-8 is used, if no encoding is specified. I think, that UTF-8 is a much better default than 8859-1, as it at least can handle arbitrary characters, whereas 8859-1 is much more limited. I don't think, that the few bytes you have to transmit more when using a character >127 is worth the hassle with 8859-1 when you suddenly discover, that you need characters not contained in 8859-1 and switch to a different encoding.
After that change the data did indeed arrive as UTF-8 encoded values. Now I have the next problem, the form data has to be decoded. When a character not contained in ASCII or a control character is transmitted, it is encoded as hex-numbers, in 8859-1 all characters above 127 are sent as one byte in hex notation, in UTF-8 as multiple bytes. So to correctly decode the text I need to know the character set and that is not available at the low level I operate. So the decoding has to be postponed until the handler of the information supplies the encoding and queries the values.
Saturday, July 10. 2010
Today I continued with implementing the post method. I thought about how to store the request body, for some reasons I don't want to add a second kind of request-object. And I don't think I need it, either. I added a new base class defining an interface to objects containing the body. As the most important use is for HTML form-elements, I started with the class for that. The code now already parses the headers and when it finds a "content-type application/x-www-form-urlencoded" it creates an object of the new class.
I have also defined an interface to send data to this object and implemented sending the contents. What is still missing is the code to parse the data and to connect the request body object with the request object. When that is done, my test implementation should finally get the data sent by the browser when I submit a simple form.
What I still need to test is, if I can get the browser to send as "content-type multipart/form-data", the description of HTML seems to suggest that I can get it to use it for regular forms when I get the browser to assume, that ASCII is not a good form to send the data. I will see, if that is possible, because then I can also implement that part. I don't want to implement uploading data yet, I first want the simpler parts to work.
Sunday, July 4. 2010
Originally I intended to implement the next part of the function to finally receive the post requests. But I got stuck again, as the error responses still didn't work. I debugged the program for quite a while and then was able to fix the bugs. But the result was still not satisfactory and I found out, that at least Firefox doesn't like it, when a response to a POST (and I guess also a GET)-request only has an error status, but no content. So I added some content and now finally the errors created by my program show up. They still look rather bland, but should be good enough.
So next is actually implementing the post-request. I will have to see, how to implement it, currently I lean into the direction to add a new request-class for form-data. I think it makes sense to directly parse the data while receiving it.
|