BOM-problems (UTF-8)
Posted: May 8th, 2014, 10:12 am
I'm having troubles with the Byte Order Mark (BOM) in the beginning of the data files: I like to use other tools together with CSPro to manipulate my data files, but many text edititing tools don't like the BOM. And according to Wikipedia, BOM is not needed for UTF-8 (CSPro data files are UTF-8). I have found a tool to remove it, and it seems that CSPro can still read the files, but my question is: Is it OK to remove it, or will I run into problems later?
(btw: Wikipedia says: "The Unicode Standard allows that the BOM "can serve as signature for UTF-8 encoded text where the character set is unmarked".[42] Some software developers have adopted it for other encodings, including UTF-8, in an attempt to distinguish UTF-8 from local 8-bit code pages. However RFC 3629, the UTF-8 standard, recommends that byte order marks be forbidden in protocols using UTF-8..." Why did US Census Bureau go against the recommendations in this case?)
Anne
(btw: Wikipedia says: "The Unicode Standard allows that the BOM "can serve as signature for UTF-8 encoded text where the character set is unmarked".[42] Some software developers have adopted it for other encodings, including UTF-8, in an attempt to distinguish UTF-8 from local 8-bit code pages. However RFC 3629, the UTF-8 standard, recommends that byte order marks be forbidden in protocols using UTF-8..." Why did US Census Bureau go against the recommendations in this case?)
Anne