Unicode Data File Error

Other discussions about CSPro
Forum rules
New release: CSPro 8.0
Post Reply
Pierre

Unicode Data File Error

Post by Pierre »

Hello,
I am wondering if anybody has come across problems with missing data because of unicode and write delay for large data files?

Here is an example

Code: Select all

(Corrupted)
D06384  2 1WED21AUG1351START-Stock 2BANANA    FRESH  1HEN "   3.0K 2.00  $   " ? $
D06384*** 1WED21AUG1251STAVT-Stock0 0  03DRY OCONUU    !    ?   $$`  0DRY   "3FRT3.1K 1.50"?    
D06384  4 1WED21AUG1351START-Suoc{    (    `?   ! 4COOKED KUMARA    ` ?    0    0  "? COOKED"1PLATE   5.00    ? 
D06384  5 1VED21AUG1351START-Stock 5TARO $  0 (   0   0  `(  `    FBESH  5FRTS   0(    3.0K********** 
(Original)
D06384  1 1WED21AUG1351START-Stock 1KUMARA                FRESH    5.0K            5.00 
D06384  2 1WED21AUG1351START-Stock 2BANANA                 FRESH   3.0K             2.00 
D06384  3 1WED21AUG1351START-Stock 3DRY COCONUT       DRY        2.1K            1.50 
D06384  4 1WED21AUG1351START-Stock 4COOKED KUMARA   COOKED 1      PLATE   5.00
D06384  5 1WED21AUG1351START-Stock 5TARO                    FRESH    5      FRTS    5.00 
The above is a section of a data file. It gets corrupted after a write delay or for some unknown reason. The only way is to fix it is to open the data file in a CSPro entry then change a data item value before the corruption then that will solve the problem.

The other way is to manually replace the corrupted lines with the originals. :x

My suggestion is that you if you are only dealing with ascii characters in your data file then keep the data files in ASCII.
Gregory Martin
Posts: 1777
Joined: December 5th, 2011, 11:27 pm
Location: Washington, DC

Re: Unicode Data File Error

Post by Gregory Martin »

Pierre,

Does this happen consistently? I've never seen anything like this, and I've run batch edits on gigabytes worth of data. Maybe this is only something with CSEntry and large files? In any case, this is definitely an error that we need to fix, if we are able to figure out why it is happening. Unfortunately, the "keep the data files in ASCII" suggestion forces people to use versions of CSPro before 5, and they won't get any of the new features of CSPro/CSEntry. Let me know if you have any hints about why this problem occurs.
Pierre

Re: Unicode Data File Error

Post by Pierre »

Hello Greg,
It never happened when the data files are stored on a hard drive and modified during entry but because of equipment logistics I was forced to use data saved on USB devices that caused write delays. I also never had a problem when running batch however, if the error is in the modified data file batch will give you record or ID item errors. I understand that Unicode is the way to go. But in my situation it was safer for me to keep the data files in ASCII because of the risk of shifting data if a corrupted UTF character is saved. So let me take my suggestion back.
"Only use ASCII if it is only necessary" =)
Honestly I think it is an isolated error and I am hoping that it is, because of all the development and testing that went into CSPro 5. Another thing (about the logistics of my situation), a few of the laptops being used for heads-up keying were infected with viruses and some of theses virus were infecting specifically USB thumb drives. So this might be another source of the problem but unlikely. If i can mimic the problem again, I will try to isolate it and send you the details.

Thanks,
Pierre
Gregory Martin
Posts: 1777
Joined: December 5th, 2011, 11:27 pm
Location: Washington, DC

Re: Unicode Data File Error

Post by Gregory Martin »

What's still strange, in my mind, is that any write lags should affect both ASCII and UTF-8 files, right? Wouldn't your ASCII files become corrupt? Or are you saying that even if just one character is affected, then the ramifications are more significant with a UTF-8 file than an ASCII file?
Pierre

Re: Unicode Data File Error

Post by Pierre »

Yes .. you are correct. This lag will affect both the ASCII and UTF files. But with ASCII only a few characters will be affected. Not a block of characters. So the ramifications will definitely be more significant in a UTF file.


Last bumped by Anonymous on April 16th, 2014, 3:12 am.
Post Reply