mutilevel dictionary

Discussions about editing and cleaning data
Post Reply
AriSilva
Posts: 369
Joined: July 22nd, 2016, 3:55 pm

mutilevel dictionary

Post by AriSilva » June 27th, 2020, 8:06 am

Is there a paper on working with multilevel dictionaries?
I´ve never done that before, all the dictionaries I´ve worked with had a single level, even if the multilevel was "embedded", or hidden, like in a dwelling/household/person structure.
What I´ve used to do was having three variables, one for each record type, occupying the same position in their specific records, to store the household number. So, the dwelling record would have a dummy_variable (not needed, just to maintain or reserve that space for sorting purposes, filling it with "00" for instance, supposing we use 2 positions for the household), then another H_HHNUMBER in the household record, and then a third variable in the person record, P_HHNUMBER.
By doing that, when processing the questionnaire, for a dwelling having more than one household, we could tie the persons to their household, and loop thru them to do some structure checking, like more that one head of the household, etc.
Now I´m starting to play with a multilevel record, and define the first level as the dwelling, and the second level as the household/person.
By doing that, the household number is the ID of the second level, and I do not have it mirrored in the two record types. So, in a batch program, as the whole questionnaire (case) in memory, how can I tie the persons to their household? I would still need another variable to do the trick, or else?
So, how the multilevel structure helps me on that?

Besides that, I do not know what I´m doing wrong, but the count and totocc seem not to be working properly, both get 0 for the fist case in the file, and then get 1 for the rest of the cases. See attached file.
Attachments
gtm_74.rar
(288.4 KiB) Downloaded 8 times
Best
Ari

khurshid.arshad
Posts: 423
Joined: July 9th, 2012, 11:32 am
Location: Islamabad, Pakistan

Re: mutilevel dictionary

Post by khurshid.arshad » June 28th, 2020, 5:21 am

You can find two level application in example folder

Help->Example Folder-> 1 - Data Entry->Two Level Application

Please check this link or you can find more information for multi level dictionaries on this portal.

viewtopic.php?f=1&t=2183&hilit=khurshid.arshad


Best.
a.

AriSilva
Posts: 369
Joined: July 22nd, 2016, 3:55 pm

Re: mutilevel dictionary

Post by AriSilva » June 28th, 2020, 7:29 am

Hi Khurshid,
Thank you for your kind answer, but the title of this issue might have misled you on my concerns about multileveling. It is not in the data entry part of it, but it is in the data editing phase I´m worried about.
I´ve looked at the link in the help->examples and yours, but both are focused in the data entry.
I want to know how to deal with the problem of linking the persons to their households.

And,as I mentioned at the end of my post, what is happening with the totocc and count functions when using a multilevel dictionary?
Best
Ari

Gregory Martin
Posts: 1330
Joined: December 5th, 2011, 11:27 pm
Location: Washington, DC

Re: mutilevel dictionary

Post by Gregory Martin » June 29th, 2020, 9:24 am

We (at the Census Bureau) nearly always recommend against using multiple level dictionaries. They are used widely in DHS surveys and other applications, but there are limitations to their use. Because we never use them, we don't encounter the bugs that might exist related to their use, and so there is a chance that the totocc/count issue that you're seeing is real.

If you want us to look at it, send us your batch application so we can see how you're using those functions.

AriSilva
Posts: 369
Joined: July 22nd, 2016, 3:55 pm

Re: mutilevel dictionary

Post by AriSilva » June 29th, 2020, 2:45 pm

I´ve never used the multilevel myself. As I wrote at the beginning of the issue, even if I have a dictionary with the traditional dwelling(1), household(many), person(many), we avoid the multilevel approach.
The reason I´m starting to "get acquainted" with the subject is to use it to generate redatam databases from the csdb file. And thanks for this invention of yours, we think we can manage the problem, thanks also to Josh´s help, but we need to define the dictionary using the multilevel facility, if there is this kind of n-n relationship, otherwise the persons, in redatam, will not belong to their appropriate houses. When the relation is 1-n there is no problem.
The attached application is just an example to get the totocc and count. I duplicated them at several levels, vivienda, hogar and persona to see what was going on. The person counts seem good, they reflect the number of persons in each household. I´ve looked at a couple of records in data viewer and they seem to be right.
However, the counts for household are always 1, no matter where I put them, and the counts at the vivienda level (the first one) seems to be reflecting the last questionnaire.

My conclusion is that we are able to use the count or totocc at the person level for each household (I did not loop thru the persons to see if I would get the right data), but I could not see the same result for the households, that is, looping thru them, since the counts seems to be always 1.
Attachments
conta_levels.rar
(333.61 KiB) Downloaded 6 times
Best
Ari

Gregory Martin
Posts: 1330
Joined: December 5th, 2011, 11:27 pm
Location: Washington, DC

Re: mutilevel dictionary

Post by Gregory Martin » June 29th, 2020, 5:35 pm

This is a good illustration of why to avoid multiple-level applications. You shouldn't be able to do this from your first level:
numeric hh4 = totocc(hogar);
You'll see that you get a compiler error if you do something like this at that same point of code:
PLG1 = 1; // error: Variable belongs to a record at a lower level
So the compiler is smart about some things (items) but not others (records). At some point we may make the compiler smarter to handle multiple levels, but it's not a high priority now.

Prior to CSPro 7.0, each level was read into memory as it was handled. So if you had levels A, A1, A2, when you were processing the A level, the second levels A1 and A2 weren't even in memory, which is why you couldn't refer to any data on a level until it was loaded into memory.

Starting with CSPro 7.0, the entire case is read from the disk at once, but it is still processed in a level-by-level fashion as if each case node were being read from the disk separately.

The best rule when working with multiple level applications is to only refer to items/records at the same or high level as the level of the PROC.

Post Reply