vendredi 23 juillet 2021

Recoded a variable with if_else and mutate function, however it creates more answers than should be possible

A sample of my dataset: https://i.stack.imgur.com/AYLAg.png

I wish to recode a variable which I want to call nr-axspa. This variable is a diagnosis, a subform of spondyloarthritis, which is a rheumatic disease. The diagnosis can be inferred based on the classification criteria of the ASAS and New York criteria.

If the patients has 1 in ASAS but 0 in New York, then he/she has nr-axspa, otherwise not (in that case it is r-axspa). I recoded everyone with nr-axspa to "1", everyone without nr-axspa to "0" and some that have ASAS 0 and New York 0 to "2". This is the code I used:

df_nr_axspa <- mutate(df, nr_axspa = if_else(asas_criteria == 0 & new_york_criteria == 0, 2, 
                                             if_else(asas_criteria == 1 & new_york_criteria == 0, 1, 0)))

Interestingly enough, when I look at summary(df_nr_axspa$nr_axspa) I find that there are 1596 patients with a diagnosis. However, I would have expected there to be only 1434 cases.

When I create a 2x2 table of ASAS criteria and New York criteria, it gives me these numbers:

<table>
<tbody>
<tr>
<td>&nbsp;</td>
<td>New York</td>
<td>&nbsp;</td>
</tr>
<tr>
<td>ASAS</td>
<td>0</td>
<td>1</td>
</tr>
<tr>
<td>0</td>
<td>20</td>
<td>50</td>
</tr>
<tr>
<td>1</td>
<td>372</td>
<td>992</td>
</tr>
</tbody>
</table>

So according to this table, there should be 20 patients without a diagnosis or group "2", 372 patients with diagnosis "1" or "nr-axspa" and 1042 patients with "0" or "r-axspa".

However, the newly coded variable has a frequency of 372 for "1", 20 for "2" but 1204 for "0". So the group "1" and group "2" have been classified correctly, but of group "0" we have suddendly a surplus of 162 patients with this diagnosis.

The code I used to determine the frequencies of the newly coded variable

describe(df_nr_ax_spa$nr_ax_spa)

So I am trying to figure out what the hell happened. When I look at the data manually, I can't seem to find any mistake in the way the new variable is coded. Does anyone have an explanation?

Thanks in advance!

Aucun commentaire:

Enregistrer un commentaire