I'm working with a sample CSV file that lists nursing home residents' DOBs and DODs. I used those fields to calculate their age at death, and now I'm trying to create a dictionary that "bins" their age at death into groups. I'd like the bins to be 1-25, 26-50, 51-75, and 76-100.
Is there a concise way to make a Dict(subject_id, age, age_bin) using "if... else" syntax? For example: (John, 76, "76-100"), (Moira, 58, "51-75").
So far I have:
#import modules
using CSV
using DataFrames
using Dates
# Open, read, write desired files
input_file = open("../data/FILE.csv", "r")
output_file = open("FILE_output.txt", "w")
# Use to later skip header line
file_flag = 0
for line in readlines(input_file)
if file_flag==0
global file_flag = 1
continue
end
# Define what each field in FILE corresponds to
line_array = split(line, ",")
subject_id = line_array[2]
gender = line_array[3]
date_of_birth = line_array[4]
date_of_death = line_array[5]
# Get yyyy-mm-dd only (first ten characters) from fields 4 and 5:
date_birth = date_of_birth[1:10]
date_death = date_of_death[1:10]
# Create DateFormat; use to calculate age
date_format = DateFormat("y-m-d")
age_days = Date(date_death, date_format) - Date(date_birth, date_format)
age_years = round(Dates.value(age_days)/365.25, digits=0)
# Use "if else" statement to determine values
keys = age_years
function values()
if age_years <= 25
return "0-25"
elseif age_years <= 50
return "26-50"
elseif age_years <= 75
return "51-75"
else age_years < 100
return "76-100"
end
end
values()
# Create desired dictionary
age_death_dict = Dict(zip(keys, values()))
end
Aucun commentaire:
Enregistrer un commentaire