I have a file with a list of names, and several files which contain each some of those names. i want to create a database with the list of names as the first column, the list of files as the first row, and place zeros and ones for if the name exists in the file or not. example:
names.txt:
APHSKNCTA
YPPASTTKI
HPSFARFAL
file.txt:
1 theName APHSKNCTA
1 theName YPPASTTKI
file2.txt:
1 theName APHSKNCTA
1 theName HPSFARFAL
expected output:
file.txt file2.txt
APHSKNCTA 1 1
YPPASTTKI 1 0
HPSFARFAL 0 1
my code so far is here below, and it gives me the intersection of each file with the list of names, but now I don't know how to, from this, print the 0s and 1s at the place I want.
directory = r'/home/Run'
for entry in os.scandir(directory):
thefiles = set(open(entry).read().split())
namess = set(open("names.txt").read().split())
inter = thefiles.intersection(namess)
print(entry)
print(inter)
output:
<DirEntry 'file.txt'>
{'APHSKNCTA, YPPASTTKI'}
<DirEntry 'file2.txt'>
{'APHSKNCTA, HPSFARFAL'}
I suppose I have to do an if statement, but is not as " if thefiles.intersection(sbpeps) = TRUE " worked that way. does any one have an idea or just some direction to point me to? i'm lost and stuck on this for not knowing what to search for. thank you!
Aucun commentaire:
Enregistrer un commentaire