I'm writing some code to create a list of predecessor files based on a file name or directory name, but can only get only get the list to work on file name OR directory name, not both. Each if
statement works on its own, but the script does not complete when I ask it to look for either using elif
.
Our file names are built like this:
[prefix]_[activeID]_[parentID]
where activeID identifies the file itself, and parentID identifies its immediate predecessor.
I have every file that could be a parent written to a dictionary (pdict), with its full path as the key, and its parts (e.g. activeID, parentID) written to different positions.
The for
loop should do the following:
For each item on the list it finds the file directly before it. Uses the dictionary created earlier. Each item in the lineageList is also a key in the dictionary, so we pull out the associated parentID (fparentID) and match it to the activeID in the larger list (kactiveID) OR the beginning of the file directory (kfileDirName) and add those file names to the lineageList. The process then repeats with the newly added file(s), hopefully.
for f in lineageList:
fparentID=pdict[f][4]
for key, value in pdict.iteritems():
kactiveID=pdict[key][3]
kfileDirName=pdict[key][5]
# Add a file to the list if its activeID matches the current parentID.
if kactiveID==fparentID:
lineageList.append(key)
# If the current parentID isn't in the file name, add a file to the list if the directory starts with the parentID.
elif kfileDirName.startswith(fparentID):
lineageList.append(key)
As mentioned, I can get a list with any keys whose file names match the criteria (by commenting out the elif
) or of any keys whose directories match the criteria (by making the elif
the primary if
statement and commenting out the other), but I cannot get it to work with both, regardless of order. The script just continues running, and has to be forced to quit. I am using PyScripter, and the bottom left corner will saying "Running" at the beginning, but that will eventually disappear but it continues to run until I stop it. Both execute within seconds when done separately.
Full code below:
import os
def getInput():
# Tkinter for GUI file selection
from Tkinter import Tk
from tkFileDialog import *
root = Tk()
root.withdraw()
# input file, though changed to be called infile just in case
infile = askopenfilename(parent=root,title="Choose a file.")
return infile
def getLineageList(infile):
# Tkinter didn't like to return file paths with the same slashes, so it broke everything.
inFullPath = str(infile.lower()).replace("/","\\")
# Parse by backslash, (basically directory) and list in reverse order, with the file itself being first in the list, then its directory, and so on up the tree.
inFullPartsRL=list(reversed(inFullPath.rsplit("\\",5)))
# The path to the directory that the file resides in (not used?)
inFileDir = inFullPath.rsplit("\\",1)[0]
# The name of the input file, minus the extension.
inFileName= inFullPartsRL[0].rsplit(".",1)[0]
# A list of all parts of the file name (sans extension), split by an underscore.
inFileParts=inFileName.split("_",5)
# The filename parts
prefix= inFileParts[0]
# This might get us to our HUC folder
# HUCdir = "N:\\Wetlands\\HSSD\\HUC10\\"+prefix
HUCdir = "C:\\svn\\FileFinder\\"+prefix
import os.path
# A function to split every file into its parts.
def parsefile(pfile):
# All parts of path, separated by "\\", up to 5 parts.
pinFullPartsRL=list(reversed(pinFullPath.rsplit("\\",5)))
# The first position in the above list of path pieces - the file name - with extension removed.
pinFileName= pinFullPartsRL[0].rsplit(".",1)[0]
# The file name itself, split into as many as 5 parts.
pinFileParts=pinFileName.split("_",5)
# The prefix, or 1st position on the left-hand side of the file name.
pprefix= pinFileParts[0]
# The directory in which the file resides:
pinFileDirName = pinFullPartsRL[1]
# A way to deal with parts that may or may not exist - the active and parent IDs.
try:
pactiveID= pinFileParts[1]
except IndexError:
pactiveID="none"
try:
pparentID= pinFileParts[2]
except IndexError:
pparentID="none"
# Stores all of the parts in a list (written later to a dictionary)
plist=[pfile, pinFileName, pprefix, pactiveID, pparentID, pinFileDirName]
return plist
# An empty list in which to store all the files in the given directory.
HUCfilelist=[]
# The piece that creates the file list (path and name) of all files in the HUC directory.
for dirpath, dirnames, filenames in os.walk(HUCdir):
for f in filenames:
HUCfilelist.append(os.path.join(dirpath.lower(),f.lower()))
pdict={}
# The part that pulls apart every file in the list. Tried assigning a number as a key but the filepath seemed like a better idea because the value in one pair can be the key in another.
for f in HUCfilelist:
pinFullPath = f # Create variable for parsefile function to use (pinFullPath is in function, so set input (f) as that variable.)
plist= parsefile(f)
pdict[f]=(plist)
# print pdict
# Starts the list with the input file.
lineageList=[inFullPath]
# For each item on the list it finds the file directly before it. Uses the dictionary created earlier. Each item in the lineageList is also a key in the dictionary, so we pull out the associated parentID (fparentID) and match it to the activeID in the larger list (kactiveID) OR the beginning of the file directory (kfileDirName) and add those file names to the lineageList. The process then repeats with the newly added file(s), hopefully.
for f in lineageList:
fparentID=pdict[f][4]
for key, value in pdict.iteritems():
kactiveID=pdict[key][3]
kfileDirName=pdict[key][5]
# Add a file to the list if its activeID matches the current parentID.
if kactiveID==fparentID:
lineageList.append(key)
# If the current parentID isn't in the file name, add a file to the list if the directory starts with the parentID.
elif kfileDirName.startswith(fparentID):
lineageList.append(key)
return lineageList
final = getLineageList(getInput())
print final
Aucun commentaire:
Enregistrer un commentaire