I am having trouble getting past part of my code and hoping someone can help me out since I am new to programming.
I am attempting the following steps:
- Read in the xml files from a directory (Success)
- Determine which document version is used (2 choices, Success)
- Parse the contract tag data (fails with the error - Traceback (most recent call last): File "D:/PyCharm Projects/Norway Parser/NO Parser.py", line 33, in if award.getElementsByTagName('CONTRACT_NO')[0] != -1: IndexError: list index out of range)
- Construct a csv list of the files
I am really confused by the errors because I am able to run the code successfully on the individual files (i.e. if the 'files' variable has the syntax files = '123123-2017.xml'
) but when I attempt to loop through all of the files in my directory, I am given an error (see point 3)
Below the code, I have included one of the xml files so that you can see the structure of the xml documents. Thank you in advance for the help
import os
from xml.etree import ElementTree
from xml.dom import minidom
#read the files in the directory
files = os.listdir()[1:len(os.listdir())-1]
#loop the files to read all of the files
for f in files:
xmldoc = minidom.parse(f)
case = 0
#determine which document version was used in the xml file
doc_loc = xmldoc.getElementsByTagName('TED_EXPORT')
for loc in doc_loc:
doc = loc.getAttribute('xsi:schemaLocation')
if doc.find('R2.0.8.S02.E01') != -1:
case = 1
elif doc.find('R2.0.9.S01.E01') != -1:
case = 2
print(f, case)
if case == 1:
pass
elif case == 2:
#scan "AWARD_CONTRACT" Tag and get data
award_contract = xmldoc.getElementsByTagName('AWARD_CONTRACT')
for award in award_contract:
if award.getAttribute("ITEM") != -1:
item_no = award.getAttribute("ITEM")
else:
item_no = 'NaN'
if award.getElementsByTagName('CONTRACT_NO')[0] != -1:
contract_no = award.getElementsByTagName('CONTRACT_NO')[0]
print(contract_no)
else:
contract_no = 'NaN'
print(contract_no)
if award.getElementsByTagName('LOT_NO')[0] != -1:
lot = award.getElementsByTagName('LOT_NO')[0]
else:
lot = 'NaN'
if award.getElementsByTagName('P')[0] != -1:
title = award.getElementsByTagName('P')[0]
else:
title = 'NaN'
if award.getElementsByTagName('DATE_CONCLUSION_CONTRACT')[0] != -1:
date = award.getElementsByTagName('DATE_CONCLUSION_CONTRACT')[0]
else:
date = 'NaN'
contractors = xmldoc.getElementsByTagName('CONTRACTOR')
for contractor in contractors:
name = contractor.getElementsByTagName('OFFICIALNAME')[0]
address = contractor.getElementsByTagName('ADDRESS')[0]
town = contractor.getElementsByTagName('TOWN')[0]
zip_code = contractor.getElementsByTagName('POSTAL_CODE')[0]
c = contractor.getElementsByTagName('COUNTRY')[0]
country = c.getAttribute("VALUE")
value = award.getElementsByTagName('VAL_TOTAL')[0]
currency = value.getAttribute("CURRENCY")
print(item_no, ',', contract_no.firstChild.data, ',', lot.firstChild.data, ',', title.firstChild.data,
',', date.firstChild.data, ',', name.firstChild.data, ',', address.firstChild.data, ',',
town.firstChild.data, ',', zip_code.firstChild.data, ',', country, ',', value.firstChild.data,
',', currency)
XML Files:
<?xml version="1.0" encoding="UTF-8"?> -
<TED_EXPORT EDITION="2017030" DOC_ID="055202-2017" xsi:schemaLocation="http://ift.tt/2orFFUR TED_EXPORT.xsd" VERSION="R2.0.9.S01.E01" xmlns:xlink="http://ift.tt/PGV9lw" xmlns:xsi="http://ift.tt/ra1lAU"
xmlns="http://ift.tt/2orFFUR">
-
<TECHNICAL_SECTION>
<RECEPTION_ID>17-057004-001</RECEPTION_ID>
<DELETION_DATE>20170521</DELETION_DATE>
<FORM_LG_LIST>EN </FORM_LG_LIST>
<COMMENTS>From Convertor</COMMENTS>
</TECHNICAL_SECTION>
-
<LINKS_SECTION>
<XML_SCHEMA_DEFINITION_LINK xlink:title="TED WEBSITE" xlink:type="simple" xlink:href="http://ted.europa.eu" />
<OFFICIAL_FORMS_LINK xlink:type="simple" xlink:href="http://ted.europa.eu" />
<FORMS_LABELS_LINK xlink:type="simple" xlink:href="http://ted.europa.eu" />
<ORIGINAL_CPV_LINK xlink:type="simple" xlink:href="http://ted.europa.eu" />
<ORIGINAL_NUTS_LINK xlink:type="simple" xlink:href="http://ted.europa.eu" />
</LINKS_SECTION>
-
<CODED_DATA_SECTION>
-
<REF_OJS>
<COLL_OJ>S</COLL_OJ>
<NO_OJ>30</NO_OJ>
<DATE_PUB>20170211</DATE_PUB>
</REF_OJS>
-
<NOTICE_DATA>
<NO_DOC_OJS>2017/S 030-055202</NO_DOC_OJS>
-
<URI_LIST>
<URI_DOC LG="EN">http://ift.tt/2mJF8RG;
</URI_LIST>
<LG_ORIG>EN</LG_ORIG>
<ISO_COUNTRY VALUE="NO" />
<IA_URL_GENERAL>http://ift.tt/2orDfFV;
<ORIGINAL_CPV CODE="45000000">Construction work</ORIGINAL_CPV>
<ORIGINAL_NUTS CODE="NO">NORGE</ORIGINAL_NUTS>
<CA_CE_NUTS CODE="NO">NORGE</CA_CE_NUTS>
<TENDERER_NUTS CODE="NO">NORGE</TENDERER_NUTS>
-
<VALUES>
<VALUE CURRENCY="NOK" TYPE="PROCUREMENT_TOTAL">18249847.00</VALUE>
</VALUES>
-
<REF_NOTICE>
<NO_DOC_OJS>2016/S 172-310344</NO_DOC_OJS>
</REF_NOTICE>
</NOTICE_DATA>
-
<CODIF_DATA>
<DS_DATE_DISPATCH>20170210</DS_DATE_DISPATCH>
<AA_AUTHORITY_TYPE CODE="3">Regional or local authority</AA_AUTHORITY_TYPE>
<TD_DOCUMENT_TYPE CODE="7">Contract award notice</TD_DOCUMENT_TYPE>
<NC_CONTRACT_NATURE CODE="1">Works</NC_CONTRACT_NATURE>
<PR_PROC CODE="1">Open procedure</PR_PROC>
<RP_REGULATION CODE="B">European Economic Area (EEA), with participation by GPA countries</RP_REGULATION>
<TY_TYPE_BID CODE="9">Not applicable</TY_TYPE_BID>
<AC_AWARD_CRIT CODE="1">Lowest price</AC_AWARD_CRIT>
<MA_MAIN_ACTIVITIES CODE="S">General public services</MA_MAIN_ACTIVITIES>
<HEADING>03A03</HEADING>
<INITIATOR>03</INITIATOR>
<DIRECTIVE VALUE="2014/24/EU" />
</CODIF_DATA>
</CODED_DATA_SECTION>
-
<TRANSLATION_SECTION>
-
<ML_TITLES>
-
<ML_TI_DOC LG="BG">
<TI_CY>Норвегия</TI_CY>
<TI_TOWN>Molde</TI_TOWN>
-
<TI_TEXT>
<P>Строителни и монтажни работи</P>
</TI_TEXT>
</ML_TI_DOC>
-
<ML_TI_DOC LG="CS">
<TI_CY>Norsko</TI_CY>
<TI_TOWN>Molde</TI_TOWN>
-
<TI_TEXT>
<P>Stavební práce</P>
</TI_TEXT>
</ML_TI_DOC>
-
<ML_TI_DOC LG="DA">
<TI_CY>Norge</TI_CY>
<TI_TOWN>Molde</TI_TOWN>
-
<TI_TEXT>
<P>Bygge- og anlægsarbejder</P>
</TI_TEXT>
</ML_TI_DOC>
-
<ML_TI_DOC LG="DE">
<TI_CY>Norwegen</TI_CY>
<TI_TOWN>Molde</TI_TOWN>
-
<TI_TEXT>
<P>Bauarbeiten</P>
</TI_TEXT>
</ML_TI_DOC>
-
<ML_TI_DOC LG="EL">
<TI_CY>Νορβηγία</TI_CY>
<TI_TOWN>Molde</TI_TOWN>
-
<TI_TEXT>
<P>Κατασκευαστικές εργασίες</P>
</TI_TEXT>
</ML_TI_DOC>
-
<ML_TI_DOC LG="EN">
<TI_CY>Norway</TI_CY>
<TI_TOWN>Molde</TI_TOWN>
-
<TI_TEXT>
<P>Construction work</P>
</TI_TEXT>
</ML_TI_DOC>
-
<ML_TI_DOC LG="ES">
<TI_CY>Noruega</TI_CY>
<TI_TOWN>Molde</TI_TOWN>
-
<TI_TEXT>
<P>Trabajos de construcción</P>
</TI_TEXT>
</ML_TI_DOC>
-
<ML_TI_DOC LG="ET">
<TI_CY>Norra</TI_CY>
<TI_TOWN>Molde</TI_TOWN>
-
<TI_TEXT>
<P>Ehitustööd</P>
</TI_TEXT>
</ML_TI_DOC>
-
<ML_TI_DOC LG="FI">
<TI_CY>Norja</TI_CY>
<TI_TOWN>Molde</TI_TOWN>
-
<TI_TEXT>
<P>Rakennustyöt</P>
</TI_TEXT>
</ML_TI_DOC>
-
<ML_TI_DOC LG="FR">
<TI_CY>Norvège</TI_CY>
<TI_TOWN>Molde</TI_TOWN>
-
<TI_TEXT>
<P>Travaux de construction</P>
</TI_TEXT>
</ML_TI_DOC>
-
<ML_TI_DOC LG="GA">
<TI_CY>Iorua, an</TI_CY>
<TI_TOWN>Molde</TI_TOWN>
-
<TI_TEXT>
<P>Construction work</P>
</TI_TEXT>
</ML_TI_DOC>
-
<ML_TI_DOC LG="HR">
<TI_CY>Norveška</TI_CY>
<TI_TOWN>Molde</TI_TOWN>
-
<TI_TEXT>
<P>Građevinski radovi</P>
</TI_TEXT>
</ML_TI_DOC>
-
<ML_TI_DOC LG="HU">
<TI_CY>Norvégia</TI_CY>
<TI_TOWN>Molde</TI_TOWN>
-
<TI_TEXT>
<P>Építési munkák</P>
</TI_TEXT>
</ML_TI_DOC>
-
<ML_TI_DOC LG="IT">
<TI_CY>Norvegia</TI_CY>
<TI_TOWN>Molde</TI_TOWN>
-
<TI_TEXT>
<P>Lavori di costruzione</P>
</TI_TEXT>
</ML_TI_DOC>
-
<ML_TI_DOC LG="LT">
<TI_CY>Norvegija</TI_CY>
<TI_TOWN>Molde</TI_TOWN>
-
<TI_TEXT>
<P>Statybos darbai</P>
</TI_TEXT>
</ML_TI_DOC>
-
<ML_TI_DOC LG="LV">
<TI_CY>Norvēģija</TI_CY>
<TI_TOWN>Molde</TI_TOWN>
-
<TI_TEXT>
<P>Celtniecības darbi</P>
</TI_TEXT>
</ML_TI_DOC>
-
<ML_TI_DOC LG="MT">
<TI_CY>in-Norveġja</TI_CY>
<TI_TOWN>Molde</TI_TOWN>
-
<TI_TEXT>
<P>Xogħol tal-kostruzzjoni</P>
</TI_TEXT>
</ML_TI_DOC>
-
<ML_TI_DOC LG="NL">
<TI_CY>Noorwegen</TI_CY>
<TI_TOWN>Molde</TI_TOWN>
-
<TI_TEXT>
<P>Bouwwerkzaamheden</P>
</TI_TEXT>
</ML_TI_DOC>
-
<ML_TI_DOC LG="PL">
<TI_CY>Norwegia</TI_CY>
<TI_TOWN>Molde</TI_TOWN>
-
<TI_TEXT>
<P>Roboty budowlane</P>
</TI_TEXT>
</ML_TI_DOC>
-
<ML_TI_DOC LG="PT">
<TI_CY>Noruega</TI_CY>
<TI_TOWN>Molde</TI_TOWN>
-
<TI_TEXT>
<P>Construção</P>
</TI_TEXT>
</ML_TI_DOC>
-
<ML_TI_DOC LG="RO">
<TI_CY>Norvegia</TI_CY>
<TI_TOWN>Molde</TI_TOWN>
-
<TI_TEXT>
<P>Lucrări de construcţii</P>
</TI_TEXT>
</ML_TI_DOC>
-
<ML_TI_DOC LG="SK">
<TI_CY>Nórsko</TI_CY>
<TI_TOWN>Molde</TI_TOWN>
-
<TI_TEXT>
<P>Stavebné práce</P>
</TI_TEXT>
</ML_TI_DOC>
-
<ML_TI_DOC LG="SL">
<TI_CY>Norveška</TI_CY>
<TI_TOWN>Molde</TI_TOWN>
-
<TI_TEXT>
<P>Gradbena dela</P>
</TI_TEXT>
</ML_TI_DOC>
-
<ML_TI_DOC LG="SV">
<TI_CY>Norge</TI_CY>
<TI_TOWN>Molde</TI_TOWN>
-
<TI_TEXT>
<P>Anläggningsarbete</P>
</TI_TEXT>
</ML_TI_DOC>
</ML_TITLES>
-
<ML_AA_NAMES>
<AA_NAME LG="EN">Statens vegvesen Region midt</AA_NAME>
</ML_AA_NAMES>
</TRANSLATION_SECTION>
-
<FORM_SECTION>
-
<F03_2014 LG="EN" FORM="F03" CATEGORY="ORIGINAL">
-
<CONTRACTING_BODY>
-
<ADDRESS_CONTRACTING_BODY>
<OFFICIALNAME>Statens vegvesen Region midt</OFFICIALNAME>
<NATIONALID>971032081</NATIONALID>
<ADDRESS>Fylkeshuset</ADDRESS>
<TOWN>Molde</TOWN>
<POSTAL_CODE>6404</POSTAL_CODE>
<COUNTRY VALUE="NO" />
<CONTACT_POINT>Statens vegvesen</CONTACT_POINT>
<PHONE>+47 02030</PHONE>
<E_MAIL>firmapost-midt@vegvesen.no</E_MAIL>
<NUTS CODE="NO" />
<URL_GENERAL>http://ift.tt/2mJvcHT;
<URL_BUYER>http://ift.tt/2orA7dd;
</ADDRESS_CONTRACTING_BODY>
<CA_TYPE VALUE="REGIONAL_AUTHORITY" />
<CA_ACTIVITY VALUE="GENERAL_PUBLIC_SERVICES" />
</CONTRACTING_BODY>
-
<OBJECT_CONTRACT>
-
<TITLE>
<P>County roads 17 and 720 Dyrstad-Sprova-Malm Preparatory road contract E-1.1.</P>
</TITLE>
-
<CPV_MAIN>
<CPV_CODE CODE="45000000" />
</CPV_MAIN>
<TYPE_CONTRACT CTYPE="WORKS" /> -
<SHORT_DESCR>
<P>Preparatory road contract E-1.1 is the first construction stage in the project 17 and 720 Dyrstad — Sprova — Malm. The contract shall construct access roads to a land abutment for a new bridge, on both sides of Beitstadsundet, as well as a connection
road between Tjuin industrial area and the municipal road that goes to Strømnes. Future road development shall also be prepared in this contract by laying out bias at Østvik.</P>
<P>The contract includes cutting down forest, blasting, mass haulage, filling and levelling for roads.</P>
<P>The contract is subject to the Norwegian Parliament approving the toll money application for the project. The Norwegian Parliament's decision is (expected to be) in October or December 2016. If the application is approved, the contract will
be signed by 14.1.2017.</P>
</SHORT_DESCR>
<VAL_TOTAL CURRENCY="NOK">18249847.00</VAL_TOTAL>
<NO_LOT_DIVISION/> -
<OBJECT_DESCR ITEM="1">
-
<CPV_ADDITIONAL>
<CPV_CODE CODE="45000000" />
</CPV_ADDITIONAL>
<NUTS CODE="NO" /> -
<SHORT_DESCR>
<P>The construction of access roads to a land abutment for a new bridge, on both sides of Beitstadsundet, as well as a connection road between Tjuin industrial area and the municipal road that goes to Strømnes. A future road development shall
also be prepared in this contract by laying out bias at Østvik.</P>
</SHORT_DESCR>
<AC_PRICE/>
<NO_OPTIONS/>
<NO_EU_PROGR_RELATED/>
</OBJECT_DESCR>
</OBJECT_CONTRACT>
-
<PROCEDURE>
<PT_OPEN/>
<CONTRACT_COVERED_GPA/>
<NOTICE_NUMBER_OJ>2016/S 172-310344</NOTICE_NUMBER_OJ>
</PROCEDURE>
-
<AWARD_CONTRACT ITEM="1">
<CONTRACT_NO>16/76552</CONTRACT_NO>
-
<TITLE>
<P>County roads 17 and 720 Dyrstad — Sprova — Malm — Preparatory road works contract E-1.1</P>
</TITLE>
-
<AWARDED_CONTRACT>
<DATE_CONCLUSION_CONTRACT>2017-01-25</DATE_CONCLUSION_CONTRACT>
<NB_TENDERS_RECEIVED>8</NB_TENDERS_RECEIVED>
<NO_AWARDED_TO_GROUP/> -
<CONTRACTOR>
-
<ADDRESS_CONTRACTOR>
<OFFICIALNAME>Odd Einar Kne AS</OFFICIALNAME>
<TOWN>Steinkjer</TOWN>
<COUNTRY VALUE="NO" />
<NUTS CODE="NO" />
</ADDRESS_CONTRACTOR>
<SME/>
</CONTRACTOR>
<VAL_TOTAL CURRENCY="NOK">18249847.00</VAL_TOTAL>
</AWARDED_CONTRACT>
</AWARD_CONTRACT>
-
<COMPLEMENTARY_INFO>
-
<ADDRESS_REVIEW_BODY>
<OFFICIALNAME>Statens vegvesen Region midt</OFFICIALNAME>
<TOWN>Molde</TOWN>
<COUNTRY VALUE="NO" />
</ADDRESS_REVIEW_BODY>
<DATE_DISPATCH_NOTICE>2017-02-10</DATE_DISPATCH_NOTICE>
</COMPLEMENTARY_INFO>
</F03_2014>
</FORM_SECTION>
</TED_EXPORT>
Aucun commentaire:
Enregistrer un commentaire