Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update dac member_name to controlled list #1285

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
133 changes: 133 additions & 0 deletions bin/migrate/choices/dac_members_2022_05.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,133 @@
# Departmental Audit Committee member names to be used in dropdown on recombinant templates

Alexander, Jim: Alexander, Jim
Almeida-Côté, Iris: Almeida-Côté, Iris
Anderson, Roxanne: Anderson, Roxanne
Bains, Manjit: Bains, Manjit
Baker, William: Baker, William
Bates, Paul: Bates, Paul
Bradbury, Victoria: Bradbury, Victoria
Brant, Daniel: Brant, Daniel
Cochrane, Ken: Cochrane, Ken
Cook-Bennett, Gail: Cook-Bennett, Gail
Corriveau, Pierre: Corriveau, Pierre
Costante, Kevin: Costante, Kevin
Cotton, Katherine: Cotton, Katherine
Crabbe, Ray: Crabbe, Ray
Cramer, Joy: Cramer, Joy
Dancey, Kevin: Dancey, Kevin
De La Durantaye, Josée: De La Durantaye, Josée
de Saldanha, Adrian: de Saldanha, Adrian
del Val, Helen: del Val, Helen
Denhamer, Janet: Denhamer, Janet
Dhaliwal, Maninder: Dhaliwal, Maninder
Dias D'Souza, Jeanette: Dias D'Souza, Jeanette
Dicerni, Richard: Dicerni, Richard
Domm, John: Domm, John
Dorrington, Cassandra: Dorrington, Cassandra
Drouin, Nathalie: Drouin, Nathalie
Duranceau, France-Élaine: Duranceau, France-Élaine
Edwards, Laurie: Edwards, Laurie
Ell, Lori: Ell, Lori
Ferguson, Don: Ferguson, Don
Finn, Patrick: Finn, Patrick
Forster, John: Forster, John
Fortin, Hélène: Fortin, Hélène
Fregin, Clifford: Fregin, Clifford
Gaudet, Serge: Gaudet, Serge
Gillis, Angeline: Gillis, Angeline
Girard, Aline: Girard, Aline
Godbout, Claude: Godbout, Claude
Gorbet, Frederick: Gorbet, Frederick
Gouin, Suzanne: Gouin, Suzanne
Hanlon, Robert: Hanlon, Robert
Hanna, John: Hanna, John
Harnish, Victoria: Harnish, Victoria
Harvey, Manon: Harvey, Manon
Hollins, Leah: Hollins, Leah
Jackson, Karen: Jackson, Karen
Jacobsen, Patricia: Jacobsen, Patricia
Jamal, Karim: Jamal, Karim
Janes, Glenn: Janes, Glenn
Jeraj, Shenaz: Jeraj, Shenaz
Jung, Dr. Hans: Jung, Dr. Hans
Katiya, Alan: Katiya, Alan
Kei, Wendy: Kei, Wendy
Kiamanesh, Mitra: Kiamanesh, Mitra
Kieley, Barbara: Kieley, Barbara
Kirkpatrick, Kent: Kirkpatrick, Kent
Kirvan, Myles: Kirvan, Myles
La Rose, Jean: La Rose, Jean
Lafortune, Andrée: Lafortune, Andrée
Lahey, James: Lahey, James
Lalli, Sandip: Lalli, Sandip
Lamoureux, Kenneth K.: Lamoureux, Kenneth K.
Leduc, Raymond: Leduc, Raymond
Liston-Heyes, Catherine: Liston-Heyes, Catherine
Lizotte-MacPherson, Linda: Lizotte-MacPherson, Linda
Lussier, Gaétan: Lussier, Gaétan
Maheu, Lorraine: Maheu, Lorraine
Malik, Tariq: Malik, Tariq
Maracle, Thomas: Maracle, Thomas
Marcoux, Rennie: Marcoux, Rennie
Maxwell, Dr. Linda: Maxwell, Dr. Linda
Maxwell, Neil: Maxwell, Neil
McKenzie, Michele: McKenzie, Michele
McLaughlin, Michael: McLaughlin, Michael
McLellan, George: McLellan, George
McWhinnie, John: McWhinnie, John
Meredith, Daphne: Meredith, Daphne
Miller, David: Miller, David
Minto, Shahid: Minto, Shahid
Mitchell, James (Jim): Mitchell, James (Jim)
Morris, Suzanne: Morris, Suzanne
Nadeau, Elisabeth: Nadeau, Elisabeth
Negris, Cybele: Negris, Cybele
Nelson, Michael: Nelson, Michael
Newman, Deborah: Newman, Deborah
Norminton, Monica: Norminton, Monica
Ostapchuk, Peter: Ostapchuk, Peter
Oxley, Loraine: Oxley, Loraine
Paquette, Jacques: Paquette, Jacques
Patry, Gilles: Patry, Gilles
Pelman, Dr. Alan: Pelman, Dr. Alan
Philip-Katyal, Ruby: Philip-Katyal, Ruby
Poitras, Anne-Marie: Poitras, Anne-Marie
Pollard, Carol: Pollard, Carol
Pratt, Betty-Anne: Pratt, Betty-Anne
Proulx, Robert: Proulx, Robert
Péan, Chantal: Péan, Chantal
Rahnema, Ali: Rahnema, Ali
Renaud, Anne-Marie: Renaud, Anne-Marie
Roberts, Meena: Roberts, Meena
Rossetti, John: Rossetti, John
Ruta, Basia: Ruta, Basia
Scott, Kimberly: Scott, Kimberly
Sheane, Inga: Sheane, Inga
Sheikh, Munir: Sheikh, Munir
Sinclair, Helen: Sinclair, Helen
Singleton, Jon: Singleton, Jon
Smith, David: Smith, David
Smith, Mindy: Smith, Mindy
Somani, Moyez: Somani, Moyez
Soria, Elizabeth (Liz): Soria, Elizabeth (Liz)
Stark, Dr. Deborah: Stark, Dr. Deborah
Sumara, Joyce: Sumara, Joyce
Swan, Carole: Swan, Carole
Sweetnam, Albert: Sweetnam, Albert
Talbot-Allan, Laura: Talbot-Allan, Laura
Tessier, Sylvie: Tessier, Sylvie
Thompson, John: Thompson, John
Thompson, Stanley: Thompson, Stanley
Tremblay, Pierre: Tremblay, Pierre
Turnbull, Norman: Turnbull, Norman
Van-Erum, Micheline: Van-Erum, Micheline
Voghel, Sylvie: Voghel, Sylvie
Wallace, Stephen: Wallace, Stephen
Weeks, Joanne: Weeks, Joanne
Whipp, Nancy: Whipp, Nancy
Wong-Alafriz, Kay: Wong-Alafriz, Kay
Yeates, Glenda: Yeates, Glenda
Yeates, Neil: Yeates, Neil
Zaarour, Roula: Zaarour, Roula
Zussman, Dr. David: Zussman, Dr. David
123 changes: 123 additions & 0 deletions bin/migrate/migrate_dac_2022_05.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,123 @@
#!/usr/bin/env python
# coding=utf-8

import codecs
from decimal import Decimal
import os
from re import sub
import sys
import unicodecsv
from unicodedata import normalize
import yaml


FIELDNAMES = [
u'reporting_period',
u'line_number',
u'member_name',
u'province',
u'role',
u'meeting_hours',
u'other_hours',
u'remuneration',
u'travel_expenses',
u'notes_en',
u'notes_fr',
u'record_created',
u'record_modified',
u'user_modified',
u'owner_org',
u'owner_org_title',
]

assert sys.stdin.read(3) == codecs.BOM_UTF8

in_csv = unicodecsv.DictReader(sys.stdin, encoding='utf-8')
out_csv = unicodecsv.DictWriter(sys.stdout, fieldnames=FIELDNAMES, encoding='utf-8')
out_csv.writeheader()

error_csv = None
if sys.argv[1:]:
error_csv = unicodecsv.DictWriter(open(sys.argv[1], 'wb'),
fieldnames=FIELDNAMES,
encoding='utf-8')
error_csv.writeheader()

# get DAC member names
dac_member_yaml = open(
os.path.join(os.path.dirname(__file__),
"choices/dac_members_2022_05.yaml"), "r")
members = yaml.safe_load(dac_member_yaml)

try:
for line in in_csv:
# change money values such as $7,300.99 to 7300.99
line['remuneration'] = Decimal(sub(r'[^\d.]', '', line['remuneration']))
line['travel_expenses'] = Decimal(sub(r'[^\d.]', '', line['travel_expenses']))

# change free text member names to a controlled list
line[u'member_name'] = line[u'member_name'].strip()

# map member names with typos where obvious
if line[u'member_name'] == u'Adrian deSaldanha':
best_match = u'de Saldanha, Adrian'
elif line[u'member_name'] == u'Angie Gillis':
best_match = u'Gillis, Angeline'
elif line[u'member_name'] == u'Carola Swan':
best_match = u'Swan, Carole'
elif line[u'member_name'] == u'Cassandra':
best_match = u'Dorrington, Cassandra'
elif line[u'member_name'] == u'Fredrick Gorbet':
best_match = u'Gorbet, Frederick'
elif line[u'member_name'] == u'Inga Shaene':
best_match = u'Sheane, Inga'
elif line[u'member_name'] == u'Almeida Iris-Côté':
best_match = u'Almeida-Côté, Iris'
elif line[u'member_name'] in [u'M. Sheikh', u'M. Shiekh', u'Munir Shiekh']:
best_match = u'Sheikh, Munir'
elif line[u'member_name'] == u'M. Roberts':
best_match = u'Roberts, Meena'
elif line[u'member_name'] == u'N. Whipp':
best_match = u'Whipp, Nancy'
elif line[u'member_name'] == u'Norman Turbull':
best_match = u'Turnbull, Norman'
elif line[u'member_name'] == u'Raymond Crabbe':
best_match = u'Crabbe, Ray'
elif line[u'member_name'] == u'Ruby Philip-Kaytal':
best_match = u'Philip-Katyal, Ruby'
elif line[u'member_name'] == u'Wendy Key':
best_match = u'Kei, Wendy'

else:
name_parts = line[u'member_name'].split(' ')
matches = []
for part in name_parts:
part = part.strip()
part = part.strip(',')
part = part.lower()
part = unicode(normalize('NFKD', part)
.encode('ASCII', 'ignore'),
'utf-8')

for member in map(unicode, members.keys()):
if part in unicode(
normalize('NFKD', member.lower()).encode('ASCII', 'ignore'),
'utf-8'):
matches.append(member)
if len(matches):
best_match = max(set(matches), key=matches.count)
else:
best_match = 'NIL'
if matches.count(best_match) > 1:
line[u'member_name'] = best_match
else:
if error_csv:
error_csv.writerow(line)

out_csv.writerow(line)

except KeyError:
if 'warehouse' in sys.argv:
sys.exit(85)
else:
raise
13 changes: 8 additions & 5 deletions ckanext/canada/tables/dac.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -68,11 +68,13 @@ resources:
obligation: Mandatory
excel_required: true
form_required: true
format_type: Free text
format_type: Controlled List
validation: This field must not be empty
occurrence: Single
datastore_type: text
form_attrs:
size: 60
excel_full_text_choices: True
choices_file: choices/dac_members.yaml
excel_error_formula: 'FALSE'

- datastore_id: province
label:
Expand Down Expand Up @@ -211,7 +213,7 @@ resources:
obligation: Mandatory
excel_required: true
form_required: true
datastore_type: money
datastore_type: numeric

- datastore_id: travel_expenses
label:
Expand All @@ -223,7 +225,7 @@ resources:
obligation: Mandatory
excel_required: true
form_required: true
datastore_type: money
datastore_type: numeric

- datastore_id: notes_en
label:
Expand Down Expand Up @@ -288,6 +290,7 @@ resources:
errors := errors || choice_error(NEW.reporting_period, {reporting_period}, 'reporting_period');
errors := errors || required_error(NEW.line_number, 'line_number');
errors := errors || required_error(NEW.member_name, 'member_name');
errors := errors || choice_error(NEW.member_name, {member_name}, 'member_name');
errors := errors || required_error(NEW.province, 'province');
errors := errors || choice_error(NEW.province, {province}, 'province');
errors := errors || required_error(NEW.meeting_hours, 'meeting_hours');
Expand Down
7 changes: 7 additions & 0 deletions ckanext/canada/tests/test_dac.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
from ckan.tests.factories import Organization

from ckanext.recombinant.tables import get_chromo
from ckanext.recombinant.helpers import recombinant_choice_fields

class TestDAC(FunctionalTestBase):
def setup(self):
Expand All @@ -19,6 +20,12 @@ def setup(self):
def test_example(self):
lc = LocalCKAN()
record = get_chromo('dac')['examples']['record']
choices_fields = recombinant_choice_fields('dac')
for f in choices_fields:
if f['datastore_id'] != 'member_name':
continue
record['member_name'] = f['choices'][0][0]
break
lc.action.datastore_upsert(
resource_id=self.resource_id,
records=[record])
Expand Down