Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update to last Pydelphin #4

Open
arademaker opened this issue Nov 21, 2022 · 3 comments
Open

update to last Pydelphin #4

arademaker opened this issue Nov 21, 2022 · 3 comments

Comments

@arademaker
Copy link

arademaker commented Nov 21, 2022

This is related to #2

def read_profile(f, args):
p = itsdb.ItsdbProfile(f)
mrs_dataspec = args
inputs = dict((r['i-id'], r['i-input']) for r in p.read_table('item'))
cur_id, mrss = None, []
if p.exists('p-result'):
rows = p.read_table('p-result')
id_spec = 'i-id'
mrs_spec = 'mrs'
else:
rows = p.join('parse', 'result')
id_spec = 'parse:i-id'
mrs_spec = 'result:mrs'
for row in rows:
mrs = simplemrs.loads_one(row[mrs_spec])
if cur_id is None:
cur_id = row[id_spec]
if cur_id == row[id_spec]:
mrss.append(mrs)
else:
yield (cur_id, inputs[cur_id], mrss)
cur_id, mrss = row[id_spec], [mrs]
if mrss:
yield (cur_id, inputs[cur_id], mrss)

Hi @goodmami , it looks like the code for read_profile can be replaced by:

def read_profile(f):
    p = itsdb.TestSuite(f)
    cur_id, cur_input, mrss = None, None, []

    for r in tsql.select('i-id i-input mrs', p):
        mrs = simplemrs.decode(r[2])

        if cur_id is None:
            cur_id = r[0]
            cur_input = r[1]

        if cur_id == r[0]:
            mrss.append(mrs)
        else:
            yield (cur_id, cur_input, mrss)
            cur_id, cur_input, mrss = r[0], r[1], [mrs]

    if mrss:
        yield (cur_id, cur_input, mrss)

Does it make sense? I didn't find in the current version of the gold profiles from ERG any reference to the p-results relation.

@goodmami
Copy link
Owner

@arademaker thanks, but really this repository is not code that I maintain. It is more a record of what was used for a previous experiment. See also #3 (comment). I should archive the repo to make that clear.

PyDelphin now has a native Penman codec (actually two, one for DMRS and another for EDS). I suggest you use those for the conversion.

@arademaker
Copy link
Author

arademaker commented Nov 22, 2022

Indeed, I understood that. This repo could be identified as part of the https://github.com/shlurbee/dmrs-text-generation-naacl2019 and, as you said, just code to reproduce the paper. But I would appreciate your advice on which part of the code does what.

I am assuming that besides reading the profiles and transform the MRSs in DMRS, the code in this repo also deals with the linearization of the penman (figure 2 from the paper https://aclanthology.org/N19-1235.pdf) am I right?

But I didn't identified the code to deal with quotations and Wikipedia markup mentioned in the appendix of the paper. Maybe you were just reporting what you know people did for preparing the profiles part of the wikiwoods?

@goodmami
Copy link
Owner

This repo could be identified as part of the https://github.com/shlurbee/dmrs-text-generation-naacl2019

It is specified in setup.sh, line 12. The requirements.txt has PyDelphin at v0.6.2.

I am assuming that besides reading the profiles and transform the MRSs in DMRS, the code in this repo also deals with the linearization of the penman (figure 2 from the paper https://aclanthology.org/N19-1235.pdf) am I right?

Yes. More recent versions of PyDelphin have support for the conversion to PENMAN, but not in the same way as was done for this experiment.

But I didn't identified the code to deal with quotations and Wikipedia markup mentioned in the appendix of the paper.

I think that is here: https://github.com/shlurbee/dmrs-text-generation-naacl2019/blob/master/preprocessing.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants