Skip to content
Olly Butters edited this page Jun 29, 2018 · 19 revisions

Pretty much everything is based on the data object we build early on. It is this that gets passed from function to function. It is a list of dictionaries, each list item (i.e. each dictionary) containing all the metadata for a single paper.

[
    {
        IDs:
        {
            hash:
            PMID:
            zotero:
            DOI:
            scopus:
        }
        clean:                   <- Cleaned, gold standard data
        {
            full_author_list:
            [
                {
                    affiliation:
                    [
                        {
                            name
                        }
                    ]
                    given
                    family
                    clean
                }
            }
            location:
            {
                clean_institute
                latitude
                longitude
                postal_town
                country_code
            }
            clean_date:
            {
                year
                month
                day
            }
            title
            abstract
            citations:
            {
                scopus:
                {
                    count
                    date_downloaded
                }
                PMC:
                {
                    count
                    date_downloaded
                }
            }
            keywords:
            {
                mesh:
                [
                    {
                        term
                        major
                    }
                ]
                other
            }
            journal:
            {
                journal_name
                volume
                issue
            }
        }
        raw:                      <-- All the raw data, everything in clean has come from here.
        {
            zotero {}
            pubmed {}
            doi {}
            scopus {}
        }
     },
     {Next paper},
     {Next paper},
     {etc}
]