Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BioGuide Scrape #296

Open
polisciresearcher opened this issue Jun 24, 2015 · 4 comments
Open

BioGuide Scrape #296

polisciresearcher opened this issue Jun 24, 2015 · 4 comments

Comments

@polisciresearcher
Copy link

Hello! Has anyone been able to scrape the BioGuide and separate out variables such as place of birth, educational institution, etc? I have Josh's CSV files, but the biographies are one continuous variable. Separate variables for the biographical details would be wonderful to have. Is this something any of you have tried to pull out?

I really appreciate any help you may be able to provide!

@konklone
Copy link
Member

@polisciresearcher It's not everything, but we do have a script which attempts to walk the bioguide and get each birthday, and all name pieces. It's very much imperfect, and hacky -- and hasn't been touched in a while.

cc @GullicksonK

@dannguyen
Copy link
Contributor

Ha, I was just thinking about this and was about to make a new Issue if this weren't already asked...how about making the bioguide text as a field for each legislator? I know some of its text will be redundant, but for people interested in hacking their own school/job parser (or just for quick lookup), having the bioguide text would be convenient.

@JoshData
Copy link
Member

JoshData commented Aug 2, 2015

I've also just been thinking about this. :)

The last couple of weeks I've been writing a grammar to do a deep parse of bioguide entries. I should have something to commit later today.

@JoshData
Copy link
Member

JoshData commented Aug 2, 2015

See #304.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants