Skip to content

Latest commit

 

History

History
103 lines (67 loc) · 5.51 KB

brigade-congress-data-science.md

File metadata and controls

103 lines (67 loc) · 5.51 KB

Data Science Brigade Congress unconference session

Compiled During Session One (1) - @carlvlewis and @VincentLa

Unconference Session Topics

  1. What is a data science working group? How has Code for San Francisco positioned their data science working group, and in general what has worked and what doesn't?
  • Successful data science projects start with good and close relationships with relevant stakeholders. In our case, this tends to be local government and/or non-profit agencies that we partner with.
  • "Data science and analytics" is only 5 percent of the work. 95% of the work is data engineering, building relationships with stakeholders, good UX/UI design, project management, and many other things.
  • Lessons from SF Projects: One thing that has helped grow beyond doing data science work at hackathons to a more sustainable working group model, is focusing on the processes and infrastructure. Things that have helped us are: ++ Reducing "key-person" risk -- this means having multiple leaders, having well documented repositories, a well-managed and up-to-date task management system. ++ Focus on building that relationship with government partners ++ From an infrastructure/data engineering perspective, put a lot of effort into ETL processes and storing your data in an accessible manner. This could mean a centralized database and/or centralized documentation.
  1. How are we interacting more with government? How can we convince government of the value of data science?
  • It's a long relationship and part of it is continuously interacting with government officials and staff to continue building relationships. An example project with United Way was discussed where the project created data viualizations and models that were given, pro-bono, to NGOs. Don't underestimate the power of getting coffee or shooting an email from time to time. Remember, government officials and staff are people too who care about things. Anything we can do to make ther lives easier and not view us as a burden is great.
  • How many of us live in cities with a "Chief Data Officer"? The challenges of a small city vs a big city can be very different.
  • There should be a focus on educating government about what data science can do. We should understand that there's a huge amount of risk aversion in government processes. Be an advocate and give them tangible examples about how it can improve outcomes.
  • We should encourage ourselves to take on projects in a more sustainable manner. If we organize as non-profit organizations, don't be afraid to step up and submit RFP's and deliver great products. Maybe use hacknights as an opportunity to fill out RFPs!
  1. Where can data science go wrong and how do we guard against it?
  • Black box models are dangerous. Especially when using data science for inference and if it's touching people's lives, be careful of what model you choose and be sure to be able to explain the factors that are important in your model.
  • Documentation is important. Any good project should be able to explain what their doing.
  • Check out "Methods of Math Destruction" which talks about some of the pitfalls of using big data in the wrong way (https://www.amazon.com/Weapons-Math-Destruction-Increases-Inequality/dp/0553418815).
  1. Other Topics that came up but we only briefly touched on/never got to?
  • What to do with new members interested in data science, but not enough projects to go around? Related, how to manage a team with many different skill levels?
  • How to engage other people who aren't data scientists but can very much add value to a data science project?
  • How to get more data out of government without making expensive FOIA requests
  • Are there opportunities for a data science "learning" group, to upskill new members?

Data science for public good

  • 'Data science' not traditionally used for civic purposes

Data democratization

  • Not just for internal analytics
  • Not just for government efficiency
  • To empower the public

Visualization as data science

  • Exploratory Data Analysis (EDA)
  • Visualization as a tool for data science insights
  • Visualization as end-product for user (government or citizen)

Data storytelling (numbers + narrative)

  • Collaborations with news organizations
  • Combining human stories with data analysis

_____ Data Project

  • Building a scalable model for cities for training

Beyond accountability

  • Data won't be opened without policy, so if framed as 'transparency,' creates antagonistic relationship with municipalities.

Intersectionality between data journalism and civic-tech and data science communities

Machine/algorithmic bias

  • ProPublica series on algorithmic bias

DATA SCIENCE

  • Is is statistics?
  • Is it EDA?
  • Is it B.I.?
  • Is it machine-learning?
  • Is it visualization?
  • Is it insight?

MY ANSWER: It is all of the above

Graphicacy

  • The ability to buiold

Visual literacy

  • The abilitiy to read charts and obtain visual insights, and spot charts that mislead.

Data literacy

  • Spreadsheet use
  • Mean, median, mode
  • P-Value, Z-Value, Regression

Alogorithmic bias

Procurement with data science/analysis/visualization.

  • Data Sceintist for LA --