Giving Voice to Digital Democracies: Difference between revisions
(Created page with "GVDD is a project team based at CRASSH Marcus Tomalin <mt126@cam.ac.uk> suggested: How about one of the following? * Developing NLP-based Visualisation Tools for Data Stat...") |
No edit summary |
||
Line 1: | Line 1: | ||
GVDD is a project team based at CRASSH | GVDD is a project team based at CRASSH | ||
Marcus Tomalin <mt126@cam.ac.uk> suggested: | Brief written, waiting to confirm client: [[Deliberative Social Media]] | ||
Potential second project: Marcus Tomalin <mt126@cam.ac.uk> suggested: | |||
How about one of the following? | How about one of the following? |
Latest revision as of 09:40, 30 October 2020
GVDD is a project team based at CRASSH
Brief written, waiting to confirm client: Deliberative Social Media
Potential second project: Marcus Tomalin <mt126@cam.ac.uk> suggested:
How about one of the following?
- Developing NLP-based Visualisation Tools for Data Statements -- i.e., taking the notion of data statements (e.g., Bender and Friedman 2018) as a starting point, develop a suite of NLP-based tools that would enable biases in language-based corpora to be displayed visually
- Developing Interactive Data Statements -- -- i.e., taking the notion of data statements (e.g., Bender and Friedman 2018) as a starting point, develop an interactive version of a data statement that enables the person using the data to ask and receive answers to (a constrained set of) questions about the data
These are both ideas that we have discussed within the GVDD group, but we haven't focused on thse specific tasks yet (mainly because we were unable to hire a coder over the summer).
Both these projects could be constrained in ways that made them approachable for students, but they could also become as complex as the students wished.
Feedback:
I have already been discussing the broad area of dataset bias with a research fellow at Microsoft Research Cambridge, who is looking at global cultural and economic bias in training of machine vision systems. Are you familiar with “model cards”, which appear similar in their intention to "data statements” as advocated in the Bender and Friedman paper? A recent application of model card approach in response to the recent “white Obama” scandal is described here:
https://thegradient.pub/pulse-lessons/
I think there are a number of potential approaches to anticipating, illustrating and correcting data set bias, but this is a fairly active research area, and I suspect that the specific domain of application for the data set may produced considerable differences in the most appropriate design responses. Is there a particular area (with publicly available datasets) that you think might be appropriate for computer science undergraduates to work on?