We often hear the notion of ethics in AI thrown around with research such as autonomous cars and ‘the trolley problem. An example is the creation of MIT‘s The Moral Machine, a platform for gathering a human perspective on moral decisions made by machine intelligence. But the reality is far from abstract, and data intelligence is used to make decisions which impact many people in their daily lives.
Wired recently reported on how doctors make life-changing decisions about patient care based on algorithms that interpret test results or weight risks, like whether to perform a particular procedure. Some of those formulas factor in a person’s race, meaning patients’ skin colour can affect access to care. They reported on a new study which examined the use of a widely used formula for estimating kidney function that by design assigns black people healthier scores. The research found that one-third of black patients, more than 700 people, would have been placed into a more severe category of kidney disease if their kidney function had been estimated using the same formula as for white patients.
We see more stories in the news about machine learning algorithms causing real-world harm. People’s lives are affected by the decisions made by machines. Human trust in technology is based on our understanding of how it works, and our assessment of its safety and reliability. To trust a decision made by a machine, we need to know that it is reliable and fair, is accountable, and that it will cause no harm.
Margriet Groenendijk as a Data & AI Developer Advocate for IBM. She develops and presents talks and workshops about data science and AI and is active in the local developer communities through attending, presenting and organising meetups. She has a background in climate science where she explored large observational datasets of carbon uptake by forests during her PhD and global scale weather and climate models as a postdoctoral fellow. I connected with her before her presentation at Codemotion’s online conference: The Spanish edition to find out more.
What is AI fairness?
According to Margriet, AI fairness is “a bit of an umbrella term or tool box you can use to make sure that AI is actually fair. This means that data is not biased and relies upon you knowing the source of your data and who created it. How can you explain your models? Ultimately, it goes beyond tech to bigger social issues.”
She notes that the problem persists back to legacy data such as that used in Natural Language Programming (NLP) “some of which was only collected by two people.”
Who is responsible for ethics in AI in a company?
As many models are changing their staffing configurations and merging and dividing teams respectively, I wondered where the responsibility for the integrity and ethics of data lies in an organisation. Is it the data scientists or the company as a whole?
According to Margriet:
“Ultimately everybody’s responsible and whatever step you’re working on, you need to know where your data came from, who worked on before you if you’re running a model, or you need to know where it came from and how it has been tested and be able to explain that to other people. It’s really everyone involved in the whole data and model pipeline.
What is the role of IBM in data fairness
IBM is a founding member of the Linux AI & Data Foundation, an umbrella foundation of the Linux Foundation that supports open source innovation in artificial intelligence, machine learning, deep learning, and data. It was created to support open source AI, ML, DL and Data, and to create a sustainable open source AI ecosystem that makes it easy to create AI and Data products and services using open source technologies.
The Foundation facilitates an extensive range of toolkits, frameworks and platforms that endeavour to bring trust and integrity into data science. This is particularly interesting to Margriet who asks “ How do you build new tools to actually be more aware of data and model bias before you start building them? These tools are important to help you build fair models, making sure you will become more aware about what could go wrong.”
What should a developer and data scientist do if they find bias in AI?
Margriet assets that you should always try to go back to the data source if you have the access. “Try to find out if and why there is bias. In many cases, it is just that there’s structural bias, because unfortunately, that’s how society works. But be aware that your model might exaggerate/amplify this bias.”
Ethical tools and resources
There are several tools out there to help developers and software designers in identify biases in their AI-based applications. One of them is AI Fairness 360, an extensible open source toolkit that can help you examine, report, and mitigate discrimination and bias in machine learning models throughout the AI application lifecycle.
The Adversarial Robustness Toolbox (ART) is a Python library for Machine Learning Security. ART provides tools that enable developers and researchers to evaluate, defend, certify and verify machine learning models and applications against the adversarial threats of Evasion, Poisoning, Extraction, and Inference.
The Not Equal Network aims to foster new collaborations in order to create the conditions for digital technology to support social justice, and develop a path for inclusive digital innovation and a fairer future for all in and through the digital economy.
The Institute for Ethical AI & Machine Learning have created the AI-RFX Procurement Framework, a set of templates to empower industry practitioners to raise the bar for AI safety, quality and performance.