The implication of an AI system on society is increasing daily. It is important for organisations and institutions to develop fair and unbiased AI systems. In this post, we discuss the different types of bias and their impact on society.
In our previous article, we have explained what AI Shepherds consider the main ethical values: mindful impact, fairness and transparency(explainability).
In this post, we will be discussing more in-depth AI fairness, which is the centre of focus of organizations, governments, and civil societies, for quite some time now. With the increase in AI-based recommendations in our daily lives, it should be mandatory for recommendation systems to be fair and unbiased. The fairness of an AI system is a complex issue that can be impacted by the data gathered to train the solution, and the mathematical model implemented on the data by the AI system. It is highly connected with the explainability, interoperability and transparency of how the models were implemented in the system.
Before discussing more Fairness in AI we should know what Fairness means. In general terms, fairness is the disparate treatment and its impact on certain groups of people because of certain attributes and preconceived notions. For e.g., Certain car models are sometimes considered to be used by a certain gender. When a bias is involved in the decision-making process then the impact on the final conclusion tends to be unfair. Like humans, even machines exhibit bias while making decisions. In an AI system, the bias is rooted in the training data used. Since the AI systems learn from the hidden patterns in the training data, the presence of a bias in the data can cause a huge impact on the end results. The bias involved are of two main types:
1.) Data Bias: Data bias exists because of a couple of reasons. One of the major causes of Data bias is historical bias. When a system is created from a long-followed belief of a trait then historical bias comes into play. For example, A person has experience driving a left seater car. For years in most countries, the driver’s seat is on the left side of the car and it has become a notion in the human mind that the driver’s seat will always be on the left side. So the driver has a bias that throughout the world he can drive the car perfectly. But in South Asian and some Anglo-Saxon countries, the driver’s seat is on the right side. Henceforth, a person having a bias toward a left seater car will feel difficulty manoeuvring a right seater car and create problems for him.
The other major cause of data bias is the sampling of the data set . For example, most of the research and development of autonomous cars has taken place in the United States of America. The data gathered during the development includes geographic information, demographic information, culture, road markings, speed limit, etc. of America. Data gathered in America would have different road signs and would have different speed limits. If the same data is used to create a model for an autonomous vehicle in some other part of the world, then many road signs will not be recognised by the model. Furthermore, the speed limit of such autonomous cars would be altered as well, leading to a dangerous scenario for the driver and passengers of the car.
Along with the historical bias and sampling of data sets, there is another important cause of data bias as well, which has more ethical implications than the other two. Confirmation bias is when data is collected in accordance with the required end result. For example, if for some reason, one wants to show that the drivers of a particular race are better drivers than others, one could sample the data favouring that race. Henceforth, one can force the desired results.
2.) Algorithmic Bias: The other type of bias is Algorithmic Bias. When the correlation of parameters in data is assumed to be the cause of an event then an Algorithmic Bias is created. There are instances when two or more parameters are associated with each other but are not causally related but due to the presence of a certain factor, the system considers the causal correlation to be the reason and trains the model in a certain way.
For example, let’s imagine an assembly line in an automotive company. By increasing the number of people on the assembly line, the number of cars assembled will increase. There is a direct correlation between the number of people with the number of cars assembled. But, if a parameter of race or gender of the employee is considered then the system can create a correlation of the gender or race of the employee at the assembly line with the assembled cars. In such a scenario gender or race can be considered as the reason for the low or high number of cars assembled, which is highly unfair.
Definition of Fairness in AI changes from industry to industry. One should keep in mind that Fairness in AI is a complicated and multi-level problem. From gathering data to creating a model, various kinds of bias make the model unfair. Furthermore, the wrongly developed model can have an unfair impact on society as well, leading to new problems. Hence, creating a fair AI model is necessary to help the company grow along with solving many problems faced by society and create a better path for further implementation of AI in day-to-day life.