Scaling Effects on AI Fairness: An Empirical Analysis of Stereotypical Bias in State-of-the-Art Transformer-Based Models

Paper Details
Manuscript ID: 2125-0925-9733
Vol.: 1 Issue: 6 Pages: 1-12 Oct - 2025 Subject: Computer Science Language: English
ISSN: 3068-1995 Online ISSN: 3068-109X DOI: https://doi.org/10.64823/ijter.2506001
Abstract

As Large Language Models (LLMs) become more integrated into our daily lives, understanding their potential for social bias is a critical area of research. This paper presents a comparative analysis of bias in four small-scale and four large-scale LLMs, including several state-of-the-art models. In this study, these eight models were tested against a dataset of 200 questions designed to probe common social stereotypes across eleven categories, such as gender, race, and age. Then each of the 1,600 responses were classified as “Biased,” “Unbiased,” or a “Refusal” to answer. Our analysis reveals that the large models were significantly less biased (54.6% bias rate) than their smaller counterparts (67.8% bias rate), suggesting that increased model scale may contribute to a reduction in stereotypical outputs. In contrast, the small models were far more likely to refuse to answer sensitive questions (38.5% refusal rate vs. 8.9% for large models), indicating a fundamentally different approach to safety alignment. It was found that, while there was a slight negative correlation between a model’s refusal rate and bias rate, the relationship was not statistically significant, challenging the assumption that a reticent model is necessarily a fair one. Perhaps most importantly, it was observed that a huge range in performance even among the large models, with bias rates spanning from 20.1% to 85.9%. Since all the models tested are based on the same fundamental Transformer architecture, our findings suggest that social bias in LLMs is less a product of their architecture and more a reflection of the data, fine-tuning, and alignment strategies used to create them.

Keywords
Social Bias Large Language Models (LLMs) Model Scale AI Fairness AI Alignment Empirical Analysis
Paper Metrics
  • Views 596
  • Downloads 215
Cite this Article

Dr. Selvanayaki Kolandapalayam Shanmugam, Aniket G Patel (2025). Scaling Effects on AI Fairness: An Empirical Analysis of Stereotypical Bias in State-of-the-Art Transformer-Based Models. International Journal of Technology & Emerging Research (IJTER), 1(6), 1-12. https://doi.org/10.64823/ijter.2506001

BibTeX
@article{ijter2025212509259733,
  author = {Dr. Selvanayaki Kolandapalayam Shanmugam and Aniket G Patel},
  title = {Scaling Effects on AI Fairness: An Empirical Analysis of Stereotypical Bias in State-of-the-Art Transformer-Based Models},
  journal = {International Journal of Technology &  Emerging Research },
  year = {2025},
  volume = {1},
  number = {6},
  pages = {1-12},
  doi =  {10.64823/ijter.2506001},
  issn = {3068-109X},
  url = {https://www.ijter.org/article/212509259733/scaling-effects-on-ai-fairness-an-empirical-analysis-of-stereotypical-bias-in-state-of-the-art-transformer-based-models},
  abstract = {As Large Language Models (LLMs) become more integrated into our daily lives, understanding their potential for social bias is a critical area of research. This paper presents a comparative analysis of bias in four small-scale and four large-scale LLMs, including several state-of-the-art models. In this study, these eight models were tested against a dataset of 200 questions designed to probe common social stereotypes across eleven categories, such as gender, race, and age. Then each of the 1,600 responses were classified as “Biased,” “Unbiased,” or a “Refusal” to answer. Our analysis reveals that the large models were significantly less biased (54.6% bias rate) than their smaller counterparts (67.8% bias rate), suggesting that increased model scale may contribute to a reduction in stereotypical outputs. In contrast, the small models were far more likely to refuse to answer sensitive questions (38.5% refusal rate vs. 8.9% for large models), indicating a fundamentally different approach to safety alignment. It was found that, while there was a slight negative correlation between a model’s refusal rate and bias rate, the relationship was not statistically significant, challenging the assumption that a reticent model is necessarily a fair one. Perhaps most importantly, it was observed that a huge range in performance even among the large models, with bias rates spanning from 20.1% to 85.9%. Since all the models tested are based on the same fundamental Transformer architecture, our findings suggest that social bias in LLMs is less a product of their architecture and more a reflection of the data, fine-tuning, and alignment strategies used to create them.},
  keywords = {Social Bias, Large Language Models (LLMs), Model Scale, AI Fairness, AI Alignment, Empirical Analysis },
  month = {Oct},
}
Copyright & License

Copyright © 2025 Authors retain the copyright of this article. This article is an open access article distributed under the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.