We propose to measure political bias in LLMs by analyzing both the content and style of their generated content regarding political issues. Existing benchmarks and measures focus on gender and racial biases. However, political bias exists in LLMs and can lead to polarization and other harms in downstream applications. In order to provide transparency to users, we advocate that there should be fine-grained and explainable measures of political biases generated by LLMs. Our proposed measure looks at different political issues such as reproductive rights and climate change, at both the content (the substance of the generation) and the style (the lexical polarity) of such bias. We measured the political bias in eleven open-sourced LLMs and showed that our proposed framework is easily scalable to other topics and is explainable.
Preprint
High-Dimension Human Value Representation in Large Language Models
Samuel Cahyawijaya , Delong Chen , Yejin Bang , and 5 more authors
2024
2023
EMNLP
Mitigating Framing Bias with Polarity Minimization Loss
Yejin Bang , Nayeon Lee , and Pascale Fung
In Findings of the Association for Computational Linguistics: EMNLP 2023 , Dec 2023
Framing bias plays a significant role in exacerbating political polarization by distorting the perception of actual events. Media outlets with divergent political stances often use polarized language in their reporting of the same event. We propose a new loss function that encourages the model to minimize the polarity difference between the polarized input articles to reduce framing bias. Specifically, our loss is designed to jointly optimize the model to map polarity ends bidirectionally. Our experimental results demonstrate that incorporating the proposed polarity minimization loss leads to a substantial reduction in framing bias when compared to a BART-based multi-document summarization model. Notably, we find that the effectiveness of this approach is most pronounced when the model is trained to minimize the polarity loss associated with informational framing bias (i.e., skewed selection of information to report).
AACL
A multitask, multilingual, multimodal evaluation of chatgpt on reasoning, hallucination, and interactivity
Yejin Bang , Samuel Cahyawijaya , Nayeon Lee , and 8 more authors
This paper proposes a framework for quantitatively evaluating interactive LLMs such as ChatGPT using publicly available data sets. We carry out an extensive technical evaluation of ChatGPT using 23 data sets covering 8 different common NLP application tasks. We evaluate the multitask, multilingual and multi-modal aspects of ChatGPT based on these data sets and a newly designed multimodal dataset. We find that ChatGPT outperforms LLMs with zero-shot learning on most tasks and even outperforms fine-tuned models on some tasks. We find that it is better at understanding non-Latin script languages than generating them. It is able to generate multimodal content from textual prompts, via an intermediate code generation step. Moreover, we find that ChatGPT is 63.41% accurate on average in 10 different reasoning categories under logical reasoning, non-textual reasoning, and commonsense reasoning, hence making it an unreliable reasoner. It is, for example, better at deductive than inductive reasoning. ChatGPT suffers from hallucination problems like other LLMs and it generates more extrinsic hallucinations from its parametric memory as it does not have access to an external knowledge base. Finally, the interactive feature of ChatGPT enables human collaboration with the underlying LLM to improve its performance, i.e, 8% ROUGE-1 on summarization and 2% ChrF++ on machine translation, in a multi-turn prompt engineering fashion. We also release codebase for evaluation set extraction.
TrustNLP
Enabling Classifiers to Make Judgements Explicitly Aligned with Human Values
Yejin Bang , Tiezheng Yu , Andrea Madotto , and 3 more authors
In Proceedings of the 3rd Workshop on Trustworthy Natural Language Processing (TrustNLP 2023) , Jul 2023
Many NLP classification tasks, such as sexism/racism detection or toxicity detection, are based on human values. Yet, human values can vary under diverse cultural conditions. Therefore, we introduce a framework for value-aligned classification that performs prediction based on explicitly written human values in the command. Along with the task, we propose a practical approach that distills value-aligned knowledge from large-scale language models (LLMs) to construct value-aligned classifiers in two steps. First, we generate value-aligned training data from LLMs by prompt-based few-shot learning. Next, we fine-tune smaller classification models with the generated data for the task. Empirical results show that our VA-Models surpass multiple baselines by at least 15.56% on the F1-score, including few-shot learning with OPT-175B and existing text augmentation methods. We suggest that using classifiers with explicit human value input improves both inclusivity & explainability in AI.
ACM Surveys
Survey of Hallucination in Natural Language Generation
Ziwei Ji , Nayeon Lee , Rita Frieske , and 7 more authors
Natural Language Generation (NLG) has improved exponentially in recent years thanks to the development of sequence-to-sequence deep learning technologies such as Transformer-based language models. This advancement has led to more fluent and coherent NLG, leading to improved development in downstream tasks such as abstractive summarization, dialogue generation, and data-to-text generation. However, it is also apparent that deep learning based generation is prone to hallucinate unintended text, which degrades the system performance and fails to meet user expectations in many real-world scenarios. To address this issue, many studies have been presented in measuring and mitigating hallucinated texts, but these have never been reviewed in a comprehensive manner before.In this survey, we thus provide a broad overview of the research progress and challenges in the hallucination problem in NLG. The survey is organized into two parts: (1) a general overview of metrics, mitigation methods, and future directions, and (2) an overview of task-specific research progress on hallucinations in the following downstream tasks, namely abstractive summarization, dialogue generation, generative question answering, data-to-text generation, and machine translation. This survey serves to facilitate collaborative efforts among researchers in tackling the challenge of hallucinated texts in NLG.
Arxiv
Survey of Social Bias in Vision-Language Models
Nayeon Lee , Yejin Bang , Holy Lovenia , and 3 more authors
arXiv preprint arXiv:2309.14381, Mar 2023
AI4SG
Towards Answering Open-ended Ethical Quandary Questions
Yejin Bang , Nayeon Lee , Tiezheng Yu , and 8 more authors
In AI for Social Good Workshop @AAAI 2023 , Mar 2023
Arxiv
Learn What NOT to Learn: Towards Generative Safety in Chatbots
Leila Khalatbari , Yejin Bang , Dan Su , and 4 more authors
arXiv preprint arXiv:2304.11220, Mar 2023
2022
Arxiv
Casual Conversations v2: Designing a large consent-driven dataset to measure algorithmic bias and robustness
Caner Hazirbas , Yejin Bang , Tiezheng Yu , and 8 more authors
arXiv preprint arXiv:2211.05809, Mar 2022
2021
SIGDIAL
Assessing Political Prudence of Open-domain Chatbots
Yejin Bang , Nayeon Lee , Etsuko Ishii , and 2 more authors
In Proceedings of the 22nd Annual Meeting of the Special Interest Group on Discourse and Dialogue , Jul 2021
Politically sensitive topics are still a challenge for open-domain chatbots. However, dealing with politically sensitive content in a responsible, non-partisan, and safe behavior way is integral for these chatbots. Currently, the main approach to handling political sensitivity is by simply changing such a topic when it is detected. This is safe but evasive and results in a chatbot that is less engaging. In this work, as a first step towards a politically safe chatbot, we propose a group of metrics for assessing their political prudence. We then conduct political prudence analysis of various chatbots and discuss their behavior from multiple angles through our automatic metric and human evaluation metrics. The testsets and codebase are released to promote research in this area.
NAACL
Towards Few-shot Fact-Checking via Perplexity
Nayeon Lee , Yejin Bang , Andrea Madotto , and 1 more author
In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies , Jun 2021
Few-shot learning has drawn researchers’ attention to overcome the problem of data scarcity. Recently, large pre-trained language models have shown great performance in few-shot learning for various downstream tasks, such as question answering and machine translation. Nevertheless, little exploration has been made to achieve few-shot learning for the fact-checking task. However, fact-checking is an important problem, especially when the amount of information online is growing exponentially every day. In this paper, we propose a new way of utilizing the powerful transfer learning ability of a language model via a perplexity score. The most notable strength of our methodology lies in its capability in few-shot learning. With only two training samples, our methodology can already outperform the Major Class baseline by more than an absolute 10% on the F1-Macro metric across multiple datasets. Through experiments, we empirically verify the plausibility of the rather surprising usage of the perplexity score in the context of fact-checking and highlight the strength of our few-shot methodology by comparing it to strong fine-tuning-based baseline models. Moreover, we construct and publicly release two new fact-checking datasets related to COVID-19.
AAAI
The Adapter-Bot: All-In-One Controllable Conversational Model
Zhaojiang Lin , Andrea Madotto , Yejin Bang , and 1 more author
Proceedings of the AAAI Conference on Artificial Intelligence, May 2021
In this paper, we present the Adapter-Bot, a generative chat-bot that uses a fixed backbone conversational model such as DialGPT (Zhang et al. 2019) and triggers on-demand dialogue skills via different adapters (Houlsby et al. 2019). Each adapter can be trained independently, thus allowing a continual integration of skills without retraining the entire model. Depending on the skills, the model is able to process multiple knowledge types, such as text, tables, and graphs, in a seamless manner. The dialogue skills can be triggered automatically via a dialogue manager, or manually, thus allowing high-level control of the generated responses. At the current stage, we have implemented 12 response styles (e.g., positive, negative etc.), 6 goal-oriented skills (e.g. weather information, movie recommendation, etc.), and personalized and emphatic responses.
Arxiv
Dynamically addressing unseen rumor via continual learning
Nayeon Lee , Andrea Madotto , Yejin Bang , and 1 more author
Personalized dialogue systems are an essential step toward better human-machine interaction. Existing personalized dialogue agents rely on properly designed conversational datasets, which are mostly monolingual (e.g., English), which greatly limits the usage of conversational agents in other languages. In this paper, we propose a multi-lingual extension of Persona-Chat, namely XPersona. Our dataset includes persona conversations in six different languages other than English for evaluating multilingual personalized agents. We experiment with both multilingual and cross-lingual trained baselines and evaluate them against monolingual and translation-pipeline models using both automatic and human evaluation. Experimental results show that the multilingual trained models outperform the translation pipeline and that they are on par with the monolingual models, with the advantage of having a single model across multiple languages. On the other hand, the state-of-the-art cross-lingual trained models achieve inferior performance to the other models, showing that cross-lingual conversation modeling is a challenging task. We hope that our dataset and baselines will accelerate research in multilingual dialogue systems.
2019
WiNLP’19
Understanding the shades of sexism in popular TV series
Nayeon Lee , Yejin Bang , Jamin Shin , and 1 more author
In Proceedings of the 2019 Workshop on Widening NLP , Nov 2019