We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.CL

Change to browse by:

cs

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Computer Science > Computation and Language

Title: Moral Mimicry: Large Language Models Produce Moral Rationalizations Tailored to Political Identity

Abstract: Large Language Models (LLMs) have recently demonstrated impressive capability in generating fluent text. LLMs have also shown an alarming tendency to reproduce social biases, for example stereotypical associations between gender and occupation or race and criminal behavior. Like race and gender, morality is an important social variable; our moral biases affect how we receive other people and their arguments. I anticipate that the apparent moral capabilities of LLMs will play an important role in their effects on the human social environment. This work investigates whether LLMs reproduce the moral biases associated with political groups, a capability I refer to as moral mimicry. I explore this hypothesis in GPT-3, a 175B-parameter language model based on the Transformer architecture, using tools from Moral Foundations Theory to measure the moral content in text generated by the model following prompting with liberal and conservative political identities. The results demonstrate that large language models are indeed moral mimics; when prompted with a political identity, GPT-3 generates text reflecting the corresponding moral biases. Moral mimicry could contribute to fostering understanding between social groups via moral reframing. Worryingly, it could also reinforce polarized views, exacerbating existing social challenges. I hope that this work encourages further investigation of the moral mimicry capability, including how to leverage it for social good and minimize its risks.
Comments: 26 pgs incl. references, 11 figures
Subjects: Computation and Language (cs.CL)
Cite as: arXiv:2209.12106 [cs.CL]
  (or arXiv:2209.12106v1 [cs.CL] for this version)

Submission history

From: Gabriel Simmons [view email]
[v1] Sat, 24 Sep 2022 23:55:53 GMT (2809kb,D)

Link back to: arXiv, form interface, contact.