Hostname: page-component-848d4c4894-hfldf Total loading time: 0 Render date: 2024-06-11T06:55:07.239Z Has data issue: false hasContentIssue false

Genies, lawyers, and smart-asses: Extending proxy failures to intentional misunderstandings

Published online by Cambridge University Press:  13 May 2024

Tomer D. Ullman*
Affiliation:
Department of Psychology, Harvard University, Cambridge, MA, USA www.tomerullman.org
Sophie Bridgers
Affiliation:
Department of Psychology, Harvard University, Cambridge, MA, USA www.tomerullman.org Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA secb@mit.edu
*
Corresponding author: Tomer D. Ullman; Email: tullman@fas.harvard.edu

Abstract

We propose that the logic of a genie – an agent that exploits an ambiguous request to intentionally misunderstand a stated goal – underlies a common and consequential phenomenon, well within what is currently called proxy failures. We argue that such intentional misunderstandings are not covered by the current proposed framework for proxy failures, and suggest to expand it.

Type
Open Peer Commentary
Copyright
Copyright © The Author(s), 2024. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Bridgers, S. E. C., Taliaferro, M., Parece, K., Schulz, L., & Ullman, T.. (2023). Loopholes: A window into value alignment and the communication of meaning. PsyArxiv.Google Scholar
Da Silva, S. G., Tehrani, J. J. (2016). Comparative phylogenetic analyses uncover the ancient roots of Indo-European folktales. Royal Society open science, 3(1), 150645.CrossRefGoogle ScholarPubMed
Goodman, N. D., & Frank, M. C. (2016). Pragmatic language interpretation as probabilistic inference. Trends in Cognitive Sciences, 20(11), 818829.CrossRefGoogle ScholarPubMed
Hannikainen, I. R., Tobia, K. P., de Almeida, G. D. F., Struchiner, N., Kneer, M., Bystranowski, P., … Żuradzki, T. (2022). Coordination and expertise foster legal textualism. Proceedings of the National Academy of Sciences of the United States of America, 119(44), e2206531119.CrossRefGoogle ScholarPubMed
Isenbergh, J. (1982). Musings on form and substance in taxation. HeinOnline.CrossRefGoogle Scholar
Katz, L. (2010). A theory of loopholes. The Journal of Legal Studies, 39(1), 131.CrossRefGoogle Scholar
Krakovna, V. (2020). Specification gaming examples in AI – Master list. http://bit.ly/kravokna_examples_list (accessed: 2020-12-28).Google Scholar
Opie, I. A., & Opie, P. (2001). The lore and language of schoolchildren. New York Review of Books.Google Scholar
Scott, J. C. (1985). Weapons of the weak: Everyday forms of peasant resistance. Yale University Press.Google Scholar
Uther, H.-J. (2004). The types of international folktales – A classification and bibliography. Suomalainen Tiedeakatemia Academia Scientiarum Fennica Exchange Centre.Google Scholar