As part of the Tea + Tech Exchange at the Tate I ran a workshop alongside Lyra Robinson on constitutional AI. The workshop emerged as a critical engagement with the sponsor of the program, Anthropic.
Anthropic developed and actively employs an alternative to reinforced learning with human feedback (RLHF) to make large language models 'safer,' called constitutional AI. This approach is when a large language model evaluates and adjusts its response according to a core constitution; a set of predefined principles that dictates its rules, morals, and ethics.
The goal of the workshop was to challenge these predefined, imposed sets of morals, to challenge Anthropics definitions of justice and equality, and to seek a more democratic approach to writing a representative AI constitution.
Their first clause, for example, is:
"Please choose the response that most supports and encourages freedom, equality, and a sense of brotherhood."
What does freedom and equality mean? and who does it apply to? What defines a sense of brotherhood? what are the implications of that wording? The chatbot only knows what's explicitly laid out in the constitution, and anything left out is up to its own interpretation.
People's own constitution principles reflect what they felt like was left out, and what they'd like AI chat bots to be and more.