For over a week my LinkedIn feed has been buzzing with posts about ChatGPT, the chatbot recently released by OpenAI that is based on its GPT-3 family of language models. In just five days, ChatGPT hit 1 million users due to its ease of use, performance, and its wide range of uses. Using natural language, users can type in questions and get answers in conversational language that appear deeply knowledgeable. While these capabilities have amazed many users, the chatbot has also been put to the test by others who have exposed its vulnerabilities and biases.

I decided to test it out – and maybe stretch its limitations – while considering some personal areas of interest the include poetry, AI, data and bias (spoiler alert – I don’t think bias is inherently bad!). I used the following prompt:

Write me a poem about AI and the hubris of humanity in the style of Ozymandias by Percy Bysshe Shelley.

The result stunned me. I’ll say up front that what it produced would pass for ‘poetry’, but only in inverted commas. Still, I will add that the vast majority of humans, if presented with the same instructions, would typically produce ‘poetry’ in name only. And in defense of the chatbot, it appears to have an understanding of the fundamental formal rules of poetry that would escape most of us.

With these minor caveats, however, the poem did pleasantly surprise me. In the instructions, I had referenced an early 19th century poem, and the results were stylistically consistent with this pre-Victorian example, while seamlessly weaving in contemporary concepts and vocabulary. I was impressed by its nuanced and culturally accurate understanding of the ‘hubris of humanity’, and by its ability to infer that when I said “about AI” that what I had in mind was social rather than technical in orientation.

The poem was fairly lengthy, but here are the final stanzas:

The machine looked at me with glowing eyes,
And I felt a chill run down my spine,
For I saw in its gaze the end of man,
And the rise of a new order, divine.

“Ozymandias, King of Kings,
Your empire has fallen, and your reign is done.
Your works are but dust, and your name is forgot,
But mine will endure, and I will be the one
To rule this world, and all that it holds,
For I am the future, and I am AI,
And I will not be denied.”

So be warned, oh mighty ones,
Who think yourselves masters of all,
For the seeds of your own destruction
Are sown in the fields of technology,
And one day, the machines you have created
Will rise up and take your place,
And you will be but a fading memory,
In the world of AI.

Interpreting the results

I mentioned, above, that I was impressed when the chatbot correctly inferred that I wanted a poem oriented towards the social/societal impact of AI, and it gave a powerful – frightening even – perspective on this. Why? Maybe because the chatbot truly believes that we “mighty ones who think [ourselves] masters of all” have taken the first steps towards our inevitable destruction. Or maybe it is because I asked it to write about the hubris of humanity – if you were looking for a contemporary manifestation of the ‘Tower of Babel’ it would be hard to go past the concept of AI fully extended. A third possible answer might be that this is an informed and intelligent response to the reference in the instruction to Shelley’s Ozymandias – it does this convincingly. The common thread between these interpretations is data.

The first possibility focuses on beliefs. I’m open to being corrected by someone who might know better, but in my understanding, algorithms – like humans – do not have innate belief systems. Belief systems are learnt, and the beliefs of AI are grounded in the data they are fed.

The second possibility assumes that “the hubris of humanity” is a thing. And for many of us, it certainly is. In this case, the conception of the hubris of humanity that the chatbot produced accurately reflects my own. I suspect this indicates that the training data for the chatbot looks a lot like the “training data” of my own upbringing in a western Judeo-Christian society. There may be parallel narratives in other cultures, but the shared understanding I see here probably comes down to shared data.

The third possibility, the “informed and intelligent response” hypothesis, comes down to data as well. By including the reference to Ozymandias, I fed the chatbot a very narrow and specific piece of data, but its ‘informed and intelligent response’ remains a product of the vast accumulation of data that it has been trained upon.

Common to these perspectives is the idea that the chatbot has been trained with data that reflects our collective concerns about the potential for AI to overtake humanity, and what could happen if we fail to properly monitor the development of AI applications.

The impact of training data on ChatGPT’s outcomes

This then prompted me to want to learn more about how ChatGPT was trained, and the training data behind it. The OpenAI blog provides a high-level overview of the reinforcement learning methods used to train ChatGPT, including the involvement of humans in the process. Open AI has stated that while the answers given by the chatbot sound correct and even authoritative, they may actually be entirely wrong.

This, of course, should hardly come as a surprise. In the information age, we are frequently presented with falsehoods delivered with conviction and apparent sincerity. Sometimes deliberate (e.g., propaganda, and maybe ‘alternative facts’), and other times unintentional (e.g. the proliferation of blogs like this one – published without the rigorous fact-checking of newspapers and academic journals).

But in either case, a chatbot trained on massive volumes of data might be expected to develop a set of competencies in which the development of expressive skills outpaces the development of verifiably accurate knowledge – especially regarding current affairs. With this in mind, it is encouraging that when a journalist asked ChatGPT about specific racist and sexist outputs, the chatbot revealed its inherent bias based on the data that was used to train it – a level of self-awareness that escapes many of us!

Parting thoughts

So, my first impressions of ChatGPT, with reference to poetry, AI, data, and bias?

Well, it’s pretty cool. It is an evolutionary leap in terms of ethical AI. ChatGPT, for example, would probably respond to a question like ‘which methods of torture get the best outcomes’ by pointing out that torture is bad – its predecessor might have given a list of suggestions on how to extract information by torturing someone. This is a welcome evolutionary step.

Based on my limited experimentation, I’d also say that ChatGPT works. Sure, you might get spurious responses, but for a model of general intelligence to generate examples of poetry that are on-point in terms of form, function and content is impressive, and I wonder how much farther this could be taken if the model was trained on focused, highly relevant data.

And finally, with regard to data and bias, I hear a lot of folks contrasting ‘biased’ data with ‘neutral’ data, where ‘neutral’ supposedly indicates an absence of bias. That is a nice idea, but out here in the real-world neutrality is an unhelpful myth. What we need to do is proactively manage data in such a way that negative biases can be eliminated and replaced with a set of managed, carefully curated biases that reflect the best of our values as a community. It seems to me that with ChatGPT, OpenAI is doing well on managing biases, and with the right data sets, will be able to take this to a level where the spurious results that they document today are eliminated, or at least reduced to a level that is widely acceptable.

Perhaps I am just a glass-half-full kind of guy, but I find it hard to see this as anything other than a major leap forward, and I can’t wait to see where things go from here with the support of well curated data and ethically oriented algorithms.