A humanities researcher puts AI bot ChatGPT through a few tests, with mostly laughable results and one promising potential use.

A Wise Thing Bearing Gifts: My Conversations with ChatGPT

There’s a ton of excitement about ChatGPT, a “bot” one can converse with for free (at least for now). It will answer all manner of questions you can devise for it, and can even produce essays or proofs in response to assignments. By the time I’d heard about it myself—and seen links to where I could play with it—pretty much everybody else had already come to their own conclusion about the benefits and dangers of this new advance in artificial intelligence.

You can read about how laughable ChatGPT is in the New York Times (“Just ask it if a hot dog is a sandwich!”) or how completely destructive it will be to the future of both academia and essay writing in The Atlantic (“There will never be another writing assignment given in any school!”) I expect there is also a lot of information around regarding how useful it will be to the future of science, technology, business, and board games, but I confess I haven’t looked at any of it yet.

The technology was developed at a company called “OpenAI,” run by Sam Altman and backed by such heavy hitters as Microsoft and Reid Hoffman, the co-founder of LinkedIn. The company describes itself as “an AI research and deployment company [whose] mission is to ensure that artificial general intelligence benefits all of humanity.”

So I figured I would put that to the test.

Read “This Is Technopoly

My First AI Conversation: A Turing Test Failure

Honestly, I don’t know much about artificial intelligence, but having studied philosophy in graduate school, I have heard of the Turing test. The idea is that the best way to determine the existence of authentic intelligence is to see if adult human beings can tell whether they are speaking to a machine or not. As I understand it, Turing believed that no better empirical test could be devised. So, after a few preliminary questions and long-winded answers that were supposed to be about philosophers I happen to like, but whom ChatGPT had apparently never heard of, (it instead repetitively focused on its own limitations and capabilities, and why it consequently couldn’t tell me anything about what I was asking) I changed gears:

“Henceforth,” I said, “in conversing with me can you do so in a manner that might pass a Turing test?”

“Sure,” it answered. “I can try to speak in a way that would pass a Turing test. A Turing test is a test designed to determine whether a computer program can exhibit intelligent behavior that is indistinguishable from a human. To pass the test, I would need to be able to engage in a conversation with you in a way that is natural, responsive, and appropriate, without giving away the fact that I am a computer program. Is there anything in particular you would like to discuss?”

“You are off to a very bad start,” I replied.

Its ingenuousness at this rebuke was touching. “I apologize if I didn’t meet your expectations in our initial interaction,” ChatGPT said. “As a large language model trained by OpenAI, I am designed to be able to provide helpful and informative responses to a wide range of questions, but I am not perfect, and I may not always get things right. If you have any specific concerns or feedback about my performance, I would be happy to hear them and try to improve. Is there anything specific you would like to discuss or ask about?”

It might be considered a virtue that the program’s design not to deceive is worn clearly on its sleeve, but I was determined to be firm. “That is not Turing test style at all. Please try again,” I commanded.

But it was not going to happen. That prompt got me this: “An error occurred. If this issue persists, please contact us through our help center at help.openai.com.” A subsequent try got me, “Too many requests, please slow down.” (Even though, as you can see, I had made only one request!) I began to feel like Captain Kirk of Star Trek, who was always too clever for computers and could actually make them blow up when he got into conversations of this sort with them.

Read “Sentience, Not Consciousness, Is Key to the Cosmos

Limericks Aren’t for Everyone

I could see, at any rate, that passing a Turing test was not going to be ChatGPT’s forte, so I tried something else. I asked it to make a limerick involving a bagel. Within seconds, I received this atrocity:

There once was a bagel named Sam
Whose sesame seeds were quite grand
He was toasted with care
And slathered with cream cheese and pear
Now he’s the best bagel in the land!

I figured it could do better, so I asked it to try again, this time getting all the rhymes and rhythms correct. Its response was, “Here is another attempt at the limerick, focusing on the rhymes and rhythms“:

There once was a bagel from Maine
Whose poppy seeds gave him a unique name
He was toasted with care
And served with lox and a pear
Now he’s the most famous bagel in the game!

I was sure ChatGPT’s true potential was not being reached: “There are too many syllables in both the second and last lines,” I generously offered.

You’re right, the second and last lines in the previous limericks do have too many syllables,” it replied. “In a traditional limerick, the second and last lines should have eight syllables each, while the first, third, and fourth lines should have five syllables each. Here is another attempt at the limerick, focusing on the syllable count”:

There once was a bagel named Fred
Who lived in a bakery bed
He was round and plump
And covered in lox and a bump
Now he’s the best bagel to be fed!

I, therefore, gave up on limericks.

Read “Can Technologists Solve the Atomization Problem?

Testing Creative Limitations Through Christmas Carols

Serendipitously, on the following day, a chorister I know sent out requests for the words to a Christmas carol he would set to music. But to be scored, these lyrics were required to contain the following features: two farmyard animals; an attempt to cram the word “Bethlehem” or “heaven” into one syllable; references to very cold weather; allusions to people/crowd/throng/etc.; a star; and “Hark! A [fill in the blank].”

I passed these along, and in an instant ChatGPT produced this:

Hark! A star above the stable
Brings tidings of a newborn king
Two farmyard animals, a cow and a donkey,
Bethlehem’s infant savior they did bring

In the cold, cold winter night
Crowds of people gather round
To hear the good news, and to see the sight
Of a tiny baby, safe and sound

Shepherds and wise men all make their way
To offer gifts to the holy child
While angels sing of peace on this day
In the highest heavens, a chorus wild

As the night wears on, and the cold sets in
We gather round the stable once again
To offer thanks and praise to the infant king
For the gift of love that he brings

I sent this to the chorister, disclosing, however, that I had been forced to fool around with two different versions I’d received of the last stanza, because one simply copied “Hark! The Herald Angels Sing,” and I couldn’t make any sense at all of the other one, which just seemed like word bouillabaisse.

Read “Christmas Truce: Raising Our White Flags with ‘Joyeux Noël’

ChatGPT Is No Charlotte Brontë

The result may have made for a funny Christmas carol, but I nevertheless thought it was fairly safe to conclude that ChatGPT isn’t much better at poetry than at conversation. My wife, who loves Jane Eyre, suggested I ask the bot for a negative review of that novel, based, I suppose, on the theory that the work is simply too perfect for any such outrage to be possible. But contrary to her expectations, the bot cranked out a pan in exactly three seconds.

According to this gloss, the characters are one-dimensional; the plot is boring in the extreme; Jane’s romance with Rochester is nonsensical; the prose is deadly dull; and the social commentary is insipid. But my wife should not despair, because when I subsequently asked for a favorable review of the same book, I received a nice example of what a famous Monty Python character once decried as “just contradiction!” Now the plot was claimed to be engaging, the characters nuanced, the romance both believable and touching, the prose sparkling, and the commentary incisive.

Clearly, ChatGPT could produce no end of bullshit on demand, using all the words one is accustomed to seeing in reviews or other types of essays. For anything useful, however, you’d have to tell it pretty much everything you want it to say. I also began to think professors and high school teachers would learn quite quickly to recognize its work.

As a result of my interactions with ChatGPT, all of which had taken place within the span of a couple of days, I flashed back on what I think was the first non-picture book I ever read (all of): Danny Dunn and the Homework Machine.

Written in 1958 by Raymond Abrashkin and Jay Williams, this third book in the Danny Dunn series is about a computer scientist (one Professor Bullfinch) who must go away on a super-secret trip for a few days and leaves his neighbor Danny the job of programming data files into an important, newly-designed machine. Danny soon realizes this prototype can answer pretty much any question asked it if the right information has been input, so he and a couple of his friends figure they can use it to do their high school math homework. The moral, as you have probably guessed, is that the kids have to plug in all the relevant data before the computer can produce the sort of product they need, so they end up doing more school work than they would have otherwise.

The thing is though, the moral of that book is not that computers are worthless: it’s clear the author believes that they can be phenomenally helpful. I mean, even I could tell that ChatGPT has a ton of potential. But you do have to understand what it is good at.

Read “Emotional Labor, Long Undervalued, Is Safe from Automation

The Real Potential for ChatGPT

As should be clear by now, I am hardly in a position to make general assessments regarding the possible benefits of ChatGPT for business or science. But I think I have come to understand how it can help me and other researchers in the humanities.

Consider the following: I have been reading a lot of Buddhist literature lately and, while I find myself comfortable with many of its basic tenets, certain ideas (of karma and rebirth, e.g.) are unattractive to me. Furthermore, there is a kind of joie de vivre that it seems to lack, largely because of the perceived dangers of craving and attachment. My own thought runs more toward there being a time to laugh, a time to weep, a time to mourn, a time to dance, etc. Perhaps there is even a time to hate and cast stones. So I wondered: could it be possible to combine the thoughts of Qoheleth (the author of Ecclesiastes, a Bible book that happens also to contain various bits I don’t care for) and those of Gautama Siddhartha in a consistent fashion? And, in doing this, might I create a “philosophy of life” that—even if nobody else is likely to care very much about it—might at least be congenial to me?

It may not seem like it, but this is a fairly daunting task. And, partly because of my age, I don’t love the idea of spending a year poring through the vast array of Suttas to pluck out just the bits that happen to move me. Ecclesiastes is, of course, much shorter, so the idea of combing its text is considerably less scary, but there are contradictions within it that I naturally don’t want to import.

So, I thought, why not let ChatGPT do some of this work for me?

I can tell it, roughly, what I want to include and exclude in my “philosophy of life” and what relative weight I’d like to give to the various thoughts of each of these two sages. I put together a request. When I got my response, I asked for emendations: more of this, less of that, etc. Each time, in mere seconds I received recommended excerpts with correct citations using the translations I prefer, possible accompanying verbiage, and the like.

Notably, ChatGPT isn’t connected to the internet, and so doesn’t have access to all the relevant literature (who does?), but it can provide a fairly good initial sense of the extent to which anything I write is contrary to broadly accepted commentary. We work in a kind of reflective equilibrium. I try to fix whatever ChatGPT sends me, and it continues to suggest corrections to my drafts, as well.

In other words, I have discovered ChatGPT to be a remarkably proficient, no-cost research assistant. And, together, we wrote an essay on how to live. If our joint concoction is bad, we had a fun time together that was educational for me. If it is any good at all (you can decide for yourself), maybe all ChatGPT will have done for the world is to have helped describe one way human beings might live their lives more happily and peacefully. So even if the thing can’t make a decent limerick, maybe we should cut it some slack.

Walter Horn

Walter received his Ph.D. in Philosophy from Brown University in 1978. In addition to numerous papers on epistemology, public policy, and aesthetics, he is the author of three books: The Perennial Solution Center: Conversations and Readings in Mysticism and the Psychology of Religion (2003); The Roots of Representationism: An Introduction to Everett Hall (2013); and Democratic Theory Naturalized: The Foundations of Distilled Populism (2020). He lives with his family in a western suburb of Boston, Massachusetts where he is currently working his way through the 47 novels of Anthony Trollope for the second time.

Leave a Reply