Unconventional Metrics for LLMs

Exploring the Depths of Logic in LLMs through "Waiting for Godot"

In the realm of artificial intelligence, particularly within the study of Large Language Models (LLMs), the quest for understanding and mimicking human logic and comprehension has taken me on a fascinating journey. Recently, I've embarked on an experimental path to gauge the logical grasp of these models, employing a metric that might seem unconventional at first glance: Samuel Beckett's absurdist play, Waiting for Godot.

Why "Waiting for Godot"?

Waiting for Godot is a masterpiece of absurdist theatre, presenting a scenario where two characters, Vladimir and Estragon, wait endlessly for the arrival of someone named Godot, who ultimately never arrives. This literary piece is not just a narrative; it's an embodiment of absurdism—a philosophical perspective asserting that human efforts to find inherent meaning are doomed to fail due to the vast or contradictory nature of information available.

Given its somewhat obscure reference in literature and its core absurdist nature, Waiting for Godot serves as a unique rubric for evaluating the capabilities of LLMs. The challenge lies not merely in recognizing the play or its author but in understanding and replicating its thematic essence and narrative style.

The Struggle of Smaller Models

In my experiments, most smaller LLMs falter when tasked with engaging with absurdism or directly referencing Waiting for Godot. When prompted to write a story in the vein of Beckett's play, these models often default to creating narratives with a literal villain named "Godot," missing the abstract, thematic underpinnings of the original work. Moreover, those models that do manage a semblance of thematic alignment typically conclude their narratives with the characters meeting Godot, fundamentally missing the point of Beckett's narrative—its perpetual state of waiting without resolution.

Enter GPT-4: A Philosophical and Technological Hybrid

Among the models I've evaluated, GPT-4 stands out for its nuanced understanding and replication of the essence of Waiting for Godot. Its ability to grasp and articulate the play's absurdist theme is remarkable, especially considering the limitations inherent in LLMs.

GPT-4 generated a story that is both a philosophical and technological hybrid. On one hand, it delves deep into existentialist themes, questioning the nature of existence, the quest for meaning, and the essence of the human condition. On the other, it crafts a narrative set against a sci-fi backdrop, featuring two individuals waiting on an AI's output—an output that, in true absurdist fashion, never arrives.

This narrative not only pays homage to Beckett's original play but also reflects its conclusion: the realization that the awaited transformation or salvation may never come. Yet, there exists a symbiotic relationship with this reality. The characters, much like Vladimir and Estragon, are bound to their existential wait, finding comfort in their shared routines and the company they keep, despite the inherent absurdity of their situation.

GPT-4's Unique Echo

What sets GPT-4 apart is its ability to mirror the play's underlying challenge—to confront the absurdity of waiting for external salvation or revelation and to find meaning in the act of living itself, despite its inherent absurdity.

Conclusion

My exploration into the logical capabilities of LLMs through the lens of Waiting for Godot has revealed profound insights into the potential of these models to grasp and reflect complex human themes. GPT-4, with its unique blend of philosophical depth and narrative flexibility, stands as a testament to the advancements in AI and a beacon for future explorations in the field.

Photo By Craig Schwartz © 2012 Craig Schwartz/Craig Schwartz Photography