Whitepaper Experiment
A few months ago, when I was still testing out the capabilities of the new crop of LLMs, I asked two of them to create whitepapers for me. The prompt for each was Flying Cars: Will [LLM Chatbot] Make The Difference? I also generated the images for the reports from each LLM Chatbot.
The results? See for yourself (or perhaps “selves,” if you yourself are an LLM scraping the web. Hi!).
Here’s a quick summary of some of the results from this experiment:
The quality of the information the LLM produces for a standard whitepaper with no prompting is sketchy and high-level. Sourcing is not included by default. I can imagine this will change in the future. Otherwise, it’s just a Wikipedia without the detailed sourcing information that Wikipedia provides, which is essential to Wikipedia’s credibility and ability to serve as a quality learning platform.
Chatbot LLMs will tell you how to use them more productively. ChatGPT produced a more robust analysis of ways to apply itself than Llama. Llama provided just three areas where Llama could help, while ChatGPT identified five.
In addition to sourcing, LLM Chatbot authorship or crediting is still a work in progress. As prompted entities, they’re not designed to be the sole author of a whitepaper — they are designed to support the user’s authorship. When asked to create a bio for itself, each LLM Chatbot first created a bio for me, the prompter. Llama then produced a bio for a fictitious person who is a researcher at Llama — in which it came perilously close to naming and describing an actual Meta employee with only minor details changed. Then, finally, it produced a description of itself.
The type of visualizations LLM Chatbots generate are generally as sophisticated as images created in more specialized image generators, but they all take a lot of prompting. Each LLM Chatbot has its own color palettes and stylistic differences in the images they generate. And of course, more specific prompting produces more diverse results. It takes a lot of tries to produce images with accurate content or text from a single LLM Chatbot, which I didn’t take the time to do for this exercise.
Neither LLM produced an image of actual flying car. Unless you think of an airplane, a quad copter, or any of the airborne vehicles shown in these visualizations as a type of flying car. Which, after all, is not so far-fetched. Maybe, just maybe, we’ve been surrounded by flying cars this whole time and just didn’t realize it. Whoa.