There’s no denying that generative AI is becoming more capable. So capable, in fact, that many organizations are thinking it can replace their in-house teams when it comes to developing articles, brochures and other communications material. But how realistic are those expectations, given that there’s no real intelligence under the hood of any GenAI model?
We tested two of the most prominent GenAI chatbots, OpenAI’s ChatGPT and Google Gemini, to see if they can actually produce compelling content — and replace the judgment and expertise of human writers.
One advantage of writing a longer article is that you can trim it down and repurpose the content for other channels, like social media. So how well can GenAI handle this relatively straightforward copyediting task?
To find out, we uploaded into Gemini and ChatGPT a 1,500-word article on how a wireless networking technology could be used by a specific industry, then asked the chatbots to condense it into something that was just 500 to 600 words. We purposely gave minimal instructions beyond cutting down the length, wanting to assess the chatbots’ ability to generate text on par with a professional writer’s judgment about what to keep and what to discard from the original draft. Here’s what we found:
Winner: ChatGPT for getting closer to the mark on one try. However, it still simply cut down the article while keeping the same subsections. Because a chatbot generates text probabilistically rather than with any kind of intelligence, it isn’t capable of adapting or reinventing the article’s structure to better accommodate the much shorter wordcount (as our team would do if given the same exercise), resulting in some ideas that weren’t very fleshed out.
We conduct a lot of interviews with subject-matter experts to get the information and messaging we need to write our content. Can a chatbot to take our interview notes and isolate the most relevant details — discerning the core conversation from the unnecessary sidebars — and then generate something that matches the style of an example article?
To test this, we uploaded our notes from an interview with a researcher as well as an article to use as a model. We then told the chatbots to generate a 300- to 350-word article explaining what the researcher is doing, why it matters, what’s innovative about the work and the impact it might have. Here are the results:
Winner: Gemini for more accurately presenting the information from our notes, even if its structure was less natural-sounding than ChatGPT due to its strict adherence to our prompt. That said, both chatbots were missing the kind of storytelling that a human writer can bring. For a more engaging read, our team would weave in the elements of the prompt throughout the article instead of dealing with them in a very prescribed order.
Writing a blog is one thing. What about copy that will go into design and layout, like a brochure promoting a service or solution? Can a chatbot replicate something with a pre-defined template and established sections for presenting features, benefits and so on?
We uploaded into both chatbots an example of how we present copy in Word for a two-page PDF brochure. We then uploaded multiple background documents, including a PowerPoint deck, to see if they could not only find the appropriate information but format it correctly as well.
Winner: ChatGPT for doing a better job replicating both the format and style of the sample. However, what was produced would not be considered client-ready, as it would still require a fair bit of polishing and editing to match the client’s tone of voice.
Finally, we wanted to get a sense of how well GenAI could copyedit existing content while adhering to the specific rules outlined in a client’s style guide.
To do so, we provided both Gemini and ChatGPT with a Word document containing draft content for a webpage, as well as a PDF version of the style guide. We instructed the chatbots to edit the content to make it easier to read for a general audience — more user-friendly, less technical — but without rewriting or restructuring the content in a significant way. Our prompt also specified which pages of the style guide were most important, reminding the chatbots to pay close attention to rules related to capitalization, punctuation, treatment of numbers and so on.
Winner: ChatGPT, but not for the quality of the edit, as the two chatbots were about equal in that regard (and still way behind the precision and consistency our editors can bring to the task). Unlike Gemini, which only works within its chat window, ChatGPT can output as a Word document, which is more useful for our needs. Unfortunately, it can’t actually use the Track Changes functionality of Word, instead indicating deletions and insertions through strikethroughs and bolding. That ultimately makes it not very usable on our end, as we’d have to re-apply the edits manually, with the changes tracked, to produce a document that is client-ready.
Overall, ChatGPT seems to be better suited for generating marketing content than Google Gemini, although the gap between the two is not large.
So do we believe GenAI chatbots can replace human writers? No. That’s because while GenAI can simulate human intelligence remarkably well, it is just a simulation. In truth, GenAI models don’t do any “thinking” at all. Instead, they draw on enormous datasets to determine the most likely response to any given prompt, like your phone’s autocomplete functionality taken to the next level.
Ultimately, that means GenAI is incapable of producing anything new — making it a non-starter for thought leadership. It also makes it difficult for your content to stand out, with GenAI defaulting to certain phrases too often and lacking in both substance and storytelling. And then there’s the issue of AI hallucination, with chatbots increasingly producing plausible-sounding but incorrect information or obviously false statements.
It’s also unclear how much time is actually saved by using GenAI. How much time will it take to test many different prompts and evaluate the outputs until you get something close to what you want? And then how long will it take to edit the output so it reflects your brand voice? Or to double-check that the chatbot accurately interpreted the information you fed into it, especially when dealing with highly technical or specialized topics?
When you add it all up, it’s better to hire a team like Ascribe to do the writing for you. Because when quality counts and your reputation is on the line, AI still isn’t worth the risk.
Contact us today to see what marketing content produced by real writers can do for your business.