Prompt Engineering - Is it still relevant?

Models are improving in leaps and bounds with better cognitive abilities, reasoning and ability to use third-party tools. So do you still need to know prompt engineering, or was it just a passing fad?

Nov 10, 2023

∙ Paid

Medieval knight riding a robot horse. Imagined using DALL-E 3.

Hi folks, in this week’s newsletter on prompt engineering, I start by delving into OpenAI's first-ever Developer Conference.

Keynote speaker Sam Altman, CEO of OpenAI, has unveiled a suite of new tools and features that promise to redefine the AI landscape.

I also discuss “Grok”, Elon Musk's latest foray into AI, with its unique blend of knowledge and wit. But is the world ready for yet another LLM frontier model?

With new GPT-4 models available in beta already, boasting larger context windows, developer access and more affordable rates, the question arises:

How will these enhancements integrate with our current digital toolkit?

As we explore the groundbreaking concept of custom ChatGPTs, or “GPTs”, we consider their potential to revolutionise how we build AI assistants tailored to specific needs.

Will this new capability democratise AI, making it accessible to those without coding expertise? And in a world where the line between human and AI interaction is increasingly blurred, we revisit the topic of prompt engineering.

Is it an arcane practice reminiscent of past aeons (so 2022;) or a necessary skill in the era of advanced AI models?

We unpack the implications of OpenAI's announcements and consider how these advancements may shape the future of AI and our interaction with it.

Don't miss out on the insights and analysis that could well inform your next digital move.

Let’s dive in! 🚀🔍

As an aside, I wrote this week’s newsletter without access to ChatGPT for the most part, so wasn’t able to try everything until the last minute. Apologies in advance for any oversights!

Before we get into prompt engineering, let’s touch on the two big news stories this week.

OpenAI Developer Conference

First, OpenAI had its first-ever Developer conference, and the live broadcast keynote by Sam Altman outlined a number of impressive new models and features.

I’ll dig deeper into these features in a future newsletter once I’ve got access to them (ChatGPT is suffering a big outage as I write this due to exceptional demand!), so for now, here are the highlights.

New GPT-4 models for developers

OpenAI announced two new versions of GPT-4 for developers to build upon.

GPT-4 Turbo and GPT-4 Turbo with vision models.

Both have a huge 128k context window, which is enough to consume most novels whole—roughly 300 pages of text in a single prompt.

Text input/output is also substantially cheaper than the existing GPT-4 model, and the GPT-4 Turbo with vision model includes an additional cost per image—for example, a 1024 x 1024 square image in detail: high mode costs 765 tokens.

There are many more refinements to the models, which you can read about more fully here.

And, for more general details regarding the GPT-4V(ision) model, there are links here.

GPTs are Custom ChatGPTs

The other exciting announcement from OpenAI is regarding creating custom versions of ChatGPT, called GPTs, that combine instructions, extra knowledge, and any combination of skills you might need.

GPTs are OpenAI's first step toward allowing users and enterprises to build their own AI assistants—directly within ChatGPT's interface.

This is BIG NEWS!

For anyone who follows this newsletter, you’ll know I’ve been discussing RAG (Retrieval Augmented Generation) AI architectures for some time.

Retrieval augmented generation allows you to incorporate your own data into language models, like GPT-4, allowing it to answer questions about your company, products or data.

OpenAI says you will then be able to offer your GPT chatbot either internally for staff use (Enterprise plan) or externally to clients, say as a customer service bot.

Although some chatbot platform companies have provided a no-code toolkit to do this in the past, for more sophisticated implementations, you needed a whole development team to implement the complex RAG architecture.

Soon, you’ll be able to do this kind of thing on-the-fly, with GPTs, although bear in mind that this is only step 1 of a much longer roadmap towards what is most likely to be the development of autonomous agents to undertake complex tasks for you.

In the first release, you can create and deploy fully customised versions of ChatGPT that are tailored to your own product or company, and here’s the punchline—you can do it without coding.

As it rolls out to ChatGPT (Plus and Enterprise) users over the coming few days, you can try it out at chat.openai.com/create.

You can also access the Assistants API in the Playground and via the coding API if you are a developer. The Assistants API is built on the same capabilities that enable the new GPTs.

GPTs use one of ChatGPT's two models (GPT-3.5 or GPT-4), you get control over:

Behaviour: You can give it a detailed set of instructions to guide its answers, like ChatGPT’s current custom instructions.
Bespoke knowledge base: You can add your own company files for the AI to draw information from—aka RAG (Retrieval Augmented Generation) I discussed above.
Capabilities: You can use OpenAI's existing capabilities (like DALL·E 3, Browse with Bing, or Advanced Data Analysis) as well as build and integrate your own custom capabilities.

Furthermore, taking the no-code ethos to the max, using workflow and productivity software from Zapier, you can integrate over 6000+ commonly available apps directly into your custom GPT.

ChatGPT GPTs interface using Zapier Actions. Source: Zapier Actions

From a business productivity perspective, this is going to be huuge and is in direct competition with Microsoft’s and Google’s officey suite of upcoming AIs.

Here are some GPT use cases from the OpenAI website,

Later in November, a new GPT Store is being rolled out to showcase the best GPTs, built by verified builders.

OpenAI also says there will also be an ability to earn money based on the number of people using your GPT.

ChatGPT Plus gets a data update and is (even) simpler to use

Finally, ChatGPT Plus now includes information up to April 2023.

Also, there’s no more hopping between models; everything is in one chat tab going forward.

You will soon (perhaps now?) be able to access DALL·E 3, Browse with Bing, and Advanced Data Analysis, all without switching chat tabs.

This also includes attaching files to let ChatGPT search PDFs and other document types.

Although GPTs are great for users, they will simultaneously signal the death knell of a thousand single-use AI startups, like Chat with PDF apps.

I’m looking forward to getting access to GPTs soon to build some real automations!

To catch up on all the OpenAI DevDay announcements, read this.

Elon Musk releases “Grok”

The other big announcement this week was Elon Musk’s x.ai frontier Large Language Model (LLM) called, Grok.

In the words of the Grok team,

Grok is an AI modeled after the Hitchhiker’s Guide to the Galaxy, so intended to answer almost anything and, far harder, even suggest what questions to ask!
Grok is designed to answer questions with a bit of wit and has a rebellious streak, so please don’t use it if you hate humor!
A unique and fundamental advantage of Grok is that it has real-time knowledge of the world via the 𝕏 platform. It will also answer spicy questions that are rejected by most other AI systems.
Grok is still a very early beta product – the best we could do with 2 months of training – so expect it to improve rapidly with each passing week with your help.
Thank you,
the xAI Team
—Source: x.ai

Although Mr Musk often tweets up a storm about the need for AI regulation, even adding his name to letters asking to “pause AI” development, it seems he's not always sticking to the script himself.

That aside, the Grok initiative is intended to “create AI tools that assist humanity in its quest for understanding and knowledge.”

Again, in their own words, as information is limited and access to Grok currently even more limited,

By creating and improving Grok, we aim to:
Gather feedback and ensure we are building AI tools that maximally benefit all of humanity. We believe that it is important to design AI tools that are useful to people of all backgrounds and political views. We also want empower our users with our AI tools, subject to the law. Our goal with Grok is to explore and demonstrate this approach in public.
Empower research and innovation: We want Grok to serve as a powerful research assistant for anyone, helping them to quickly access relevant information, process data, and come up with new ideas.
Our ultimate goal is for our AI tools to assist in the pursuit of understanding.
—Source: x.ai

The model is currently available on a “request only” basis.

It’ll be interesting to see what Grok offers over and above every other frontier Large Language Model (LLM) apart from a sense of humour and free access to Twitter data—because we already know the “Answer to the Ultimate Question of Life, the Universe, and Everything,” is 42.

Either way, the competition is welcome and will only improve the speed and quality of future model developments from all creators.

Now, let’s get on to the week’s main topic.

Prompt Engineering - Is it still relevant?

Considering that LLMs, such as GPT-4, which powers ChatGPT, are simply (okay, maybe not that simply) engineered to predict the next token—where a token can represent a single word such as “and”, “if”, “of”, or a fraction of an extended word, for instance, “OpenAI” being segmented into “Open" and “AI”—the outcomes are remarkable.

How tokens relate to words. Source: https://platform.openai.com/tokenizer

The capabilities of these models become truly impressive after they have undergone pretraining on upwards of billions of tokens, learning to determine the likelihood of the next token by analysing its predecessors within the model’s context window.

Prompt engineering is, therefore, about navigating (in a non-technical way using language) through this vast interconnected, multi-dimensional network to get to the nuggets of information to best answer the user’s questions.

I’ve discussed prompt engineering many times during the course of this newsletter.

Prompt engineering is still necessary because LLMs, like those underpinning ChatGPT and others (Claude, PaLM, LLama-2, etc.), can’t perfectly interpret what we ask them, and we have to nudge them in the right direction.

Of course, sometimes models fail to give satisfactory answers because they are not yet sophisticated enough to comprehend our requests.

But also, sometimes, our instructions are just too vague and lack sufficient context, or we are too lazy to articulate what we need to get a good result ← this!

People have made analogies between using ChatGPT and having a bright graduate come to work for you—the more context and clearer the steps you provide to help them achieve a task, the better the results you get back will likely be.

People have made analogies between using ChatGPT and having a bright graduate come to work for you—the more context and clearer the steps you provide to help them achieve a task, the better the results you get back will likely be.

However, with the advent of bespoke chatbots through OpenAI’s new GPTs (discussed above and via other third-party chatbot platforms) and the integration of different GPT-4 models into a single ChatGPT interface (so no more hopping between models), you have to ask, is prompt engineering still relevant?

The old ChatGPT interface required users to hop between different models depending on the use case. Not any more. Source: ChatGPT Plus (pre-Nov 9th)

Well, yes and no.

For models less powerful than GPT-4—that includes GPT-3.5, Llama-2, and most others, the answer is definitely yes—it’s wise to use prompt engineering techniques in all but the most simple requests.

And, even for GPT-4, there are times when you need to dig deep into the model and using specific techniques is essential.

This is especially true if you want to create a prompt that is (reasonably) repeatable in terms of the answers it gives— in this instance, you will need to structure your prompt very carefully—although LLMs, by their nature, are not deterministic, so this can be tricky at times.

Having said that, for everyday queries or tasks you ask an LLM to complete, it’s more than likely that the results it gives back are just fine based on the Pareto principle, otherwise known as the 80/20 rule.

An interesting article by Ethan Mollick (Professor @Wharton studying AI, innovation & startups), who uses and experiments with AI extensively in his day-to-day work as an educator, outlines two ways to prompt,

Conversational prompting - this is everyday prompting where you go back and forth between you and the bot until you (hopefully) get the right answer, and
Structured prompting - where you take time to carefully craft and iterate over a prompt, using specific techniques and templates to elicit the best results.

In the following, I’ll describe my take on these approaches.

Keep reading with a 7-day free trial

Subscribe to BotZilla AI Newsletter to keep reading this post and get 7 days of free access to the full post archives.

BotZilla AI Newsletter

Prompt Engineering - Is it still relevant?

Models are improving in leaps and bounds with better cognitive abilities, reasoning and ability to use third-party tools. So do you still need to know prompt engineering, or was it just a passing fad?

OpenAI Developer Conference

New GPT-4 models for developers

GPTs are Custom ChatGPTs

ChatGPT Plus gets a data update and is (even) simpler to use

Elon Musk releases “Grok”

Grok is an AI modeled after the Hitchhiker’s Guide to the Galaxy, so intended to answer almost anything and, far harder, even suggest what questions to ask!

Grok is designed to answer questions with a bit of wit and has a rebellious streak, so please don’t use it if you hate humor!

A unique and fundamental advantage of Grok is that it has real-time knowledge of the world via the 𝕏 platform. It will also answer spicy questions that are rejected by most other AI systems.

Grok is still a very early beta product – the best we could do with 2 months of training – so expect it to improve rapidly with each passing week with your help.

Thank you,the xAI Team

By creating and improving Grok, we aim to:

Empower research and innovation: We want Grok to serve as a powerful research assistant for anyone, helping them to quickly access relevant information, process data, and come up with new ideas.

Our ultimate goal is for our AI tools to assist in the pursuit of understanding.

Prompt Engineering - Is it still relevant?

People have made analogies between using ChatGPT and having a bright graduate come to work for you—the more context and clearer the steps you provide to help them achieve a task, the better the results you get back will likely be.

Keep reading with a 7-day free trial

Thank you,
the xAI Team