What is it? (Skip ahead if you don’t need an introduction to Prompt Engineering.) The simple answer is the art or science of designing the right words that will return the results the user has intended; the magic words that make AI use their powers to do what you want.
The more complicated answer involves the long-standing debate over whether the skill of making good, rational decisions is a science or an art. Both the art and science of decision making require many types of information to be rationally prioritized, which is a subjective process in each individual decider. However, some choices are always wrong, and we (usually) all agree. We know that computers still struggle to see things that are obviously wrong. This is why we are still showing captchas where the stoplights are in a photo-array; so the AI in self-driving cars can become more able to mimic our driving abilities.
Other emergent and improving AI technologies are also using human feedback to improve. This includes text generation, sound and music design, image generation, 3d model and environment generation, library sciences, and more. Each piece of the puzzle is being run by Artificial Intelligences, but the context and accuracy of their results can only be tested exclusively by Human Intelligences. That’s where Prompt Engineering shows us a glimpse of the future.
For now, many Prompt Engineering industries are predatory or outright scams. There are people selling prompts for AI generators, and there are direct-to-NFT business models with less life-expectancy than yogurt in the sun.
However, the role that we humans play in directing AI is critical, and irreplaceable. Computers simply do not know when they do not “get it.” The ability to communicate effectively with the tools of the future is critical to being able to build it. So, what skills make a good prompt engineer?
Prompt Engineering asks for humans who are jacks-of-all-trades, especially people who have a lot of hobbies. Neurodivergent people with varied interests (rather than a singular fixation) would excel at this type of job. People with physical disabilities that don’t impede their use of the hardware will be particularly good at this because they are experienced at navigating creatively without accommodations until the system improves with their feedback. The reason the task asks for cross-disciplinary development is because each piece of media being generated has many specific terms attached to it: metadata.
For example, if a person is querying a piece of “video game concept art”, they will get a vague result. But someone who knows enough to specify the materials of the armor, the type of weapon or blade, the exact hairstyle, the precise photography term for the lighting…etc. The more useable data a person is able to specify, the more useful outcomes the computer can return.
With current technology, a person can use a program like GPT-3 to engineer an article like this one without spending the hours of time that I did to research, write, and compile it into a legible text. A person who did that could spend more time on editing and improving the output article, or maybe move on to another task. Whether that is good or not is subjective.
I used AI to generate all of the images for this article. I mostly used Midjourney, a chatbot from within the Discord program. I also used DALL-E 2, which required a great deal of coaxing to create non-white-male Prompt Engineers (almost all bald with unkempt beards…do with that what you will.)
To learn more about the AI image generators I used in this article, see my comparison of several AI image generators including free options that are great for practicing prompt engineering so you can better communicate with the tools you’re using.