Talk of ‘Artificial Intelligence’ or AI imagery in Commercial Photography has been a little hard to escape this year. It has everyone talking, and its impact right now shouldn’t be underestimated.
We wanted to share our interaction with this new technology, and how it fits in with photography.
The term artificial intelligence is perhaps a little misleading, as most of us now know. It isn’t intelligence in the common meaning of the word. What it is, at present, underneath the catchy AI tag, is a computer/server based large language learning model, that harnesses an enormous data set to produce an output at speeds not possible before. This data set can be a pre-existing stock library, or simply all the images on the internet. More on this a little bit later.
After entering written text prompts, the software/algorithm feeds the user an image that is supposed to match the input. Quite frequently, the outcome is scarily accurate!
It has the potential to completely remove tasks traditionally undertaken by people, and/or increase the speed and efficiency of the work typically carried out by humans in the creative industries.
But is it good enough, do we like what it’s producing, and is it a useful and ethical commercial tool that we can harness in our photographic industry?
Firstly, it’s fair to say here at Double Exposure we still feel in control of our image making; the dramatic switch seen historically with machine introduction to industries doesn’t seem to be happening to us right now, at a mechanical level at least. The web/program based AI that exists at the moment can’t yet pick up our camera, compose/understand a subject and deliver on a creative brief. This means that, at present and in our industry, artificial intelligence doesn’t manifest itself as a robot that can completely replace a photographer.
Secondly, the programs that create imagery from scratch still need a human inputing commands, and a general overseeing through the creative process. The software cannot work autonomously. It doesn’t know what a client brief is. It can’t mange the complex exchange of information about how to complete a commercial photoshoot in the advertising sector, for example.
All it can do currently is accept commands from a person, and create something visual in return. This means it’s a computer program, or algorithm; it is a creative digital tool. We believe it’s the human touch, or artistic eye, that will still be the most valuable tool (in combination with traditional art direction and creativity) that produces the best output.
However, what and how it creates imagery, is very impressive!
The programs we have been using are trained to link words, to millions of relevant images, and create something similar by matching an input word to an output. In technical terms, it creates the picture elements (pixels), in a certain order, to make a recognisable image; an image that more or less could have been created by light falling on a cameras sensor, because those words equal pixels that go in that order. This might sound simple, but it is incredible quick and accurate at ‘generating’ an image that looks like what you need it too.
Consequentially, it can mimic the image creating ability of a camera, and by direct extension, the human photographer – by turning word prompts into a digital image that generally looks the same.
So should and could we use these digitally generated images, commercially?
This is best answered by looking at the imagery it can create, and to explore AI within photography from an ethical perspective.
A note on the ethics of AI Generative Content
It’s important to understand the two main approaches of computer learning algorithms in the photography industry at present.
One is a more specific focused application of the technology; it’s designed to speed up and assist common processes in post production, such as object removal, background extension, and generation of digital subjects that otherwise didn’t exists at the time of shooting. This ‘generative fill‘ style of tool is designed to support the photographer in the creation of existing work.
The other approach is a system that has been trained on large swatches of the internet, specifically pictures, and as such functions directly as a consequence of data scrubbing. The programs that use this method are unethical in their approach, but understandably interesting. At present there isn’t regulation from the UK Government to allow for proper financial renumeration to the authors of the work the program is using; think of it as selling a product or service that wasn’t yours in the first place. That’s not to say that learning from what has been produced historically and re-creating it is new to any creative medium, but it is raising serious questions not only over the images it is using to make new ones, but how to correctly copy-write and protect the imagery created for both photographer and client.
The line between copyright infringement and appropriation is thin, and without understanding precisely how the programs interpret data, this grey area is certainly in need of some caution at present. We feel strongly that using software to mimic an image makers style, i.e ‘ripping off’ their ‘look’, is completely unacceptable. However, allowing technology to advance, in a way that can understand what an artist wants to make (by learning what that image should look) like is a progressive tool. Provided the source of images are approved, and the resultant image has enough ‘original’ content at a technical level, then it’s likely here to stay.
Our challenge to ourselves
This year we created two scenarios that could represent client briefs, and set ourselves the challenge of understanding the benefits and weaknesses of using AI imagery in Commercial Photography, and exploring this emerging software. Our primary (service-based) business focuses on product photography, among other sectors.
Product photography consists of two elements: the photography of the product itself, and the environment that product is in. The application of ‘AI’ tools on both elements of our practice are, at present, staggeringly different, and the following blog explores how we got on technically with the programs we chose to try.
AI to generate the product
This isn’t a new concept. The generation of a real product with computer rendering has existed for a long time. In some industries it is mainstream practice. Many 3D renders are excellent, and have replaced photography, but there is still a large market where a camera is more cost effective, and allows for greater flexibility and realism.
Our early experience with AI imagery in commercial photography, as a generation tool for the product itself, has been fraught with challenges, and doesn’t appear worth it, at this stage.
One core reason is the copyright law that protects the intellectual rights of imagery. Programs that generate images from looking at other images, appear to be prohibited to copy or render/generate identical and licensed versions of what already exists.
The other primary reason is the inability to know what the product in question is. If the software has never ‘seen’ an object yet (let’s assume it’s a product new to market and there isn’t any 3D models or photography either online or in the algorithms database), then how can it possible generate an accurate visual image of it? Remember, the AI tools for image making rely on being trained on existing images – therefore it cannot know 100% what something actually is, if it’s never seen it.
So, creating the object that we would typically photograph, with AI software, is currently not a viable option. The programs can’t yet create an exact copy of the item in question, and often distort the object or create something visually similar but technically incorrect.
To test the programs we created two original photographs in our studio, of a couple of common tech products.
We then tried to create the same images with AI programs. Take a look below at how Adobe Firefly did based on commands by our team, to make something visually similar. It’s clear there’s a copyright block on these consumer items, which makes sense.
We have concluded (at present) this isn’t something we would use day to day. It may transpire that in time a large learning model can generate accurate image versions of products, but essentially for any image made like that to be purposeful it needs to be technically correct, and is therefore just a computer render; in other words this already exists, and doesn’t need reinventing.
AI to generate the products environment
This is where things get interesting.
As with 3D modelling or computer rendering of products, the background blending with the product visual has existed for quite some time. Clipping a product in post production, and inserting it into another scene is common practice – we’ve been doing it for years, and it’s established in many industries, for example automotive ‘photography’ – which often isn’t photography/videography at all any more. The car is a render, and the background is either blended into the image in editing, or entirely made up like a digital background scene as per the movies.
Also, with photography, using correctly licensed stock imagery to composite with product pictures is established and mainstream too.
So in that sense, nothing new is going on. The really interesting thing however, is the speed, variety and integration of these techniques with existing workflows. AI imagery in commercial photography has the potential to speed up time to market in advertising, and offer greater control and freedom with creative direction.
The base surface. The background. The mood, vibe and lighting. These are all elements of an image that AI is fast integrating itself with. Where once a stock library image was a go-to source for a different room-set, or an industrial look and feel, AI has and is replacing this workflow.
The ability to quickly command the desired look of the environment, and then speedily refine the resulting images across different versions of this software is staggeringly useful. It is saving time, and getting very close to whats needed, very quickly.
Once generated, these can easily be saved and inserted into the traditional photoshop workflow as before.
Indeed, a reference mood board image from a client brief can be fed directly into the AI program to further refine this process, so the programme can generate a close match without unnecessary time adjusting text prompts. Furthermore, the ability to ‘back search’ from a close match can refine a concept super fast.
Even if the final product image is manually blended, having the ability to show the program the shape and camera angles are very useful; and whilst technical inputs like lens type, elevation and depth of field are still not quite as advanced as we’d like, it surely won’t be long before we can command the exactly background perspectives we need.
Additionally, the integration of Adobe’s Generative Fill, into Photoshop itself, is a complimentary and exciting tool.
This algorithm (an extension of the content aware tool) can quickly, and surprisingly accurately, extend backgrounds, remove unwanted elements and even generate props or objects that have never existed, without any learnt skill from a digital operator/retoucher. Whilst at times this is a little crude to a seasoned editor, there’s no doubt this in-built tool will become the norm for a majority of studios from now on. The same technology is even being built into mobile phone apps, to remove parts of images in your everyday family snaps.. scary right?!
Have a look at the examples below to see how AI software can create complimentary backgrounds (and unusual/unique props,) that have never existed before. These are unique environments that can be created with simple commands, or generated from feeding in existing images that help the computer to understand the camera elevation, angle, and lighting.
Provided the programs we can use are regulated in due course, and rightly so, this will likely become common practice.
It doesn’t matter if the speaker in the image isn’t the right one, as our photograph of the product can overlay this object, and be blended into the composite with traditional Photoshop techniques.
We feel these are successful results; the lighting, tonal gradients and colours are well generated – this is clearly the beginning of yet another way of working, digitally.
As such, we’re excited to explore the benefits that AI imagery in commercial photography can bring to our workflows; especially the speed and creativity which is so often needed in client briefs. See specifically the new images below that are very passable; the Sonos speaker now has a modern concrete styled room, with some dark wooden furniture and interesting window light. This is a generated room, which has the correct elevation and lighting to match the product. The image looks fine, nothing is too odd or out of place – and to build a small set like that would otherwise take additional time and money.
It may transpire that even if AI doesn’t replace traditional set building and still life styling, it adds the ability to generate extra content from the same shoot, and extra content has been a demand in (mainly social) marketing for over a decade!
In day to day practice, it is yet to be established how the authenticity of this image creation will be protected, in this instance. Is the image below a direct copy of another authors work, or a digital interpretation from thousands of other images of rooms; and if so, how can the primary image makers are correctly acknowledged or remunerated remains to be seen.
One of the hardest challenges for our team was the reverse engineering of the background to match the product. It will be better for any AI concepts/briefs to start with the room/background and then commission the photography. By doing it this way around, we can match the angles and lighting perfectly; thus allowing complete freedom to generate the environment desired by the client and not be tied to trying to match the background to the product later on.
With the Fitbit watch, we found the team most liked a more conceptual image – see the fluffy pink clouds image below. Whilst it is a large deviation from the image we made in the studio, visually, it still works very well as the primary idea was abstract anyway. Watches don’t float above triangle shapes just as much as they don’t above the clouds, so from a conceptual point of view it’s clear that generative programs will flourish when used to make hyper real creative results that don’t have to have day-to-day realism.
AI as Inspiration
The other obvious use of AI imagery in commercial photography, is the generation of mood boards, and inspiration.
A good mood board often forms the backbone of a creative shoot, and closely informs the art direction and deliverables of a brief.
Often, these documents are time consuming to put together, and need additional explanation with text. The ability to prompt software to quickly make images that have the look and feel you’re after is encouraging.
We’ve found we can frequently get close to the colours, atmosphere and aesthetic we were asking the program for. Furthermore, it can then generate images based from what it has made and this back feeding promises to be quite a powerful refining tool when trying to pin down a concept.
This may prove to be a great way of creating a mood boards to portray a concept or brief, and each individual image can either be saved as a set or individually.
If you’d like to find out more about how to safely use AI imagery in commercial photography, and how it can help create a deck of imagery for your project, or integrate concepts you’d like to see with products you already have, you can talk to us on the detail below.
Our in-house team have put in the time to ensure we understand the relevant platforms to help AI concepts compliment our practice, and add value to the projects of tomorrow!