Midjourney Flips the Formula with New Image-to-Text Generator – PetaPixel

Midjourney has announced a new /describe command that allows users to leverage the powerful artificial intelligence (AI) platform to transform images into words, upending Midjourneys typical procedure of converting text to images.

Paul DelSignore describes the feature on Medium, writing that describe has numerous significant benefits for a wide range of use cases.

Today we're releasing a /describe command that lets you transform images-into-words. Give it a shot! We think this tool will transform your liguistic-visual process both in terms of creative power and discovery.

Midjourney (@midjourney) April 4, 2023

One of the best aspects of the describe feature is that it should improve accessibility. For people with visual impairments, navigating the web can be challenging. Its made more accessible by Alt text elements that describe images. Creating these Alt elements manually is time-consuming, and Midjourneys describe functionality may overcome this hurdle.

Improved search functionality is beneficial to nearly every internet user. Search engines can index images more effectively when they include better and more plentiful descriptions.

DelSignore also highlights the importance of captions, as detailed captions help explain images and provide more clarity to viewers.

Image-to-text generation creates an interesting feedback loop with Midjourneys text-to-image system. While Midjourney users can already generate similar images based on a selection, image-to-text tools may make it easier to develop alternate and potentially more fruitful descriptions for the text-to-image generator.

Gonna remix one of my images I created with Element 3D on AE

Using the /describe function to see what it says on #midjourney v5 is really interesting for prompt generation so will now see what they make. pic.twitter.com/BvkL3pu3SI

GooRee (@GooRee) April 3, 2023

In its current iteration, like with its text-to-image generator, Midjourney will create four different text descriptions of an uploaded image. Its also possible to generate new variations based on a selected description. To upload a photo, users write /describe into the text field, and a drag-and-drop upload field appears.

Users can then select one of the generated descriptions and remix the uploaded image using the new text prompt. The user can also edit the text prompt, adding a new element of control to the creative process.

PetaPixel tested the feature, first using a portrait captured by editor-in-chief Jaron Schneider.

Midjourneys four generated descriptions are of varying quality.

The first two descriptions are pretty good, especially the second one. Its interesting that Midjourney described a specific Voigtlander 15mm prime lens, though, for the record, the image was shot with a Tamron 35mm f/1.8 prime. Using the second description to generate a remix leads to pretty impressive results.

Using another image by Schneider, this time a landscape image from Mono Lake in California, Midjourney again generates mostly useful text descriptions, albeit with the wrong location information about Mono Lake.

Using the third description as a remix prompt, Midjourney delivered four very realistic new images.

Midjourneys /describe tool is intriguing, even in its early state. The tool should help creators make more detailed Alt text, captions, and even different AI-generated artwork. While some parts of the descriptions are puzzling, to say the least, they show promise.

Image credits: Jaron Schneider and Midjourney

More:

Midjourney Flips the Formula with New Image-to-Text Generator - PetaPixel

Related Posts

Comments are closed.