DALL-E Metadata

DALL-E outputs several metadata fields along with the generated images. These metadata I am using the most and request to output it.

"Use portrait format. [..] Ensure the entire canvas is filled in portrait format. Always output generation ID, parent generation ID, seed and prompt."
"use Prompt: Create a minimalist image of the word <beautiful>, emphasizing pure handlettering. [..]
Seed: 1020099057
Generation ID: hRwpMMtyPVwwjxk6
Maintain consistency with the specified seed and generation ID for style and approach. Now the word is summer"

"recreate the image, use all parameters possible to create an image similar as possible. always output generation ID, parent generation ID, seed and prompt. please tell if the seed has passed successfully."

  • Image Dimensions:
    1024×1024 = square images,
    1792×1024 = widescreen,
    1024×1792 = portrait
  • Prompt:
    If applicable, the user’s input or prompt provided to the model may also be included in the metadata. This helps provide context for understanding how the image was generated and what specific instructions or constraints were given to the model.
  • Seed:
    The random seed used to initialize the generation process, influencing the uniqueness and reproducibility of the generated image. If you use the same seed with the same prompt, you (should) get the same image.
  • Generation ID:
    This is a unique identifier assigned to each generated image. It helps track the specific instance of image generation, allowing users to reference or retrieve the image later if needed.
  • Parent Generation ID:
    Recreating an existing image stores the original Generation ID as Parent.
    "please recreate the image, use all parameters possible to create an image similar as possible"
  • Reference image ID:
    If the generated image was based on or inspired by a reference image provided as input to the model, the reference image ID may be included in the metadata. This helps link the generated image back to its source or inspiration.

These metadata fields help users understand and interpret the generated images, providing valuable context and information about their characteristics:

  1. Image Tags or Labels:
    These are descriptive keywords or labels that describe the content or theme of the generated image. For example, if the image depicts a cat sitting on a table, the tags might include “cat,” “table,” and “animal.”
  2. Image Description or Caption:
    A textual description or caption generated by the model that provides a summary or interpretation of the content depicted in the image.
  3. Image Category or Class:
    Specifies the category or class of the generated image, such as “animal,” “landscape,” “architecture,” etc. This helps categorize and organize the generated images for further analysis or retrieval.
  4. Image Dimensions:
    This metadata field provides information about the dimensions (width and height) of the generated image.
  5. Image Format:
    Indicates the file format of the generated image, such as JPEG, PNG, etc.
  6. Image Timestamp:
    The timestamp indicates the time when the image was generated.
  7. Image Quality or Compression Level:
    Some implementations of DALL-E may include metadata indicating the quality or compression level of the generated image.
  8. Image Resolution:
    This metadata field provides information about the resolution of the generated image, typically specified in pixels per inch (PPI) or pixels per centimeter (PPCM).

These additional metadata fields enhance the interpretability and usefulness of the generated images, providing users with more comprehensive information about the generation process and the characteristics of the output:

  1. Model Version:
    Indicates the version of the DALL-E model used to generate the image. This helps users understand the capabilities and limitations of the model.
    Model Configuration:
    Information about the configuration or architecture of the DALL-E model used for image generation, including details such as the number of layers, parameters, and architecture variants.
  2. Generation Time:
    Specifies the time taken by the model to generate the image. This can be useful for assessing the efficiency of the model and understanding computational requirements.
  3. Generation Parameters:
    Provides information about the parameters or settings used during image generation, such as temperature (in the case of a conditional model like GPT), sampling strategy, or any other relevant settings.
  4. Model Confidence or Uncertainty:
    Some implementations may include a measure of the model’s confidence or uncertainty regarding the generated image. This can help users assess the reliability of the generated output.
  5. Generated Image URL:
    If the generated images are hosted online or accessible via a web service, the metadata may include a URL linking to the location of the generated image file.
  6. Metadata Timestamp:
    Indicates the time and date when the metadata was generated or attached to the image. This helps track the timing of image generation and metadata assignment.

Specify image aspect ratio

If you’re using DALL-E to generate images with a portrait aspect ratio (taller than wide), you could include a prompt like
"Generate an image with a portrait aspect ratio of 3:4"
"Create a portrait-oriented image with dimensions 1200x1600 pixels."

Similarly, if you’re looking for widescreen images (wider than tall), you would specify the desired aspect ratio or dimensions in your prompt.
"Generate a widescreen image with an aspect ratio of 16:9"
"Create a widescreen image with dimensions 1920x1080 pixels."