• Comment: Note significant copying from Stable Diffusion article. Pretty much everything is copied and then "Stable Diffusion" replaced with "BlueMoon AI". DanCherek (A/AP) (talk) 23:59, 6 January 2024 (UTC)

BlueMoon AI
Original author(s)BlueMoon AI
Developer(s)BlueMoon AI
Initial releaseDecember 12, 2023
Stable release
0.0.1 / January 6, 2024
Repositoryhttps://github.com/BlueMoonAI/BlueMoonAI
Written inPython
Operating systemAny that support CUDA kernels
TypeText-to-image model
LicenseApache License 2.0 and Creative ML OpenRAIL-M
Websitehttps://bluemoonai.github.io/

BlueMoon AI is an open-source text-to-image model based on stable diffusion, released in the later part of 2023. It is recognized as a component of the ongoing open-source AI model landscape.

The primary function of BlueMoon AI is to generate detailed images based on textual descriptions. Additionally, it can be employed for tasks such as inpainting, outpainting, and generating image-to-image translations guided by text prompts. Developed on stable diffusion models, BlueMoon AI's code and model weights are openly available, hosted on its GitHub repository.

BlueMoon AI operates as a latent diffusion model, a specific type within deep generative artificial neural networks. The code and model weights are open-sourced, facilitating its compatibility with most consumer hardware featuring a modest GPU with a minimum of 4 GB VRAM. This departure from previous proprietary text-to-image models like DALL-E and Midjourney emphasizes BlueMoon AI's commitment to accessibility, eliminating the reliance on cloud services.

Capabilities edit

BlueMoon AI's latent text-to-image diffusion model sets it apart with its capability to generate photo-realistic images based on textual prompts. This innovative approach allows users to describe a scene, and BlueMoon AI transforms the text into a detailed and visually convincing image. The model excels not only in generating images from textual descriptions but also extends its functionality to tasks like inpainting, outpainting, and image-to-image translations guided by textual prompts.

Text to Image Generation edit

BlueMoon AI's text-to-image sample script, known as "txt2img," evaluates a text prompt along with other option parameters. The model's interpretation of the prompt is used by the script to create an image file. The final photos have an invisible digital stamp that identifies them as BlueMoon AI creations.

Every iteration of txt2img uses a different seed value, which affects the final image. Users can use the same seed to replicate an image that has already been generated, or they can randomize the seed for a variety of results. The classifier-free guidance scale value and the number of inference steps are two configurable variables that allow you to decide how closely the output matches the prompt.

Image Modification edit

BlueMoon AI incorporates another sampling script, "img2img," for modifying images. This script takes a text prompt, the path to an existing image, and a strength value ranging from 0.0 to 1.0. The script generates a new image based on the original, featuring elements specified in the text prompt. The strength value denotes the amount of noise added to the output image. Higher strength values introduce more variation within the image but may result in images that are not semantically consistent with the prompt.

ControlNet edit

BlueMoon AI makes use of ControlNet, a neural network architecture designed specifically for regulating diffusion models by incorporating additional conditions. This novel method creates "locked" and "trainable" copies of neural network block weights. While the "locked" copy keeps the original model intact, the "trainable" copy learns the desired condition. This technique guarantees that diffusion models that are appropriate for production use are not compromised while training with small datasets of image pairs. The 1x1 convolution known as the "zero convolution" has its bias and weight initialized to zero. ControlNet-caused distortion is avoided since all zero convolutions generate zero output before to training. The process is still fine-tuning, preserving the security of the original model, and no layer is trained from scratch. Training on small-scale or even personal devices is made possible by this methodology.

Architecture and Development edit

The core technology behind BlueMoon AI is based on stable diffusion models, a subset of deep generative artificial neural networks. The stable diffusion approach ensures stability and efficiency in the generation process, contributing to the model's success in creating high-quality images. The development of BlueMoon AI is a testament to the commitment to open-source AI initiatives, with the entire codebase and model weights being available on its GitHub repository.

Hardware Requirements edit

BlueMoon AI is designed to be accessible to a broad user base, with the capability to run on consumer hardware equipped with a modest GPU featuring at least 4 GB VRAM. This democratization of access contrasts with earlier proprietary text-to-image models like DALL-E and Midjourney, which were confined to cloud services.

Releases edit

BlueMoonAI Releases
Version number Release date Notes
0.1.[1] [Release date] [Notes]

Usage and controversy edit

BlueMoon AI claims no rights on generated images and freely gives users the rights of usage to any generated images from the model provided that the image content is not illegal or harmful to individuals.[2]

License edit

Diverging from models like DALL-E, BlueMoon AI is grounded in Stable Diffusion, a framework that divulges its source code,[3] in addition to the model (pretrained weights). Imposing the Creative ML OpenRAIL-M license, a variant of Responsible AI License (RAIL), on the model (M), BlueMoon AI's source code is governed by the Apache License 2.0[4]. This licensing framework prohibits certain use cases, including criminal activities, libel, harassment, doxing, exploitation of minors, dispensing medical advice, automatic creation of legal obligations, generation of legal evidence, and discrimination or harm based on social behavior, personal characteristics, or legally protected categories[5][6]. Users retain ownership of their generated output images, with the freedom to use them for commercial purposes[7]

References edit

  1. ^ "BlueMoonAI Releases on GitHub". GitHub. Archived from the original on December 12, 2023. Retrieved December 20, 2023.
  2. ^ "LICENSE.md · stabilityai/stable-diffusion-xl-base-1.0 at main". huggingface.co. July 26, 2023. Retrieved January 1, 2024.
  3. ^ "Stable Diffusion Public Release". Stability.Ai. Archived from the original on August 30, 2022. Retrieved August 31, 2022.
  4. ^ "From RAIL to Open RAIL: Topologies of RAIL Licenses". Responsible AI Licenses (RAIL). August 18, 2022. Archived from the original on July 27, 2023. Retrieved February 20, 2023.
  5. ^ "Ready or not, mass video deepfakes are coming". The Washington Post. August 30, 2022. Archived from the original on August 31, 2022. Retrieved August 31, 2022.
  6. ^ "License - a Hugging Face Space by CompVis". huggingface.co. Archived from the original on September 4, 2022. Retrieved September 5, 2022.
  7. ^ Katsuo Ishida (August 26, 2022). "言葉で指示した画像を凄いAIが描き出す「Stable Diffusion」 ~画像は商用利用も可能". Impress Corporation (in Japanese). Archived from the original on November 14, 2022. Retrieved October 4, 2022.

BlueMoon AI. "interactive progress". GitHub. Retrieved January 2, 2024.

External links edit

Category:Artificial intelligence art Category:Deep learning software applications Category:Text-to-image generation Category:Unsupervised learning Category:Art controversies Category:2024 software Category:Open-source artificial intelligence