Image quality drops by a lot to a point every detail and texture is lost and a "plastic"-like filter is applied to everything.

#13
by Jamerrone - opened

After ComfyUi added support for this, I was finally able to try it out. However, the results are awful, every texture and fine detail is simply lost, as if there is a "plastic/smooth" filter over the whole image. This is also easy to see in the examples provided in this's models card.

Without:
ComfyUI_00137_

With:
ComfyUI_00138_

It also seems to affect colour quite a bit.

Alibaba-PAI org

Thank you for your testing. Please give me some demos and prompts so I can run more tests.

@bubbliiiing Hi, thanks for your response, I will try my best to share whatever I can, so please let me know if you need anything else.

Workflow:
https://pastebin.com/Kg2gZ8qu

Prompt used:

A photo of Lin, a 23-year-old soloist ballerina from Shanghai, captured in a moment of suspended grace against a seamless, matte grey studio backdrop. Her striking profile is a deliberate and harmonious synthesis of three specific muses: she possesses the ethereal, soft-contoured jawline and porcelain complexion of Liu Yifei, combined with the high-bridged, sculpted nose and deep-set architectural elegance of Dilraba Dilmurat, and finished with the emotive, cinematic sophistication of Tang Wei. Her dark hair is swept back into a tight, disciplined bun at the nape of her neck, revealing the graceful slope of her neck and the tension in her sternocleidomastoid muscle as she gazes upward toward her raised right hand.

She is captured from a rear three-quarter angle, emphasizing the muscular definition of her back and the elegant torque of épaulement. Her costume is a dramatic black ensemble; the bodice features short, cap-like sleeves and a scoop back constructed from sheer nude mesh that blends seamlessly with her skin tone. The black fabric of the bodice is embellished with subtle, deep red and dusty pink floral embroidery climbing up the side ribs. A voluminous, multi-layered black tulle tutu flares out from her hips, the stiff netting catching the studio light to reveal a complex texture of shadows and highlights.

Lin is poised in a relevé on her left leg, the calf muscle taut and defined, showcasing the immense strength required for the pose. Her foot is encased in a satin pointe shoe of a pale, peachy-pink hue, the fabric scuffed slightly at the box from use, with ribbons wound tightly around her ankle. Her left arm extends downward and to the side in a gentle demi-seconde, fingers softly grouped, while her right arm reaches high in a graceful curve, framing her upturned face. The lighting is cinematic and high-contrast, with a cool key light striking her from the upper right to illuminate her profile and the tops of her arms, while deep, soft shadows wrap around her lower body and the folds of the tutu, grounding the figure in three-dimensional space. The image quality is ultra-photorealistic, evoking the texture of 35mm film with fine grain, sharp focus on the subject, and a subtle fall-off in the background.

Result without ControlNet:
ComfyUI_00028_

ControlNet Input image (Pose):
Ballet+Dictionary_+Épaulment+-+Ballet+Manila+Archives

Result with ControlNet:
ComfyUI_00029_

As you can see, the skin looks much better than in my previous example. I recently updated my workflow, and the texture has definitely improved, though some images still show problems (not this one). The main issue they all share is a pixelated background that becomes obvious when you zoom in. It forms a sort of square pattern that’s far too large to be normal pixelation.

Another recurring problem is the harsh edge where the subject meets the background. It feels like an antialiasing issue, creating a kind of hard outline around the subject. You can see a faint dark outline on the subject’s face here, although this example isn’t too bad. Overall, this image turned out reasonably well, though I suspect I got lucky with the seed.

I’ve been considering running it through a KSampler with a low denoise value to smooth things out, but I’d rather avoid that since my machine isn’t very powerful.

Oops, I forgot, here is the prompt for the first image:

A photo of Sakura, a 17-year-old high school student from Japan, captured in a candid, high-fidelity cinematic moment on a rainy evening. She is squatting low on the rain-slicked asphalt of an urban sidewalk, holding a transparent vinyl umbrella with a white handle resting over her shoulder in one hand, her other hand resting on her knee. The clear plastic canopy is streaked with rivulets of water and beaded with droplets that catch the ambient city light. A profound, silent interaction defines the scene: Sakura is looking directly downward, her expression gentle and focused, locking eyes with a small black cat sitting on the wet ground in front of her.

Sakura has long, lustrous black hair styled in a precise hime cut with blunt bangs across her forehead and sidelocks framing her cheeks, damp strands clinging subtly to her jacket, with a single red ribbon tied on the left side. Her skin has a fine, almost porcelain texture with a cool undertone, visible pores on her nose, and a soft sheen of moisture on her cheeks. She wears a dark navy sailor-style school uniform (seifuku) featuring a white collar with red linear detailing and a bright red necktie loosely knotted at the chest; a simple black choker encircles her neck. The uniform jacket has oversized sleeves. Her lower body features a short, dark pleated miniskirt that fans slightly over clean white ankle socks that provide a stark contrast to the wet asphalt, ending in dark leather loafers that gleam with moisture.

The black cat sits upright in a shallow puddle, its short fur slicked by the rain, tilting its head back to stare intently up into Sakura's face, establishing a clear line of sight. The background is anchored by a large, illuminated red vending machine standing against the darkness, its cool bluish-white interior light spilling onto Sakura’s profile and the umbrella. The ground reflects the red chassis and the neon streetlights in distorted patches on the wet pavement. Additional cool rain streaks fall through the frame, some caught in sharp focus and others blurred into vertical lines against the background lights. The scene is rendered with a wide-aperture lens creating a shallow depth of field, keeping the girl and cat in sharp focus while softening the background into gentle bokeh, with the texture of fine-grain 35mm film stock.

I will download these images and run some tests. Thank you very much for your feedback.

Sign up or log in to comment