Art Direction query

Hi cats. Hi kittens.

You’ve booted up your newly downloaded Ren’Py game / VN, and after the obligatory introduction the young lady who catches the protagonist’s interest (and vice versa) shows on screen for the first time. Does she look like :

A -

B -

C -

D -

E -

Or F - Any of those are fine, I’d rather the program team use the easiest technique so there are more images in total.

One gained-weight image for each down in the reply section below, since the way the characters develop are also probably a consideration in your vote.

  • A
  • B
  • C
  • D
  • E
  • F
0 voters

A little early-midgame update, so you’ve got two examples per option.

A :


B :
C :
D :
E :

I pick the following two options:
G - no AIslop.
H - the creator chooses their own art style.
But of those presented, I guess D.

3 Likes

There’s certainly more ‘sloppiness’ here than in the intended final work. I pushed all these options through rough-draft versions of what I’d need for the full game to confirm there weren’t hard cutoffs for the stylistic choices inherent in the modeling / prompts / loras of each, but didn’t finalize any of them since that’d be 7x the work of what I’ll end up needing.

(If the poll results don’t end up crazy-tilted, I’ll put my thumb on the scale a little when the time comes to get really entrenched in one option.)

I’m fine with AI images in principle, but once I catch a tell, every image I’m inspecting the hands, eyes, and teeth. It lessens the fun but I can’t stop myself honestly.

Usually I prefer anime styles, but in this batch you presented I happened to like a realistic image the most.

Anyways, good luck with the project!

I would like to see more use of a western-style comic art style, with a little bit of a realistic model blended in to help with things like skin details; the proliferation of mediocre animesque stuff has kind of blunted my enthusiasm for it. Essentially, something in-between A on the first list and D on the second.

two examples of stuff I’ve generated closest to what I’m getting at:

3 Likes

Those are great! Your second example pings closer to my current conception of my idealized output than any of the examples I’ve put together so far. If you’re willing to pass on the model / settings / prompt / loras used it would probably up my game.

Re : the proliferation of mediocre animesque stuff, (hidden to reduce contamination of the poll, please vote before reading)

I'm not a fan of the style of example E above, myself. I included it in case I was in a wildly different headspace than the majority of folks here, which (fortunately) doesn't seem to be the case. EDIT : Huh, major swings in the voting. Now (Feb 5th) E is a close #2.

I do not support the use of AI image generators, but I cannot stop you either, so, at the very least, do not use B or D. Those are faces are made of cobbled together photos taken from real human beings, and turning real people into porn like this is pretty fucked up, to say the least.

4 Likes

Hey Fam, that’s a very good concern, and well articulated.

We’ve probably both seen those images where people just take a famous person and stretch them out in photoshop, and it’s like - “Why, man? That’s a real person, don’t treat them like that.” So I think I’m with you, at the core of the principle.

I won’t try to sell you on Stable Diffusion (SD) as a legit method of creating media, but I do want to throw you a bit of my workflow so you know why I’m not real concerned about victimizing real people. You probably know that a SD image starts with random noise, then iterates that noise according to the model’s understanding of the prompts the user fed it. If you run a batch of thirty attempts you get thirty results, most of them rubbish because there’s just no way to square the circle between the starting randomness and the prompt intent.

I use blending in my prompts - telling the system “Don’t structure this bit of the prompt 100% in way X, assemble with a 50/50 consideration between these two disparate elements”. To maintain consistency on character models across multiple images, most people use a name. Since the models are heavily trained on available imagery from the 'net, famous people get used a lot. My prompts are ‘blended’ versions, favoring characters as portrayed by people over actors (or politicians or models or whatever), and not saying “use famous sitcom character X” but "use something like famous sitcom character X, but equally character y, [etc]). I did a bunch of A-B testing, and found that blended prompts aggressively affect the final output in recognizable ways over any number of images generated but only have the vaguest relationship to the individuals that make up bits of the blend. Like, what part of the world did their ancestors come from. Usually there’s different eyes, jaw structure, cheek bones, etc etc. Because the new blended character iterates across multiple output images, I’m confident it’s not just randomly drawing features from some unmentioned person that was part of the model training - those new body structural features are derived from the blended origin points, but not observably correlated in a way that lets me say “okay, that’s character X’s jawline”. THEN the prompt gets a bunch of overriding modifications to individual elements - hair style, color, eye color, etc - that pushes further out of any already-not-perceptable-by-me alignment between the cited characters and the image output.

So, it’s like doing one of those “What would their adult offspring look like” photo collages, but only with images of people already made up to look like other people, running a few generations deep (“a+b=x, c+d=y, x+y=z, use z”), changing elements with a sharpie, and then trying to reconstitute the resulting image from scratch out of a pile of random connect-the-dot puzzles.

I’m fairly confident I’m not just Ship-Of-Theseus-ing an existing person with a few new planks. It’s more like an Armada-of-Theseus broke up on the reefs, and the ship we built from the rubble is half virgin wood anyway.

1 Like