Hey Fam, that’s a very good concern, and well articulated.
We’ve probably both seen those images where people just take a famous person and stretch them out in photoshop, and it’s like - “Why, man? That’s a real person, don’t treat them like that.” So I think I’m with you, at the core of the principle.
I won’t try to sell you on Stable Diffusion (SD) as a legit method of creating media, but I do want to throw you a bit of my workflow so you know why I’m not real concerned about victimizing real people. You probably know that a SD image starts with random noise, then iterates that noise according to the model’s understanding of the prompts the user fed it. If you run a batch of thirty attempts you get thirty results, most of them rubbish because there’s just no way to square the circle between the starting randomness and the prompt intent.
I use blending in my prompts - telling the system “Don’t structure this bit of the prompt 100% in way X, assemble with a 50/50 consideration between these two disparate elements”. To maintain consistency on character models across multiple images, most people use a name. Since the models are heavily trained on available imagery from the 'net, famous people get used a lot. My prompts are ‘blended’ versions, favoring characters as portrayed by people over actors (or politicians or models or whatever), and not saying “use famous sitcom character X” but "use something like famous sitcom character X, but equally character y, [etc]). I did a bunch of A-B testing, and found that blended prompts aggressively affect the final output in recognizable ways over any number of images generated but only have the vaguest relationship to the individuals that make up bits of the blend. Like, what part of the world did their ancestors come from. Usually there’s different eyes, jaw structure, cheek bones, etc etc. Because the new blended character iterates across multiple output images, I’m confident it’s not just randomly drawing features from some unmentioned person that was part of the model training - those new body structural features are derived from the blended origin points, but not observably correlated in a way that lets me say “okay, that’s character X’s jawline”. THEN the prompt gets a bunch of overriding modifications to individual elements - hair style, color, eye color, etc - that pushes further out of any already-not-perceptable-by-me alignment between the cited characters and the image output.
So, it’s like doing one of those “What would their adult offspring look like” photo collages, but only with images of people already made up to look like other people, running a few generations deep (“a+b=x, c+d=y, x+y=z, use z”), changing elements with a sharpie, and then trying to reconstitute the resulting image from scratch out of a pile of random connect-the-dot puzzles.
I’m fairly confident I’m not just Ship-Of-Theseus-ing an existing person with a few new planks. It’s more like an Armada-of-Theseus broke up on the reefs, and the ship we built from the rubble is half virgin wood anyway.