Building a Realistic AI Shopping Story: Scene by Scene Prompts

WhatsApp Channel Join Now

Most AI images look good in isolation, but when you line them up, they often don’t feel connected. The face changes slightly, the mood shifts, the camera style drifts, and the “story” disappears. I wanted to test a different approach — build a simple shopping journey using tightly controlled prompts and keep the same person consistent across every frame.
Instead of chasing dramatic cinematic effects, the focus here was everyday realism: smartphone camera feel, neutral lighting, natural posture, and believable behavior. Think less photoshoot, more real-life moments.
The sequence follows one subject from outside the store to the final purchase decision.

Scene 1: Outside the Store

Starting point: street presence, relaxed posture, clear identity. This frame locks in the character and outfit so later scenes don’t drift.

Prompt Used:

💬 AI Prompt

Ultra-realistic smartphone street portrait of a young adult man leaning casually against a white urban utility pole. He stands relaxed, one hand inside his jeans pocket, the other hanging naturally while holding a smartphone without using it.

 

He looks directly into the camera with a cool, masculine, confident expression calm, composed, natural, and self-assured. His presence feels relaxed and unforced.

 

He has short, slightly messy black hair with soft texture and light natural volume, natural strand definition, soft movement, not stiff, not overly styled.

 

He wears an oversized muted plum-brown plaid shirt with rolled sleeves, a dark inner t-shirt, loose washed blue jeans with subtle thigh fading, and a thin metal bracelet.

 

He carries a vintage-style black parasute messenger bag worn crossbody. The bag is made from thick yet flexible parasute fabric with a casual, streetwear and grunge-inspired aesthetic. It has two medium-sized front flap pockets and two open side pockets. The adjustable strap has a black buckle. The bag is positioned behind his body, subtly visible from the viewer’s angle, not dominant in the frame.

 

The background shows the front of a clothing store named “MADE IN PAKISTAN” in Multan city of Pakistan. A large black storefront sign with a yellow leaf logo is clearly visible across the facade. The building has a modern white and dark gray exterior with hanging green plants from the second-floor balcony. The building features a façade of weathered red brick and warm terracotta tiles, accented with deep sheesham wood jali screens on the second-floor balcony, through which drapes of richly embroidered fabrics flow gently in the breeze. A grand, arched entryway, reminiscent of Mughal architecture, is framed by vibrant Multani blue-and-white tile work.

 

In front of the store, instead of mannequins, are takhts (low traditional seating platforms) covered in vibrant Ajrak-patterned textiles. A pair of tall, ornate brass oil lamps (Chiragh) flank the entrance, and the air is filled with the subtle scent of jasmine and cardamom.

 

Several motorcycles are parked in front, one with a DK license plate. A white banner reads: “گاہک اپنی سواری اور سامان کی جود حفاظت کریں شکریہ.” Potted plants separate the parking area, and the sidewalk uses reddish paving blocks. A busy commercial street is visible in the distance.

 

Shot in bright daylight, minimal harsh shadows, evenly lit face, natural skin tones. Deep depth of field, no background blur. Smartphone camera look, realistic dynamic range, slightly natural HDR processing, high detail, sharp focus across the entire frame, authentic urban Bali atmosphere, photorealistic.

 

NEGATIVE PROMPT:

blurry background, shallow depth of field, bokeh, cinematic blur, dramatic lighting, harsh shadow on face, strong contrast shadows, underexposed, overexposed highlights, studio lighting, artificial light glow, rim light, lens flare, soft focus, gaussian blur, low resolution, noise, grainy texture, oversharpened, overprocessed skin, plastic skin, waxy face, airbrushed skin, unrealistic skin texture, beauty filter, heavy color grading, teal and orange tone, cinematic color tone, moody dark tone, vintage filter, sepia tone

hat, cap, headwear, beanie, hood covering head

stiff hair, overly styled hair, pompadour, slick back, wet hair look, exaggerated volume

exaggerated muscles, bulky bodybuilder physique, skinny disproportionate body, bad anatomy, distorted hands, extra fingers, missing fingers, deformed hands, unnatural pose

bag in front of body, dominant bag in frame, oversized bag, incorrect bag material, leather bag, shiny bag, glossy finish, tactical military bag, backpack

wrong storefront name, incorrect text, misspelled sign, unreadable text, random logo, fictional brand name

 

empty street, no motorcycles, no DK plate, no banner text, no mannequins, no Balinese umbrella, blurred signage

 

night scene, sunset lighting, cloudy dark sky, rain, wet pavement reflections

 

dramatic cinematic depth, DSLR portrait effect, telephoto compression, extreme wide angle distortion

Ultra-realistic smartphone street portrait of a young adult man leaning casually against a white urban utility pole. He stands relaxed, one hand inside his jeans pocket, the other hanging naturally while holding a smartphone without using it.

He looks directly into the camera with a cool, masculine, confident expression — calm, composed, natural, and self-assured. His presence feels relaxed and unforced.

He has short, slightly messy black hair with soft texture and light natural volume, natural strand definition, soft movement, not stiff, not overly styled.

He wears an oversized muted plum-brown plaid shirt with rolled sleeves, a dark inner t-shirt, loose washed blue jeans with subtle thigh fading, and a thin metal bracelet.

He carries a vintage-style black parasute messenger bag worn crossbody. The bag is made from thick yet flexible parasute fabric with a casual, streetwear and grunge-inspired aesthetic. It has two medium-sized front flap pockets and two open side pockets. The adjustable strap has a black buckle. The bag is positioned behind his body, subtly visible from the viewer’s angle, not dominant in the frame.

The background shows the front of a clothing store named “MADE IN PAKISTAN” in Multan city of Pakistan. A large black storefront sign with a yellow leaf logo is clearly visible across the facade. The building has a modern white and dark gray exterior with hanging green plants from the second-floor balcony. The building features a façade of weathered red brick and warm terracotta tiles, accented with deep sheesham wood jali screens on the second-floor balcony, through which drapes of richly embroidered fabrics flow gently in the breeze. A grand, arched entryway, reminiscent of Mughal architecture, is framed by vibrant Multani blue-and-white tile work.

In front of the store, instead of mannequins, are takhts (low traditional seating platforms) covered in vibrant Ajrak-patterned textiles. A pair of tall, ornate brass oil lamps (Chiragh) flank the entrance, and the air is filled with the subtle scent of jasmine and cardamom.

Several motorcycles are parked in front, one with a DK license plate. A white banner reads: “گاہک اپنی سواری اور سامان کی جود حفاظت کریں شکریہ.” Potted plants separate the parking area, and the sidewalk uses reddish paving blocks. A busy commercial street is visible in the distance.

Shot in bright daylight, minimal harsh shadows, evenly lit face, natural skin tones. Deep depth of field, no background blur. Smartphone camera look, realistic dynamic range, slightly natural HDR processing, high detail, sharp focus across the entire frame, authentic urban Bali atmosphere, photorealistic.

NEGATIVE PROMPT:
blurry background, shallow depth of field, bokeh, cinematic blur, dramatic lighting, harsh shadow on face, strong contrast shadows, underexposed, overexposed highlights, studio lighting, artificial light glow, rim light, lens flare, soft focus, gaussian blur, low resolution, noise, grainy texture, oversharpened, overprocessed skin, plastic skin, waxy face, airbrushed skin, unrealistic skin texture, beauty filter, heavy color grading, teal and orange tone, cinematic color tone, moody dark tone, vintage filter, sepia tone

hat, cap, headwear, beanie, hood covering head

stiff hair, overly styled hair, pompadour, slick back, wet hair look, exaggerated volume

exaggerated muscles, bulky bodybuilder physique, skinny disproportionate body, bad anatomy, distorted hands, extra fingers, missing fingers, deformed hands, unnatural pose

bag in front of body, dominant bag in frame, oversized bag, incorrect bag material, leather bag, shiny bag, glossy finish, tactical military bag, backpack

wrong storefront name, incorrect text, misspelled sign, unreadable text, random logo, fictional brand name

empty street, no motorcycles, no DK plate, no banner text, no mannequins, no Balinese umbrella, blurred signage

night scene, sunset lighting, cloudy dark sky, rain, wet pavement reflections

dramatic cinematic depth, DSLR portrait effect, telephoto compression, extreme wide angle distortion

What matters here isn’t just style, it’s consistency anchors: hair, clothing layers, bag position, lighting type.

Scene 2: Browsing the Rack

The key change is behavioral, not posing anymore, just interacting with clothing. Small action details make AI scenes feel less staged.

Prompt Used:

💬 AI Prompt

Ultra-realistic smartphone photo of the same young adult man with identical facial geometry and appearance across frames… [full prompt text continues]Ultra-realistic smartphone photo of the same young adult man with identical facial geometry and appearance across frames.
He has short, slightly messy black hair with soft texture and light natural volume, natural strand definition, soft movement, not stiff, not overly styled. No glasses.
He wears an oversized muted plum-brown plaid shirt with rolled sleeves, a dark inner t-shirt, loose washed blue jeans with subtle thigh fading, and a thin metal bracelet.
He carries a vintage-style black parasute messenger bag worn crossbody. The bag is made from thick yet flexible parasute fabric with a casual, streetwear and grunge-inspired aesthetic. It has two medium-sized front flap pockets and two open side pockets. The adjustable strap has a black buckle. The bag is positioned behind his body, subtly visible from the viewer’s angle, not dominant in the frame.
He is inside a modern clothing shop, standing beside a neatly organized dress rack. He slides one hanger forward and checks the fabric with his fingers, inspecting quality and stitching. Relaxed buyer behavior, natural posture, not posing, attention focused on the garment.
Modern retail interior only — clean boutique layout, organized racks, folded garments on shelves, mirrors, track lighting, neutral walls, no exterior view, no storefront facade, no street elements visible.
Bright indoor retail lighting, evenly lit face, natural skin tones. Smartphone camera realism, deep depth of field, sharp focus across entire frame.

When prompts include micro-actions (touching fabric, sliding a hanger), results usually look more natural.

Scene 3: Mirror Check

Mirror shots are useful because they test identity consistency. If the reflection breaks, the prompt control isn’t tight enough.

Prompt Used:

💬 AI Prompt

Ultra-realistic smartphone photo of the same young adult man with identical facial geometry and appearance across frames… [full prompt text continues]Ultra-realistic smartphone photo of the same young adult man with identical facial geometry and appearance across frames.
He has short, slightly messy black hair with soft texture and light natural volume, natural strand definition, soft movement, not stiff, not overly styled.
He wears an oversized muted plum-brown plaid shirt with rolled sleeves, a dark inner t-shirt, loose washed blue jeans with subtle thigh fading, and a thin metal bracelet. No glasses.
He carries a vintage-style black parasute messenger bag worn crossbody. The bag is made from thick yet flexible parasute fabric with a casual, streetwear and grunge-inspired aesthetic. It has two medium-sized front flap pockets and two open side pockets. The adjustable strap has a black buckle. The bag is positioned behind his body, subtly visible from the viewer’s angle, not dominant in the frame.
He stands in front of a tall full-length store mirror, holding a dress (designer shirt) on a hanger at chest height and comparing it visually. He looks at the mirror reflection instead of the camera. Posture relaxed and natural.
Reflection shows correct face and outfit without distortion. Messenger bag remains behind body, only partially visible in reflection edge.
Modern boutique interior — mirror wall, clothing racks, folded stacks, fitting room corridor, modern fixtures, indoor-only environment.
Even indoor lighting, smartphone HDR realism, deep depth of field, no blur.

This scene adds a decision moment without needing dramatic expression.

Scene 4: Checkout Counter

End of the sequence, selection made, price checked, purchase likely. A quiet ending works better than an exaggerated “final pose.”

Prompt Used:

💬 AI Prompt

Ultra-realistic smartphone photo of the same young adult man with identical facial geometry and appearance across frames.

He has short, slightly messy black hair with soft texture and light natural volume, natural strand definition, soft movement, not stiff, not overly styled. No glasses.

He wears an oversized muted plum-brown plaid shirt with rolled sleeves, a dark inner t-shirt, loose washed blue jeans with subtle thigh fading, and a thin metal bracelet.

He carries a vintage-style black parasute messenger bag worn crossbody. The bag is made from thick yet flexible parasute fabric with a casual, streetwear and grunge-inspired aesthetic. It has two medium-sized front flap pockets and two open side pockets. The adjustable strap has a black buckle. The bag is positioned behind his body, subtly visible from the viewer’s angle, not dominant in the frame.

He stands at a modern checkout counter inside the clothing shop, holding a selected dress folded over his forearm while checking the price tag. Natural decision moment, calm and confident expression.

Counter shows POS terminal, folded garments, price tags, minimal retail accessories. Background contains organized racks and shelves only — no exterior visibility.

Bright indoor retail lighting, evenly exposed face, smartphone camera look, deep focus across frame.

Conclusion:

What I learned from this experiment is that realism comes more from continuity than detail. You can have ultra-sharp textures and still get a fake-looking series if identity and behavior change from frame to frame. Repeating the same physical descriptors, camera style, and lighting rules across prompts matters more than adding extra visual flair.
If you’re building prompt-based stories, plan the sequence first. Treat each image like a scene, not a standalone poster. The difference shows immediately.

Try Google Gemini Here:

Generate Now

This blog post and AI prompts were created by Shahbaz Ahmad.
Follow me on TikTok @Dudefrompak for more ready-to-use prompts.

WhatsApp Channel

📢 Join Our WhatsApp Channel

Get daily AI photo editing prompts, tools, and tips directly on your phone.

Join Now on WhatsApp 🚀

Leave a Reply

Your email address will not be published. Required fields are marked *