You can produce studio-grade product photography with nothing more than a phone camera, a reference shot, and the right AI image model. The workflow replaces lights, backdrops, and post-production with a combination of FLUX Kontext for object-level editing, gpt-image-1 for photorealistic compositions, and FLUX 1.1 Pro Ultra for high-resolution final renders.
This guide walks through the exact steps we use to go from a raw phone capture to ad-ready product images in under 30 minutes.
What You Need Before You Start
Forget ring lights and seamless paper. Your input requirements are minimal.
- One clean reference photo of the product. Shoot it on any phone against a neutral surface (white table, plain wall). Natural window light works fine. Avoid harsh overhead shadows on the product itself.
- Multiple angles if possible. Three to five angles give AI models more information about shape, texture, and material finish. One angle works for flat-lay items like packaging.
- A written product brief. Specify the material (matte plastic, brushed aluminum, glass), the color precisely (hex codes or Pantone if available), and the target context (kitchen counter, bathroom shelf, outdoor table).
Step-by-Step Workflow for AI Product Shots
Step 1: Isolate the product with background removal
Use gpt-image-1 or any segmentation tool to extract the product from your phone photo. The goal is a clean cutout with accurate edge detail. gpt-image-1 handles this well when you prompt it with "remove background, preserve all edge detail including transparent or reflective surfaces, output on transparent background."
Step 2: Generate the scene with FLUX 1.1 Pro Ultra
FLUX 1.1 Pro Ultra outputs at high native resolution, which matters for ecommerce where buyers zoom into texture. Build your scene prompt with this structure:
[Product description] + [Surface/setting] + [Lighting direction and quality] + [Camera angle and lens] + [Mood/atmosphere]
Example prompt: "A 250ml amber glass skincare bottle on a raw travertine stone slab, soft diffused morning light from camera left, slight caustic reflections on the surface, shot at 85mm f/2.8, shallow depth of field, warm neutral tones, editorial beauty photography."
The specificity of the lighting call (direction, quality, time of day) does most of the work. Saying "studio lighting" gives you flat, generic output. Saying "soft diffused morning light from camera left" gives you dimension.
Step 3: Swap in your actual product with FLUX Kontext
FLUX Kontext handles in-context editing and object replacement. Upload your generated scene and your isolated product cutout. Prompt Kontext to place the product into the scene while preserving the lighting environment. This keeps the shadows, reflections, and color temperature consistent with the generated background.
The key prompt detail here is to specify that lighting and shadow on the product should match the scene. Without this, you get the classic "pasted on" look where the product sits in wrong light.
Step 4: Refine material accuracy
AI models struggle with specific material finishes. Matte black plastic often renders as glossy, brushed metal loses its directional grain, and frosted glass can come out as clear. Run a second pass in FLUX Kontext or gpt-image-1 targeting only the product surface. Prompt with exact material descriptions: "matte soft-touch finish with zero specular highlights" or "linear brushed aluminum grain running horizontally."
Step 5: Generate lifestyle context variants
Once your hero shot is solid, create lifestyle variants for different ad placements. Change the scene prompt while keeping the product swap step identical. A single product cutout can appear on a marble bathroom counter for Instagram, a gym bag context for Meta feed ads, and a clean white background for Amazon listings, all within minutes.
Common Failures and How to Avoid Them
Color drift is the most frequent problem. AI models shift product colors toward what they consider "aesthetically pleasing," which means your coral packaging becomes salmon. Always include hex color values in your prompt and run a final color correction pass against your original reference photo.
Scale inconsistency happens when the product looks too large or too small relative to context objects. Include a real-world size reference in your prompt ("250ml bottle next to a standard coffee mug") to anchor proportions.
Text and label distortion remains a weak point for all current models. If your product has readable text on the label, plan to composite that in manually or regenerate until text is acceptable. gpt-image-1 handles text better than FLUX for label-heavy products, but neither is reliable at small font sizes.
