[bug]: Flux Fill inpainting does not work #7836

pelladigabor · 2025-03-25T11:05:22Z

Is there an existing issue for this problem?

I have searched the existing issues

Operating system

Linux

GPU vendor

Nvidia (CUDA)

GPU model

GTX 1660

GPU VRAM

6GB

Version number

5.9.0rc2-cuda

Browser

Firefox 136.0.2

Python dependencies

{
"accelerate": "1.0.1",
"compel": "2.0.2",
"cuda": "12.4",
"diffusers": "0.31.0",
"numpy": "1.26.3",
"opencv": "4.9.0.80",
"onnx": "1.16.1",
"pillow": "11.0.0",
"python": "3.11.11",
"torch": "2.4.1+cu124",
"torchvision": "0.19.1+cu124",
"transformers": "4.46.3",
"xformers": null
}

What happened

I wanted to test the new pre-release with FLUX Fill support. I have downloaded the model from starter models. I cannot get it to work for inpainting.

When I create a raster layer and an inpainting mask layer, select the new FLUX Fill main model, draw an area for inpainting, press Invoke, the following error is presented:

Server Error
RuntimeError: mat1 and mat2 must have the same dtype, but got Float and BFloat16

I did not use any other layers, controls, IP adapter, Lora, nothing else.
The console log:

[2025-03-25 08:46:42,143]::[InvokeAI]::ERROR --> Error while invoking session a51cf2cc-e97c-4880-9d36-8776aa61815f, invocation b8b4c3dc-47d6-4513-bf2d-6e3c50524965 (flux_denoise): mat1 and mat2 must have the same dtype, but got Float and BFloat16
[2025-03-25 08:46:42,143]::[InvokeAI]::ERROR --> Traceback (most recent call last):
  File "/opt/invokeai/invokeai/app/services/session_processor/session_processor_default.py", line 129, in run_node
    output = invocation.invoke_internal(context=context, services=self._services)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/invokeai/invokeai/app/invocations/baseinvocation.py", line 303, in invoke_internal
    output = self.invoke(context)
             ^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/opt/invokeai/invokeai/app/invocations/flux_denoise.py", line 156, in invoke
    latents = self._run_diffusion(context)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/invokeai/invokeai/app/invocations/flux_denoise.py", line 380, in _run_diffusion
    x = denoise(
        ^^^^^^^^
  File "/opt/invokeai/invokeai/backend/flux/denoise.py", line 75, in denoise
    pred = model(
           ^^^^^^
  File "/opt/venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/invokeai/invokeai/backend/flux/model.py", line 110, in forward
    img = self.img_in(img)
          ^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/invokeai/invokeai/backend/model_manager/load/model_cache/torch_module_autocast/custom_modules/custom_linear.py", line 82, in forward
    return self._autocast_forward(input)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/invokeai/invokeai/backend/model_manager/load/model_cache/torch_module_autocast/custom_modules/custom_linear.py", line 76, in _autocast_forward
    return torch.nn.functional.linear(input, weight, bias)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: mat1 and mat2 must have the same dtype, but got Float and BFloat16

What you expected to happen

FLUX Fill model can be used for inpainting.

How to reproduce the problem

I always have this issue, if FLUX Fill main model is selected. Both on canvas and on workflow.

Additional context

No response

Discord username

No response

The text was updated successfully, but these errors were encountered:

psychedelicious · 2025-03-25T21:39:09Z

Please share the contents of your invokeai.yaml file

pelladigabor · 2025-03-26T05:28:27Z

This is it:

# Internal metadata - do not edit:
schema_version: 4.0.2

# Put user settings here - see https://invoke-ai.github.io/InvokeAI/configuration/:
enable_partial_loading: true
device_working_mem_gb: 3.5

psychedelicious · 2025-03-26T05:41:22Z

Thanks, I suspect this is related to the GTX 1660's limited support for float16 and bfloat16 operations. These are the low-level mathematical operations used by the diffusion process.

We force the GTX 1660 to use float32 operations for compatibility, but I see that for FLUX, we force diffusion process to use bfloat16. This may be the issue, but I'm not confident changing this.

@brandonrising Do you recall FLUX strictly requires bfloat16? Can we use TorchDevice.choose_torch_dtype() instead? See

InvokeAI/invokeai/app/invocations/flux_denoise.py

Line 166 in 92f0c28

inference_dtype = torch.bfloat16

for where we force it to bfloat16.

pelladigabor · 2025-03-26T06:44:07Z

Thanks, what you wrote should be the cause of my issue. I have checked that Nvidia 20xx series and 16xx series have Compute capability 7.5 cuda gpus. But 8.0 is required for bfloat16 nvidia docs, which can be found in 30xx series RTX cards and up.
Side note, I can run all other FLUX models in InvokeAI with this card, even the unquantized dev model, when the bbox resolution is not too high.

This resolves an issue where specifying `float32` precision causes FLUX Fill to error. I noticed that our other customized torch modules do some dtype casting themselves, so maybe this is a fine place to do this? Maybe this could break things... See #7836

psychedelicious · 2025-03-26T08:34:35Z

Interesting, I would have expected all FLUX models to fail.

I was able to reproduce the issue on my 4090 by manually setting precision: float32.

That helped me to understand the issue enough to find a fix. I've drafted the fix but am not confident that it is the right way to handle it.

pelladigabor added the bug Something isn't working label Mar 25, 2025

psychedelicious linked a pull request Mar 26, 2025 that will close this issue

experiment(backend): autocast dtype in CustomLinear #7843

Open

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[bug]: Flux Fill inpainting does not work #7836

[bug]: Flux Fill inpainting does not work #7836

pelladigabor commented Mar 25, 2025

psychedelicious commented Mar 25, 2025

pelladigabor commented Mar 26, 2025

psychedelicious commented Mar 26, 2025

pelladigabor commented Mar 26, 2025

psychedelicious commented Mar 26, 2025

[bug]: Flux Fill inpainting does not work #7836

[bug]: Flux Fill inpainting does not work #7836

Comments

pelladigabor commented Mar 25, 2025

Is there an existing issue for this problem?

Operating system

GPU vendor

GPU model

GPU VRAM

Version number

Browser

Python dependencies

What happened

What you expected to happen

How to reproduce the problem

Additional context

Discord username

psychedelicious commented Mar 25, 2025

pelladigabor commented Mar 26, 2025

psychedelicious commented Mar 26, 2025

pelladigabor commented Mar 26, 2025

psychedelicious commented Mar 26, 2025