Skip to content

RuntimeError: Sizes of tensors must match except in dimension 1 #1320

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
MathiasHolmstrom opened this issue Jun 2, 2023 · 16 comments
Open

Comments

@MathiasHolmstrom
Copy link

  • PyTorch-Forecasting version: 1.0.0
  • PyTorch version: 2.0.1+cpu
  • Python version: 3.9
  • Operating System: Windows11

Expected behavior

I executed code Baseline().predict(val_dataloader, return_y=True) and did not expect any errors

Actual behavior

Received the following error

    return torch.cat(sequences, dim=1)
RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 1280 but got size 42 for tensor number 14 in the list.


Code to reproduce the problem

I am running the following code on an internal dataset

max_prediction_length = 6
max_encoder_length = 24
training_cutoff = data["time_idx"].max() - max_prediction_length

training = TimeSeriesDataSet(
    data[data['time_idx'] <= training_cutoff],
    group_ids=["product_number", "sku_size", "retail_sales_channel"],
    time_idx="time_idx",
    target="quantity_sold",
    min_prediction_length=1,
    time_varying_known_reals=["time_idx", "discount_rate"],
    time_varying_unknown_categoricals=[],
    time_varying_unknown_reals=[
        "quantity_physical_closing",
    ], 
    add_relative_time_idx=True,
    add_target_scales=True,
    add_encoder_length=True,
)
validation = TimeSeriesDataSet.from_dataset(training, data, predict=True, stop_randomization=True)
batch_size = 128  # set this between 32 to 128
train_dataloader = training.to_dataloader(train=True, batch_size=batch_size, num_workers=0)
val_dataloader = validation.to_dataloader(train=False, batch_size=batch_size * 10, num_workers=0)
baseline_predictions = Baseline().predict(val_dataloader, return_y=True)

@ntlm1686
Copy link

ntlm1686 commented Jun 3, 2023

I didn't run the code, but I know len(trainning) % 128 == 42 or len(trainning) % 1280 == 42

Their code is funny.

@MathiasHolmstrom
Copy link
Author

So do you know what I can change to make it work?

@ntlm1686
Copy link

ntlm1686 commented Jun 6, 2023

Just make the length of training integer multiple of the batch size.

For example, your batch size is 64. Training length is 6420. Then drop the last 20 samples.

@MathiasHolmstrom
Copy link
Author

It's the validation data that fails so I assume I should drop it based on validation set? Although I tried both and neither works.

@adejumobioluwafemi
Copy link

I am currently faced with similar issue even when I tried to evaluate the performance of the tft model.

predictions = best_tft.predict(val_dataloader, return_y=True, trainer_kwargs=dict(accelerator="cpu"))
MAE()(predictions.output, predictions.y)

Please, if you find a way around yours, let me know how

@hippotilt
Copy link

I'm having the same issue with pretty much the same code :/

@neverfox
Copy link

neverfox commented Jul 5, 2023

Yes, the code in question (which produces this error) is in the TFT demand example in the documentation.

@hippotilt
Copy link

hippotilt commented Jul 6, 2023

I've found a fix : modifying the concat_sequences() function in utils.py:
it just pads the last sequence tensor with nans so that its size matches that of the other. I'm not sure how reliable this is, but with this my code runs.

def concat_sequences(
      sequences: Union[List[torch.Tensor], List[rnn.PackedSequence]]
  ) -> Union[torch.Tensor, rnn.PackedSequence]:
      """
      Concatenate RNN sequences.
      Args:
          sequences (Union[List[torch.Tensor], List[rnn.PackedSequence]): list of RNN packed sequences or tensors of which
              first index are samples and second are timesteps
  
      Returns:
          Union[torch.Tensor, rnn.PackedSequence]: concatenated sequence
      """
      if isinstance(sequences[0], rnn.PackedSequence):
          return rnn.pack_sequence(sequences, enforce_sorted=False)
      elif isinstance(sequences[0], torch.Tensor):
          # BEGINING OF MODIFIED CODE
          #print("Sequence size : ")
          #print(sequences[0].size(), sequences[-1].size())
          if sequences[0].size(0) > sequences[-1].size(0):
              #print("Padding")
              delta = sequences[0].size(0) - sequences[-1].size(0)
              #print(sequences[-1].size())
              sequences[-1] = F.pad(sequences[-1],pad=(0,0,0,delta),mode="constant",value=torch.nan)
              #print(sequences[-1].size())
          # END OF MODIFIED CODE
          return torch.cat(sequences, dim=1)
      elif isinstance(sequences[0], (tuple, list)):
          return tuple(
              concat_sequences([sequences[ii][i] for ii in range(len(sequences))]) for i in range(len(sequences[0]))
          )
      else:
          raise ValueError("Unsupported sequence type")

@DaniloMendezR
Copy link

I've been struggling with a similar problem for a long time now. What worked for me (I don't know if it makes mathematical sense) was to lower the batch size to the size that the error tells you. In your case 42.

Hope this helps

@abudis
Copy link

abudis commented Jul 25, 2023

Please see my comment here - #449 (comment).

If you don't need the ys (it's easy to format them yourself), then setting return_y = False fixes the issue.

@hippotilt thanks! I tracked down the problem to this function. It would be nice if something similar was merged upstream so that we don't need to hack it in our own code.

@Meet1995
Copy link

Meet1995 commented Nov 12, 2023

I encountered the same error and narrowed down the issue, as mentioned by many above, to the concat_sequences function in utils.py. The following fix worked for me:

def concat_sequences(
    sequences: Union[List[torch.Tensor], List[rnn.PackedSequence]]
) -> Union[torch.Tensor, rnn.PackedSequence]:
    """
    Concatenate RNN sequences.

    Args:
        sequences (Union[List[torch.Tensor], List[rnn.PackedSequence]): list of RNN packed sequences or tensors of which
            first index are samples and second are timesteps

    Returns:
        Union[torch.Tensor, rnn.PackedSequence]: concatenated sequence
    """
    if isinstance(sequences[0], rnn.PackedSequence):
        return rnn.pack_sequence(sequences, enforce_sorted=False)
    elif isinstance(sequences[0], torch.Tensor):
        return torch.cat(sequences, dim=0)  # changed from dim=1 to dim=0
    elif isinstance(sequences[0], (tuple, list)):
        return tuple(
            concat_sequences([sequences[ii][i] for ii in range(len(sequences))]) for i in range(len(sequences[0]))
        )
    else:
        raise ValueError("Unsupported sequence type")

Just changing the concat dimension to 0 (the axis containing the batches) fixes the error. I am not sure how this function is used elsewhere in the package and hope it does not break things in those places.

@deltawi
Copy link

deltawi commented Jan 15, 2024

I am currently faced with similar issue even when I tried to evaluate the performance of the tft model.

predictions = best_tft.predict(val_dataloader, return_y=True, trainer_kwargs=dict(accelerator="cpu")) MAE()(predictions.output, predictions.y)

Please, if you find a way around yours, let me know how

Same issue here, can't predict all my examples because they aren't a multiplier of batch_size. Would be great if we can have a fix on this one.

@darui1223
Copy link

I also encountered this problem, so how should I fix it?

@darui1223
Copy link

I am currently faced with similar issue even when I tried to evaluate the performance of the tft model.
predictions = best_tft.predict(val_dataloader, return_y=True, trainer_kwargs=dict(accelerator="cpu")) MAE()(predictions.output, predictions.y)
Please, if you find a way around yours, let me know how

Same issue here, can't predict all my examples because they aren't a multiplier of batch_size. Would be great if we can have a fix on this one.

Have you solved the problem?

@darui1223
Copy link

ust changing the concat dimension to 0 (the axis containing the batches) fixes the error. I am not sure how this function is used elsewhere in the package and hope it does not break things in those places.

hi,I modified dim=0 according to your method, but the error was still reported before. Did you run it successfully?

@darui1223
Copy link

I encountered the same error and narrowed down the issue, as mentioned by many above, to the concat_sequences function in utils.py. The following fix worked for me:

def concat_sequences(
    sequences: Union[List[torch.Tensor], List[rnn.PackedSequence]]
) -> Union[torch.Tensor, rnn.PackedSequence]:
    """
    Concatenate RNN sequences.

    Args:
        sequences (Union[List[torch.Tensor], List[rnn.PackedSequence]): list of RNN packed sequences or tensors of which
            first index are samples and second are timesteps

    Returns:
        Union[torch.Tensor, rnn.PackedSequence]: concatenated sequence
    """
    if isinstance(sequences[0], rnn.PackedSequence):
        return rnn.pack_sequence(sequences, enforce_sorted=False)
    elif isinstance(sequences[0], torch.Tensor):
        return torch.cat(sequences, dim=0)  # changed from dim=1 to dim=0
    elif isinstance(sequences[0], (tuple, list)):
        return tuple(
            concat_sequences([sequences[ii][i] for ii in range(len(sequences))]) for i in range(len(sequences[0]))
        )
    else:
        raise ValueError("Unsupported sequence type")

Just changing the concat dimension to 0 (the axis containing the batches) fixes the error. I am not sure how this function is used elsewhere in the package and hope it does not break things in those places.

thank you ! I solved the problem your way!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants