Skip to content

Commit 8dc7f11

Browse files
committed
Fix performance issue introduced by torch cuda cache clear during generation
1 parent 4b4111a commit 8dc7f11

File tree

1 file changed

+0
-1
lines changed

1 file changed

+0
-1
lines changed

ldm/modules/attention.py

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -282,7 +282,6 @@ def einsum_op_cuda(self, q, k, v):
282282

283283
def get_attention_mem_efficient(self, q, k, v):
284284
if q.device.type == 'cuda':
285-
torch.cuda.empty_cache()
286285
#print("in get_attention_mem_efficient with q shape", q.shape, ", k shape", k.shape, ", free memory is", get_mem_free_total(q.device))
287286
return self.einsum_op_cuda(q, k, v)
288287

0 commit comments

Comments
 (0)