Skip to content

Commit ff55d01

Browse files
committed
[nofree] Restrict semantics to memory visible to caller
This patch clarifies the semantics of the nofree function attribute to make clear that it provides an "as if" semantic. That is, a nofree function is guaranteed not to free memory which existed before the call, but might allocate and then deallocate that same memory within the lifetime of the callee. This is the result of the discussion on llvm-dev under the thread "Ambiguity in the nofree function attribute". The most important part of this change is the LangRef wording. The rest is minor comment changes to emphasize the new semantics where code was accidentally consistent, and fix one place which wasn't consistent. That one place is currently narrowly used as it is primarily part of the ongoing (and not yet enabled) deref-at-point semantics work. Differential Revision: https://reviews.llvm.org/D100141
1 parent 0daf273 commit ff55d01

File tree

5 files changed

+47
-33
lines changed

5 files changed

+47
-33
lines changed

llvm/docs/LangRef.rst

Lines changed: 15 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1598,12 +1598,21 @@ example:
15981598
call is dead after inlining.
15991599
``nofree``
16001600
This function attribute indicates that the function does not, directly or
1601-
indirectly, call a memory-deallocation function (free, for example). As a
1602-
result, uncaptured pointers that are known to be dereferenceable prior to a
1603-
call to a function with the ``nofree`` attribute are still known to be
1604-
dereferenceable after the call (the capturing condition is necessary in
1605-
environments where the function might communicate the pointer to another thread
1606-
which then deallocates the memory).
1601+
transitively, call a memory-deallocation function (``free``, for example)
1602+
on a memory allocation which existed before the call.
1603+
1604+
As a result, uncaptured pointers that are known to be dereferenceable
1605+
prior to a call to a function with the ``nofree`` attribute are still
1606+
known to be dereferenceable after the call. The capturing condition is
1607+
necessary in environments where the function might communicate the
1608+
pointer to another thread which then deallocates the memory. Alternatively,
1609+
``nosync`` would ensure such communication cannot happen and even captured
1610+
pointers cannot be freed by the function.
1611+
1612+
A ``nofree`` function is explicitly allowed to free memory which it
1613+
allocated or (if not ``nosync``) arrange for another thread to free
1614+
memory on it's behalf. As a result, perhaps surprisingly, a ``nofree``
1615+
function can return a pointer to a previously deallocated memory object.
16071616
``noimplicitfloat``
16081617
This attributes disables implicit floating-point instructions.
16091618
``noinline``

llvm/lib/IR/Value.cpp

Lines changed: 10 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -736,9 +736,18 @@ bool Value::canBeFreed() const {
736736

737737
// Handle byval/byref/sret/inalloca/preallocated arguments. The storage
738738
// lifetime is guaranteed to be longer than the callee's lifetime.
739-
if (auto *A = dyn_cast<Argument>(this))
739+
if (auto *A = dyn_cast<Argument>(this)) {
740740
if (A->hasPointeeInMemoryValueAttr())
741741
return false;
742+
// A pointer to an object in a function which neither frees, nor can arrange
743+
// for another thread to free on its behalf, can not be freed in the scope
744+
// of the function. Note that this logic is restricted to memory
745+
// allocations in existance before the call; a nofree function *is* allowed
746+
// to free memory it allocated.
747+
const Function *F = A->getParent();
748+
if (F->doesNotFreeMemory() && F->hasNoSync())
749+
return false;
750+
}
742751

743752
const Function *F = nullptr;
744753
if (auto *I = dyn_cast<Instruction>(this))
@@ -749,12 +758,6 @@ bool Value::canBeFreed() const {
749758
if (!F)
750759
return true;
751760

752-
// A pointer to an object in a function which neither frees, nor can arrange
753-
// for another thread to free on its behalf, can not be freed in the scope
754-
// of the function.
755-
if (F->doesNotFreeMemory() && F->hasNoSync())
756-
return false;
757-
758761
// With garbage collection, deallocation typically occurs solely at or after
759762
// safepoints. If we're compiling for a collector which uses the
760763
// gc.statepoint infrastructure, safepoints aren't explicitly present

llvm/lib/Transforms/InstCombine/InstructionCombining.cpp

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2802,8 +2802,10 @@ Instruction *InstCombinerImpl::visitFree(CallInst &FI) {
28022802
// If we free a pointer we've been explicitly told won't be freed, this
28032803
// would be full UB and thus we can conclude this is unreachable. Cases:
28042804
// 1) freeing a pointer which is explicitly nofree
2805-
// 2) calling free from a call site marked nofree
2806-
// 3) calling free in a function scope marked nofree
2805+
// 2) calling free from a call site marked nofree (TODO: can generalize
2806+
// for non-arguments)
2807+
// 3) calling free in a function scope marked nofree (when we can prove
2808+
// the allocation existed before the start of the function scope)
28072809
if (auto *A = dyn_cast<Argument>(Op->stripPointerCasts()))
28082810
if (A->hasAttribute(Attribute::NoFree) ||
28092811
FI.hasFnAttr(Attribute::NoFree) ||

llvm/test/Transforms/LICM/hoist-alloc.ll

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -178,11 +178,11 @@ define i8 @test_hoist_malloc_leak() nofree nosync {
178178
; CHECK-NEXT: [[A_RAW:%.*]] = call nonnull i8* @malloc(i64 32)
179179
; CHECK-NEXT: call void @init(i8* [[A_RAW]])
180180
; CHECK-NEXT: [[ADDR:%.*]] = getelementptr i8, i8* [[A_RAW]], i32 31
181-
; CHECK-NEXT: [[RES:%.*]] = load i8, i8* [[ADDR]], align 1
182181
; CHECK-NEXT: br label [[FOR_BODY:%.*]]
183182
; CHECK: for.body:
184183
; CHECK-NEXT: [[IV:%.*]] = phi i64 [ [[IV_NEXT:%.*]], [[FOR_BODY]] ], [ 0, [[ENTRY:%.*]] ]
185184
; CHECK-NEXT: call void @unknown()
185+
; CHECK-NEXT: [[RES:%.*]] = load i8, i8* [[ADDR]], align 1
186186
; CHECK-NEXT: call void @use(i8 [[RES]])
187187
; CHECK-NEXT: [[IV_NEXT]] = add nuw nsw i64 [[IV]], 1
188188
; CHECK-NEXT: [[EXITCOND:%.*]] = icmp eq i64 [[IV_NEXT]], 200
@@ -372,11 +372,11 @@ define i8 @test_hoist_allocsize_leak() nofree nosync {
372372
; CHECK-NEXT: [[A_RAW:%.*]] = call nonnull i8* @my_alloc(i64 32)
373373
; CHECK-NEXT: call void @init(i8* [[A_RAW]])
374374
; CHECK-NEXT: [[ADDR:%.*]] = getelementptr i8, i8* [[A_RAW]], i32 31
375-
; CHECK-NEXT: [[RES:%.*]] = load i8, i8* [[ADDR]], align 1
376375
; CHECK-NEXT: br label [[FOR_BODY:%.*]]
377376
; CHECK: for.body:
378377
; CHECK-NEXT: [[IV:%.*]] = phi i64 [ [[IV_NEXT:%.*]], [[FOR_BODY]] ], [ 0, [[ENTRY:%.*]] ]
379378
; CHECK-NEXT: call void @unknown()
379+
; CHECK-NEXT: [[RES:%.*]] = load i8, i8* [[ADDR]], align 1
380380
; CHECK-NEXT: call void @use(i8 [[RES]])
381381
; CHECK-NEXT: [[IV_NEXT]] = add nuw nsw i64 [[IV]], 1
382382
; CHECK-NEXT: [[EXITCOND:%.*]] = icmp eq i64 [[IV_NEXT]], 200

llvm/test/Transforms/LoopVectorize/X86/load-deref-pred.ll

Lines changed: 16 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -2299,24 +2299,24 @@ define i32 @test_allocsize(i64 %len, i1* %test_base) nofree nosync {
22992299
; CHECK-NEXT: [[TMP67:%.*]] = getelementptr inbounds i32, i32* [[BASE]], i64 [[TMP12]]
23002300
; CHECK-NEXT: [[TMP68:%.*]] = getelementptr inbounds i32, i32* [[TMP64]], i32 0
23012301
; CHECK-NEXT: [[TMP69:%.*]] = bitcast i32* [[TMP68]] to <4 x i32>*
2302-
; CHECK-NEXT: [[WIDE_LOAD:%.*]] = load <4 x i32>, <4 x i32>* [[TMP69]], align 4
2302+
; CHECK-NEXT: [[WIDE_MASKED_LOAD:%.*]] = call <4 x i32> @llvm.masked.load.v4i32.p0v4i32(<4 x i32>* [[TMP69]], i32 4, <4 x i1> [[TMP39]], <4 x i32> poison)
23032303
; CHECK-NEXT: [[TMP70:%.*]] = getelementptr inbounds i32, i32* [[TMP64]], i32 4
23042304
; CHECK-NEXT: [[TMP71:%.*]] = bitcast i32* [[TMP70]] to <4 x i32>*
2305-
; CHECK-NEXT: [[WIDE_LOAD4:%.*]] = load <4 x i32>, <4 x i32>* [[TMP71]], align 4
2305+
; CHECK-NEXT: [[WIDE_MASKED_LOAD4:%.*]] = call <4 x i32> @llvm.masked.load.v4i32.p0v4i32(<4 x i32>* [[TMP71]], i32 4, <4 x i1> [[TMP47]], <4 x i32> poison)
23062306
; CHECK-NEXT: [[TMP72:%.*]] = getelementptr inbounds i32, i32* [[TMP64]], i32 8
23072307
; CHECK-NEXT: [[TMP73:%.*]] = bitcast i32* [[TMP72]] to <4 x i32>*
2308-
; CHECK-NEXT: [[WIDE_LOAD5:%.*]] = load <4 x i32>, <4 x i32>* [[TMP73]], align 4
2308+
; CHECK-NEXT: [[WIDE_MASKED_LOAD5:%.*]] = call <4 x i32> @llvm.masked.load.v4i32.p0v4i32(<4 x i32>* [[TMP73]], i32 4, <4 x i1> [[TMP55]], <4 x i32> poison)
23092309
; CHECK-NEXT: [[TMP74:%.*]] = getelementptr inbounds i32, i32* [[TMP64]], i32 12
23102310
; CHECK-NEXT: [[TMP75:%.*]] = bitcast i32* [[TMP74]] to <4 x i32>*
2311-
; CHECK-NEXT: [[WIDE_LOAD6:%.*]] = load <4 x i32>, <4 x i32>* [[TMP75]], align 4
2311+
; CHECK-NEXT: [[WIDE_MASKED_LOAD6:%.*]] = call <4 x i32> @llvm.masked.load.v4i32.p0v4i32(<4 x i32>* [[TMP75]], i32 4, <4 x i1> [[TMP63]], <4 x i32> poison)
23122312
; CHECK-NEXT: [[TMP76:%.*]] = xor <4 x i1> [[TMP39]], <i1 true, i1 true, i1 true, i1 true>
23132313
; CHECK-NEXT: [[TMP77:%.*]] = xor <4 x i1> [[TMP47]], <i1 true, i1 true, i1 true, i1 true>
23142314
; CHECK-NEXT: [[TMP78:%.*]] = xor <4 x i1> [[TMP55]], <i1 true, i1 true, i1 true, i1 true>
23152315
; CHECK-NEXT: [[TMP79:%.*]] = xor <4 x i1> [[TMP63]], <i1 true, i1 true, i1 true, i1 true>
2316-
; CHECK-NEXT: [[PREDPHI:%.*]] = select <4 x i1> [[TMP39]], <4 x i32> [[WIDE_LOAD]], <4 x i32> zeroinitializer
2317-
; CHECK-NEXT: [[PREDPHI7:%.*]] = select <4 x i1> [[TMP47]], <4 x i32> [[WIDE_LOAD4]], <4 x i32> zeroinitializer
2318-
; CHECK-NEXT: [[PREDPHI8:%.*]] = select <4 x i1> [[TMP55]], <4 x i32> [[WIDE_LOAD5]], <4 x i32> zeroinitializer
2319-
; CHECK-NEXT: [[PREDPHI9:%.*]] = select <4 x i1> [[TMP63]], <4 x i32> [[WIDE_LOAD6]], <4 x i32> zeroinitializer
2316+
; CHECK-NEXT: [[PREDPHI:%.*]] = select <4 x i1> [[TMP39]], <4 x i32> [[WIDE_MASKED_LOAD]], <4 x i32> zeroinitializer
2317+
; CHECK-NEXT: [[PREDPHI7:%.*]] = select <4 x i1> [[TMP47]], <4 x i32> [[WIDE_MASKED_LOAD4]], <4 x i32> zeroinitializer
2318+
; CHECK-NEXT: [[PREDPHI8:%.*]] = select <4 x i1> [[TMP55]], <4 x i32> [[WIDE_MASKED_LOAD5]], <4 x i32> zeroinitializer
2319+
; CHECK-NEXT: [[PREDPHI9:%.*]] = select <4 x i1> [[TMP63]], <4 x i32> [[WIDE_MASKED_LOAD6]], <4 x i32> zeroinitializer
23202320
; CHECK-NEXT: [[TMP80]] = add <4 x i32> [[VEC_PHI]], [[PREDPHI]]
23212321
; CHECK-NEXT: [[TMP81]] = add <4 x i32> [[VEC_PHI1]], [[PREDPHI7]]
23222322
; CHECK-NEXT: [[TMP82]] = add <4 x i32> [[VEC_PHI2]], [[PREDPHI8]]
@@ -2467,24 +2467,24 @@ define i32 @test_allocsize_array(i64 %len, i1* %test_base) nofree nosync {
24672467
; CHECK-NEXT: [[TMP67:%.*]] = getelementptr inbounds i32, i32* [[BASE]], i64 [[TMP12]]
24682468
; CHECK-NEXT: [[TMP68:%.*]] = getelementptr inbounds i32, i32* [[TMP64]], i32 0
24692469
; CHECK-NEXT: [[TMP69:%.*]] = bitcast i32* [[TMP68]] to <4 x i32>*
2470-
; CHECK-NEXT: [[WIDE_LOAD:%.*]] = load <4 x i32>, <4 x i32>* [[TMP69]], align 4
2470+
; CHECK-NEXT: [[WIDE_MASKED_LOAD:%.*]] = call <4 x i32> @llvm.masked.load.v4i32.p0v4i32(<4 x i32>* [[TMP69]], i32 4, <4 x i1> [[TMP39]], <4 x i32> poison)
24712471
; CHECK-NEXT: [[TMP70:%.*]] = getelementptr inbounds i32, i32* [[TMP64]], i32 4
24722472
; CHECK-NEXT: [[TMP71:%.*]] = bitcast i32* [[TMP70]] to <4 x i32>*
2473-
; CHECK-NEXT: [[WIDE_LOAD4:%.*]] = load <4 x i32>, <4 x i32>* [[TMP71]], align 4
2473+
; CHECK-NEXT: [[WIDE_MASKED_LOAD4:%.*]] = call <4 x i32> @llvm.masked.load.v4i32.p0v4i32(<4 x i32>* [[TMP71]], i32 4, <4 x i1> [[TMP47]], <4 x i32> poison)
24742474
; CHECK-NEXT: [[TMP72:%.*]] = getelementptr inbounds i32, i32* [[TMP64]], i32 8
24752475
; CHECK-NEXT: [[TMP73:%.*]] = bitcast i32* [[TMP72]] to <4 x i32>*
2476-
; CHECK-NEXT: [[WIDE_LOAD5:%.*]] = load <4 x i32>, <4 x i32>* [[TMP73]], align 4
2476+
; CHECK-NEXT: [[WIDE_MASKED_LOAD5:%.*]] = call <4 x i32> @llvm.masked.load.v4i32.p0v4i32(<4 x i32>* [[TMP73]], i32 4, <4 x i1> [[TMP55]], <4 x i32> poison)
24772477
; CHECK-NEXT: [[TMP74:%.*]] = getelementptr inbounds i32, i32* [[TMP64]], i32 12
24782478
; CHECK-NEXT: [[TMP75:%.*]] = bitcast i32* [[TMP74]] to <4 x i32>*
2479-
; CHECK-NEXT: [[WIDE_LOAD6:%.*]] = load <4 x i32>, <4 x i32>* [[TMP75]], align 4
2479+
; CHECK-NEXT: [[WIDE_MASKED_LOAD6:%.*]] = call <4 x i32> @llvm.masked.load.v4i32.p0v4i32(<4 x i32>* [[TMP75]], i32 4, <4 x i1> [[TMP63]], <4 x i32> poison)
24802480
; CHECK-NEXT: [[TMP76:%.*]] = xor <4 x i1> [[TMP39]], <i1 true, i1 true, i1 true, i1 true>
24812481
; CHECK-NEXT: [[TMP77:%.*]] = xor <4 x i1> [[TMP47]], <i1 true, i1 true, i1 true, i1 true>
24822482
; CHECK-NEXT: [[TMP78:%.*]] = xor <4 x i1> [[TMP55]], <i1 true, i1 true, i1 true, i1 true>
24832483
; CHECK-NEXT: [[TMP79:%.*]] = xor <4 x i1> [[TMP63]], <i1 true, i1 true, i1 true, i1 true>
2484-
; CHECK-NEXT: [[PREDPHI:%.*]] = select <4 x i1> [[TMP39]], <4 x i32> [[WIDE_LOAD]], <4 x i32> zeroinitializer
2485-
; CHECK-NEXT: [[PREDPHI7:%.*]] = select <4 x i1> [[TMP47]], <4 x i32> [[WIDE_LOAD4]], <4 x i32> zeroinitializer
2486-
; CHECK-NEXT: [[PREDPHI8:%.*]] = select <4 x i1> [[TMP55]], <4 x i32> [[WIDE_LOAD5]], <4 x i32> zeroinitializer
2487-
; CHECK-NEXT: [[PREDPHI9:%.*]] = select <4 x i1> [[TMP63]], <4 x i32> [[WIDE_LOAD6]], <4 x i32> zeroinitializer
2484+
; CHECK-NEXT: [[PREDPHI:%.*]] = select <4 x i1> [[TMP39]], <4 x i32> [[WIDE_MASKED_LOAD]], <4 x i32> zeroinitializer
2485+
; CHECK-NEXT: [[PREDPHI7:%.*]] = select <4 x i1> [[TMP47]], <4 x i32> [[WIDE_MASKED_LOAD4]], <4 x i32> zeroinitializer
2486+
; CHECK-NEXT: [[PREDPHI8:%.*]] = select <4 x i1> [[TMP55]], <4 x i32> [[WIDE_MASKED_LOAD5]], <4 x i32> zeroinitializer
2487+
; CHECK-NEXT: [[PREDPHI9:%.*]] = select <4 x i1> [[TMP63]], <4 x i32> [[WIDE_MASKED_LOAD6]], <4 x i32> zeroinitializer
24882488
; CHECK-NEXT: [[TMP80]] = add <4 x i32> [[VEC_PHI]], [[PREDPHI]]
24892489
; CHECK-NEXT: [[TMP81]] = add <4 x i32> [[VEC_PHI1]], [[PREDPHI7]]
24902490
; CHECK-NEXT: [[TMP82]] = add <4 x i32> [[VEC_PHI2]], [[PREDPHI8]]

0 commit comments

Comments
 (0)