Skip to content

Improve the dump-arrays performance on Windows #66973

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 16, 2023

Conversation

hjyamauchi
Copy link
Contributor

Use the HeapWalk API for heap iteration instead of the
Heap32First/Next API, which was known to be slow. Since HeapWalk only
works locally, it requires using a remote thread and a DLL.

@@ -5,6 +5,9 @@ import PackageDescription

let package = Package(
name: "swift-inspect",
products: [
.library(name: "SwiftInspectHeapWalk", type: .dynamic, targets: ["SwiftInspectHeapWalk"]),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that SwiftInspectClient might be a better name? I don't think that we should assume that it is only going to be limited to HeapWalk. We may wish to extend it further.

@@ -16,12 +19,15 @@ let package = Package(
dependencies: [
"SymbolicationShims",
.product(name: "ArgumentParser", package: "swift-argument-parser"),
"SwiftInspectHeapWalk",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be conditional on when the platform is Windows.

@@ -0,0 +1,184 @@
#if defined(_WIN32)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing copyright header.

return 1;
}
HeapEntry *buf = (HeapEntry *)
MapViewOfFile(hMapFile,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should follow LLVM style, perhaps clang-format the file?

Comment on lines 78 to 79
free(heaps);
heaps = (HANDLE*)malloc(heapCount * sizeof(HANDLE));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are in C++, so prefer to use delete[] and new[].

}

auto cleanup = [&] {
free(heaps);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Custom deleter with std::unique_ptr should avoid this right?

let sharedMemoryName = "Local\\SwiftInspectFileMapping"
let readEventName = "Local\\SwiftInspectReadEvent"
let writeEventName = "Local\\SwiftInspectWriteEvent"
let waitTimeoutMs = DWORD(30000)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about extracting these constants from the C++ side into a C-only header and then using a module to share the constants across the C++ and Swift sides?

defer { CloseHandle(hWriteEvent) }

// The path to the dll assuming it's in the same directory as swift-inspect.
let dllPath: String = Bundle.main.bundlePath + "\\SwiftInspectHeapWalk.dll"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, if we use the bundle, that makes deploying this a bit more difficult as that has requirements on the layout.

return
}
// Allocate memory and write dllPath in the remote process
dllPathUtf16Len = UInt(wcslen($0) * 2) // *2 for wide chars
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that I would prefer using MemoryLayout rather than the hardcoded 2.

@hjyamauchi
Copy link
Contributor Author

The latest revision should address all the comments.

Copy link
Contributor

@tristanlabelle tristanlabelle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mind-blowing! It'd be great if the code could be broken up into a few functions so that the high-level logic can be read more easily.

hjyamauchi added a commit to hjyamauchi/swift-build that referenced this pull request Jul 3, 2023
swift-inspect will fail to build in release mode after the dump-arrays
PR (swiftlang/swift#66973) due to an llvm bug
(llvm/llvm-project#40056). This is a
workaround for it.
hjyamauchi added a commit to compnerd/swift-build that referenced this pull request Jul 3, 2023
swift-inspect will fail to build in release mode after the dump-arrays
PR (swiftlang/swift#66973) due to an llvm bug
(llvm/llvm-project#40056). This is a
workaround for it.
@hjyamauchi
Copy link
Contributor Author

@compnerd Can you take another look?

var dllPathRemote = UnsafeMutableRawPointer(bitPattern: 0)
var dllPathUtf16Len: UInt = 0
var dllPathWritten = false
dllPath.withCString(encodedAs: UTF16.self) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unsafe. We have a path limitation to 261 characters. We absolutely need to do this inside the closure to ensure that we have a NT-style path ...

URL(fileURLWithPath: swiftInspectPath)
        .deletingLastPathComponent()
        .appendingPathComponent("SwiftInspectClient.dll")
        .withUnsafeFileSystemRepresentation {
  #"\\?\\#(String(decodingCString: $0!, as: UTF16.self))"#.withCString(encodedAs: UTF16.self) {
    let faAttributes: WIN32_FILE_ATTRIBUTE_DATA = .init()
    guard GetFileAttributesExW($0, GetFileExInfoStandard, &faAttributes),
          faAttributes.dwFileAttributes & FILE_ATTRIBUTE_REPARSE_POINT == 0 else {
      return nil
    }

    let szLength = SIZE_T(_wcslen($0) + 1)
    guard let allocation = VirtualAllocEx(self.process, nil, szLength,
                                          DWORD(MEM_COMIT), DWORD(PAGE_READWRITE)) else {
      return nil
    }

    if !WriteProcessMemory(self.process, allocation, $0, szLength, nil) {
      VirtualFreeEx(self.process, allocation, 0, DWORD(MEM_RELEASE)
      return nil
    }

    return allocation
  }
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Can you elaborate? I understand there's a path limitation. What's a NT-style path? And how does using a closure ensure it? Or what did the original code do wrong?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The \\?\ prefix creates a NT-style path. It is a path that bypasses the Win32 layer and goes into the kernel namespace, which allows it to bypass the 261 character limitation. The closure doesn't do that, its the interior string construction which does that. The problem is that the path must be normalised first as the normalization is something that the Win32 layer normally handles. Due to the memory handling in Foundation, we need to use a closure (withUnsafeFileSystemRepresentation). That mostly normalizes the path, and then we can create the long path.


private func unloadDllAndPathRemote(
dwProcessId: DWORD, hKernel32: HMODULE!,
dllPathRemote: UnsafeMutableRawPointer) -> Bool {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this formatted by swift-format?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a swift-format config file that's used for the existing code? If I run it with its default config, it'd change the lines that aren't touched by this PR. Any preference?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that I had saved it ... the default sounds like a good idea though. I think that there may be a few places where it will break before the first parameter which is probably the only divergence the existing code has.

DWORD(FILE_MAP_ALL_ACCESS),
0,
0,
SIZE_T(bufSize));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this formatted via swift-format?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see above

Comment on lines 342 to 326
let hookAddr = unsafeBitCast(GetProcAddress(hKernel32, "LoadLibraryW"),
to: LPTHREAD_START_ROUTINE.self)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems unnecessary. It resolves to the local function LoadLibraryW, so why do the dynamic lookup?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Casting the local function pointer LoadLibraryW to UnsafeMutableRawPointer or LPTHREAD_START_ROUTINE doesn't seem to work with Swift/Builtin.swift:95: Fatal error: Can't unsafeBitCast between types of different sizes. How?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let lpfnLoadLibraryW: @convention(c) (LPCWSTR?) -> HMODULE? = LoadLibraryW
_ = unsafeBitCast(lpfnLoadLibraryW, to: LPTHREAD_START_ROUTINE.self)

I imagine you can even combine the two lines.

@CodaFi
Copy link
Contributor

CodaFi commented Jul 7, 2023

I’m no expert in win32, but this looks pretty good. I guess my gut reaction reading all of this is that anything that touches HeapLock needs documentation about avoiding further dynamic allocations until the lock is dropped. It’s so easy to re-enter the allocator holding this thing.

HANDLE Handle;
explicit ScopedHandle(HANDLE Handle) noexcept : Handle(Handle) {}
~ScopedHandle() noexcept {
if (Handle != NULL) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be checking for INVALID_HANDLE_VALUE?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I understand, INVALID_HANDLE_VALUE != NULL and the APIs that we use here returns NULL on failures.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, the values are different. I'm worried that we might try to do a CloseHandle on INVALID_HANDLE_VALUE as that is not being checked.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

.deletingLastPathComponent()
.appendingPathComponent("SwiftInspectClient.dll")
.withUnsafeFileSystemRepresentation {
#"\\?\\#(String(decodingCString: unsafeBitCast($0!, to: UnsafePointer<UInt8>.self), as: UTF8.self))"#
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the unsafeBitCast needed here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A cast from UnsafePointer<Int8> to UnsafePointer<UInt8> needed here

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is might need to be $0.baseAddress.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That doesn't seem to work.

S:\SourceCache\swift\tools\swift-inspect\Sources\swift-inspect\WindowsRemoteProcess.swift:421:45: error: value of type 'UnsafePointer<Int8>?' has no member 'baseAddress'
        #"\\?\\#(String(decodingCString: $0.baseAddress, as: UTF8.self))"#
                                         ~~ ^~~~~~~~~~~
S:\SourceCache\swift\tools\swift-inspect\Sources\swift-inspect\WindowsRemoteProcess.swift:421:46: error: value of type 'UnsafePointer<Int8>' has no member 'baseAddress'
        #"\\?\\#(String(decodingCString: $0!.baseAddress, as: UTF8.self))"#


private func unloadDllAndPathRemote(
dwProcessId: DWORD, hKernel32: HMODULE!,
dllPathRemote: UnsafeMutableRawPointer) -> Bool {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that I had saved it ... the default sounds like a good idea though. I think that there may be a few places where it will break before the first parameter which is probably the only divergence the existing code has.


// Unload the dll from the remote process
let unloadHookAddr = unsafeBitCast(GetProcAddress(hKernel32, "FreeLibrary"),
to: LPTHREAD_START_ROUTINE.self)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to above for LoadLibraryW, we can probably avoid the dynamic resolution of FreeLibrary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

    let lpfnFreeLibrary:  (HMODULE?) -> Bool = FreeLibrary
    let unloadHookAddr = unsafeBitCast(lpfnFreeLibrary, to: LPTHREAD_START_ROUTINE.self)

causes a runtime cast failure

Swift/Builtin.swift:95: Fatal error: Can't unsafeBitCast between types of different sizes
Current stack trace:
0    (null)                             0x00007fffb7581100 swift_stdlib_reportFatalErrorInFile + 132

I'm not sure why this one doesn't work when the LoadLibraryW one worked

@ahoppen ahoppen removed their request for review July 7, 2023 17:42
@hjyamauchi hjyamauchi force-pushed the dump-arrays branch 3 times, most recently from f776ae2 to 5b7f488 Compare July 10, 2023 20:36
@hjyamauchi
Copy link
Contributor Author

Addressed the comments except for the embedding the dll into the executable thing. I imagine that we need to write a windows program that adds the dll as a resource into the executable and run it as part of the swift-inspect build process. How do I do that? Any examples?

@tshortli tshortli removed their request for review July 10, 2023 20:45
Copy link
Member

@compnerd compnerd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is nearly ready! I think that the question around the invocation of FreeLibrary is something we need to think through, but the rest is looking really good!

}

// Unload the dll from the remote process
let unloadHookAddr = unsafeBitCast(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this correct? This is using the address of Kernel32.FreeLibrary in the swift-inspect process but invoking it in the remote process, which is not guaranteed to be loaded at the same address right? I don't think that this is safe to do. We need to have the function be bound in the process. The simplest way to do this is to have a secondary entry point that we can invoke in the injected module that can unload itself.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that kernel32 is special and guaranteed to be loaded at the same address across processes.

https://learn.microsoft.com/en-us/windows/win32/dlls/dynamic-link-library-entry-point-function

Because Kernel32.dll is guaranteed to be loaded in the process address space when the entry-point function is called...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I read it the same way - it is that it is loaded before the entry point executes, it doesn't guarantee that the module is loaded at the same address, ASLR can relocate the base address.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, thatwas a misquote, sorry.

I'm reading that kernel32 is loaded at the same address across processes, however. For example, http://www.nynaeve.net/?p=198

Isn't that assumed and necessary when we call LoadLibraryW earlier?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that is a mistake as well. Discussed this offline, and we will use the toolhelp32 to snapshot and find the HMODULE associated with kernel32.dll in the image and use GetProcAddress to get the proper location.

@hjyamauchi
Copy link
Contributor Author

PTAL

Use the HeapWalk API for heap iteration instead of the
Heap32First/Next API, which was known to be slow. Since HeapWalk only
works locally, it requires using a remote thread and a DLL.
@hjyamauchi
Copy link
Contributor Author

Updated to get the remote LoadLibraryW/FreeLibrary addresses from the module snapshot and GetProcAddress() and no to rely on them being at the same addresses across processes.

@compnerd
Copy link
Member

@swift-ci please test

@compnerd compnerd merged commit aad2b6e into swiftlang:main Jul 16, 2023
compnerd added a commit to compnerd/apple-swift that referenced this pull request Jul 17, 2023
Clean up some casts that would cause warnings on Windows introduced in
PR swiftlang#66973.  Rewrite some of the path computation for the DLL path to
avoid some unnecessary conversions.  Split out some of the merged guards
to allow for split diagnostic emissions.
compnerd added a commit to compnerd/apple-swift that referenced this pull request Jul 17, 2023
…6973

Clean up the incorrect memory binding in swift-inspect introduced in
PR#66973.  This fixes the incorrect memory usage to mutate the `context`
parameter.
compnerd added a commit that referenced this pull request Jul 18, 2023
swift-inspect: correct invalid memory usage introduced in #66973
compnerd added a commit that referenced this pull request Jul 18, 2023
swift-inspect: clean up incorrect casts introduced in #66973
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants