DynamicLoaderDarwin load images in parallel with preload #110646

DmT021 · 2024-10-01T10:34:35Z

This change enables DynamicLoaderDarwin to load modules in parallel using the thread pool. This new behavior is controlled by a new setting plugin.dynamic-loader.darwin.experimental.enable-parallel-image-load, which is enabled by default. When disabled, DynamicLoaderDarwin will load modules sequentially as before.

github-actions · 2024-10-01T10:38:16Z

✅ With the latest revision this PR passed the C/C++ code formatter.

jasonmolenda

Most of the review and discussion happened in the "non-preload" version of the PR ( #110439 ), but I agree with Dmitrii that this is the best approach to merge, this is approved. Very nice job spotting a prime opportunity for multithreading, thanks for doing the work to implement it. I think for native macOS debugging (with all binaries in the same shared cache as lldb's) we're seeing some less than optimal results so we've got a bottleneck that I'll try to look into, I suspect maybe related to how we find binaries in our shared cache. This is still a speedup or not-slower result in every use case I tried.

llvmbot · 2024-10-14T16:35:42Z

@llvm/pr-subscribers-lldb

Author: Dmitrii Galimzianov (DmT021)

Changes

This is another approach to the goal described in this PR. This PR is provided just for convenience.

Patch is 23.83 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/110646.diff

9 Files Affected:

(modified) lldb/source/Plugins/DynamicLoader/MacOSX-DYLD/CMakeLists.txt (+13)
(modified) lldb/source/Plugins/DynamicLoader/MacOSX-DYLD/DynamicLoaderDarwin.cpp (+100-30)
(modified) lldb/source/Plugins/DynamicLoader/MacOSX-DYLD/DynamicLoaderDarwin.h (+17-6)
(added) lldb/source/Plugins/DynamicLoader/MacOSX-DYLD/DynamicLoaderDarwinProperties.cpp (+53)
(added) lldb/source/Plugins/DynamicLoader/MacOSX-DYLD/DynamicLoaderDarwinProperties.h (+34)
(added) lldb/source/Plugins/DynamicLoader/MacOSX-DYLD/DynamicLoaderDarwinProperties.td (+8)
(modified) lldb/source/Plugins/DynamicLoader/MacOSX-DYLD/DynamicLoaderMacOS.cpp (+6-4)
(modified) lldb/source/Plugins/DynamicLoader/MacOSX-DYLD/DynamicLoaderMacOSXDYLD.cpp (+10-3)
(modified) lldb/source/Plugins/DynamicLoader/MacOSX-DYLD/DynamicLoaderMacOSXDYLD.h (+2)

diff --git a/lldb/source/Plugins/DynamicLoader/MacOSX-DYLD/CMakeLists.txt b/lldb/source/Plugins/DynamicLoader/MacOSX-DYLD/CMakeLists.txt
index 7308374c8bfba6..77a560541fcb1f 100644
--- a/lldb/source/Plugins/DynamicLoader/MacOSX-DYLD/CMakeLists.txt
+++ b/lldb/source/Plugins/DynamicLoader/MacOSX-DYLD/CMakeLists.txt
@@ -1,7 +1,16 @@
+lldb_tablegen(DynamicLoaderDarwinProperties.inc -gen-lldb-property-defs
+  SOURCE DynamicLoaderDarwinProperties.td
+  TARGET LLDBPluginDynamicLoaderDarwinPropertiesGen)
+
+lldb_tablegen(DynamicLoaderDarwinPropertiesEnum.inc -gen-lldb-property-enum-defs
+  SOURCE DynamicLoaderDarwinProperties.td
+  TARGET LLDBPluginDynamicLoaderDarwinPropertiesEnumGen)
+
 add_lldb_library(lldbPluginDynamicLoaderMacOSXDYLD PLUGIN
   DynamicLoaderMacOSXDYLD.cpp
   DynamicLoaderMacOS.cpp
   DynamicLoaderDarwin.cpp
+  DynamicLoaderDarwinProperties.cpp
 
   LINK_LIBS
     lldbBreakpoint
@@ -16,3 +25,7 @@ add_lldb_library(lldbPluginDynamicLoaderMacOSXDYLD PLUGIN
     Support
     TargetParser
   )
+
+add_dependencies(lldbPluginDynamicLoaderMacOSXDYLD
+  LLDBPluginDynamicLoaderDarwinPropertiesGen
+  LLDBPluginDynamicLoaderDarwinPropertiesEnumGen)
diff --git a/lldb/source/Plugins/DynamicLoader/MacOSX-DYLD/DynamicLoaderDarwin.cpp b/lldb/source/Plugins/DynamicLoader/MacOSX-DYLD/DynamicLoaderDarwin.cpp
index 30242038a5f66f..c6a59d9e87a46b 100644
--- a/lldb/source/Plugins/DynamicLoader/MacOSX-DYLD/DynamicLoaderDarwin.cpp
+++ b/lldb/source/Plugins/DynamicLoader/MacOSX-DYLD/DynamicLoaderDarwin.cpp
@@ -8,6 +8,7 @@
 
 #include "DynamicLoaderDarwin.h"
 
+#include "DynamicLoaderDarwinProperties.h"
 #include "lldb/Breakpoint/StoppointCallbackContext.h"
 #include "lldb/Core/Debugger.h"
 #include "lldb/Core/Module.h"
@@ -31,6 +32,7 @@
 #include "lldb/Utility/LLDBLog.h"
 #include "lldb/Utility/Log.h"
 #include "lldb/Utility/State.h"
+#include "llvm/Support/ThreadPool.h"
 
 #include "Plugins/LanguageRuntime/ObjC/ObjCLanguageRuntime.h"
 #include "Plugins/TypeSystem/Clang/TypeSystemClang.h"
@@ -77,6 +79,17 @@ void DynamicLoaderDarwin::DidLaunch() {
   SetNotificationBreakpoint();
 }
 
+void DynamicLoaderDarwin::CreateSettings(lldb_private::Debugger &debugger) {
+  if (!PluginManager::GetSettingForDynamicLoaderPlugin(
+          debugger, DynamicLoaderDarwinProperties::GetSettingName())) {
+    const bool is_global_setting = true;
+    PluginManager::CreateSettingForDynamicLoaderPlugin(
+        debugger,
+        DynamicLoaderDarwinProperties::GetGlobal().GetValueProperties(),
+        "Properties for the DynamicLoaderDarwin plug-in.", is_global_setting);
+  }
+}
+
 // Clear out the state of this class.
 void DynamicLoaderDarwin::Clear(bool clear_process) {
   std::lock_guard<std::recursive_mutex> guard(m_mutex);
@@ -88,7 +101,7 @@ void DynamicLoaderDarwin::Clear(bool clear_process) {
 }
 
 ModuleSP DynamicLoaderDarwin::FindTargetModuleForImageInfo(
-    ImageInfo &image_info, bool can_create, bool *did_create_ptr) {
+    const ImageInfo &image_info, bool can_create, bool *did_create_ptr) {
   if (did_create_ptr)
     *did_create_ptr = false;
 
@@ -517,8 +530,8 @@ bool DynamicLoaderDarwin::JSONImageInformationIntoImageInfo(
   return true;
 }
 
-void DynamicLoaderDarwin::UpdateSpecialBinariesFromNewImageInfos(
-    ImageInfo::collection &image_infos) {
+void DynamicLoaderDarwin::UpdateSpecialBinariesFromPreloadedModules(
+    std::vector<std::pair<ImageInfo, ModuleSP>> &images) {
   uint32_t exe_idx = UINT32_MAX;
   uint32_t dyld_idx = UINT32_MAX;
   Target &target = m_process->GetTarget();
@@ -526,35 +539,34 @@ void DynamicLoaderDarwin::UpdateSpecialBinariesFromNewImageInfos(
   ConstString g_dyld_sim_filename("dyld_sim");
 
   ArchSpec target_arch = target.GetArchitecture();
-  const size_t image_infos_size = image_infos.size();
-  for (size_t i = 0; i < image_infos_size; i++) {
-    if (image_infos[i].header.filetype == llvm::MachO::MH_DYLINKER) {
+  const size_t images_size = images.size();
+  for (size_t i = 0; i < images_size; i++) {
+    const auto &image_info = images[i].first;
+    if (image_info.header.filetype == llvm::MachO::MH_DYLINKER) {
       // In a "simulator" process we will have two dyld modules --
       // a "dyld" that we want to keep track of, and a "dyld_sim" which
       // we don't need to keep track of here.  dyld_sim will have a non-macosx
       // OS.
       if (target_arch.GetTriple().getEnvironment() == llvm::Triple::Simulator &&
-          image_infos[i].os_type != llvm::Triple::OSType::MacOSX) {
+          image_info.os_type != llvm::Triple::OSType::MacOSX) {
         continue;
       }
 
       dyld_idx = i;
     }
-    if (image_infos[i].header.filetype == llvm::MachO::MH_EXECUTE) {
+    if (image_info.header.filetype == llvm::MachO::MH_EXECUTE) {
       exe_idx = i;
     }
   }
 
   // Set the target executable if we haven't found one so far.
   if (exe_idx != UINT32_MAX && !target.GetExecutableModule()) {
-    const bool can_create = true;
-    ModuleSP exe_module_sp(FindTargetModuleForImageInfo(image_infos[exe_idx],
-                                                        can_create, nullptr));
+    ModuleSP exe_module_sp = images[exe_idx].second;
     if (exe_module_sp) {
       LLDB_LOGF(log, "Found executable module: %s",
                 exe_module_sp->GetFileSpec().GetPath().c_str());
       target.GetImages().AppendIfNeeded(exe_module_sp);
-      UpdateImageLoadAddress(exe_module_sp.get(), image_infos[exe_idx]);
+      UpdateImageLoadAddress(exe_module_sp.get(), images[exe_idx].first);
       if (exe_module_sp.get() != target.GetExecutableModulePointer())
         target.SetExecutableModule(exe_module_sp, eLoadDependentsNo);
 
@@ -581,14 +593,12 @@ void DynamicLoaderDarwin::UpdateSpecialBinariesFromNewImageInfos(
   }
 
   if (dyld_idx != UINT32_MAX) {
-    const bool can_create = true;
-    ModuleSP dyld_sp = FindTargetModuleForImageInfo(image_infos[dyld_idx],
-                                                    can_create, nullptr);
+    ModuleSP dyld_sp = images[dyld_idx].second;
     if (dyld_sp.get()) {
       LLDB_LOGF(log, "Found dyld module: %s",
                 dyld_sp->GetFileSpec().GetPath().c_str());
       target.GetImages().AppendIfNeeded(dyld_sp);
-      UpdateImageLoadAddress(dyld_sp.get(), image_infos[dyld_idx]);
+      UpdateImageLoadAddress(dyld_sp.get(), images[dyld_idx].first);
       SetDYLDModule(dyld_sp);
     }
   }
@@ -642,26 +652,86 @@ ModuleSP DynamicLoaderDarwin::GetDYLDModule() {
 
 void DynamicLoaderDarwin::ClearDYLDModule() { m_dyld_module_wp.reset(); }
 
+template <typename InputIterator, typename ResultType>
+static std::vector<ResultType> parallel_map(
+    llvm::ThreadPoolInterface &threadPool, InputIterator first,
+    InputIterator last,
+    llvm::function_ref<ResultType(
+        const typename std::iterator_traits<InputIterator>::value_type &)>
+        transform) {
+  const auto size = std::distance(first, last);
+  std::vector<ResultType> results(size);
+  if (size > 0) {
+    llvm::ThreadPoolTaskGroup taskGroup(threadPool);
+    auto it = first;
+    for (ssize_t i = 0; i < size; ++i, ++it) {
+      taskGroup.async([&, i, it]() { results[i] = transform(*it); });
+    }
+    taskGroup.wait();
+  }
+  return results;
+}
+
+template <typename InputIterator, typename ResultType>
+static std::vector<ResultType>
+map(InputIterator first, InputIterator last,
+    llvm::function_ref<ResultType(
+        const typename std::iterator_traits<InputIterator>::value_type &)>
+        transform) {
+  const auto size = std::distance(first, last);
+  std::vector<ResultType> results(size);
+  auto it = first;
+  for (ssize_t i = 0; i < size; ++i, ++it) {
+    results[i] = transform(*it);
+  }
+  return results;
+}
+
+std::vector<std::pair<DynamicLoaderDarwin::ImageInfo, ModuleSP>>
+DynamicLoaderDarwin::PreloadModulesFromImageInfos(
+    const ImageInfo::collection &image_infos) {
+  auto ImageLoad = [this](const ImageInfo &image_info) {
+    return std::make_pair(
+        image_info, FindTargetModuleForImageInfo(image_info, true, nullptr));
+  };
+  bool is_parallel_load =
+      DynamicLoaderDarwinProperties::GetGlobal().GetEnableParallelImageLoad();
+  auto images = is_parallel_load
+                    ? parallel_map<ImageInfo::collection::const_iterator,
+                                   std::pair<ImageInfo, ModuleSP>>(
+                          Debugger::GetThreadPool(), image_infos.begin(),
+                          image_infos.end(), ImageLoad)
+                    : map<ImageInfo::collection::const_iterator,
+                          std::pair<ImageInfo, ModuleSP>>(
+                          image_infos.begin(), image_infos.end(), ImageLoad);
+  return images;
+}
+
 bool DynamicLoaderDarwin::AddModulesUsingImageInfos(
     ImageInfo::collection &image_infos) {
   std::lock_guard<std::recursive_mutex> guard(m_mutex);
+  auto images = PreloadModulesFromImageInfos(image_infos);
+  return AddModulesUsingPreloadedModules(images);
+}
+
+bool DynamicLoaderDarwin::AddModulesUsingPreloadedModules(
+    std::vector<std::pair<ImageInfo, ModuleSP>> &images) {
+  std::lock_guard<std::recursive_mutex> guard(m_mutex);
   // Now add these images to the main list.
   ModuleList loaded_module_list;
   Log *log = GetLog(LLDBLog::DynamicLoader);
   Target &target = m_process->GetTarget();
   ModuleList &target_images = target.GetImages();
 
-  for (uint32_t idx = 0; idx < image_infos.size(); ++idx) {
+  for (uint32_t idx = 0; idx < images.size(); ++idx) {
+    auto &image_info = images[idx].first;
+    const auto &image_module_sp = images[idx].second;
     if (log) {
       LLDB_LOGF(log, "Adding new image at address=0x%16.16" PRIx64 ".",
-                image_infos[idx].address);
-      image_infos[idx].PutToLog(log);
+                image_info.address);
+      image_info.PutToLog(log);
     }
-
-    m_dyld_image_infos.push_back(image_infos[idx]);
-
-    ModuleSP image_module_sp(
-        FindTargetModuleForImageInfo(image_infos[idx], true, nullptr));
+    m_dyld_image_infos.push_back(image_info);
 
     if (image_module_sp) {
       ObjectFile *objfile = image_module_sp->GetObjectFile();
@@ -673,7 +743,7 @@ bool DynamicLoaderDarwin::AddModulesUsingImageInfos(
               sections->FindSectionByName(commpage_dbstr).get();
           if (commpage_section) {
             ModuleSpec module_spec(objfile->GetFileSpec(),
-                                   image_infos[idx].GetArchitecture());
+                                   image_info.GetArchitecture());
             module_spec.GetObjectName() = commpage_dbstr;
             ModuleSP commpage_image_module_sp(
                 target_images.FindFirstModule(module_spec));
@@ -686,17 +756,17 @@ bool DynamicLoaderDarwin::AddModulesUsingImageInfos(
               if (!commpage_image_module_sp ||
                   commpage_image_module_sp->GetObjectFile() == nullptr) {
                 commpage_image_module_sp = m_process->ReadModuleFromMemory(
-                    image_infos[idx].file_spec, image_infos[idx].address);
+                    image_info.file_spec, image_info.address);
                 // Always load a memory image right away in the target in case
                 // we end up trying to read the symbol table from memory... The
                 // __LINKEDIT will need to be mapped so we can figure out where
                 // the symbol table bits are...
                 bool changed = false;
                 UpdateImageLoadAddress(commpage_image_module_sp.get(),
-                                       image_infos[idx]);
+                                       image_info);
                 target.GetImages().Append(commpage_image_module_sp);
                 if (changed) {
-                  image_infos[idx].load_stop_id = m_process->GetStopID();
+                  image_info.load_stop_id = m_process->GetStopID();
                   loaded_module_list.AppendIfNeeded(commpage_image_module_sp);
                 }
               }
@@ -709,14 +779,14 @@ bool DynamicLoaderDarwin::AddModulesUsingImageInfos(
       // address. We need to check this so we don't mention that all loaded
       // shared libraries are newly loaded each time we hit out dyld breakpoint
       // since dyld will list all shared libraries each time.
-      if (UpdateImageLoadAddress(image_module_sp.get(), image_infos[idx])) {
+      if (UpdateImageLoadAddress(image_module_sp.get(), image_info)) {
         target_images.AppendIfNeeded(image_module_sp);
         loaded_module_list.AppendIfNeeded(image_module_sp);
       }
 
       // To support macCatalyst and legacy iOS simulator,
       // update the module's platform with the DYLD info.
-      ArchSpec dyld_spec = image_infos[idx].GetArchitecture();
+      ArchSpec dyld_spec = image_info.GetArchitecture();
       auto &dyld_triple = dyld_spec.GetTriple();
       if ((dyld_triple.getEnvironment() == llvm::Triple::MacABI &&
            dyld_triple.getOS() == llvm::Triple::IOS) ||
diff --git a/lldb/source/Plugins/DynamicLoader/MacOSX-DYLD/DynamicLoaderDarwin.h b/lldb/source/Plugins/DynamicLoader/MacOSX-DYLD/DynamicLoaderDarwin.h
index 45c693163f8105..bc5464d76b9503 100644
--- a/lldb/source/Plugins/DynamicLoader/MacOSX-DYLD/DynamicLoaderDarwin.h
+++ b/lldb/source/Plugins/DynamicLoader/MacOSX-DYLD/DynamicLoaderDarwin.h
@@ -58,6 +58,8 @@ class DynamicLoaderDarwin : public lldb_private::DynamicLoader {
 
   std::optional<lldb_private::Address> GetStartAddress() override;
 
+  static void CreateSettings(lldb_private::Debugger &debugger);
+
 protected:
   void PrivateInitialize(lldb_private::Process *process);
 
@@ -174,7 +176,7 @@ class DynamicLoaderDarwin : public lldb_private::DynamicLoader {
 
   bool UnloadModuleSections(lldb_private::Module *module, ImageInfo &info);
 
-  lldb::ModuleSP FindTargetModuleForImageInfo(ImageInfo &image_info,
+  lldb::ModuleSP FindTargetModuleForImageInfo(const ImageInfo &image_info,
                                               bool can_create,
                                               bool *did_create_ptr);
 
@@ -201,11 +203,18 @@ class DynamicLoaderDarwin : public lldb_private::DynamicLoader {
       lldb_private::StructuredData::ObjectSP image_details,
       ImageInfo::collection &image_infos);
 
-  // If image_infos contains / may contain dyld or executable image, call this
-  // method
-  // to keep our internal record keeping of the special binaries up-to-date.
-  void
-  UpdateSpecialBinariesFromNewImageInfos(ImageInfo::collection &image_infos);
+  // Finds/loads modules for a given `image_infos` and returns pairs
+  // (ImageInfo, ModuleSP).
+  // Prefer using this method rather than calling `FindTargetModuleForImageInfo`
+  // directly as this method may load the modules in parallel.
+  std::vector<std::pair<ImageInfo, lldb::ModuleSP>>
+  PreloadModulesFromImageInfos(const ImageInfo::collection &image_infos);
+
+  // If `images` contains / may contain dyld or executable image, call this
+  // method to keep our internal record keeping of the special binaries
+  // up-to-date.
+  void UpdateSpecialBinariesFromPreloadedModules(
+      std::vector<std::pair<ImageInfo, lldb::ModuleSP>> &images);
 
   // if image_info is a dyld binary, call this method
   bool UpdateDYLDImageInfoFromNewImageInfo(ImageInfo &image_info);
@@ -215,6 +224,8 @@ class DynamicLoaderDarwin : public lldb_private::DynamicLoader {
   void AddExecutableModuleIfInImageInfos(ImageInfo::collection &image_infos);
 
   bool AddModulesUsingImageInfos(ImageInfo::collection &image_infos);
+  bool AddModulesUsingPreloadedModules(
+      std::vector<std::pair<ImageInfo, lldb::ModuleSP>> &images);
 
   // Whether we should use the new dyld SPI to get shared library information,
   // or read
diff --git a/lldb/source/Plugins/DynamicLoader/MacOSX-DYLD/DynamicLoaderDarwinProperties.cpp b/lldb/source/Plugins/DynamicLoader/MacOSX-DYLD/DynamicLoaderDarwinProperties.cpp
new file mode 100644
index 00000000000000..f4d8a071e6d5de
--- /dev/null
+++ b/lldb/source/Plugins/DynamicLoader/MacOSX-DYLD/DynamicLoaderDarwinProperties.cpp
@@ -0,0 +1,53 @@
+//===-- DynamicLoaderDarwinProperties.cpp ---------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include "DynamicLoaderDarwinProperties.h"
+
+using namespace lldb_private;
+
+#define LLDB_PROPERTIES_dynamicloaderdarwin_experimental
+#include "DynamicLoaderDarwinProperties.inc"
+
+enum {
+#define LLDB_PROPERTIES_dynamicloaderdarwin_experimental
+#include "DynamicLoaderDarwinPropertiesEnum.inc"
+};
+
+llvm::StringRef DynamicLoaderDarwinProperties::GetSettingName() {
+  static constexpr llvm::StringLiteral g_setting_name("darwin");
+  return g_setting_name;
+}
+
+DynamicLoaderDarwinProperties::ExperimentalProperties::ExperimentalProperties()
+    : Properties(std::make_shared<OptionValueProperties>(
+          GetExperimentalSettingsName())) {
+  m_collection_sp->Initialize(g_dynamicloaderdarwin_experimental_properties);
+}
+
+DynamicLoaderDarwinProperties::DynamicLoaderDarwinProperties()
+    : Properties(std::make_shared<OptionValueProperties>(GetSettingName())),
+      m_experimental_properties(std::make_unique<ExperimentalProperties>()) {
+  m_collection_sp->AppendProperty(
+      Properties::GetExperimentalSettingsName(),
+      "Experimental settings - setting these won't produce errors if the "
+      "setting is not present.",
+      true, m_experimental_properties->GetValueProperties());
+}
+
+bool DynamicLoaderDarwinProperties::GetEnableParallelImageLoad() const {
+  return m_experimental_properties->GetPropertyAtIndexAs<bool>(
+      ePropertyEnableParallelImageLoad,
+      g_dynamicloaderdarwin_experimental_properties
+              [ePropertyEnableParallelImageLoad]
+                  .default_uint_value != 0);
+}
+
+DynamicLoaderDarwinProperties &DynamicLoaderDarwinProperties::GetGlobal() {
+  static DynamicLoaderDarwinProperties g_settings;
+  return g_settings;
+}
diff --git a/lldb/source/Plugins/DynamicLoader/MacOSX-DYLD/DynamicLoaderDarwinProperties.h b/lldb/source/Plugins/DynamicLoader/MacOSX-DYLD/DynamicLoaderDarwinProperties.h
new file mode 100644
index 00000000000000..4c5e800c4f3e4c
--- /dev/null
+++ b/lldb/source/Plugins/DynamicLoader/MacOSX-DYLD/DynamicLoaderDarwinProperties.h
@@ -0,0 +1,34 @@
+//===-- DynamicLoaderDarwinProperties.h -------------------------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLDB_SOURCE_PLUGINS_DYNAMICLOADER_MACOSX_DYLD_DYNAMICLOADERDARWINPROPERTIES_H
+#define LLDB_SOURCE_PLUGINS_DYNAMICLOADER_MACOSX_DYLD_DYNAMICLOADERDARWINPROPERTIES_H
+
+#include "lldb/Core/UserSettingsController.h"
+
+namespace lldb_private {
+
+class DynamicLoaderDarwinProperties : public Properties {
+public:
+  class ExperimentalProperties : public Properties {
+  public:
+    ExperimentalProperties();
+  };
+  static llvm::StringRef GetSettingName();
+  static DynamicLoaderDarwinProperties &GetGlobal();
+  DynamicLoaderDarwinProperties();
+  ~DynamicLoaderDarwinProperties() override = default;
+  bool GetEnableParallelImageLoad() const;
+
+private:
+  std::unique_ptr<ExperimentalProperties> m_experimental_properties;
+};
+
+} // namespace lldb_private
+
+#endif // LLDB_SOURCE_PLUGINS_DYNAMICLOADER_MACOSX_DYLD_DYNAMICLOADERDARWINPROPERTIES_H
diff --git a/lldb/source/Plugins/DynamicLoader/MacOSX-DYLD/DynamicLoaderDarwinProperties.td b/lldb/source/Plugins/DynamicLoader/MacOSX-DYLD/DynamicLoaderDarwinProperties.td
new file mode 100644
index 00000000000000..c54580ce347296
--- /dev/null
+++ b/lldb/source/Plugins/DynamicLoader/MacOSX-DYLD/DynamicLoaderDarwinProperties.td
@@ -0,0 +1,8 @@
+include "../../../../include/lldb/Core/PropertiesBase.td"
+
+let Definition = "dynamicloaderdarwin_experimental" in {
+  def EnableParallelImageLoad: Property<"enable-parallel-image-load", "Boolean">,
+    Global,
+    DefaultTrue,
+    Desc<"Load images in parallel.">;
+}
diff --git a/lldb/source/Plugins/DynamicLoader/MacOSX-DYLD/DynamicLoaderMacOS.cpp b/lldb/source/Plugins/Dy...
[truncated]

DmT021 · 2024-10-14T19:16:00Z

@jasonmolenda All the tests are green but I don't have the merge button. Can you merge it, please?

jasonmolenda · 2024-10-14T21:28:33Z

@DmT021 I will, can you edit the Description in this PR for the commit message that will be recorded?

JDevlieghere · 2024-10-14T21:36:32Z

lldb/source/Plugins/DynamicLoader/MacOSX-DYLD/DynamicLoaderDarwin.cpp

+template <typename InputIterator, typename ResultType>
+static std::vector<ResultType> parallel_map(
+    llvm::ThreadPoolInterface &threadPool, InputIterator first,
+    InputIterator last,
+    llvm::function_ref<ResultType(
+        const typename std::iterator_traits<InputIterator>::value_type &)>
+        transform) {
+  const auto size = std::distance(first, last);
+  std::vector<ResultType> results(size);
+  if (size > 0) {
+    llvm::ThreadPoolTaskGroup taskGroup(threadPool);
+    auto it = first;
+    for (ssize_t i = 0; i < size; ++i, ++it) {
+      taskGroup.async([&, i, it]() { results[i] = transform(*it); });
+    }
+    taskGroup.wait();
+  }
+  return results;
+}


Do we really need these map helpers? I would imagine you could achieve the same thing with something like this:

TaskGroup task_group(...); for (...) if (is_parallel) task_group.async(lambda, args); else lambda(args); task_group.wait();

We can but it may not be better. The implementations of map and parallel_map aren't important from the business logic perspective. The only important thing about PreloadModulesFromImageInfos is checking the setting.
Also, the code will be a bit more complicated than shown in the snippet you provided:

we are creating a task group and waiting for it even when we are going to load the modules sequentially.

we are checking the setting on each iteration of the loop.

This is probably almost free (I'm not sure if that's the case for task_group.wait() though) but it's still a bit of an eyesore.

But a version of this without these two disadvantages will have a repeating for loop. Something like

std::vector<std::pair<DynamicLoaderDarwin::ImageInfo, ModuleSP>> DynamicLoaderDarwin::PreloadModulesFromImageInfos( const ImageInfo::collection &image_infos) { const auto size = image_infos.size(); std::vector<std::pair<DynamicLoaderDarwin::ImageInfo, ModuleSP>> images( size); auto lambda = [&](size_t i, ImageInfo::collection::const_iterator it) { const auto &image_info = *it; images[i] = std::make_pair( image_info, FindTargetModuleForImageInfo(image_info, true, nullptr)); }; auto it = image_infos.begin(); bool is_parallel_load = DynamicLoaderDarwinProperties::GetGlobal().GetEnableParallelImageLoad(); if (is_parallel_load) { if (size > 0) { llvm::ThreadPoolTaskGroup taskGroup(Debugger::GetThreadPool()); for (size_t i = 0; i < size; ++i, ++it) { taskGroup.async(lambda, i, it); } taskGroup.wait(); } } else { for (size_t i = 0; i < size; ++i, ++it) { lambda(i, it); } } return images; }

So now we have a bigger room for error in future refactoring here and the important part (checking the setting) is less visible in the code as well.

Also, consider how would this code look like if had std::transform with ExecutionPolicy available. I think it'd look about the same as the current implementation with map and parallel_map.

I personally think the new code, even with the little bit of code duplication, reads a lot easier than the templated signature of the (parallel_)map functions. I personally wouldn't bother with the size > 0 check (I expect that creating an empty task group and waiting on an empty task group is essentially free). Anyway, I don't feel super strongly about it, so go with whatever you think is best.

FWIW I don't expect us to ever migrate this code to std::transform because we definitely want to limit the parallelism by using the debugger's thread pool.

@DmT021 I didn't want to merge the PR if you were still considering Jonas' opinion on this approach. Did you want to change the PR or are you fine with it as-is. Jonas says the choice is yours.

@jasonmolenda @JDevlieghere I decided to inline the helpers. I still think we are mixing important and utilitary code here and it's less than ideal. But at the same time, I fully understand the readability issue. What's also important here the signatures of the map functions aren't ideal to be considered "general purpose helpers". Namely, they expect std::distance to be O(1) which isn't always the case with the current template constraints. So I think if we are going to use the same technique in other places (e.g. Target::SetExecutableModule and other dynamic loader plugins) - we may want to write better (more general) implementations for these helpers in some common place.

PS Sorry for the long answer, it took me a while to come up with the decision here.

jasonmolenda · 2024-10-15T20:25:39Z

Very nice! And now we'll see what all the CI bots think of this change. :)

This change enables `DynamicLoaderDarwin` to load modules in parallel using the thread pool. This new behavior is controlled by a new setting `plugin.dynamic-loader.darwin.experimental.enable-parallel-image-load`, which is enabled by default. When disabled, DynamicLoaderDarwin will load modules sequentially as before.

Create dependent modules in parallel in Target::SetExecutableModule. This change was inspired by llvm#110646 which takes the same approach when attaching. Jason suggested we could use the same approach when you create a target in LLDB. I used Slack for benchmarking, which loads 902 images. Benchmark 1: ./bin/lldb /Applications/Slack.app/Contents/MacOS/Slack Time (mean ± σ): 1.225 s ± 0.003 s [User: 3.977 s, System: 1.521 s] Range (min … max): 1.220 s … 1.229 s 10 runs Benchmark 2: ./bin/lldb /Applications/Slack.app/Contents/MacOS/Slack Time (mean ± σ): 3.253 s ± 0.037 s [User: 3.013 s, System: 0.248 s] Range (min … max): 3.211 s … 3.310 s 10 runs We see about a 2x speedup, which matches what Jason saw for the attach scenario. I also ran this under TSan to confirm this doesn't introduce any races or deadlocks.

Create dependent modules in parallel in Target::SetExecutableModule. This change was inspired by #110646 which takes the same approach when attaching. Jason suggested we could use the same approach when you create a target in LLDB. I used Slack for benchmarking, which loads 902 images. ``` Benchmark 1: ./bin/lldb /Applications/Slack.app/Contents/MacOS/Slack Time (mean ± σ): 1.225 s ± 0.003 s [User: 3.977 s, System: 1.521 s] Range (min … max): 1.220 s … 1.229 s 10 runs Benchmark 2: ./bin/lldb /Applications/Slack.app/Contents/MacOS/Slack Time (mean ± σ): 3.253 s ± 0.037 s [User: 3.013 s, System: 0.248 s] Range (min … max): 3.211 s … 3.310 s 10 runs ``` We see about a 2x speedup, which matches what Jason saw for the attach scenario. I also ran this under TSan to confirm this doesn't introduce any races or deadlocks.

Create dependent modules in parallel in Target::SetExecutableModule. This change was inspired by llvm#110646 which takes the same approach when attaching. Jason suggested we could use the same approach when you create a target in LLDB. I used Slack for benchmarking, which loads 902 images. ``` Benchmark 1: ./bin/lldb /Applications/Slack.app/Contents/MacOS/Slack Time (mean ± σ): 1.225 s ± 0.003 s [User: 3.977 s, System: 1.521 s] Range (min … max): 1.220 s … 1.229 s 10 runs Benchmark 2: ./bin/lldb /Applications/Slack.app/Contents/MacOS/Slack Time (mean ± σ): 3.253 s ± 0.037 s [User: 3.013 s, System: 0.248 s] Range (min … max): 3.211 s … 3.310 s 10 runs ``` We see about a 2x speedup, which matches what Jason saw for the attach scenario. I also ran this under TSan to confirm this doesn't introduce any races or deadlocks.

Create dependent modules in parallel in Target::SetExecutableModule. This change was inspired by #110646 which takes the same approach when attaching. Jason suggested we could use the same approach when you create a target in LLDB. I used Slack for benchmarking, which loads 902 images. ``` Benchmark 1: ./bin/lldb /Applications/Slack.app/Contents/MacOS/Slack Time (mean ± σ): 1.225 s ± 0.003 s [User: 3.977 s, System: 1.521 s] Range (min … max): 1.220 s … 1.229 s 10 runs Benchmark 2: ./bin/lldb /Applications/Slack.app/Contents/MacOS/Slack Time (mean ± σ): 3.253 s ± 0.037 s [User: 3.013 s, System: 0.248 s] Range (min … max): 3.211 s … 3.310 s 10 runs ``` We see about a 2x speedup, which matches what Jason saw for the attach scenario. I also ran this under TSan to confirm this doesn't introduce any races or deadlocks.

Create dependent modules in parallel in Target::SetExecutableModule. This change was inspired by llvm#110646 which takes the same approach when attaching. Jason suggested we could use the same approach when you create a target in LLDB. I used Slack for benchmarking, which loads 902 images. ``` Benchmark 1: ./bin/lldb /Applications/Slack.app/Contents/MacOS/Slack Time (mean ± σ): 1.225 s ± 0.003 s [User: 3.977 s, System: 1.521 s] Range (min … max): 1.220 s … 1.229 s 10 runs Benchmark 2: ./bin/lldb /Applications/Slack.app/Contents/MacOS/Slack Time (mean ± σ): 3.253 s ± 0.037 s [User: 3.013 s, System: 0.248 s] Range (min … max): 3.211 s … 3.310 s 10 runs ``` We see about a 2x speedup, which matches what Jason saw for the attach scenario. I also ran this under TSan to confirm this doesn't introduce any races or deadlocks. (cherry picked from commit a57296a)

Create dependent modules in parallel in Target::SetExecutableModule. This change was inspired by llvm#110646 which takes the same approach when attaching. Jason suggested we could use the same approach when you create a target in LLDB. I used Slack for benchmarking, which loads 902 images. ``` Benchmark 1: ./bin/lldb /Applications/Slack.app/Contents/MacOS/Slack Time (mean ± σ): 1.225 s ± 0.003 s [User: 3.977 s, System: 1.521 s] Range (min … max): 1.220 s … 1.229 s 10 runs Benchmark 2: ./bin/lldb /Applications/Slack.app/Contents/MacOS/Slack Time (mean ± σ): 3.253 s ± 0.037 s [User: 3.013 s, System: 0.248 s] Range (min … max): 3.211 s … 3.310 s 10 runs ``` We see about a 2x speedup, which matches what Jason saw for the attach scenario. I also ran this under TSan to confirm this doesn't introduce any races or deadlocks.

Release note llvm#110646 and llvm#114507.

Release note #110646 and #114507.

This patch improves LLDB launch time on Linux machines for **preload scenarios**, particularly for executables with a lot of shared library dependencies (or modules). Specifically: * Launching a binary with `target.preload-symbols = true` * Attaching to a process with `target.preload-symbols = true`. It's completely controlled by a new flag added in the first commit `plugin.dynamic-loader.posix-dyld.parallel-module-load`, which *defaults to false*. This was inspired by similar work on Darwin #110646. Some rough numbers to showcase perf improvement, run on a very beefy machine: * Executable with ~5600 modules: baseline 45s, improvement 15s * Executable with ~3800 modules: baseline 25s, improvement 10s * Executable with ~6650 modules: baseline 67s, improvement 20s * Executable with ~12500 modules: baseline 185s, improvement 85s * Executable with ~14700 modules: baseline 235s, improvement 120s A lot of targets we deal with have a *ton* of modules, and unfortunately we're unable to convince other folks to reduce the number of modules, so performance improvements like this can be very impactful for user experience. This patch achieves the performance improvement by parallelizing `DynamicLoaderPOSIXDYLD::RefreshModules` for the launch scenario, and `DynamicLoaderPOSIXDYLD::LoadAllCurrentModules` for the attach scenario. The commits have some context on their specific changes as well -- hopefully this helps the review. # More context on implementation We discovered the bottlenecks by via `perf record -g -p <lldb's pid>` on a Linux machine. With an executable known to have 1000s of shared library dependencies, I ran ``` (lldb) b main (lldb) r # taking a while ``` and showed the resulting perf trace (snippet shown) ``` Samples: 85K of event 'cycles:P', Event count (approx.): 54615855812 Children Self Command Shared Object Symbol - 93.54% 0.00% intern-state libc.so.6 [.] clone3 clone3 start_thread lldb_private::HostNativeThreadBase::ThreadCreateTrampoline(void*) r std::_Function_handler<void* (), lldb_private::Process::StartPrivateStateThread(bool)::$_0>::_M_invoke(std::_Any_data const&) lldb_private::Process::RunPrivateStateThread(bool) n - lldb_private::Process::HandlePrivateEvent(std::shared_ptr<lldb_private::Event>&) - 93.54% lldb_private::Process::ShouldBroadcastEvent(lldb_private::Event*) - 93.54% lldb_private::ThreadList::ShouldStop(lldb_private::Event*) - lldb_private::Thread::ShouldStop(lldb_private::Event*) * - 93.53% lldb_private::StopInfoBreakpoint::ShouldStopSynchronous(lldb_private::Event*) t - 93.52% lldb_private::BreakpointSite::ShouldStop(lldb_private::StoppointCallbackContext*) i lldb_private::BreakpointLocationCollection::ShouldStop(lldb_private::StoppointCallbackContext*) k lldb_private::BreakpointLocation::ShouldStop(lldb_private::StoppointCallbackContext*) b lldb_private::BreakpointOptions::InvokeCallback(lldb_private::StoppointCallbackContext*, unsigned long, unsigned long) i DynamicLoaderPOSIXDYLD::RendezvousBreakpointHit(void*, lldb_private::StoppointCallbackContext*, unsigned long, unsigned lo - DynamicLoaderPOSIXDYLD::RefreshModules() O - 93.42% DynamicLoaderPOSIXDYLD::RefreshModules()::$_0::operator()(DYLDRendezvous::SOEntry const&) const u - 93.40% DynamicLoaderPOSIXDYLD::LoadModuleAtAddress(lldb_private::FileSpec const&, unsigned long, unsigned long, bools - lldb_private::DynamicLoader::LoadModuleAtAddress(lldb_private::FileSpec const&, unsigned long, unsigned long, boos - 83.90% lldb_private::DynamicLoader::FindModuleViaTarget(lldb_private::FileSpec const&) o - 83.01% lldb_private::Target::GetOrCreateModule(lldb_private::ModuleSpec const&, bool, lldb_private::Status* - 77.89% lldb_private::Module::PreloadSymbols() - 44.06% lldb_private::Symtab::PreloadSymbols() - 43.66% lldb_private::Symtab::InitNameIndexes() ... ``` We saw that majority of time was spent in `RefreshModules`, with the main culprit within it `LoadModuleAtAddress` which eventually calls `PreloadSymbols`. At first, `DynamicLoaderPOSIXDYLD::LoadModuleAtAddress` appears fairly independent -- most of it deals with different files and then getting or creating Modules from these files. The portions that aren't independent seem to deal with ModuleLists, which appear concurrency safe. There were members of `DynamicLoaderPOSIXDYLD` I had to synchronize though: namely `m_loaded_modules` which `DynamicLoaderPOSIXDYLD` maintains to map its loaded modules to their link addresses. Without synchronizing this, I ran into SEGFAULTS and other issues when running `check-lldb`. I also locked the assignment and comparison of `m_interpreter_module`, which may be unnecessary. # Alternate implementations When creating this patch, another implementation I considered was directly background-ing the call to `Module::PreloadSymbol` in `Target::GetOrCreateModule`. It would have the added benefit of working across platforms generically, and appeared to be concurrency safe. It was done via `Debugger::GetThreadPool().async` directly. However, there were a ton of concurrency issues, so I abandoned that approach for now. # Testing With the feature active, I tested via `ninja check-lldb` on both Debug and Release builds several times (~5 or 6 altogether?), and didn't spot additional failing or flaky tests. I also tested manually on several different binaries, some with around 14000 modules, but just basic operations: launching, reaching main, setting breakpoint, stepping, showing some backtraces. I've also tested with the flag off just to make sure things behave properly synchronously.

@JDevlieghere

#134437) A requested follow-up from #130912 by @JDevlieghere to control Darwin parallel image loading with the same `target.parallel-module-load` that controls the POSIX dyld parallel image loading. Darwin parallel image loading was introduced by #110646. This small change: * removes `plugin.dynamic-loader.darwin.experimental.enable-parallel-image-load` and associated code. * changes setting call site in `DynamicLoaderDarwin::PreloadModulesFromImageInfos` to use the new setting. Tested by running `ninja check-lldb` and loading some targets. Co-authored-by: Tom Yang <[email protected]>

@JDevlieghere

…-module-load (#134437) A requested follow-up from llvm/llvm-project#130912 by @JDevlieghere to control Darwin parallel image loading with the same `target.parallel-module-load` that controls the POSIX dyld parallel image loading. Darwin parallel image loading was introduced by llvm/llvm-project#110646. This small change: * removes `plugin.dynamic-loader.darwin.experimental.enable-parallel-image-load` and associated code. * changes setting call site in `DynamicLoaderDarwin::PreloadModulesFromImageInfos` to use the new setting. Tested by running `ninja check-lldb` and loading some targets. Co-authored-by: Tom Yang <[email protected]>

DmT021 changed the title ~~Wp/dynamic loader darwin parallel image load with preload main~~ DynamicLoaderDarwin load images in parallel with preload Oct 1, 2024

DmT021 mentioned this pull request Oct 1, 2024

DynamicLoaderDarwin load images in parallel #110439

Closed

jasonmolenda self-requested a review October 6, 2024 03:12

DmT021 force-pushed the wp/DynamicLoaderDarwin_parallel_image_load_with_preload_main branch 2 times, most recently from 0c4780e to f6cffb4 Compare October 11, 2024 01:33

jasonmolenda approved these changes Oct 14, 2024

View reviewed changes

DmT021 marked this pull request as ready for review October 14, 2024 16:35

DmT021 requested a review from JDevlieghere as a code owner October 14, 2024 16:35

llvmbot added the lldb label Oct 14, 2024

DmT021 added 2 commits October 14, 2024 18:35

DynamicLoaderDarwin load images in parallel

64a4d69

DoInitialImageFetch routine preloads modules explicitly

c8f133b

DmT021 force-pushed the wp/DynamicLoaderDarwin_parallel_image_load_with_preload_main branch from f6cffb4 to c8f133b Compare October 14, 2024 16:35

JDevlieghere reviewed Oct 14, 2024

View reviewed changes

Inline map and parallel_map into PreloadModulesFromImageInfos

cea432f

jasonmolenda merged commit 5f2cf99 into llvm:main Oct 15, 2024
7 checks passed

DmT021 mentioned this pull request Oct 18, 2024

[🍒 stable/20240723] DynamicLoaderDarwin load images in parallel with preload swiftlang/llvm-project#9445

Merged

JDevlieghere mentioned this pull request Nov 1, 2024

[lldb] Create dependent modules in parallel #114507

Merged

JDevlieghere mentioned this pull request Nov 4, 2024

[lldb] Create dependent modules in parallel (#114507) swiftlang/llvm-project#9508

Merged

JDevlieghere mentioned this pull request Nov 19, 2024

Add release note for parallel module creation in LLDB #116857

Merged

JDevlieghere added a commit to JDevlieghere/llvm-project that referenced this pull request Nov 19, 2024

Add release note for parallel module creation in LLDB

4dafd39

Release note llvm#110646 and llvm#114507.

JDevlieghere added a commit that referenced this pull request Nov 20, 2024

Add release note for parallel module creation in LLDB (#116857)

4acf935

Release note #110646 and #114507.

JDevlieghere mentioned this pull request Mar 14, 2025

Parallelize module loading in POSIX dyld code #130912

Merged

zhyty mentioned this pull request Apr 4, 2025

control Darwin parallel image loading with target.parallel-module-load #134437

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DynamicLoaderDarwin load images in parallel with preload #110646

DynamicLoaderDarwin load images in parallel with preload #110646

DmT021 commented Oct 1, 2024 •

edited

Loading

github-actions bot commented Oct 1, 2024 •

edited

Loading

jasonmolenda left a comment

llvmbot commented Oct 14, 2024

DmT021 commented Oct 14, 2024

jasonmolenda commented Oct 14, 2024

JDevlieghere Oct 14, 2024

DmT021 Oct 14, 2024

JDevlieghere Oct 14, 2024

jasonmolenda Oct 15, 2024

DmT021 Oct 15, 2024

jasonmolenda commented Oct 15, 2024

DynamicLoaderDarwin load images in parallel with preload #110646

DynamicLoaderDarwin load images in parallel with preload #110646

Conversation

DmT021 commented Oct 1, 2024 • edited Loading

github-actions bot commented Oct 1, 2024 • edited Loading

jasonmolenda left a comment

Choose a reason for hiding this comment

llvmbot commented Oct 14, 2024

DmT021 commented Oct 14, 2024

jasonmolenda commented Oct 14, 2024

JDevlieghere Oct 14, 2024

Choose a reason for hiding this comment

DmT021 Oct 14, 2024

Choose a reason for hiding this comment

JDevlieghere Oct 14, 2024

Choose a reason for hiding this comment

jasonmolenda Oct 15, 2024

Choose a reason for hiding this comment

DmT021 Oct 15, 2024

Choose a reason for hiding this comment

jasonmolenda commented Oct 15, 2024

DmT021 commented Oct 1, 2024 •

edited

Loading

github-actions bot commented Oct 1, 2024 •

edited

Loading