Skip to content

Commit af34157

Browse files
committed
add xpu support
[pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci update typos and bug fixes [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci xpu seeding PR1 [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci add seeding for pytorch utilities mp_fabric xpu forking xpu multiprocess pytorch add header for xpu rename change to lightning.pytorch [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Teardown from lightning-xpu (from #PR- 3) From Lightning-AI#3 [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci add torch.xpu.stream to ddp update docs [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci update _LIGHTNING_XPU_AVAILABLE to _lightning_xpu_available correct fabric imports.py 1. remove xpu.py from _graveyard 2. correct _lightning_xpu_available() usage fix _try_import function not defined issue in fabric add docs [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci
1 parent e7afe04 commit af34157

File tree

28 files changed

+378
-58
lines changed

28 files changed

+378
-58
lines changed

docs/source-fabric/fundamentals/launch.rst

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -93,8 +93,9 @@ This is essentially the same as running ``python path/to/your/script.py``, but i
9393
itself and are expected to be parsed there.
9494
9595
Options:
96-
--accelerator [cpu|gpu|cuda|mps|tpu]
96+
--accelerator [cpu|gpu|cuda|mps|tpu|xpu]
9797
The hardware accelerator to run on.
98+
Install Lightning-XPU to enable ``xpu``.
9899
--strategy [ddp|dp|deepspeed] Strategy for how to run across multiple
99100
devices.
100101
--devices TEXT Number of devices to run on (``int``), which

docs/source-pytorch/common/index.rst

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,8 @@
1717
../advanced/model_parallel
1818
Train on single or multiple GPUs <../accelerators/gpu>
1919
Train on single or multiple HPUs <../integrations/hpu/index>
20-
Train on single or multiple IPUs <../integrations/ipu/index>
20+
Train on single or multiple XPUs <../integrations/xpu/index>
21+
Train on single or multiple IPUs <../accelerators/ipu>
2122
Train on single or multiple TPUs <../accelerators/tpu>
2223
Train on MPS <../accelerators/mps>
2324
Use a pretrained model <../advanced/pretrained>
@@ -168,6 +169,13 @@ How-to Guides
168169
:col_css: col-md-4
169170
:height: 180
170171

172+
.. displayitem::
173+
:header: Train on single or multiple XPUs
174+
:description: Train models faster with XPU accelerators
175+
:button_link: ../integrations/xpu/index.html
176+
:col_css: col-md-4
177+
:height: 180
178+
171179
.. displayitem::
172180
:header: Train on single or multiple IPUs
173181
:description: Train models faster with IPU accelerators

docs/source-pytorch/common_usecases.rst

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -133,6 +133,13 @@ Customize and extend Lightning for things like custom hardware or distributed st
133133
:button_link: integrations/hpu/index.html
134134
:height: 100
135135

136+
.. displayitem::
137+
:header: Train on single or multiple XPUs
138+
:description: Train models faster with XPUs.
139+
:col_css: col-md-12
140+
:button_link: integrations/xpu/index.html
141+
:height: 100
142+
136143
.. displayitem::
137144
:header: Train on single or multiple IPUs
138145
:description: Train models faster with IPUs.

docs/source-pytorch/conf.py

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -94,6 +94,11 @@ def _load_py_module(name: str, location: str) -> ModuleType:
9494
target_dir="docs/source-pytorch/integrations/hpu",
9595
checkout="tags/1.1.0",
9696
)
97+
assist_local.AssistantCLI.pull_docs_files(
98+
gh_user_repo="Lightning-AI/lightning-XPU",
99+
target_dir="docs/source-pytorch/integrations/xpu",
100+
checkout="tags/1.0.0",
101+
)
97102
assist_local.AssistantCLI.pull_docs_files(
98103
gh_user_repo="Lightning-AI/lightning-Graphcore",
99104
target_dir="docs/source-pytorch/integrations/ipu",
@@ -334,6 +339,7 @@ def _load_py_module(name: str, location: str) -> ModuleType:
334339
"torchmetrics": ("https://torchmetrics.readthedocs.io/en/stable/", None),
335340
"graphcore": ("https://docs.graphcore.ai/en/latest/", None),
336341
"lightning_habana": ("https://lightning-ai.github.io/lightning-Habana/", None),
342+
"intel-xpu": ("https://lightning-ai.github.io/lightning-XPU/", None),
337343
"tensorboardX": ("https://tensorboardx.readthedocs.io/en/stable/", None),
338344
# needed for referencing App from lightning scope
339345
"lightning.app": ("https://lightning.ai/docs/app/stable/", None),

docs/source-pytorch/extensions/accelerator.rst

Lines changed: 16 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,7 @@ Currently there are accelerators for:
1212
- :doc:`TPU <../accelerators/tpu>`
1313
- :doc:`IPU <../integrations/ipu/index>`
1414
- :doc:`HPU <../integrations/hpu/index>`
15+
- :doc:`XPU <../integrations/xpu/index>`
1516
- :doc:`MPS <../accelerators/mps>`
1617

1718
The Accelerator is part of the Strategy which manages communication across multiple devices (distributed communication).
@@ -32,16 +33,16 @@ Create a Custom Accelerator
3233
.. warning:: This is an :ref:`experimental <versioning:Experimental API>` feature.
3334

3435
Here is how you create a new Accelerator.
35-
Let's pretend we want to integrate the fictional XPU accelerator and we have access to its hardware through a library
36-
``xpulib``.
36+
Let's pretend we want to integrate the fictional YPU accelerator and we have access to its hardware through a library
37+
``ypulib``.
3738

3839
.. code-block:: python
3940
40-
import xpulib
41+
import ypulib
4142
4243
43-
class XPUAccelerator(Accelerator):
44-
"""Support for a hypothetical XPU, optimized for large-scale machine learning."""
44+
class YPUAccelerator(Accelerator):
45+
"""Support for a hypothetical YPU, optimized for large-scale machine learning."""
4546
4647
@staticmethod
4748
def parse_devices(devices: Any) -> Any:
@@ -52,29 +53,29 @@ Let's pretend we want to integrate the fictional XPU accelerator and we have acc
5253
@staticmethod
5354
def get_parallel_devices(devices: Any) -> Any:
5455
# Here, convert the device indices to actual device objects
55-
return [torch.device("xpu", idx) for idx in devices]
56+
return [torch.device("ypu", idx) for idx in devices]
5657
5758
@staticmethod
5859
def auto_device_count() -> int:
5960
# Return a value for auto-device selection when `Trainer(devices="auto")`
60-
return xpulib.available_devices()
61+
return ypulib.available_devices()
6162
6263
@staticmethod
6364
def is_available() -> bool:
64-
return xpulib.is_available()
65+
return ypulib.is_available()
6566
6667
def get_device_stats(self, device: Union[str, torch.device]) -> Dict[str, Any]:
6768
# Return optional device statistics for loggers
6869
return {}
6970
7071
71-
Finally, add the XPUAccelerator to the Trainer:
72+
Finally, add the YPUAccelerator to the Trainer:
7273

7374
.. code-block:: python
7475
7576
from lightning.pytorch import Trainer
7677
77-
accelerator = XPUAccelerator()
78+
accelerator = YPUAccelerator()
7879
trainer = Trainer(accelerator=accelerator, devices=2)
7980
8081
@@ -90,28 +91,28 @@ If you wish to switch to a custom accelerator from the CLI without code changes,
9091

9192
.. code-block:: python
9293
93-
class XPUAccelerator(Accelerator):
94+
class YPUAccelerator(Accelerator):
9495
...
9596
9697
@classmethod
9798
def register_accelerators(cls, accelerator_registry):
9899
accelerator_registry.register(
99-
"xpu",
100+
"ypu",
100101
cls,
101-
description=f"XPU Accelerator - optimized for large-scale machine learning.",
102+
description=f"YPU Accelerator - optimized for large-scale machine learning.",
102103
)
103104
104105
Now, this is possible:
105106

106107
.. code-block:: python
107108
108-
trainer = Trainer(accelerator="xpu")
109+
trainer = Trainer(accelerator="ypu")
109110
110111
Or if you are using the Lightning CLI, for example:
111112

112113
.. code-block:: bash
113114
114-
python train.py fit --trainer.accelerator=xpu --trainer.devices=2
115+
python train.py fit --trainer.accelerator=ypu --trainer.devices=2
115116
116117
117118
----------

docs/source-pytorch/glossary/index.rst

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,7 @@
1818
GPU <../accelerators/gpu>
1919
Half precision <../common/precision>
2020
HPU <../integrations/hpu/index>
21+
XPU <../integrations/xpu/index>
2122
Inference <../deploy/production_intermediate>
2223
IPU <../integrations/ipu/index>
2324
Lightning CLI <../cli/lightning_cli>
@@ -161,6 +162,13 @@ Glossary
161162
:button_link: ../integrations/hpu/index.html
162163
:height: 100
163164

165+
.. displayitem::
166+
:header: XPU
167+
:description: Intel® Graphics Cards for faster training
168+
:col_css: col-md-12
169+
:button_link: ../integrations/xpu/index.html
170+
:height: 100
171+
164172
.. displayitem::
165173
:header: Inference
166174
:description: Making predictions by applying a trained model to unlabeled examples
Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,40 @@
1+
.. _xpu:
2+
3+
Accelerator: XPU training
4+
=========================
5+
6+
.. raw:: html
7+
8+
<div class="display-card-container">
9+
<div class="row">
10+
11+
.. Add callout items below this line
12+
13+
.. displayitem::
14+
:header: Basic
15+
:description: Learn the basics of single and multi-XPU core training.
16+
:col_css: col-md-4
17+
:button_link: basic.html
18+
:height: 150
19+
:tag: basic
20+
21+
.. displayitem::
22+
:header: Intermediate
23+
:description: Enable state-of-the-art scaling with advanced mix-precision settings.
24+
:col_css: col-md-4
25+
:button_link: intermediate.html
26+
:height: 150
27+
:tag: intermediate
28+
29+
.. displayitem::
30+
:header: Advanced
31+
:description: Explore state-of-the-art scaling with additional advanced configurations.
32+
:col_css: col-md-4
33+
:button_link: advanced.html
34+
:height: 150
35+
:tag: advanced
36+
37+
.. raw:: html
38+
39+
</div>
40+
</div>
Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
:orphan:
2+
3+
######################
4+
Level 19: Explore XPUs
5+
######################
6+
7+
Explore Intel® Graphics Cards (XPU) for model scaling.
8+
9+
----
10+
11+
.. raw:: html
12+
13+
<div class="display-card-container">
14+
<div class="row">
15+
16+
.. Add callout items below this line
17+
18+
.. displayitem::
19+
:header: Train models on XPUs
20+
:description: Learn the basics of single and multi-XPU core training.
21+
:col_css: col-md-6
22+
:button_link: ../integrations/xpu/basic.html
23+
:height: 150
24+
:tag: basic
25+
26+
.. displayitem::
27+
:header: Optimize models training on XPUs
28+
:description: Enable state-of-the-art scaling with advanced mixed-precision settings.
29+
:col_css: col-md-6
30+
:button_link: ../integrations/xpu/intermediate.html
31+
:height: 150
32+
:tag: intermediate
33+
34+
.. raw:: html
35+
36+
</div>
37+
</div>
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,6 @@
11
# validation HPU connectors
22
lightning-habana >=1.0.0
33
lightning-graphcore >=0.1.0.rc4
4+
5+
# validation XPU connectors
6+
lightning-xpu >=0.1.0

src/lightning/fabric/accelerators/__init__.py

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,3 +22,13 @@
2222

2323
ACCELERATOR_REGISTRY = _AcceleratorRegistry()
2424
_register_classes(ACCELERATOR_REGISTRY, "register_accelerators", sys.modules[__name__], Accelerator)
25+
26+
from lightning.fabric.utilities.imports import _lightning_xpu_available
27+
28+
_ACCELERATORS_BASE_MODULE = "lightning.fabric.accelerators"
29+
ACCELERATOR_REGISTRY = _AcceleratorRegistry()
30+
call_register_accelerators(ACCELERATOR_REGISTRY, _ACCELERATORS_BASE_MODULE)
31+
if _lightning_xpu_available() and "xpu" not in ACCELERATOR_REGISTRY:
32+
from lightning_xpu.fabric import XPUAccelerator
33+
34+
XPUAccelerator.register_accelerators(ACCELERATOR_REGISTRY)

src/lightning/fabric/cli.py

Lines changed: 9 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -25,12 +25,15 @@
2525
from lightning.fabric.strategies import STRATEGY_REGISTRY
2626
from lightning.fabric.utilities.device_parser import _parse_gpu_ids
2727
from lightning.fabric.utilities.distributed import _suggested_max_num_threads
28+
from lightning.fabric.utilities.imports import _lightning_xpu_available
2829

2930
_log = logging.getLogger(__name__)
3031

3132
_CLICK_AVAILABLE = RequirementCache("click")
3233

33-
_SUPPORTED_ACCELERATORS = ("cpu", "gpu", "cuda", "mps", "tpu")
34+
_SUPPORTED_ACCELERATORS = ["cpu", "gpu", "cuda", "mps", "tpu"]
35+
if _lightning_xpu_available():
36+
_SUPPORTED_ACCELERATORS.append("xpu")
3437

3538

3639
def _get_supported_strategies() -> List[str]:
@@ -149,13 +152,17 @@ def _set_env_variables(args: Namespace) -> None:
149152
def _get_num_processes(accelerator: str, devices: str) -> int:
150153
"""Parse the `devices` argument to determine how many processes need to be launched on the current machine."""
151154
if accelerator == "gpu":
152-
parsed_devices = _parse_gpu_ids(devices, include_cuda=True, include_mps=True)
155+
parsed_devices = _parse_gpu_ids(devices, include_cuda=True, include_mps=True, include_xpu=True)
153156
elif accelerator == "cuda":
154157
parsed_devices = CUDAAccelerator.parse_devices(devices)
155158
elif accelerator == "mps":
156159
parsed_devices = MPSAccelerator.parse_devices(devices)
157160
elif accelerator == "tpu":
158161
raise ValueError("Launching processes for TPU through the CLI is not supported.")
162+
elif accelerator == "xpu":
163+
from lightning_xpu.fabric import XPUAccelerator
164+
165+
parsed_devices = XPUAccelerator.parse_devices(devices)
159166
else:
160167
return CPUAccelerator.parse_devices(devices)
161168
return len(parsed_devices) if parsed_devices is not None else 0

0 commit comments

Comments
 (0)