feat: Update to Pytorch 2.0 #1528

peri044 · 2022-12-05T16:40:38Z

Signed-off-by: Dheeraj Peri [email protected]

Description

Upgrade to Pyt 2.0 (which is 1.14)

Fixes # (issue)

Type of change

Please delete options that are not relevant and/or add your own.

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
This change requires a documentation update

Checklist:

My code follows the style guidelines of this project (You can use the linters)
I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas and hacks
I have made corresponding changes to the documentation
I have added tests to verify my fix or my feature
New and existing unit tests pass locally with my changes
I have added the relevant labels to my PR in so that relevant reviewers are notified

Signed-off-by: Dheeraj Peri <[email protected]>

github-actions

Code conforms to C++ style guidelines

github-actions

Code conforms to Python style guidelines

Signed-off-by: Dheeraj Peri <[email protected]>

github-actions

Code conforms to C++ style guidelines

github-actions

Code conforms to Python style guidelines

github-actions

Code conforms to Python style guidelines

github-actions

Code conforms to C++ style guidelines

Signed-off-by: Dheeraj Peri <[email protected]>

peri044 · 2022-12-17T00:49:08Z

@frank-wei We are observing FX test failures with latest pytorch upgrade. Any insights into this ? Thanks !!
For test_flatten_aten tests, it looks like the op signature changed maybe ?
Error is

AssertionError: False is not true : expected ops {<OpOverload(op='aten._reshape_alias', overload='default')>}, actuall ops {<OpOverload(op='aten.view', overload='default')>}

https://github.com/pytorch/TensorRT/blob/master/py/torch_tensorrt/fx/test/converters/aten_op/test_flatten_aten.py#L30 needs to be changed ?

frank-wei · 2022-12-17T05:14:07Z

@frank-wei We are observing FX test failures with latest pytorch upgrade. Any insights into this ? Thanks !! For test_flatten_aten tests, it looks like the op signature changed maybe ? Error is
AssertionError: False is not true : expected ops {<OpOverload(op='aten._reshape_alias', overload='default')>}, actuall ops {<OpOverload(op='aten.view', overload='default')>}
https://github.com/pytorch/TensorRT/blob/master/py/torch_tensorrt/fx/test/converters/aten_op/test_flatten_aten.py#L30 needs to be changed ?

These are PT2.0 tracer based tests. The reason is that PT2.0 changed the aten._reshape_alias to aten.view. I will fix those tests.

frank-wei · 2022-12-17T06:50:54Z

Should be fixed in #1559

github-actions

Code conforms to C++ style guidelines

github-actions

There are some changes that do not conform to Python style guidelines:

--- tools/perf/perf_run.py	2022-12-17 20:08:18.017950 +0000
+++ tools/perf/perf_run.py	2022-12-17 20:08:36.883639 +0000
@@ -614,14 +614,22 @@
                + "(among the following options vgg16|resnet50|efficientnet_b0|vit) "
                + "or provide a torch model file"
            )

        if use_dynamo and (model_torch is None):
-            raise ValueError("No Pytorch model (nn.Module) is provided for torchdynamo compilation. Please provide a pytorch model")
+            raise ValueError(
+                "No Pytorch model (nn.Module) is provided for torchdynamo compilation. Please provide a pytorch model"
+            )

        if use_dynamo and model_torch:
-            model_torch = torch.compile(model_torch, "default", dynamic=False, fullgraph=False, backend=dynamo_backend)
+            model_torch = torch.compile(
+                model_torch,
+                "default",
+                dynamic=False,
+                fullgraph=False,
+                backend=dynamo_backend,
+            )

        backends = parse_backends(params["backends"])
        truncate_long_and_double = params["truncate"]
        batch_size = params["batch_size"]
        is_trt_engine = params["is_trt_engine"]

Signed-off-by: Dheeraj Peri <[email protected]>

github-actions

Code conforms to Python style guidelines

- Fix compilation error for GPT-2 model arising from Byte-type inputs fed into TensorRT Engine - Update translation dictionary between Torch and TensorRT types to include `at::kByte` - Add field to PartitioningInfo specifying whether to cast Int8 inputs to TensorRT Engines to Int, to avoid error arising from Int8 inputs being fed into non-quantized engines - Add automatic detection of quantized/calibrated models and disable Int8 => Int32 casting in those cases - Fix bug where LoweringInfo target device was not being updated for Python API - Allow `castNode` to force creation of a new node and avoid searching for an existing one to convert - Add test to ensure cast is inserted in the Torch engine preceding a TensorRT engine, when the Byte tensor is an output of the Torch engine

Signed-off-by: Dheeraj Peri <[email protected]>

- Address review comments - Improve documentation and logging messages - Restructure casting function to allow for casting of variable data types - Add casting for `at::kByte` segment block inputs as well as segment block outputs

fix: Properly cast intermediate Int8 tensors to TensorRT Engines in Fallback

Signed-off-by: Torch-TensorRT Github Bot <[email protected]>

Signed-off-by: Dheeraj Peri <[email protected]>

github-actions

There are some changes that do not conform to Python style guidelines:

--- tools/perf/perf_run.py	2022-12-28 21:02:32.955162 +0000
+++ tools/perf/perf_run.py	2022-12-28 21:02:49.724464 +0000
@@ -15,10 +15,11 @@
import pandas as pd

# Importing supported Backends
import torch
import torch_tensorrt as torchtrt
+
# from torch_tensorrt.fx.lower import compile
# from torch_tensorrt.fx.utils import LowerPrecision

import tensorrt as trt
from utils import (
@@ -142,11 +143,11 @@
    start_compile = time.time_ns()
    model = torchtrt.compile(
        model,
        ir="fx",
        inputs=input_tensors,
-        enabled_precisions={torch.float16 if precision=="fp16" else torch.float32},
+        enabled_precisions={torch.float16 if precision == "fp16" else torch.float32},
    )
    end_compile = time.time_ns()
    compile_time_ms = (end_compile - start_compile) / 1e6

    iters = params.get("iterations", 20)
@@ -167,16 +168,24 @@
            meas_time = end_time - start_time
            timings.append(meas_time)

    recordStats("FX-TensorRT", timings, precision, batch_size, compile_time_ms)

+
def run_dynamo(model, input_tensors, params, precision, batch_size):
    dynamo_backend = params["dynamo_backend"]
-    print("Running Dynamo with backend: ", dynamo_backend, " for precision: ", precision, " batch_size : ", batch_size)
+    print(
+        "Running Dynamo with backend: ",
+        dynamo_backend,
+        " for precision: ",
+        precision,
+        " batch_size : ",
+        batch_size,
+    )

    if precision == "fp16":
-       input_tensors = [tensor.half() for tensor in input_tensors]
+        input_tensors = [tensor.half() for tensor in input_tensors]

    fp16_mode = True if precision == "fp16" else False
    # dynamo_backend_params = {"fp16_mode" : fp16_mode}
    # model = torch.compile(
    #     model,
@@ -185,10 +194,11 @@
    #     fullgraph=False,
    #     backend=dynamo_backend,
    #     # **dynamo_backend_params
    # )
    import torch._dynamo as dynamo
+
    model = dynamo.optimize(dynamo_backend, nopython=True)(model)
    # Compile and measure the time
    with torch.no_grad():
        start_compile = time.time_ns()
        features = model(*input_tensors)
@@ -217,11 +227,14 @@
            torch.cuda.synchronize()
            end_time = timeit.default_timer()
            meas_time = end_time - start_time
            timings.append(meas_time)

-    recordStats("Dynamo-" + dynamo_backend, timings, precision, batch_size, compile_time_ms)
+    recordStats(
+        "Dynamo-" + dynamo_backend, timings, precision, batch_size, compile_time_ms
+    )
+

def torch_dtype_from_trt(dtype):
    if dtype == trt.int8:
        return torch.int8
    elif dtype == trt.bool:

github-actions

Code conforms to C++ style guidelines

Signed-off-by: Dheeraj Peri <[email protected]>

github-actions

Code conforms to C++ style guidelines

github-actions

There are some changes that do not conform to Python style guidelines:

--- tools/perf/perf_run.py	2022-12-28 21:05:04.918867 +0000
+++ tools/perf/perf_run.py	2022-12-28 21:05:22.129868 +0000
@@ -15,10 +15,11 @@
import pandas as pd

# Importing supported Backends
import torch
import torch_tensorrt as torchtrt
+
# from torch_tensorrt.fx.lower import compile
# from torch_tensorrt.fx.utils import LowerPrecision

import tensorrt as trt
from utils import (
@@ -142,11 +143,11 @@
    start_compile = time.time_ns()
    model = torchtrt.compile(
        model,
        ir="fx",
        inputs=input_tensors,
-        enabled_precisions={torch.float16 if precision=="fp16" else torch.float32},
+        enabled_precisions={torch.float16 if precision == "fp16" else torch.float32},
    )
    end_compile = time.time_ns()
    compile_time_ms = (end_compile - start_compile) / 1e6

    iters = params.get("iterations", 20)
@@ -167,16 +168,24 @@
            meas_time = end_time - start_time
            timings.append(meas_time)

    recordStats("FX-TensorRT", timings, precision, batch_size, compile_time_ms)

+
def run_dynamo(model, input_tensors, params, precision, batch_size):
    dynamo_backend = params["dynamo_backend"]
-    print("Running Dynamo with backend: ", dynamo_backend, " for precision: ", precision, " batch_size : ", batch_size)
+    print(
+        "Running Dynamo with backend: ",
+        dynamo_backend,
+        " for precision: ",
+        precision,
+        " batch_size : ",
+        batch_size,
+    )

    if precision == "fp16":
-       input_tensors = [tensor.half() for tensor in input_tensors]
+        input_tensors = [tensor.half() for tensor in input_tensors]

    fp16_mode = True if precision == "fp16" else False
    # dynamo_backend_params = {"fp16_mode" : fp16_mode}
    # model = torch.compile(
    #     model,
@@ -185,10 +194,11 @@
    #     fullgraph=False,
    #     backend=dynamo_backend,
    #     # **dynamo_backend_params
    # )
    import torch._dynamo as dynamo
+
    model = dynamo.optimize(dynamo_backend, nopython=True)(model)
    # Compile and measure the time
    with torch.no_grad():
        start_compile = time.time_ns()
        features = model(*input_tensors)
@@ -217,11 +227,14 @@
            torch.cuda.synchronize()
            end_time = timeit.default_timer()
            meas_time = end_time - start_time
            timings.append(meas_time)

-    recordStats("Dynamo-" + dynamo_backend, timings, precision, batch_size, compile_time_ms)
+    recordStats(
+        "Dynamo-" + dynamo_backend, timings, precision, batch_size, compile_time_ms
+    )
+

def torch_dtype_from_trt(dtype):
    if dtype == trt.int8:
        return torch.int8
    elif dtype == trt.bool:

Signed-off-by: Dheeraj Peri <[email protected]>

github-actions

Code conforms to C++ style guidelines

github-actions

There are some changes that do not conform to Python style guidelines:

--- tools/perf/perf_run.py	2022-12-30 03:03:32.906616 +0000
+++ tools/perf/perf_run.py	2022-12-30 03:03:50.481014 +0000
@@ -15,10 +15,11 @@
import pandas as pd

# Importing supported Backends
import torch
import torch_tensorrt as torchtrt
+
# from torch_tensorrt.fx.lower import compile
# from torch_tensorrt.fx.utils import LowerPrecision

import tensorrt as trt
from utils import (
@@ -142,11 +143,11 @@
    start_compile = time.time_ns()
    model = torchtrt.compile(
        model,
        ir="fx",
        inputs=input_tensors,
-        enabled_precisions={torch.float16 if precision=="fp16" else torch.float32},
+        enabled_precisions={torch.float16 if precision == "fp16" else torch.float32},
    )
    end_compile = time.time_ns()
    compile_time_ms = (end_compile - start_compile) / 1e6

    iters = params.get("iterations", 20)
@@ -167,16 +168,24 @@
            meas_time = end_time - start_time
            timings.append(meas_time)

    recordStats("FX-TensorRT", timings, precision, batch_size, compile_time_ms)

+
def run_dynamo(model, input_tensors, params, precision, batch_size):
    dynamo_backend = params["dynamo_backend"]
-    print("Running Dynamo with backend: ", dynamo_backend, " for precision: ", precision, " batch_size : ", batch_size)
+    print(
+        "Running Dynamo with backend: ",
+        dynamo_backend,
+        " for precision: ",
+        precision,
+        " batch_size : ",
+        batch_size,
+    )

    if precision == "fp16":
-       input_tensors = [tensor.half() for tensor in input_tensors]
+        input_tensors = [tensor.half() for tensor in input_tensors]

    fp16_mode = True if precision == "fp16" else False
    # dynamo_backend_params = {"fp16_mode" : fp16_mode}
    # model = torch.compile(
    #     model,
@@ -185,10 +194,11 @@
    #     fullgraph=False,
    #     backend=dynamo_backend,
    #     # **dynamo_backend_params
    # )
    import torch._dynamo as dynamo
+
    model = dynamo.optimize(dynamo_backend, nopython=True)(model)
    # Compile and measure the time
    with torch.no_grad():
        start_compile = time.time_ns()
        features = model(*input_tensors)
@@ -217,11 +227,14 @@
            torch.cuda.synchronize()
            end_time = timeit.default_timer()
            meas_time = end_time - start_time
            timings.append(meas_time)

-    recordStats("Dynamo-" + dynamo_backend, timings, precision, batch_size, compile_time_ms)
+    recordStats(
+        "Dynamo-" + dynamo_backend, timings, precision, batch_size, compile_time_ms
+    )
+

def torch_dtype_from_trt(dtype):
    if dtype == trt.int8:
        return torch.int8
    elif dtype == trt.bool:

Signed-off-by: Dheeraj Peri <[email protected]>

github-actions

Code conforms to C++ style guidelines

github-actions

There are some changes that do not conform to Python style guidelines:

--- tools/perf/perf_run.py	2023-01-03 20:57:19.467070 +0000
+++ tools/perf/perf_run.py	2023-01-03 20:57:40.994896 +0000
@@ -15,10 +15,11 @@
import pandas as pd

# Importing supported Backends
import torch
import torch_tensorrt as torchtrt
+
# from torch_tensorrt.fx.lower import compile
# from torch_tensorrt.fx.utils import LowerPrecision

import tensorrt as trt
from utils import (
@@ -142,11 +143,11 @@
    start_compile = time.time_ns()
    model = torchtrt.compile(
        model,
        ir="fx",
        inputs=input_tensors,
-        enabled_precisions={torch.float16 if precision=="fp16" else torch.float32},
+        enabled_precisions={torch.float16 if precision == "fp16" else torch.float32},
    )
    end_compile = time.time_ns()
    compile_time_ms = (end_compile - start_compile) / 1e6

    iters = params.get("iterations", 20)
@@ -167,16 +168,24 @@
            meas_time = end_time - start_time
            timings.append(meas_time)

    recordStats("FX-TensorRT", timings, precision, batch_size, compile_time_ms)

+
def run_dynamo(model, input_tensors, params, precision, batch_size):
    dynamo_backend = params["dynamo_backend"]
-    print("Running Dynamo with backend: ", dynamo_backend, " for precision: ", precision, " batch_size : ", batch_size)
+    print(
+        "Running Dynamo with backend: ",
+        dynamo_backend,
+        " for precision: ",
+        precision,
+        " batch_size : ",
+        batch_size,
+    )

    if precision == "fp16":
-       input_tensors = [tensor.half() for tensor in input_tensors]
+        input_tensors = [tensor.half() for tensor in input_tensors]

    fp16_mode = True if precision == "fp16" else False
    # dynamo_backend_params = {"fp16_mode" : fp16_mode}
    # model = torch.compile(
    #     model,
@@ -185,10 +194,11 @@
    #     fullgraph=False,
    #     backend=dynamo_backend,
    #     # **dynamo_backend_params
    # )
    import torch._dynamo as dynamo
+
    model = dynamo.optimize(dynamo_backend, nopython=True)(model)
    # Compile and measure the time
    with torch.no_grad():
        start_compile = time.time_ns()
        features = model(*input_tensors)
@@ -217,11 +227,14 @@
            torch.cuda.synchronize()
            end_time = timeit.default_timer()
            meas_time = end_time - start_time
            timings.append(meas_time)

-    recordStats("Dynamo-" + dynamo_backend, timings, precision, batch_size, compile_time_ms)
+    recordStats(
+        "Dynamo-" + dynamo_backend, timings, precision, batch_size, compile_time_ms
+    )
+

def torch_dtype_from_trt(dtype):
    if dtype == trt.int8:
        return torch.int8
    elif dtype == trt.bool:

Signed-off-by: Dheeraj Peri <[email protected]>

github-actions

Code conforms to C++ style guidelines

github-actions

There are some changes that do not conform to Python style guidelines:

--- tools/perf/perf_run.py	2023-01-03 21:46:18.769567 +0000
+++ tools/perf/perf_run.py	2023-01-03 21:46:36.026528 +0000
@@ -15,10 +15,11 @@
import pandas as pd

# Importing supported Backends
import torch
import torch_tensorrt as torchtrt
+
# from torch_tensorrt.fx.lower import compile
# from torch_tensorrt.fx.utils import LowerPrecision

import tensorrt as trt
from utils import (
@@ -142,11 +143,11 @@
    start_compile = time.time_ns()
    model = torchtrt.compile(
        model,
        ir="fx",
        inputs=input_tensors,
-        enabled_precisions={torch.float16 if precision=="fp16" else torch.float32},
+        enabled_precisions={torch.float16 if precision == "fp16" else torch.float32},
    )
    end_compile = time.time_ns()
    compile_time_ms = (end_compile - start_compile) / 1e6

    iters = params.get("iterations", 20)
@@ -167,16 +168,24 @@
            meas_time = end_time - start_time
            timings.append(meas_time)

    recordStats("FX-TensorRT", timings, precision, batch_size, compile_time_ms)

+
def run_dynamo(model, input_tensors, params, precision, batch_size):
    dynamo_backend = params["dynamo_backend"]
-    print("Running Dynamo with backend: ", dynamo_backend, " for precision: ", precision, " batch_size : ", batch_size)
+    print(
+        "Running Dynamo with backend: ",
+        dynamo_backend,
+        " for precision: ",
+        precision,
+        " batch_size : ",
+        batch_size,
+    )

    if precision == "fp16":
-       input_tensors = [tensor.half() for tensor in input_tensors]
+        input_tensors = [tensor.half() for tensor in input_tensors]

    fp16_mode = True if precision == "fp16" else False
    # dynamo_backend_params = {"fp16_mode" : fp16_mode}
    # model = torch.compile(
    #     model,
@@ -185,10 +194,11 @@
    #     fullgraph=False,
    #     backend=dynamo_backend,
    #     # **dynamo_backend_params
    # )
    import torch._dynamo as dynamo
+
    model = dynamo.optimize(dynamo_backend, nopython=True)(model)
    # Compile and measure the time
    with torch.no_grad():
        start_compile = time.time_ns()
        features = model(*input_tensors)
@@ -217,11 +227,14 @@
            torch.cuda.synchronize()
            end_time = timeit.default_timer()
            meas_time = end_time - start_time
            timings.append(meas_time)

-    recordStats("Dynamo-" + dynamo_backend, timings, precision, batch_size, compile_time_ms)
+    recordStats(
+        "Dynamo-" + dynamo_backend, timings, precision, batch_size, compile_time_ms
+    )
+

def torch_dtype_from_trt(dtype):
    if dtype == trt.int8:
        return torch.int8
    elif dtype == trt.bool:

Signed-off-by: Dheeraj Peri <[email protected]>

github-actions

Code conforms to C++ style guidelines

github-actions

There are some changes that do not conform to Python style guidelines:

--- tools/perf/perf_run.py	2023-01-03 21:47:55.345180 +0000
+++ tools/perf/perf_run.py	2023-01-03 21:48:18.667455 +0000
@@ -15,10 +15,11 @@
import pandas as pd

# Importing supported Backends
import torch
import torch_tensorrt as torchtrt
+
# from torch_tensorrt.fx.lower import compile
# from torch_tensorrt.fx.utils import LowerPrecision

import tensorrt as trt
from utils import (
@@ -142,11 +143,11 @@
    start_compile = time.time_ns()
    model = torchtrt.compile(
        model,
        ir="fx",
        inputs=input_tensors,
-        enabled_precisions={torch.float16 if precision=="fp16" else torch.float32},
+        enabled_precisions={torch.float16 if precision == "fp16" else torch.float32},
    )
    end_compile = time.time_ns()
    compile_time_ms = (end_compile - start_compile) / 1e6

    iters = params.get("iterations", 20)
@@ -167,16 +168,24 @@
            meas_time = end_time - start_time
            timings.append(meas_time)

    recordStats("FX-TensorRT", timings, precision, batch_size, compile_time_ms)

+
def run_dynamo(model, input_tensors, params, precision, batch_size):
    dynamo_backend = params["dynamo_backend"]
-    print("Running Dynamo with backend: ", dynamo_backend, " for precision: ", precision, " batch_size : ", batch_size)
+    print(
+        "Running Dynamo with backend: ",
+        dynamo_backend,
+        " for precision: ",
+        precision,
+        " batch_size : ",
+        batch_size,
+    )

    if precision == "fp16":
-       input_tensors = [tensor.half() for tensor in input_tensors]
+        input_tensors = [tensor.half() for tensor in input_tensors]

    fp16_mode = True if precision == "fp16" else False
    # dynamo_backend_params = {"fp16_mode" : fp16_mode}
    # model = torch.compile(
    #     model,
@@ -185,10 +194,11 @@
    #     fullgraph=False,
    #     backend=dynamo_backend,
    #     # **dynamo_backend_params
    # )
    import torch._dynamo as dynamo
+
    model = dynamo.optimize(dynamo_backend, nopython=True)(model)
    # Compile and measure the time
    with torch.no_grad():
        start_compile = time.time_ns()
        features = model(*input_tensors)
@@ -217,11 +227,14 @@
            torch.cuda.synchronize()
            end_time = timeit.default_timer()
            meas_time = end_time - start_time
            timings.append(meas_time)

-    recordStats("Dynamo-" + dynamo_backend, timings, precision, batch_size, compile_time_ms)
+    recordStats(
+        "Dynamo-" + dynamo_backend, timings, precision, batch_size, compile_time_ms
+    )
+

def torch_dtype_from_trt(dtype):
    if dtype == trt.int8:
        return torch.int8
    elif dtype == trt.bool:

Signed-off-by: Dheeraj Peri <[email protected]>

github-actions

Code conforms to C++ style guidelines

github-actions

Code conforms to Python style guidelines

chore: Update to Pytorch 2.0

32328c2

Signed-off-by: Dheeraj Peri <[email protected]>

facebook-github-bot added the cla signed label Dec 5, 2022

github-actions bot added component: api [Python] Issues re: Python API component: build system Issues re: Build system component: tests Issues re: Tests labels Dec 5, 2022

github-actions bot requested a review from narendasan December 5, 2022 16:41

Merge branch 'clone_cuda11.7_update' into pyt2.0

c9e957d

github-actions bot approved these changes Dec 5, 2022

View reviewed changes

chore: Fix torchvision dependency

5fc2b95

Signed-off-by: Dheeraj Peri <[email protected]>

github-actions bot approved these changes Dec 5, 2022

View reviewed changes

Merge branch 'clone_cuda11.7_update' into pyt2.0

e1c7abf

github-actions bot approved these changes Dec 13, 2022

View reviewed changes

chore: Add dynamo support

f38eb76

Signed-off-by: Dheeraj Peri <[email protected]>

frank-wei mentioned this pull request Dec 17, 2022

Update test_flatten_aten and test_reshape_aten due to PT2.0 changed tracer behavior for these ops #1559

Merged

7 tasks

Merge branch 'clone_cuda11.7_update' into pyt2.0

3c92d82

github-actions bot added component: conversion Issues re: Conversion stage component: converters Issues re: Specific op converters component: core Issues re: The core compiler component: fx documentation Improvements or additions to documentation labels Dec 17, 2022

github-actions bot approved these changes Dec 17, 2022

View reviewed changes

github-actions bot requested changes Dec 17, 2022

View reviewed changes

chore: linter fixes

9131695

Signed-off-by: Dheeraj Peri <[email protected]>

github-actions bot approved these changes Dec 17, 2022

View reviewed changes

gs-olive and others added 7 commits December 20, 2022 20:52

chore: updates to dynamo in perf_run

925c76b

Signed-off-by: Dheeraj Peri <[email protected]>

fix: Improve logging, restructure casting function

d74e0b5

- Address review comments - Improve documentation and logging messages - Restructure casting function to allow for casting of variable data types - Add casting for `at::kByte` segment block inputs as well as segment block outputs

Merge pull request #1549 from gs-olive/gpt_2_bugfix

544654f

fix: Properly cast intermediate Int8 tensors to TensorRT Engines in Fallback

docs: [Automated] Regenerating documenation for 544654f

5fa6374

Signed-off-by: Torch-TensorRT Github Bot <[email protected]>

Merge branch 'master' into pyt2.0

09fa2ad

chore: upgrade to latest 2.0

52edcc4

Signed-off-by: Dheeraj Peri <[email protected]>

github-actions bot added component: api [C++] Issues re: C++ API component: partitioning labels Dec 28, 2022

github-actions bot requested changes Dec 28, 2022

View reviewed changes

github-actions bot approved these changes Dec 28, 2022

View reviewed changes

chore: Update pytorch version in circle ci

bd7f616

Signed-off-by: Dheeraj Peri <[email protected]>

github-actions bot approved these changes Dec 28, 2022

View reviewed changes

github-actions bot requested changes Dec 28, 2022

View reviewed changes

chore: minor update

2ace0e3

Signed-off-by: Dheeraj Peri <[email protected]>

github-actions bot approved these changes Dec 30, 2022

View reviewed changes

github-actions bot requested changes Dec 30, 2022

View reviewed changes

chore: explicitly set cudnn version during pip installation of TRT

3a8a7ae

Signed-off-by: Dheeraj Peri <[email protected]>

github-actions bot approved these changes Jan 3, 2023

View reviewed changes

github-actions bot requested changes Jan 3, 2023

View reviewed changes

chore: updates to latest nightly

0512141

Signed-off-by: Dheeraj Peri <[email protected]>

github-actions bot approved these changes Jan 3, 2023

View reviewed changes

github-actions bot requested changes Jan 3, 2023

View reviewed changes

chore: unmodify docs

5923e6b

Signed-off-by: Dheeraj Peri <[email protected]>

github-actions bot approved these changes Jan 3, 2023

View reviewed changes

github-actions bot requested changes Jan 3, 2023

View reviewed changes

chore: linter fixes

ec06d6f

Signed-off-by: Dheeraj Peri <[email protected]>

github-actions bot approved these changes Jan 3, 2023

View reviewed changes

peri044 merged commit 358fae1 into clone_cuda11.7_update Jan 4, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Update to Pytorch 2.0 #1528

feat: Update to Pytorch 2.0 #1528

peri044 commented Dec 5, 2022

github-actions bot left a comment

github-actions bot left a comment

github-actions bot left a comment

github-actions bot left a comment

github-actions bot left a comment

github-actions bot left a comment

peri044 commented Dec 17, 2022 •

edited

Loading

frank-wei commented Dec 17, 2022 •

edited

Loading

frank-wei commented Dec 17, 2022

github-actions bot left a comment

github-actions bot left a comment

github-actions bot left a comment

github-actions bot left a comment

github-actions bot left a comment

github-actions bot left a comment

github-actions bot left a comment

github-actions bot left a comment

github-actions bot left a comment

github-actions bot left a comment

github-actions bot left a comment

github-actions bot left a comment

github-actions bot left a comment

github-actions bot left a comment

github-actions bot left a comment

github-actions bot left a comment

github-actions bot left a comment

feat: Update to Pytorch 2.0 #1528

feat: Update to Pytorch 2.0 #1528

Conversation

peri044 commented Dec 5, 2022

Description

Type of change

Checklist:

github-actions bot left a comment

Choose a reason for hiding this comment

github-actions bot left a comment

Choose a reason for hiding this comment

github-actions bot left a comment

Choose a reason for hiding this comment

github-actions bot left a comment

Choose a reason for hiding this comment

github-actions bot left a comment

Choose a reason for hiding this comment

github-actions bot left a comment

Choose a reason for hiding this comment

peri044 commented Dec 17, 2022 • edited Loading

frank-wei commented Dec 17, 2022 • edited Loading

frank-wei commented Dec 17, 2022

github-actions bot left a comment

Choose a reason for hiding this comment

github-actions bot left a comment

Choose a reason for hiding this comment

github-actions bot left a comment

Choose a reason for hiding this comment

github-actions bot left a comment

Choose a reason for hiding this comment

github-actions bot left a comment

Choose a reason for hiding this comment

github-actions bot left a comment

Choose a reason for hiding this comment

github-actions bot left a comment

Choose a reason for hiding this comment

github-actions bot left a comment

Choose a reason for hiding this comment

github-actions bot left a comment

Choose a reason for hiding this comment

github-actions bot left a comment

Choose a reason for hiding this comment

github-actions bot left a comment

Choose a reason for hiding this comment

github-actions bot left a comment

Choose a reason for hiding this comment

github-actions bot left a comment

Choose a reason for hiding this comment

github-actions bot left a comment

Choose a reason for hiding this comment

github-actions bot left a comment

Choose a reason for hiding this comment

github-actions bot left a comment

Choose a reason for hiding this comment

github-actions bot left a comment

Choose a reason for hiding this comment

peri044 commented Dec 17, 2022 •

edited

Loading

frank-wei commented Dec 17, 2022 •

edited

Loading