neuralmagic
diff --git a/‎DEVELOPING.md
+1-1 b/‎DEVELOPING.md
+1-1
diff --git a/‎README.md
+11-5 b/‎README.md
+11-5
diff --git a/‎docker/Dockerfile
+2-2 b/‎docker/Dockerfile
+2-2
diff --git a/‎docker/README.md
+8-13 b/‎docker/README.md
+8-13
diff --git a/‎docs/use-cases/cv/embedding-extraction.md
+2-1 b/‎docs/use-cases/cv/embedding-extraction.md
+2-1
diff --git a/‎docs/use-cases/cv/image-classification.md
+1-1 b/‎docs/use-cases/cv/image-classification.md
+1-1
diff --git a/‎docs/use-cases/cv/image-segmentation-yolact.md
+1-1 b/‎docs/use-cases/cv/image-segmentation-yolact.md
+1-1
diff --git a/‎docs/use-cases/cv/object-detection-yolov5.md
+1-2 b/‎docs/use-cases/cv/object-detection-yolov5.md
+1-2
diff --git a/‎docs/use-cases/nlp/question-answering.md
+1-1 b/‎docs/use-cases/nlp/question-answering.md
+1-1
diff --git a/‎docs/use-cases/nlp/text-classification.md
+1-1 b/‎docs/use-cases/nlp/text-classification.md
+1-1
diff --git a/‎docs/use-cases/nlp/token-classification.md
+2-1 b/‎docs/use-cases/nlp/token-classification.md
+2-1
diff --git a/‎docs/use-cases/nlp/transformers-embedding-extraction.md
+2-1 b/‎docs/use-cases/nlp/transformers-embedding-extraction.md
+2-1
diff --git a/‎docs/use-cases/nlp/zero-shot-text-classification.md
+1-1 b/‎docs/use-cases/nlp/zero-shot-text-classification.md
+1-1
diff --git a/‎examples/aws-lambda/README.md
-76 b/‎examples/aws-lambda/README.md
-76
@@ -16,7 +16,7 @@ limitations under the License.
 
 # Developing the DeepSparse Engine
 
-The DeepSparse Python API is developed and tested using Python 3.6-3.10.
+The DeepSparse Python API is developed and tested using Python 3.7-3.10.
 To develop the Python API, you will also need the development dependencies and to follow the styling guidelines.
 
 Here's some details to get started.
 
@@ -17,7 +17,7 @@ limitations under the License.
 
 <div style="display: flex; flex-direction: column; align-items: center;">
   <h1>
-    <img alt="tool icon" src="https://raw.githubusercontent.com/neuralmagic/deepsparse/main/docs/source/icon-deepsparse.png" />
+    <img alt="tool icon" src="https://raw.githubusercontent.com/neuralmagic/deepsparse/main/docs/old/source/icon-deepsparse.png" />
     &nbsp;&nbsp;DeepSparse
   </h1>
   <h4> An inference runtime offering GPU-class performance on CPUs and APIs to integrate ML into your application</h4>
@@ -56,10 +56,16 @@ limitations under the License.
 [DeepSparse](https://github.com/neuralmagic/deepsparse) is a CPU inference runtime that takes advantage of sparsity within neural networks to execute inference quickly. Coupled with [SparseML](https://github.com/neuralmagic/sparseml), an open-source optimization library, DeepSparse enables you to achieve GPU-class performance on commodity hardware.
 
 <p align="center">
-   <img alt="NM Flow" src="https://github.com/neuralmagic/deepsparse/tree/main/docs/neural-magic-workflow.png" width="60%" />
+   <img alt="NM Flow" src="https://github.com/neuralmagic/deepsparse/blob/7ee5e60f13b1fd321c5282c91e2873b3363ec911/docs/neural-magic-workflow.png" width="60%" />
 </p>
 
-For details of training a sparse model for deployment with DeepSparse, [check out SparseML](https://github.com/neuralmagic/sparseml).
+For details of training sparse models for deployment with DeepSparse, [check out SparseML](https://github.com/neuralmagic/sparseml).
+
+### ✨NEW✨ DeepSparse ARM Alpha 💪
+
+Neural Magic is bringing performant deep learning inference to ARM CPUs! In our recent product release, we launched alpha support for DeepSparse on AWS Graviton and Ampere. We are working towards a general release across ARM server, embedded, and mobile platforms in 2023.
+
+**If you would like to trial the alpha or want early access to the general release, [sign up for the waitlist](https://neuralmagic.com/deepsparse-arm-waitlist/).**
 
 ## Installation
 
@@ -212,13 +218,13 @@ DeepSparse offers different inference scenarios based on your use case.
 
 **Single-stream** scheduling: the latency/synchronous scenario, requests execute serially. [`default`]
 
-<img src="https://raw.githubusercontent.com/neuralmagic/deepsparse/main/docs/source/single-stream.png" alt="single stream diagram" />
+<img src="https://raw.githubusercontent.com/neuralmagic/deepsparse/main/docs/old/source/single-stream.png" alt="single stream diagram" />
 
 It's highly optimized for minimum per-request latency, using all of the system's resources provided to it on every request it gets.
 
 **Multi-stream** scheduling: the throughput/asynchronous scenario, requests execute in parallel.
 
-<img src="https://raw.githubusercontent.com/neuralmagic/deepsparse/main/docs/source/multi-stream.png" alt="multi stream diagram" />
+<img src="https://raw.githubusercontent.com/neuralmagic/deepsparse/main/docs/old/source/multi-stream.png" alt="multi stream diagram" />
 
 The most common use cases for the multi-stream scheduler are where parallelism is low with respect to core count, and where requests need to be made asynchronously without time to batch them.
 
 
@@ -113,13 +113,13 @@ COPY --from=build $VENV $VENV
 COPY --from=build deepsparse deepsparse
 ENV PATH="${VENV}/bin:$PATH"
 HEALTHCHECK CMD python -c 'import deepsparse'
-ENTRYPOINT bash
+CMD bash
 
 
 FROM base as prod
 ARG VENV
 COPY --from=build $VENV $VENV
 ENV PATH="${VENV}/bin:$PATH"
 HEALTHCHECK CMD python -c 'import deepsparse'
-ENTRYPOINT bash
+CMD bash
 
@@ -7,15 +7,15 @@ This image is based off the latest official Python:3.8.16 image
 You can access the already built image detailed at https://github.com/orgs/neuralmagic/packages/container/package/deepsparse:
 
 ```bash
-docker pull ghcr.io/neuralmagic/deepsparse:1.4
-docker tag ghcr.io/neuralmagic/deepsparse:1.4 deepsparse_docker
+docker pull ghcr.io/neuralmagic/deepsparse:1.4.2
+docker tag ghcr.io/neuralmagic/deepsparse:1.4.2 deepsparse_docker
 ```
 
 ## Extend
 If you would like to customize the docker image, you can use the pre-built images as a base in your own `Dockerfile`:
 
 ```Dockerfile
-FROM ghcr.io/neuralmagic/deepsparse:1.4
+FROM ghcr.io/neuralmagic/deepsparse:1.4.2
 ...
 ```
 
@@ -34,15 +34,10 @@ docker build --build-arg BRANCH=main -t deepsparse_docker .
 We also provide the following pre-built images with all integration specific 
 dependencies installed:
 
-| Image Name                	| Description                                                            	|
-|---------------------------	|------------------------------------------------------------------------	|
-| deepsparse-base           	| Base DeepSparse image with no integration specific dependencies    	|
-| deepsparse/deepsparse-all 	| DeepSparse image with all major integration dependencies installed 	|
-| deepsparse-server         	| DeepSparse image with `[server]` dependencies installed              	|
-| deepsparse-transformers   	| DeepSparse image with all transformer dependencies installed       	|
-| deepsparse-torchvision    	| DeepSparse image with torchvision dependencies installed           	|
-| deepsparse-ultralytics    	|  DeepSparse image with yolov5 and yolov8 dependencies installed    	|
-
+| Image Name      	 | Description                                                        	 |
+|-------------------|----------------------------------------------------------------------|
+| deepsparse-base 	 | Base DeepSparse image with no integration specific dependencies    	 |
+| deepsparse      	 | DeepSparse image with all major integration dependencies installed 	 |
 
 To build a development image for a specific branch use the  following  command:
 
@@ -57,4 +52,4 @@ To run the container:
 
 ```bash
 docker container run -it deepsparse_docker
-```
+```
@@ -106,6 +106,7 @@ print(len(result["embeddings"][0][0]))
 
 ### Cross Use Case Functionality
 Check out the [Server User Guide](../../user-guide/deepsparse-server.md) for more details on configuring the Server.
+
 ## Using a Custom ONNX File 
 Apart from using models from the SparseZoo, DeepSparse allows you to define custom ONNX files for embedding extraction. 
 
@@ -132,4 +133,4 @@ rn50_embedding_pipeline = Pipeline.create(
 embedding = rn50_embedding_pipeline(images="lion.jpeg")
 print(len(embedding.embeddings[0][0]))
 # 2048
-```
+```
@@ -284,4 +284,4 @@ pipeline = Pipeline.create(
 prediction = pipeline(images=["lion.jpeg"])
 print(prediction.labels)
 # [291]
-```
+```
@@ -252,4 +252,4 @@ predictions = yolact_pipeline(images=images)
 # predictions has attributes `boxes`, `classes`, `masks` and `scores`
 predictions.classes[0]
 # [20,20, .......0, 0,24]
-```
+```
@@ -282,7 +282,6 @@ print(labels)
 # [['person', 'person', 'car', 'person', 'motorcycle', 'person', 'person', 'person', 'motorcycle', 'person']]
 ```
 
-
 ### Cross Use Case Functionality
 
 Check out the [Server User Guide](../../user-guide/deepsparse-server.md) for more details on configuring a Server.
@@ -321,4 +320,4 @@ print(pipeline_outputs.boxes)
 print(pipeline_outputs.labels)
 # [[[-0.8809833526611328, 5.1244752407073975, 27.885415077209473, 57.20366072654724], [-9.014896631240845, -2.4366320967674255, 21.488688468933105, 37.2245477437973], [14.241515636444092, 11.096746131777763, 30.164274215698242, 22.02291651070118], [7.107024908065796, 5.017698150128126, 15.09239387512207, 10.45704211294651]]]
 # [['8367.0', '1274.0', '8192.0', '6344.0']]
-```
+```
@@ -318,4 +318,4 @@ question = "What is DeepSparse?"
 output = qa_pipeline(question=question, context=q_context)
 print(output.answer)
 # sparsity-aware
-```
+```
@@ -439,4 +439,4 @@ sequences = ["I think DeepSparse Pipelines are awesome!"]
 prediction = pipeline(sequences)
 print(prediction)
 # labels=['LABEL_1'] scores=[0.9996163845062256]
-```
+```
@@ -273,6 +273,7 @@ print(resp.text)
 ### Cross Use Case Functionality
 
 Check out the [Server User Guide](../../user-guide/deepsparse-server.md) for more details on configuring the Server.
+
 ## Using a Custom ONNX File 
 Apart from using models from the SparseZoo, DeepSparse allows you to deploy token classification pipelines with custom ONNX files. 
 
@@ -300,4 +301,4 @@ pipeline = Pipeline.create(
 output = pipeline("Mary is flying from Nairobi to New York")
 print(output.predictions)
 # [[TokenClassificationResult(entity='B-PER', score=0.9971914291381836, word='mary', start=0, end=4, index=1, is_grouped=False), TokenClassificationResult(entity='B-LOC', score=0.9993892312049866, word='nairobi', start=20, end=27, index=5, is_grouped=False), TokenClassificationResult(entity='B-LOC', score=0.9993736147880554, word='new', start=31, end=34, index=7, is_grouped=False), TokenClassificationResult(entity='I-LOC', score=0.997299075126648, word='york', start=35, end=39, index=8, is_grouped=False)]]
-```
+```
@@ -209,6 +209,7 @@ print(len(result["embeddings"][0]))
 ### Cross Use Case Functionality
 
 Check out the [Server User Guide](../../user-guide/deepsparse-server.md) for more details on configuring the Server.
+
 ## Using a Custom ONNX File 
 Apart from using models from the SparseZoo, DeepSparse allows you to deploy transformer embedding extraction pipelines with custom ONNX files. 
 
@@ -241,4 +242,4 @@ input_sequence = "The generalized embedding extraction Pipeline is the best!"
 embedding = bert_emb_pipeline(input_sequence)
 print(len(embedding.embeddings[0]))
 # 98304
-```
+```
@@ -284,4 +284,4 @@ pipeline = Pipeline.create(
 prediction = pipeline("Who are you voting for in the upcoming election")
 print(prediction)
 # sequences='Who are you voting for in the upcoming election' labels=['sports', 'poltics', 'public health'] scores=[0.35093653202056885, 0.3335352838039398, 0.31552815437316895]
-```
+```