You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* add llama
* add test for tokenizer
* make llama 3.1 working
* update
* add shape test for 70b and 405b
* clean up
* add tests
* update
* fix error
* calculate rotary embedding in model layer
* remove rotary_emb from attention
* update feed
* update .csproj
* Update NuGet.config
* fix test
* pass device
* fix test
* update constructor
* disable 405b test
* update
* disable 70b test
* use windows only fact
* revert change
* rename test to LLaMA3_1
Copy file name to clipboardExpand all lines: src/Microsoft.ML.GenAI.Core/Extension/ModuleExtension.cs
+51
Original file line number
Diff line number
Diff line change
@@ -197,6 +197,57 @@ public static Dictionary<string, string> InferDeviceMapForEachLayer(
197
197
returndeviceMap;
198
198
}
199
199
200
+
/// <summary>
201
+
/// Infer the device map for each layer in the model.
202
+
/// The device map is a dictionary where the key is the device id (e.g. "cuda:0") and the value is the memory size in bytes of the device.
203
+
/// When inferring the device map, each layer in the model will be placed on the device in the order of the devices list.
204
+
/// </summary>
205
+
/// <param name="model"></param>
206
+
/// <param name="numberOfLayerToBePlaced">a list of key-value pairs where the key is the device id (e.g. "cuda:0") and the value is the number of layers to be placed on the device.
207
+
/// If you want to place all remaining layers on the device, set that value to -1.
208
+
/// e.g. [{"cuda:0", 2}, {"cpu", -1}], the first 2 layers will be placed on "cuda:0" and the rest will be placed on "cpu".
0 commit comments