amd · dwithchenna · May 6, 2025 · May 12, 2025 · May 12, 2025 · May 14, 2025
diff --git a/docs/llm/high_level_python.rst b/docs/llm/high_level_python.rst
@@ -10,7 +10,7 @@
 High-Level Python SDK
 #####################
 
-A Python environment offers flexibility for experimenting with LLMs, profiling them, and integrating them into Python applications. We use the `Lemonade SDK <https://github.com/onnx/turnkeyml/blob/main/docs/lemonade/README.md>`_ to get up and running quickly.
+A Python environment offers flexibility for experimenting with LLMs, profiling them, and integrating them into Python applications. We use the `Lemonade SDK <https://github.com/lemonade-sdk/lemonade>`_ to get up and running quickly.
 
 To get started, follow these instructions.
 
@@ -35,7 +35,7 @@ To create and set up an environment, run these commands in your terminal:
 
     conda create -n ryzenai-llm python=3.10
     conda activate ryzenai-llm
-    pip install turnkeyml[llm-oga-hybrid]
+    pip install lemonade-sdk[llm-oga-hybrid]
     lemonade-install --ryzenai hybrid
 
 ****************

diff --git a/docs/llm/overview.rst b/docs/llm/overview.rst
@@ -73,7 +73,7 @@ The Server Interface provides a convenient means to integrate with applications
 
 To get started with the server interface, follow these instructions: :doc:`server_interface`.
 
-For example applications that have been tested with Lemonade Server, see the `Lemonade Server Examples <https://github.com/onnx/turnkeyml/tree/main/examples/lemonade/server>`_.
+For example applications that have been tested with Lemonade Server, see the `Lemonade Server Examples <https://github.com/lemonade-sdk/lemonade/tree/main/docs/server/apps>`_.
 
 
 OGA APIs for C++ Libraries and Python
@@ -170,7 +170,7 @@ The comprehensive set of pre-optimized models for hybrid execution used in these
      - 8.9x
      - 🟢
 
-The :ref:`ryzen-ai-oga-featured-llms` table was compiled using validation, benchmarking, and accuracy metrics as measured by the `ONNX TurnkeyML v6.1.0 <https://pypi.org/project/turnkeyml/6.1.0/>`_ ``lemonade`` commands in each example link.
+The :ref:`ryzen-ai-oga-featured-llms` table was compiled using validation, benchmarking, and accuracy metrics as measured by the `ONNX TurnkeyML v6.1.0 <https://pypi.org/project/turnkeyml/6.1.0/>`_ ``lemonade`` commands in each example link. After this table was created, the Lemonade SDK moved to the new location found `here <https://github.com/lemonade-sdk/lemonade>`_.
 
 Data collection details:
 

diff --git a/docs/llm/server_interface.rst b/docs/llm/server_interface.rst
@@ -23,10 +23,10 @@ Server Setup
 Lemonade Server can be installed via the Lemonade Server Installer executable by following these steps:
 
 1. Make sure your system has the recommended Ryzen AI driver installed as described in :ref:`install-driver`.
-2. Download and install ``Lemonade_Server_Installer.exe`` from the `latest TurnkeyML release <https://github.com/onnx/turnkeyml/releases>`_.
+2. Download and install ``Lemonade_Server_Installer.exe`` from the `latest Lemonade release <https://github.com/lemonade-sdk/lemonade/releases>`_.
 3. Launch the server by double-clicking the ``lemonade_server`` shortcut added to your desktop.
 
-See the `Lemonade Server Installation Guide <https://github.com/onnx/turnkeyml/blob/main/docs/lemonade/lemonade_server_exe.md>`_ for more details.
+See the `Lemonade Server README <https://github.com/lemonade-sdk/lemonade/blob/main/docs/server/README.md>`_ for more details.
 
 ************
 Server Usage
@@ -38,7 +38,7 @@ The Lemonade Server provides the following OpenAI-compatible endpoints:
 - POST ``/api/v0/completions`` - Text Completions (prompt to completion)
 - GET ``/api/v0/models`` - List available models
 
-Please refer to the `server specification <https://github.com/onnx/turnkeyml/blob/main/docs/lemonade/server_spec.md>`_ document in the Lemonade repository for details about the request and response formats for each endpoint. 
+Please refer to the `server specification <https://github.com/lemonade-sdk/lemonade/blob/main/docs/server/server_spec.md>`_ document in the Lemonade repository for details about the request and response formats for each endpoint. 
 
 The `OpenAI API documentation <https://platform.openai.com/docs/guides/streaming-responses?api-mode=chat>`_ also has code examples for integrating streaming completions into an application. 
 
@@ -75,8 +75,8 @@ Instructions:
 Next Steps
 **********
 
-- See `Lemonade Server Examples <https://github.com/onnx/turnkeyml/tree/main/examples/lemonade/server>`_ to find applications that have been tested with Lemonade Server.
-- Check out the `Lemonade Server specification <https://github.com/onnx/turnkeyml/blob/main/docs/lemonade/server_spec.md>`_ to learn more about supported features.
+- See `Lemonade Server Examples <https://github.com/lemonade-sdk/lemonade/tree/main/docs/server/apps>`_ to find applications that have been tested with Lemonade Server.
+- Check out the `Lemonade Server specification <https://github.com/lemonade-sdk/lemonade/blob/main/docs/server/server_spec.md>`_ to learn more about supported features.
 - Try out your Lemonade Server install with any application that uses the OpenAI chat completions API.
 
 

diff --git a/docs/model_quantization.rst b/docs/model_quantization.rst
@@ -45,7 +45,7 @@ For more details
 ~~~~~~~~~~~~~~~~
 - `AMD Quark Tutorial <https://github.com/amd/RyzenAI-SW/tree/main/tutorial/quark_quantization>`_ for Ryzen AI Deployment
 - Running INT8 model on NPU using :doc:`Getting Started Tutorial <getstartex>`
-- Advanced quantization techniques `Fast Finetuning and Cross Layer Equalization <https://gitenterprise.xilinx.com/VitisAI/RyzenAI-SW/blob/dev/tutorial/quark_quantization/docs/advanced_quant_readme.md>`_ for INT8 model
+- Advanced quantization techniques `Fast Finetuning and Cross Layer Equalization <https://github.com/amd/RyzenAI-SW/blob/main/tutorial/quark_quantization/docs/advanced_quant_readme.md>`_ for INT8 model
 
 
 BF16 Examples
@@ -65,7 +65,7 @@ For more details
 - `Image Classification <https://github.com/amd/RyzenAI-SW/tree/main/example/image_classification>`_ using ResNet50 to run BF16 model on NPU
 - `Finetuned DistilBERT for Text Classification <https://github.com/amd/RyzenAI-SW/tree/main/example/DistilBERT_text_classification_bf16>`_ 
 - `Text Embedding Model Alibaba-NLP/gte-large-en-v1.5  <https://github.com/amd/RyzenAI-SW/tree/main/example/GTE>`_ 
-- Advanced quantization techniques `Fast Finetuning <https://quark.docs.amd.com/latest/supported_accelerators/ryzenai/tutorial_convert_fp32_or_fp16_to_bf16.html>`_ for BF16 models.
+- Advanced quantization techniques `FP32/FP16 to BF16 Conversion <https://quark.docs.amd.com/latest/supported_accelerators/ryzenai/tutorial_convert_fp32_or_fp16_to_bf16.html>`_ for BF16 models.
 
 
 ..

diff --git a/docs/modelrun.rst b/docs/modelrun.rst
@@ -327,15 +327,14 @@ Python example:
     )
 
 
-**NOTE**: When compiling with encryptionKey, ensure that any existing cache directory (either the default cache directory or the directory specified by the ``cache_dir`` provider option) is deleted before compiling.
+|memo| **NOTE**: When compiling with encryptionKey, ensure that any existing cache directory (either the default cache directory or the directory specified by the ``cache_dir`` provider option) is deleted before compiling.
 
 |
 
 **************************
 Operator Assignment Report
 **************************
 
-
 Vitis AI EP generates a file named ``vitisai_ep_report.json`` that provides a report on model operator assignments across CPU and NPU. This file is automatically generated in the cache directory if no explicit cache location is specified in the code. This report includes information such as the total number of nodes, the list of operator types in the model, and which nodes and operators runs on the NPU or on the CPU. Additionally, the report includes node statistics, such as input to a node, the applied operation, and output from the node.
 
 

diff --git a/docs/relnotes.rst b/docs/relnotes.rst
@@ -126,9 +126,11 @@ Version 1.4
 
 - Known Issues:
 
-  - LT might cause warnings or crashes when running concurrently with other MSFT Copilot apps
-  - Recall app might stop functioning; NPU driver and workloads are expected to continue to work
-  - Cocreator app does not close contexts quickly and might cause contexts to be limited due to remaining contexts still open
+  - Microsoft Windows Insider Program (WIP) users may see warnings or need to restart when running all applications concurrently. 
+
+    - NPU driver and workloads will continue to work.
+
+  - Context creation may appear to be limited when some application do not close contexts quickly.
 
 
 ***********