Skip to content

Commit 06eee57

Browse files
authored
Python docs review (2025-03-04)
2 parents 6defd51 + 8191073 commit 06eee57

File tree

13 files changed

+61
-75
lines changed

13 files changed

+61
-75
lines changed

1_python/1_getting-started/project-setup.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,13 +5,13 @@ description: "Set up your `lmstudio-python` app or script."
55
index: 2
66
---
77

8-
`lmstudio` is a library published on Python that allows you to use `lmstudio-python` in your own projects.
8+
`lmstudio` is a library published on PyPI that allows you to use `lmstudio-python` in your own projects.
99
It is open source and developed on GitHub.
1010
You can find the source code [here](https://github.com/lmstudio-ai/lmstudio-python).
1111

1212
## Installing `lmstudio-python`
1313

14-
As it is published to Python, `lmstudio-python` may be installed using `pip`
14+
As it is published to PyPI, `lmstudio-python` may be installed using `pip`
1515
or your preferred project dependency manager (`pdm` is shown, but other
1616
Python project management tools offer similar dependency addition commands).
1717

1_python/1_getting-started/repl.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6,8 +6,8 @@ index: 2
66
---
77

88
To enable interactive use, `lmstudio-python` offers a convenience API which manages
9-
its resources via `atexit` hooks, allowing the a default synchronous client session
10-
to be used across multiple interactive comments.
9+
its resources via `atexit` hooks, allowing a default synchronous client session
10+
to be used across multiple interactive commands.
1111

1212
This convenience API is shown in the examples throughout the documentation as the
1313
`Python (convenience API)` tab (alongside the `Python (scoped resource API)` examples,

1_python/1_llm-prediction/chat-completion.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -132,23 +132,23 @@ You can ask the LLM to predict the next response in the chat context using the `
132132

133133
```lms_code_snippet
134134
variants:
135-
Streaming:
135+
"Non-streaming":
136136
language: python
137137
code: |
138138
# The `chat` object is created in the previous step.
139-
prediction_stream = model.respond_stream(chat)
139+
result = model.respond(chat)
140140
141-
for fragment in prediction_stream:
142-
print(fragment.content, end="", flush=True)
143-
print() # Advance to a new line at the end of the response
141+
print(result)
144142
145-
"Non-streaming":
143+
Streaming:
146144
language: python
147145
code: |
148146
# The `chat` object is created in the previous step.
149-
result = model.respond(chat)
147+
prediction_stream = model.respond_stream(chat)
150148
151-
print(result)
149+
for fragment in prediction_stream:
150+
print(fragment.content, end="", flush=True)
151+
print() # Advance to a new line at the end of the response
152152
```
153153

154154
## Customize Inferencing Parameters

1_python/1_llm-prediction/completion.md

Lines changed: 16 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -39,23 +39,23 @@ Once you have a loaded model, you can generate completions by passing a string t
3939

4040
```lms_code_snippet
4141
variants:
42-
Streaming:
42+
"Non-streaming":
4343
language: python
4444
code: |
4545
# The `chat` object is created in the previous step.
46-
prediction_stream = model.complete_stream("My name is", config={"maxTokens": 100})
46+
result = model.complete("My name is", config={"maxTokens": 100})
4747
48-
for fragment in prediction_stream:
49-
print(fragment.content, end="", flush=True)
50-
print() # Advance to a new line at the end of the response
48+
print(result)
5149
52-
"Non-streaming":
50+
Streaming:
5351
language: python
5452
code: |
5553
# The `chat` object is created in the previous step.
56-
result = model.complete("My name is", config={"maxTokens": 100})
54+
prediction_stream = model.complete_stream("My name is", config={"maxTokens": 100})
5755
58-
print(result)
56+
for fragment in prediction_stream:
57+
print(fragment.content, end="", flush=True)
58+
print() # Advance to a new line at the end of the response
5959
```
6060

6161
## 3. Print Prediction Stats
@@ -64,21 +64,22 @@ You can also print prediction metadata, such as the model used for generation, n
6464

6565
```lms_code_snippet
6666
variants:
67-
Streaming:
67+
"Non-streaming":
6868
language: python
6969
code: |
70-
# After iterating through the prediction fragments,
71-
# the overall prediction result may be obtained from the stream
72-
result = prediction_stream.result()
73-
70+
# `result` is the response from the model.
7471
print("Model used:", result.model_info.display_name)
7572
print("Predicted tokens:", result.stats.predicted_tokens_count)
7673
print("Time to first token (seconds):", result.stats.time_to_first_token_sec)
7774
print("Stop reason:", result.stats.stop_reason)
78-
"Non-streaming":
75+
76+
Streaming:
7977
language: python
8078
code: |
81-
# `result` is the response from the model.
79+
# After iterating through the prediction fragments,
80+
# the overall prediction result may be obtained from the stream
81+
result = prediction_stream.result()
82+
8283
print("Model used:", result.model_info.display_name)
8384
print("Predicted tokens:", result.stats.predicted_tokens_count)
8485
print("Time to first token (seconds):", result.stats.time_to_first_token_sec)

1_python/1_llm-prediction/parameters.md

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,10 @@ Set inference-time parameters such as `temperature`, `maxTokens`, `topP` and mor
3333

3434
<!-- See [`LLMPredictionConfigInput`](./../api-reference/llm-prediction-config-input) for all configurable fields. -->
3535

36-
Another useful inference-time configuration parameter is [`structured`](<(./structured-responses)>), which allows you to rigorously enforce the structure of the output using a JSON or Pydantic schema.
36+
Note that while `structured` can be set to a JSON schema definition as an inference-time configuration parameter,
37+
the preferred approach is to instead set the [dedicated `response_format` parameter](<(./structured-responses)>),
38+
which allows you to more rigorously enforce the structure of the output using a JSON or class based schema
39+
definition.
3740

3841
# Load Parameters
3942

1_python/1_llm-prediction/structured-response.md

Lines changed: 9 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -130,32 +130,34 @@ schema = {
130130
book = result.parsed
131131
132132
print(book)
133-
# ^
133+
# ^
134134
# Note that `book` is correctly typed as { title: string, author: string, year: number }
135135
136136
Streaming:
137137
language: python
138138
code: |
139139
prediction_stream = model.respond_stream("Tell me about The Hobbit", response_format=schema)
140140
141-
# Optionally stream the response
142-
# for fragment in prediction:
143-
# print(fragment.content, end="", flush=True)
144-
# print()
141+
# Stream the response
142+
for fragment in prediction:
143+
print(fragment.content, end="", flush=True)
144+
print()
145145
# Note that even for structured responses, the *fragment* contents are still only text
146146
147147
# Get the final structured result
148148
result = prediction_stream.result()
149149
book = result.parsed
150150
151151
print(book)
152-
# ^
152+
# ^
153153
# Note that `book` is correctly typed as { title: string, author: string, year: number }
154154
```
155155

156+
<!--
157+
156158
TODO: Info about structured generation caveats
157159
158-
<!-- ## Overview
160+
## Overview
159161
160162
Once you have [downloaded and loaded](/docs/basics/index) a large language model,
161163
you can use it to respond to input through the API. This article covers getting JSON structured output, but you can also

1_python/1_llm-prediction/working-with-chats.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,9 @@ variants:
2424

2525
For more complex tasks, it is recommended to use the `Chat` helper class.
2626
It provides various commonly used methods to manage the chat.
27-
Here is an example with the `Chat` class.
27+
Here is an example with the `Chat` class, where the initial system prompt
28+
is supplied when initializing the chat instance, and then the initial user
29+
message is added via the corresponding method call.
2830

2931
```lms_code_snippet
3032
variants:

1_python/2_agent/tools.md

Lines changed: 5 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -71,7 +71,11 @@ is typically going to be the most convenient):
7171

7272
This means that your wording will affect the quality of the generation. Make sure to always provide a clear description of the tool so the model knows how to use it.
7373

74-
When a tool call fails, the language model may be able to respond appropriately to the failure.
74+
The SDK does not yet automatically convert raised exceptions to text and report them
75+
to the language model, but it can be beneficial for tool implementations to do so.
76+
In many cases, when notified of an error, a language model is able to adjust its
77+
request to avoid the failure.
78+
7579

7680
## Tools with External Effects (like Computer Use or API Calls)
7781

@@ -103,11 +107,6 @@ can essentially turn your LLMs into autonomous agents that can perform tasks on
103107
104108
```
105109

106-
The SDK does not yet automatically convert raised exceptions to text and report them
107-
to the language model, but it can be beneficial for tool implementations to do so.
108-
In many cases, when notified of an error, a language model is able to adjust its
109-
request to avoid the failure.
110-
111110
### Example code using the `create_file` tool:
112111

113112
```lms_code_snippet

1_python/4_tokenization/index.md

Lines changed: 3 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,9 @@ Models use a tokenizer to internally convert text into "tokens" they can deal wi
88

99
## Tokenize
1010

11-
You can tokenize a string with a loaded LLM or embedding model using the SDK. In the below examples, `llm` can be replaced with an embedding model `emb`.
11+
You can tokenize a string with a loaded LLM or embedding model using the SDK.
12+
In the below examples, the LLM reference can be replaced with an
13+
embedding model reference without requiring any other changes.
1214

1315
```lms_code_snippet
1416
variants:
@@ -74,31 +76,3 @@ You can determine if a given conversation fits into a model's context by doing t
7476
print("Fits in context:", does_chat_fit_in_context(model, chat))
7577
7678
```
77-
78-
<!-- ### Context length comparisons
79-
80-
The below examples check whether a conversation is over a LLM's context length
81-
(replace `llm` with `emb` to check for an embedding model).
82-
83-
```lms_code_snippet
84-
variants:
85-
"Python (convenience API)":
86-
language: python
87-
code: |
88-
import { LMStudioClient, Chat } from "@lmstudio/sdk";
89-
90-
const client = new LMStudioClient()
91-
const llm = client.llm.model()
92-
93-
# To check for a string, simply tokenize
94-
var tokens = llm.tokenize("Hello, world!")
95-
96-
# To check for a Chat, apply the prompt template first
97-
const chat = Chat.createEmpty().withAppended("user", "Hello, world!")
98-
const templatedChat = llm.applyPromptTemplate(chat)
99-
tokens = llm.tokenize(templatedChat)
100-
101-
# If the prompt's length in tokens is less than the context length, you're good!
102-
const contextLength = llm.getContextLength()
103-
const isOkay = (tokens.length < contextLength)
104-
``` -->

1_python/5_manage-models/loading.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,8 @@ AI models are huge. It can take a while to load them into memory. LM Studio's SD
2323

2424
## Get the Current Model with `.model()`
2525

26-
If you already have a model loaded in LM Studio (either via the GUI or `lms load`), you can use it by calling `.model()` without any arguments.
26+
If you already have a model loaded in LM Studio (either via the GUI or `lms load`),
27+
you can use it by calling `.model()` without any arguments.
2728

2829
```lms_code_snippet
2930
variants:

0 commit comments

Comments
 (0)