Skip to content

Commit 19b4e7d

Browse files
authored
Support parameterized views in list tables; optimize row counts via system schema; update README with test steps and lint fixes.
1 parent d42bc1d commit 19b4e7d

File tree

2 files changed

+125
-98
lines changed

2 files changed

+125
-98
lines changed

README.md

Lines changed: 39 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11
# ClickHouse MCP Server
2+
23
[![PyPI - Version](https://img.shields.io/pypi/v/mcp-clickhouse)](https://pypi.org/project/mcp-clickhouse)
34

45
An MCP server for ClickHouse.
@@ -10,22 +11,22 @@ An MCP server for ClickHouse.
1011
### Tools
1112

1213
* `run_select_query`
13-
- Execute SQL queries on your ClickHouse cluster.
14-
- Input: `sql` (string): The SQL query to execute.
15-
- All ClickHouse queries are run with `readonly = 1` to ensure they are safe.
14+
* Execute SQL queries on your ClickHouse cluster.
15+
* Input: `sql` (string): The SQL query to execute.
16+
* All ClickHouse queries are run with `readonly = 1` to ensure they are safe.
1617

1718
* `list_databases`
18-
- List all databases on your ClickHouse cluster.
19+
* List all databases on your ClickHouse cluster.
1920

2021
* `list_tables`
21-
- List all tables in a database.
22-
- Input: `database` (string): The name of the database.
22+
* List all tables in a database.
23+
* Input: `database` (string): The name of the database.
2324

2425
## Configuration
2526

2627
1. Open the Claude Desktop configuration file located at:
27-
- On macOS: `~/Library/Application Support/Claude/claude_desktop_config.json`
28-
- On Windows: `%APPDATA%/Claude/claude_desktop_config.json`
28+
* On macOS: `~/Library/Application Support/Claude/claude_desktop_config.json`
29+
* On Windows: `%APPDATA%/Claude/claude_desktop_config.json`
2930

3031
2. Add the following:
3132

@@ -89,7 +90,6 @@ Or, if you'd like to try it out with the [ClickHouse SQL Playground](https://sql
8990
}
9091
```
9192

92-
9393
3. Locate the command entry for `uv` and replace it with the absolute path to the `uv` executable. This ensures that the correct version of `uv` is used when starting the server. On a mac, you can find this path using `which uv`.
9494

9595
4. Restart Claude Desktop to apply the changes.
@@ -102,7 +102,7 @@ Or, if you'd like to try it out with the [ClickHouse SQL Playground](https://sql
102102

103103
*Note: The use of the `default` user in this context is intended solely for local development purposes.*
104104

105-
```
105+
```bash
106106
CLICKHOUSE_HOST=localhost
107107
CLICKHOUSE_PORT=8123
108108
CLICKHOUSE_USER=default
@@ -118,36 +118,39 @@ CLICKHOUSE_PASSWORD=clickhouse
118118
The following environment variables are used to configure the ClickHouse connection:
119119

120120
#### Required Variables
121+
121122
* `CLICKHOUSE_HOST`: The hostname of your ClickHouse server
122123
* `CLICKHOUSE_USER`: The username for authentication
123124
* `CLICKHOUSE_PASSWORD`: The password for authentication
124125

125-
> [!CAUTION]
126+
> [!CAUTION]
126127
> It is important to treat your MCP database user as you would any external client connecting to your database, granting only the minimum necessary privileges required for its operation. The use of default or administrative users should be strictly avoided at all times.
127128
128129
#### Optional Variables
130+
129131
* `CLICKHOUSE_PORT`: The port number of your ClickHouse server
130-
- Default: `8443` if HTTPS is enabled, `8123` if disabled
131-
- Usually doesn't need to be set unless using a non-standard port
132+
* Default: `8443` if HTTPS is enabled, `8123` if disabled
133+
* Usually doesn't need to be set unless using a non-standard port
132134
* `CLICKHOUSE_SECURE`: Enable/disable HTTPS connection
133-
- Default: `"true"`
134-
- Set to `"false"` for non-secure connections
135+
* Default: `"true"`
136+
* Set to `"false"` for non-secure connections
135137
* `CLICKHOUSE_VERIFY`: Enable/disable SSL certificate verification
136-
- Default: `"true"`
137-
- Set to `"false"` to disable certificate verification (not recommended for production)
138+
* Default: `"true"`
139+
* Set to `"false"` to disable certificate verification (not recommended for production)
138140
* `CLICKHOUSE_CONNECT_TIMEOUT`: Connection timeout in seconds
139-
- Default: `"30"`
140-
- Increase this value if you experience connection timeouts
141+
* Default: `"30"`
142+
* Increase this value if you experience connection timeouts
141143
* `CLICKHOUSE_SEND_RECEIVE_TIMEOUT`: Send/receive timeout in seconds
142-
- Default: `"300"`
143-
- Increase this value for long-running queries
144+
* Default: `"300"`
145+
* Increase this value for long-running queries
144146
* `CLICKHOUSE_DATABASE`: Default database to use
145-
- Default: None (uses server default)
146-
- Set this to automatically connect to a specific database
147+
* Default: None (uses server default)
148+
* Set this to automatically connect to a specific database
147149

148150
#### Example Configurations
149151

150152
For local development with Docker:
153+
151154
```env
152155
# Required variables
153156
CLICKHOUSE_HOST=localhost
@@ -160,6 +163,7 @@ CLICKHOUSE_VERIFY=false
160163
```
161164

162165
For ClickHouse Cloud:
166+
163167
```env
164168
# Required variables
165169
CLICKHOUSE_HOST=your-instance.clickhouse.cloud
@@ -172,6 +176,7 @@ CLICKHOUSE_PASSWORD=your-password
172176
```
173177

174178
For ClickHouse SQL Playground:
179+
175180
```env
176181
CLICKHOUSE_HOST=sql-clickhouse.clickhouse.com
177182
CLICKHOUSE_USER=demo
@@ -204,6 +209,17 @@ You can set these variables in your environment, in a `.env` file, or in the Cla
204209
}
205210
}
206211
```
212+
213+
### Running tests
214+
215+
```bash
216+
uv sync --all-extras --dev # install dev dependencies
217+
uv run ruff check . # run linting
218+
219+
docker compose up -d test_services # start ClickHouse
220+
uv run pytest tests
221+
```
222+
207223
## YouTube Overview
208224

209225
[![YouTube](http://i.ytimg.com/vi/y9biAm_Fkqw/hqdefault.jpg)](https://www.youtube.com/watch?v=y9biAm_Fkqw)

mcp_clickhouse/mcp_server.py

Lines changed: 86 additions & 75 deletions
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,50 @@
11
import logging
2-
from typing import Sequence
2+
import json
3+
from typing import Optional, List, Any
34
import concurrent.futures
45
import atexit
56

67
import clickhouse_connect
7-
from clickhouse_connect.driver.binding import quote_identifier, format_query_value
8+
from clickhouse_connect.driver.binding import format_query_value
89
from dotenv import load_dotenv
910
from mcp.server.fastmcp import FastMCP
11+
from dataclasses import dataclass, field, asdict, is_dataclass
1012

1113
from mcp_clickhouse.mcp_env import get_config
1214

15+
16+
@dataclass
17+
class Column:
18+
database: str
19+
table: str
20+
name: str
21+
column_type: str
22+
default_kind: Optional[str]
23+
default_expression: Optional[str]
24+
comment: Optional[str]
25+
26+
27+
@dataclass
28+
class Table:
29+
database: str
30+
name: str
31+
engine: str
32+
create_table_query: str
33+
dependencies_database: str
34+
dependencies_table: str
35+
engine_full: str
36+
sorting_key: str
37+
primary_key: str
38+
total_rows: int
39+
total_bytes: int
40+
total_bytes_uncompressed: int
41+
parts: int
42+
active_parts: int
43+
total_marks: int
44+
comment: Optional[str] = None
45+
columns: List[Column] = field(default_factory=list)
46+
47+
1348
MCP_SERVER_NAME = "mcp-clickhouse"
1449

1550
# Configure logging
@@ -34,6 +69,24 @@
3469
mcp = FastMCP(MCP_SERVER_NAME, dependencies=deps)
3570

3671

72+
def result_to_table(query_columns, result) -> List[Table]:
73+
return [Table(**dict(zip(query_columns, row))) for row in result]
74+
75+
76+
def result_to_column(query_columns, result) -> List[Column]:
77+
return [Column(**dict(zip(query_columns, row))) for row in result]
78+
79+
80+
def to_json(obj: Any) -> str:
81+
if is_dataclass(obj):
82+
return json.dumps(asdict(obj), default=to_json)
83+
elif isinstance(obj, list):
84+
return [to_json(item) for item in obj]
85+
elif isinstance(obj, dict):
86+
return {key: to_json(value) for key, value in obj.items()}
87+
return obj
88+
89+
3790
@mcp.tool()
3891
def list_databases():
3992
"""List available ClickHouse databases"""
@@ -45,85 +98,38 @@ def list_databases():
4598

4699

47100
@mcp.tool()
48-
def list_tables(database: str, like: str = None):
101+
def list_tables(
102+
database: str, like: Optional[str] = None, not_like: Optional[str] = None
103+
):
49104
"""List available ClickHouse tables in a database, including schema, comment,
50105
row count, and column count."""
51106
logger.info(f"Listing tables in database '{database}'")
52107
client = create_clickhouse_client()
53-
query = f"SHOW TABLES FROM {quote_identifier(database)}"
108+
query = f"SELECT database, name, engine, create_table_query, dependencies_database, dependencies_table, engine_full, sorting_key, primary_key, total_rows, total_bytes, total_bytes_uncompressed, parts, active_parts, total_marks, comment FROM system.tables WHERE database = {format_query_value(database)}"
54109
if like:
55-
query += f" LIKE {format_query_value(like)}"
56-
result = client.command(query)
110+
query += f" AND name LIKE {format_query_value(like)}"
57111

58-
# Get all table comments in one query
59-
table_comments_query = (
60-
f"SELECT name, comment FROM system.tables WHERE database = {format_query_value(database)}"
61-
)
62-
table_comments_result = client.query(table_comments_query)
63-
table_comments = {row[0]: row[1] for row in table_comments_result.result_rows}
64-
65-
# Get all column comments in one query
66-
column_comments_query = f"SELECT table, name, comment FROM system.columns WHERE database = {format_query_value(database)}"
67-
column_comments_result = client.query(column_comments_query)
68-
column_comments = {}
69-
for row in column_comments_result.result_rows:
70-
table, col_name, comment = row
71-
if table not in column_comments:
72-
column_comments[table] = {}
73-
column_comments[table][col_name] = comment
74-
75-
def get_table_info(table):
76-
logger.info(f"Getting schema info for table {database}.{table}")
77-
schema_query = f"DESCRIBE TABLE {quote_identifier(database)}.{quote_identifier(table)}"
78-
schema_result = client.query(schema_query)
79-
80-
columns = []
81-
column_names = schema_result.column_names
82-
for row in schema_result.result_rows:
83-
column_dict = {}
84-
for i, col_name in enumerate(column_names):
85-
column_dict[col_name] = row[i]
86-
# Add comment from our pre-fetched comments
87-
if table in column_comments and column_dict["name"] in column_comments[table]:
88-
column_dict["comment"] = column_comments[table][column_dict["name"]]
89-
else:
90-
column_dict["comment"] = None
91-
columns.append(column_dict)
92-
93-
# Get row count and column count from the table
94-
row_count_query = (
95-
f"SELECT count() FROM {quote_identifier(database)}.{quote_identifier(table)}"
96-
)
97-
row_count_result = client.query(row_count_query)
98-
row_count = row_count_result.result_rows[0][0] if row_count_result.result_rows else 0
99-
column_count = len(columns)
100-
101-
create_table_query = f"SHOW CREATE TABLE {database}.`{table}`"
102-
create_table_result = client.command(create_table_query)
103-
104-
return {
105-
"database": database,
106-
"name": table,
107-
"comment": table_comments.get(table),
108-
"columns": columns,
109-
"create_table_query": create_table_result,
110-
"row_count": row_count,
111-
"column_count": column_count,
112-
}
113-
114-
tables = []
115-
if isinstance(result, str):
116-
# Single table result
117-
for table in (t.strip() for t in result.split()):
118-
if table:
119-
tables.append(get_table_info(table))
120-
elif isinstance(result, Sequence):
121-
# Multiple table results
122-
for table in result:
123-
tables.append(get_table_info(table))
112+
if not_like:
113+
query += f" AND name NOT LIKE {format_query_value(not_like)}"
114+
115+
result = client.query(query)
116+
117+
# Deserialize result as Table dataclass instances
118+
tables = result_to_table(result.column_names, result.result_rows)
119+
120+
for table in tables:
121+
column_data_query = f"SELECT database, table, name, type AS column_type, default_kind, default_expression, comment FROM system.columns WHERE database = {format_query_value(database)} AND table = {format_query_value(table.name)}"
122+
column_data_query_result = client.query(column_data_query)
123+
table.columns = [
124+
c
125+
for c in result_to_column(
126+
column_data_query_result.column_names,
127+
column_data_query_result.result_rows,
128+
)
129+
]
124130

125131
logger.info(f"Found {len(tables)} tables")
126-
return tables
132+
return [asdict(table) for table in tables]
127133

128134

129135
def execute_query(query: str):
@@ -160,10 +166,15 @@ def run_select_query(query: str):
160166
logger.warning(f"Query failed: {result['error']}")
161167
# MCP requires structured responses; string error messages can cause
162168
# serialization issues leading to BrokenResourceError
163-
return {"status": "error", "message": f"Query failed: {result['error']}"}
169+
return {
170+
"status": "error",
171+
"message": f"Query failed: {result['error']}",
172+
}
164173
return result
165174
except concurrent.futures.TimeoutError:
166-
logger.warning(f"Query timed out after {SELECT_QUERY_TIMEOUT_SECS} seconds: {query}")
175+
logger.warning(
176+
f"Query timed out after {SELECT_QUERY_TIMEOUT_SECS} seconds: {query}"
177+
)
167178
future.cancel()
168179
# Return a properly structured response for timeout errors
169180
return {

0 commit comments

Comments
 (0)