Skip to content

Commit 5a1071a

Browse files
authored
update API and configuration document (#4)
* Update seasearch_api.md * Update seasearch_api.md * Update README.md * Update seasearch_api.md * Update README.md * Update README.md * Create overview.md * Create authentication.md * Create index_management.md * Create docmuent_opreation.md * Create search_document.md * Update mkdocs.yml * Delete manual/api/seasearch_api.md * Create document_operation.md * Delete manual/api/docmuent_opreation.md * Update README.md
1 parent effac4c commit 5a1071a

File tree

8 files changed

+279
-529
lines changed

8 files changed

+279
-529
lines changed

manual/api/authentication.md

Lines changed: 47 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,47 @@
1+
# API Authentication
2+
SeaSearch uses HTTP Basic Auth for authentication. API requests must include the corresponding basic auth token in the header.
3+
4+
To generate a basic auth token, combine the username and password with a colon (e.g., aladdin:opensesame), and then base64 encode the resulting string (e.g., YWxhZGRpbjpvcGVuc2VzYW1l).
5+
6+
You can generate a token using the following command, for example with aladdin:opensesame:
7+
8+
```
9+
echo -n 'aladdin:opensesame' | base64
10+
YWxhZGRpbjpvcGVuc2VzYW1l
11+
```
12+
Note: Basic auth is not secure. If you need to access SeaSearch over the public internet, it is strongly recommended to use HTTPS (e.g., via reverse proxy such as Nginx).
13+
```
14+
"Authorization": "Basic YWRtaW46MTIzNDU2Nzg="
15+
```
16+
17+
## Administrator User
18+
SeaSearch uses accounts to manage API permissions. When the program starts for the first time, an administrator account must be configured through environment variables.
19+
20+
Here is an example of setting the administrator account via shell:
21+
```
22+
set ZINC_FIRST_ADMIN_USER=admin
23+
set ZINC_FIRST_ADMIN_PASSWORD=Complexpass#123
24+
```
25+
!!! tip
26+
In most scenarios, you can use the administrator account to provide access for applications. Only when you need to integrate multiple applications with different permissions, you should create regular users.
27+
28+
29+
## Regular Users
30+
You can create/update users via the API:
31+
```
32+
[POST] /api/user
33+
{
34+
"_id": "prabhat",
35+
"name": "Prabhat Sharma",
36+
"role": "admin", // or user
37+
"password": "Complexpass#123"
38+
}
39+
```
40+
To get all users:
41+
```
42+
[GET] /api/user
43+
```
44+
To delete a user:
45+
```
46+
[DELETE] /api/user/${userId}
47+
```

manual/api/document_operation.md

Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
## Document Operations
2+
An index stores multiple documents. Users can perform CRUD operations (Create, Read, Update, Delete) on documents via the API. In SeaSearch, each document has a unique ID.
3+
4+
!!! tip
5+
Due to architectural design, SeaSearch’s performance for single document CRUD operations is much lower than that of ElasticSearch. Therefore, we recommend using batch operations whenever possible.
6+
7+
ElasticSearch Document APIs contain many additional parameters that are not meaningful to SeaSearch and are not supported. All query parameters are unsupported.
8+
9+
### Create Document
10+
ElasticSearch API: [Index Document](https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-index_.html)
11+
12+
### Update Document
13+
ElasticSearch’s update API supports partial updates to fields. SeaSearch only supports full document updates and does not support updating data via script or detecting if an update is a no-op.
14+
15+
If the document does not exist during an update, SeaSearch will create the corresponding document.
16+
17+
ElasticSearch API: [Update Document](https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-update.html)
18+
19+
### Delete Document
20+
Delete a document by its ID.
21+
22+
ElasticSearch API: [Delete Document](https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-delete.html)
23+
24+
### Get Document by ID
25+
```
26+
[GET] /api/${indexName}/_doc/${docId}
27+
```
28+
29+
### Batch Operations
30+
It is recommended to use batch operations to update indexes.
31+
32+
ElasticSearch API: [Bulk Document API](https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-bulk.html)

manual/api/index_management.md

Lines changed: 60 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,60 @@
1+
## Index Management
2+
In SeaSearch, users can create any number of indexes. An index is a collection of documents that can be searched, and a document can contain multiple searchable fields. Users specify the fields contained in the index via mappings and can customize the analyzers available to the index through settings. Each field can specify either a built-in or custom analyzer. The analyzer is used to split the content of a field into searchable tokens.
3+
4+
### Create Index
5+
To create a SeaSearch index, you can configure the mappings and settings at the same time. For more details about mappings and settings, refer to the following sections.
6+
7+
ElasticSearch API: [Create Index](https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-create-index.html)
8+
9+
### Configure Mappings
10+
Mappings define the types and attributes of fields in a document. Users can configure the mapping via the API.
11+
12+
SeaSearch supports the following field types:
13+
14+
- text
15+
- keyword
16+
- numeric
17+
- bool
18+
- date
19+
- vector
20+
21+
Other types, such as flattened, object, nested, etc., are not supported, and mappings do not support modifying existing fields (new fields can be added).
22+
23+
ElasticSearch Mappings API: [Put Mapping](https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-put-mapping.html)
24+
25+
ElasticSearch Mappings Explanation: [Mapping Types](https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-types.html)
26+
27+
### Configure Settings
28+
Index settings control the properties of the index. The most commonly used property is `analysis`, which allows you to customize the analyzers for the index. The analyzers defined here can be used by fields in the mappings.
29+
30+
ElasticSearch Settings API: [Update Settings](https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-update-settings.html)
31+
32+
ElasticSearch related explanation:
33+
- [Analyzer Concepts](https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-concepts.html)
34+
- [Specifying Analyzers](https://www.elastic.co/guide/en/elasticsearch/reference/current/specify-analyzer.html)
35+
36+
### Analyzer Support
37+
Analyzers can be configured as default when creating an index, or they can be set for specific fields. (See the previous section for related concepts from the ES documentation.)
38+
39+
SeaSearch supports the following analyzers, which can be found here: [ZincSearch Documentation](https://zincsearch-docs.zinc.dev/api/index/analyze/). The concepts such as tokenization and token filters are consistent with ES and support most of the commonly used analyzers and tokenizers in ES.
40+
41+
### Chinese Analyzer
42+
To enable the Chinese analyzer in the system, set the environment variable `ZINC_PLUGIN_GSE_ENABLE=true`.
43+
44+
If you need more comprehensive support for Chinese word dictionaries, set `ZINC_PLUGIN_GSE_DICT_EMBED = BIG`.
45+
46+
`GSE` is a standard analyzer, so you can directly assign the Chinese analyzer to fields in the mappings:
47+
```
48+
PUT /es/my-index/_mappings
49+
{
50+
"properties": {
51+
"content": {
52+
"type": "text",
53+
"analyzer": "gse_standard"
54+
}
55+
}
56+
}
57+
```
58+
If users have custom tokenization habits, they can specify their dictionary files by setting the environment variable `ZINC_PLUGIN_GSE_DICT_PATH=${DICT_PATH}`, where `DICT_PATH` is the actual path to the dictionary files. The `user.txt` file contains the dictionary, and the `stop.txt` file contains stop words. Each line contains a single word.
59+
60+
GSE will load the dictionary and stop words from this path and use the user-defined dictionary to segment Chinese sentences.

manual/api/overview.md

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
# Overview
2+
SeaSearch is developed based on ZincSearch and is compatible with ElasticSearch (ES) APIs. The concepts used in the API are similar to those in ElasticSearch, so users can directly refer to the [ElasticSearch API documentation](https://www.elastic.co/guide/en/elasticsearch/reference/current/rest-apis.html) and [ZincSearch API documentation](https://zincsearch-docs.zinc.dev/api-es-compatible/) for most API calls. This document introduces the commonly used APIs to help users quickly understand the main concepts and basic usage flow. It will also explain the modifications we made to the ZincSearch API and highlight the differences from the upstream API.
3+
4+
The ES-compatible APIs provided by SeaSearch can be accessed by adding the /es/ prefix in the URL. For example, the ES API URL is:
5+
```
6+
GET /my-index-000001/_search
7+
```
8+
The corresponding SeaSearch API URL is:
9+
```
10+
GET /es/my-index-000001/_search
11+
```

manual/api/search_document.md

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
## Search Documents
2+
### Query DSL
3+
To perform full-text search, use the DSL. For usage, refer to:
4+
5+
[Query DSL Documentation](https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl.html)
6+
7+
We do not support all query parameter options provided by ES. Unsupported parameters include: indices_boost, knn, min_score, retriever, pit, runtime_mappings, seq_no_primary_term, stats, terminate_after, version.
8+
9+
Search API: [Search API](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-search.html)
10+
11+
### Delete by Query
12+
To delete documents based on a query, use the delete-by-query operation. Like search, we do not support some ES parameters.
13+
14+
ElasticSearch API: [Delete by Query](https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-delete-by-query.html)
15+
16+
### Multi-Search
17+
Multi-search supports searching multiple indexes and running different queries on each index.
18+
19+
ElasticSearch API: [Multi-Search API Documentation](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-multi-search.html)
20+
21+
We extended the multi-search to support using the same scoring information across different indexes for more accurate score calculation. To enable this, set `unify_score=true` in the query.
22+
23+
`unify_score` is meaningful only in this scenario: when searching the same query across multiple indexes. For example, in Seafile, we create an index for each library. When globally searching across all accessible libraries, enabling unify_score ensures consistent scoring across different repositories, providing more accurate search results.
24+
```
25+
[POST] /es/_msearch?unify_score=true
26+
{"index": "t1"}
27+
{"query": {"bool": {"should": [{"match": {"filename": {"query": "数据库", "minimum_should_match": "-25%"}}}, {"match": {"filename.ngram": {"query": "数据库", "minimum_should_match": "80
28+
```

0 commit comments

Comments
 (0)