Skip to content

feat: autorag #541

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 27 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
ea82aa4
feat: move linked project check to requireNuxtHubLinkedProject
RihanArfan Apr 14, 2025
501caa6
feat: add hubAI().models() and .toMarkdown()
RihanArfan Apr 14, 2025
4f19cdd
feat: autorag
RihanArfan Apr 14, 2025
61a72d8
chore: upgrade dependencies
RihanArfan Apr 14, 2025
5f80eef
up
atinux Apr 14, 2025
ccbf589
up
atinux Apr 14, 2025
aed6aa4
docs: add workers changelog (#542)
atinux Apr 14, 2025
6c5143e
docs: update link
atinux Apr 15, 2025
082b165
docs: fix nitro typo (#544)
leomp12 Apr 15, 2025
5e13428
docs: update workers changelog image (#547)
HugoRCD Apr 15, 2025
9b95ec6
docs: fix mobile menu styling
atinux Apr 17, 2025
be403fd
docs: add observability changelog image (#550)
HugoRCD Apr 21, 2025
055dab2
feat: support observability and additional bindings (#549)
RihanArfan Apr 22, 2025
3f0f9da
docs: only link to feature if docs exist (#543)
RihanArfan Apr 22, 2025
1079702
chore(release): v0.8.25
atinux Apr 22, 2025
a959285
chore(deps): upgrade dependencies
RihanArfan Apr 22, 2025
238ee65
chore(deps): update autofix-ci/action digest to 551dded (#551)
renovate[bot] Apr 22, 2025
ad3f3aa
chore(deps): update all non-major dependencies (#552)
renovate[bot] Apr 22, 2025
6ab6a79
docs: update version
RihanArfan Apr 22, 2025
ff6ad22
docs: add `models()` to `hubAI()`
RihanArfan Apr 22, 2025
d9f1905
fix: exclude gateway type from `hubAI()`
RihanArfan Apr 23, 2025
48a2aab
docs: autorag mention on vectorize
RihanArfan Apr 23, 2025
eb8e7db
docs: autorag
RihanArfan Apr 23, 2025
0853188
Merge branch 'main' into feat/autorag
RihanArfan Apr 23, 2025
7127a8a
docs: change json to ts for syntax highlighting
RihanArfan Apr 23, 2025
dedbb02
docs: changelog
RihanArfan Apr 25, 2025
aa87441
docs: improvements
RihanArfan Apr 27, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
64 changes: 64 additions & 0 deletions docs/content/changelog/hub-autorag.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
---
title: Introducing hubAutoRAG()
description: "Create fully-managed RAG pipelines to power your AI applications with accurate and up-to-date information."
date: 2025-04-26
image: '/images/changelog/nuxthub-autorag.jpg'
authors:
- name: Rihan Arfan
avatar:
src: https://avatars.githubusercontent.com/u/20425781?v=4
to: https://bsky.app/profile/rihan.dev
username: rihan.dev
---

::tip
This feature is available from [`@nuxthub/core >= v0.8.26`](https://github.com/nuxt-hub/core/releases/tag/v0.8.26)
::


We are excited to introduce [`hubAutoRAG()`](/docs/features/autorag). Cloudflare [AutoRAG](https://developers.cloudflare.com/autorag/) lets you create fully-managed, retrieval-augmented generation pipelines that continuously updates and scales on Cloudflare. With AutoRAG, you can integrate context-aware AI into your Nuxt applications without managing infrastructure.

If you are currently using [`hubVectorize()`](/docs/features/vectorize), you may be interested in switching to `hubAutoRAG()` for a simplified developer experience. AutoRAG automatically indexes your data into vector embeddings optimized for semantic search. Once a data source is connected, indexing runs continuously in the background to keep your knowledge base fresh and queryable.

## How to use hubAutoRAG()

1. Update `@nuxthub/core` to the latest version (`v0.8.26` or later)

2. Enable `hub.ai` in your `nuxt.config.ts`

```ts [nuxt.config.ts]
export default defineNuxtConfig({
hub: {
ai: true
}
})
```

3. Create an AutoRAG instance from the [Cloudflare dashboard](https://dash.cloudflare.com/?to=/:account/ai/autorag) and add your data source.

::callout{to="https://dash.cloudflare.com/?to=/:account/ai/autorag"}
Go to [AutoRAG](https://dash.cloudflare.com/?to=/:account/ai/autorag) in the [Cloudflare dashboard](https://dash.cloudflare.com/?to=/:account/ai/autorag)
::

4. Start using [`hubAutoRAG()`](/docs/features/browser) in your server routes

```ts [server/api/rag.ts]
export default eventHandler(async () => {
const autorag = hubAutoRAG("nuxt-ui") // access AutoRAG instance
return await autorag.aiSearch({
query: "How do I create a modal with Nuxt UI?",
model: "@cf/meta/llama-3.3-70b-instruct-sd",
rewrite_query: true,
max_num_results: 2,
ranking_options: {
score_threshold: 0.7,
},
})
})
```

5. [Deploy your project with NuxtHub](/docs/getting-started/deploy)

::note{to="/docs/features/autorag"}
Read the documentation about `hubAutoRAG()` to learn more.
::
69 changes: 54 additions & 15 deletions docs/content/docs/2.features/ai.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,8 +22,8 @@ const response = await hubAI().run('@cf/runwayml/stable-diffusion-v1-5-img2img',

```ts [embeddings.ts]
// returns embeddings that can be used for vector searches in tools like Vectorize
const embeddings = await hubAI().run("@cf/baai/bge-base-en-v1.5", {
text: "NuxtHub AI uses `hubAI()` to run models."
const embeddings = await hubAI().run("@cf/baai/bge-base-en-v1.5", {
text: "NuxtHub AI uses `hubAI()` to run models."
});
```
::
Expand Down Expand Up @@ -100,16 +100,55 @@ export default defineEventHandler(async () => {
::

::field{name="AI Gateway" type="object"}
Options for configuring [`AI Gateway`](#ai-gateway) - `id`, `skipCache`, and `cacheTtl`.
Options for configuring [`AI Gateway`](#ai-gateway) - `id`, `skipCache`, and `cacheTtl`.
::
::

### `models()`

List all available models programatically.

```ts [server/api/models-test.ts]
export default defineEventHandler(async () => {
const ai = hubAI() // access AI bindings
return await ai.models({ page: 2 })
})
```

#### Options

::field-group
::field{name="params" type="object"}
::collapsible
::field{name="author" type="string"}
The author of the model to filter by.

::field{name="hide_experimental" type="boolean"}
Whether to hide experimental models.

::field{name="page" type="number"}
The page of results to return.

::field{name="per_page" type="number"}
The number of results to return per page.

::field{name="search" type="string"}
A search term to filter models by.

::field{name="source" type="number"}
The source ID to filter by.

::field{name="task" type="string"}
The task name to filter by.
::
::
::

## Tools

Tools are actions that your LLM can execute to run functions or interact with external APIs. The result of these tools will be used by the LLM to generate additional responses.
Tools are actions that your LLM can execute to run functions or interact with external APIs. The result of these tools will be used by the LLM to generate additional responses.

This can help you supply the LLM with real-time information, save data to a KV store, or provide it with external data from your database.
This can help you supply the LLM with real-time information, save data to a KV store, or provide it with external data from your database.

With Workers AI, tools have 4 properties:
- `name`: The name of the tool
Expand All @@ -125,7 +164,7 @@ const tools = [
parameters: {
type: 'object',
properties: {
city: {
city: {
type: 'number',
description: 'The city to retrieve weather information for'
},
Expand All @@ -152,7 +191,7 @@ const tools = [
::

::field{name="parameters" type="JsonSchema7"}
The parameters and options for parameters that the model will use to run the tool.
The parameters and options for parameters that the model will use to run the tool.
::collapsible
::field{name="type" type="string"}
The type of your functions parameter. It's recommended to use an `object` so you can easily add additional properties in the future.
Expand Down Expand Up @@ -189,7 +228,7 @@ npx nypm i @cloudflare/ai-utils
import { runWithTools } from '@cloudflare/ai-utils'

export default defineEventHandler(async (event) => {
return await runWithTools(hubAI(), '@cf/meta/llama-3.1-8b-instruct',
return await runWithTools(hubAI(), '@cf/meta/llama-3.1-8b-instruct',
{
messages: [
{ role: 'user', content: 'What is the weather in New York?' },
Expand All @@ -201,8 +240,8 @@ export default defineEventHandler(async (event) => {
parameters: {
type: 'object',
properties: {
city: {
type: 'number',
city: {
type: 'number',
description: 'The city to retrieve weather information for'
},
},
Expand All @@ -214,7 +253,7 @@ export default defineEventHandler(async (event) => {
},
},
]
},
},
{
// options
streamFinalResponse: true,
Expand Down Expand Up @@ -300,7 +339,7 @@ export default defineEventHandler(async () => {
::

::field{name="cacheTtl" type="number"}
Controls the [Cache TTL](https://developers.cloudflare.com/ai-gateway/configuration/caching/#cache-ttl-cf-cache-ttl), the duration (in seconds) that a cached request will be valid for. The minimum TTL is 60 seconds and maximum is one month.
Controls the [Cache TTL](https://developers.cloudflare.com/ai-gateway/configuration/caching/#cache-ttl-cf-cache-ttl), the duration (in seconds) that a cached request will be valid for. The minimum TTL is 60 seconds and maximum is one month.
::
::

Expand All @@ -310,7 +349,7 @@ The recommended method to handle text generation responses is streaming.

LLMs work internally by generating responses sequentially using a process of repeated inference — the full output of a LLM model is essentially a sequence of hundreds or thousands of individual prediction tasks. For this reason, while it only takes a few milliseconds to generate a single token, generating the full response takes longer.

If your UI waits for the entire response to be generated, a user may see a loading spinner for several seconds before the response is displayed.
If your UI waits for the entire response to be generated, a user may see a loading spinner for several seconds before the response is displayed.

Streaming lets you start displaying the response as soon as the first tokens are generated, and append each additional token until the response is complete. This yields a much better experience for the end user. Displaying text incrementally as it’s generated not only provides instant responsiveness, but also gives the end-user time to read and interpret the text.

Expand Down Expand Up @@ -386,7 +425,7 @@ npx nypm i ai @ai-sdk/vue workers-ai-provider

### `useChat()`

`useChat()` is a Vue composable provided by the Vercel AI SDK that handles streaming responses, API calls, state for your chat.
`useChat()` is a Vue composable provided by the Vercel AI SDK that handles streaming responses, API calls, state for your chat.

It requires a `POST /api/chat` endpoint that uses the `hubAI()` server composable and returns a compatible stream for the Vercel AI SDK.

Expand Down Expand Up @@ -457,6 +496,6 @@ Explore open source templates made by the community:
::


## Pricing
## Pricing

:pricing-table{:tabs='["AI"]'}
Loading