Intro to streaming
Don't we all know what streaming is? When we watch Netflix or YouTube, we watch on-demand as soon as the video is available, without waiting for the entire data to be processed and downloaded -That's streaming. In the context of web applications, it is parsing data and rendering it on-demand. If this feels familiar, that's because it is. Streaming data has ever existed and is a default in Browsers, how images are loaded, how font files are parsed, and generally speaking, how data is received over the network. Streaming has not been available to web developers until recently, through the Streams API; it gives developers access to unlocks that streaming affords.
Streaming involves breaking a resource that you want to receive over a network down into small chunks, then processing it bit by bit. Browsers already do this when receiving media assets — videos buffer and play as more of the content downloads, and sometimes you'll see images display gradually as more is loaded too.
Recursion in Next.js components
import { Suspense } from 'react'
export default function Page() {
let i = 4
return (
<div>
<h1 className='text-center text-4xl lg:text-5xl'>
I am trying to understand this generator function component.
</h1>
<Suspense>
<Dynamic num={i} />
</Suspense>
</div>
)
}
async function Dynamic({ num }: { num: number }) {
if (num === 0)
return (
<h2 className='text-3xl lg:text-4xl font-bold text-center text-green-500 my-4'>
Data, hey!
</h2>
)
await new Promise(resolve => setTimeout(resolve, 2000))
--num
return (
<Suspense
fallback={
<h2 className='text-3xl lg:text-4xl font-bold text-center opacity-80 my-4'>
Loading...
</h2>
}
>
<h3>text</h3>
<Dynamic num={num} />
</Suspense>
)
}
The GIF shows text being loaded sequentially. The idea behind this is simple recursion in a server component.
How does it work? Like how basic recursion works:
- a recursive call,
- a base condition to terminate the recursion.
The component is being rendered within itself - the recursive call; the num
prop specified is a way to introduce a base condition, so when its value becomes 0
we return a final value, before num
is 0
the component suspends, and the data we show is actually the value of the fallback prop of the Suspense
component.
<Suspense
fallback={
<h2 className='text-3xl lg:text-4xl font-bold text-center opacity-80 my-4'>
Loading...
</h2>
}
>
The text in the JSX is just to show the component is being recursively rendered. Now this is a contrived example but the idea is, we need something sequential, a loop, something we can keep getting data from till we have a final result; we need something that is recursive-like, and also pauses and holds data; doesn't that sound like generators?
Streaming using Generator* functions
Generator functions are functions that return a special type of an iterator called a generator when called. Contrary to normal functions, they are pausable and can return values in sequence, and optionally return a final value as in return
in normal functions.
To consume a generator function, we instantiate as we do for classes in JS (or TS).
class myClass() {}
const newClass = myClass()
// generator functions
function* gen() {}
const genObj = gen()
An iterator, strictly speaking; a generator, has a .next
method that starts executing the generator function, each yield
keyword pauses it, and a return
keyword stop it. Calling next()
returns an object with two properties:
value, the value yielded
done, a boolean, indicating if the value is the last in the sequence
function* gen() {
yield 1
yield 2
yield 3
return 4
}
const genObj = gen()
const val = genObj.next()
console.log(val) // { value: 1, done: false }
const val2 = genObj.next()
console.log(val2) // { value: 2, done: false }
const val3 = genObj.next()
console.log(val3) // { value: 3 , done: false }
const finalVal = genObj.next()
console.log(finalVal) // { value: 4, done: true }
One can say they are infinite-state machines. You can learn more about iterators and generators from MDN
yield
or return
keyword, and they can be async tooUsing generator functions to do streaming
It's essentially the same idea as the recursive component but using generator functions. It gives us more control of our data and highlights a creative way of using generator functions.
import { Suspense } from 'react'
function* generator() {
yield <h2>hello</h2>
yield <h2>world</h2>
return <h2>Hello World</h2>
}
export default function Page() {
return (
<div>
<h1 className='text-center text-4xl lg:text-5xl'>
I am trying to understand this generator function component.
</h1>
<Suspense>
<GeneratorComponent generator={generator()} />
</Suspense>
</div>
)
}
async function GeneratorComponent({
generator,
}: {
generator: Generator<JSX.Element, JSX.Element, JSX.Element>
}) {
const { value, done } = generator.next()
await new Promise(resolve => setTimeout(resolve, 2000))
if (done) return <div className='text-center'>{value}</div>
return (
<Suspense fallback={<div className='text-center'>{value}</div>}>
<GeneratorComponent generator={generator} />
</Suspense>
)
}
some streaming right?
This technique was inspired by Theo's talk for NextConf 2023.
Streaming data from an LLM
This technique can also be adapted to stream responses from a Large Language Model (LLM). Here's an example of how it can be done:
import { Suspense } from 'react'
import {
EnhancedGenerateContentResponse,
GoogleGenerativeAI,
} from '@google/generative-ai'
export default async function Page() {
const stream = await responseStream(`What are you in 50 lines?`)
return (
<div>
<h1 className='text-5xl'>
Streaming response from an LLM in server components.
</h1>
<Suspense fallback={<div>Loading...</div>}>
<StreamData generator={stream} />
</Suspense>
</div>
)
}
async function responseStream(prompt: string) {
const genAI = new GoogleGenerativeAI(process.env.GEMINI_API_KEY!)
const model = genAI.getGenerativeModel({ model: `gemini-pro` })
const result = await model.generateContentStream(prompt as string)
return result.stream
}
async function StreamData({
generator,
}: {
generator: AsyncGenerator<EnhancedGenerateContentResponse, any, unknown>
}) {
const response = await generator.next()
await new Promise(resolve => setTimeout(resolve, 500))
if (response.done) {
return null
}
return (
<>
{response.value.text()}
<Suspense>
<StreamData generator={generator} />
</Suspense>
</>
)
}
Why it might not be used? One limitation is that we can't pass arguments directly to server components.
Usually, prompts supplied to an LLM are not known beforehand; since we can't pass props (data) from client to server components, that renders (pun intended) this approach a one-off; it's essentially a one-way ticket from the server to the client environment.
We can URL state as a workaround.
The URL is accessible to both server and client environments, as such, data needed to be passed from the client component can be added as a URL parameter, particularly as a search parameter, the server component can then read this parameter. For example, you can append ?prompt=yourPrompt
to the URL and access it in your server component.
Consequently, the Vercel SDK provides various ways of streaming data from an LLM or alternate source.
Streaming with the Vercel AI SDK
In an earlier version, the Vercel AI SDK also briefly explores the idea of streaming in server components - using generator functions. The technique appears to have been removed, abstracted maybe, or replaced with streaming in Next.js server actions, in the latest version of the library. If you are working with AI, the SDK provides various methods suited for different use cases. You can check the library here
In conclusion, streaming in server components using Next.js offers a creative way to handle data efficiently. By leveraging the asynchronous nature of generator functions, developers can create on-demand experiences for their users.
Thank you for re...