Khofly | Docs

Cloudflare Workers

1. Go to cloudflare.com and log in.

2. In your dashboard go to Workers & Pages and then Workers & Pages.

3. Click Create button, find Start from a template and pick LLM App.

4. Name your worker however you want and click Deploy.

5. Continue to the project.

6. In Deployments tab find the edit code button, should be top right.

7. Paste the code below in the editor.

8. Click Deploy go back to the dashboard.

9. In the Settings tab under Domains & Routes copy the active worker domain ( should be smth like name.email.workers.dev ) and paste it under /settings into AI Worker.

10. That's it, you can now play around with different models and redeploy any change to the worker that you want.

The model can be selected in /settings , speed and results will depend on the size of the model, you can find the full list of models here.

AI Worker Code

const corsHeaders = {
  'Access-Control-Allow-Headers': '*',
  'Access-Control-Allow-Methods': 'POST, OPTIONS', 
  'Access-Control-Allow-Origin': 'https://khofly.com', // Replace if you need with appropriate domain 
};

export default {
  async fetch(request, env) {
    const { method } = request;

    if (request.method === "OPTIONS") {
      return new Response("OK", {
        headers: corsHeaders
      });
    }

    if (method !== "POST") {
      return new Response('Method not allowed', { 
        status: 405, 
        headers: corsHeaders 
      });
    }

    const body = await request.json(); // parses JSON body

    const model = body.model; // For everything
    const messages = body?.messages; // For AI Chat
    const max_tokens = body?.max_tokens; // For AI Chat
    const temperature = body?.temperature; // For AI Chat
    const prompt = body?.prompt; // For AI Answers
    const source_lang = body?.source_lang; // For translate
    const target_lang = body?.target_lang; // For translate

    if (!model) return new Response('Model is missing', { status: 400 });

    const shouldStream = messages?.length;

    try {
      // Different reqBody for different scenarios
      const reqBody = model.includes("m2m100") ? {
        // For translations
        text: prompt,
        source_lang: source_lang,
        target_lang: target_lang
      } : shouldStream ? {
        // For AI Chat
        messages: messages,
        stream: true,
        max_tokens: max_tokens || 2048,
        temperature: temperature || 0.5
      } : {
        // For AI Answer
        prompt: prompt,
        max_tokens: 512,
        temperature: 0.2
      };

      let response = await env.AI.run(model, reqBody);

      if(shouldStream) {
        return new Response(response, {
          headers: {
            ...corsHeaders,
            "content-type": "text/event-stream"
           },
        });
      } else {
        return Response.json(response, {
          headers: corsHeaders
        });
      }
    } catch (error) {
      // Handle error
      return new Response('Error: ' + error?.message, { 
        status: 500, 
        headers: corsHeaders 
      });
    }
  }
};

Last updated: 06.04.2025

/docs/self-host-searxng

Self-Host SearXNG

Own your data by self-hosting your SearXNG instance

/docs/self-host-khofly

Self-Host Khofly

You can also host Khofly by yourself