Awentail


A
W
E

Knowledge Base

The knowledge base is how your AI assistant learns your content. Upload documents, scrape websites and Awentail’s RAG engine takes care of the rest.

Supported file formats

FormatExtensionNotes
PDF.pdfText-based PDF (scanned PDF with OCR coming soon)
Word.docxFull text extraction including tables
Plain text.txtDirect content loading
CSV.csvRow-by-row processing

Uploading documents

  1. Go to your assistant’s Knowledge Base tab
  2. Click Upload or drag and drop files
  3. Awentail automatically:
    • Extracts text from the document
    • Splits it into optimized chunks (150 words with 30-word overlap)
    • Generates vector embeddings using OpenAI text-embedding-3-small
    • Stores vectors in PostgreSQL with pgvector for fast similarity search

You can upload multiple files at once. Each file appears in the document list with its name, size and chunk count.

Web scraping

For content that lives on the web:

  1. Go to the Knowledge Base tab
  2. Click Scrape web
  3. Enter a URL (e.g. https://example.com/pricing)
  4. Awentail downloads the page, extracts text content (removes navigation, footer, scripts) and indexes it

Scraping limits depend on your plan:

PlanScrapes per month
Free
Starter1
Pro3
Business10

How RAG works

When a visitor asks a question, Awentail uses a hybrid search approach:

  1. Vector search (70% weight) — Finds semantically similar chunks using cosine similarity of embeddings
  2. Keyword search (30% weight) — Uses PostgreSQL tsvector full-text search for exact term matches
  3. LLM Reranking — When the top result score is below 0.78, the LLM reranks results for better accuracy

This hybrid approach outperforms pure vector search, especially for technical or domain-specific content.

Managing documents

Tip: Split large documents into focused topics for better search accuracy. A 5-page product FAQ will perform better than a 200-page manual.

Best practices