Knowledge Bases & File Storage

Overview

Healthcare AI systems need access to relevant, trustworthy, and up-to-date information — clinical guidelines, imaging reports, patient summaries, and research papers.

ByteEngine provides:

Knowledge Bases — vector stores optimized for healthcare data (FHIR, PDFs, text, images)
File Storage — HIPAA-compliant storage for structured and unstructured medical files

Together, they form the foundation for RAG (Retrieval-Augmented Generation) in healthcare — enabling AI Workers to ground their reasoning in accurate, context-specific information.

1. What is a Knowledge Base?

A Knowledge Base (KB) in ByteEngine is a semantic, searchable repository where you store and query unstructured healthcare content such as:

Clinical documents (PDF, TXT, DOCX)
Research papers
SOAP notes
FHIR resource text fields (e.g., Observation.note)
Image captions or radiology reports

ByteEngine automatically:

Extracts and preprocesses the text
Generates embeddings using a domain-optimized model (e.g., BioLinkBERT, PubMedBERT)
Stores the data in a vector database
Enables semantic and contextual search for your AI Workers

Knowledge Base Architecture

Flow: [PDFs / FHIR / Text] → [AI Embedding Engine] → [Vector Store] → [AI Worker + Session → Context Retrieval]

2. Creating a Knowledge Base

Using the Console (No Code)

Navigate to Knowledge Bases → Create New
Name your KB (e.g., "Clinical Guidelines")
Upload files (PDF, CSV, TXT, or FHIR export)
Click "Ingest Data"

ByteEngine will preprocess and embed your files automatically.

UI Example: [Screenshot: Knowledge Base creation interface showing file upload and configuration options]

Using the API

curl -X POST "https://api.engine.boolbyte.com/api/knowledgebases" \
  -H "Authorization: Bearer $BYTEENGINE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Diabetes Knowledge Base",
    "description": "Clinical research and treatment guidelines for diabetes",
    "type": "text"
  }'

Upload Files to the Knowledge Base

curl -X POST "https://api.engine.boolbyte.com/api/knowledgebases/{kb_id}/upload" \
  -H "Authorization: Bearer $BYTEENGINE_API_KEY" \
  -F "[email protected]"

Using JavaScript SDK

import { EngineClient } from '@boolbyte/engine';

const client = new EngineClient({ apiKey: 'YOUR_API_KEY' });

// Create a knowledge base
const knowledgeBase = await client.knowledgeBase.createKnowledgeBase({
  name: 'Diabetes Knowledge Base',
  description: 'Clinical research and treatment guidelines for diabetes',
  type: 'text'
});

// Upload files to the knowledge base
const uploadResult = await client.knowledgeBase.uploadFile(knowledgeBase.data.id, {
  file: diabetesGuidelinesFile,
  name: 'diabetes-guidelines.pdf'
});

console.log('Knowledge base created:', knowledgeBase.data.id);

3. Querying a Knowledge Base

Once your KB is ready, you can run semantic searches.

Example API Query

curl -X POST "https://api.engine.boolbyte.com/api/knowledgebases/{kb_id}/search" \
  -H "Authorization: Bearer $BYTEENGINE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"query": "What are the treatment options for Type 2 Diabetes?"}'

Example Response

{
  "success": true,
  "data": {
    "matches": [
      {
        "score": 0.89,
        "source": "diabetes-guidelines.pdf",
        "snippet": "For Type 2 Diabetes, first-line therapy includes Metformin..."
      },
      {
        "score": 0.76,
        "source": "clinical-research-2024.txt",
        "snippet": "Studies show GLP-1 agonists reduce HbA1c by..."
      }
    ]
  }
}

4. Using Knowledge Bases in AI Workers

Knowledge Bases can be attached directly to Workers or Sessions, enabling RAG-powered reasoning.

YAML Example

worker:
  name: "diabetes-coach"
  model: "medgemma-27b"
  knowledge_bases:
    - "kb:diabetes-guidelines"
  context: "Use the diabetes KB to answer patient treatment queries."

Programmatic Example (JavaScript)

// Create a worker with knowledge base access
const worker = await client.worker.createWorker({
  name: 'diabetes-coach',
  defaultModelName: 'medgemma-27b',
  instructions: 'Use the diabetes knowledge base to answer patient treatment queries.',
  toolConfigs: {
    tools: [
      {
        toolName: 'knowledge_base',
        config: {
          knowledgeBaseId: 'diabetes-guidelines'
        }
      }
    ]
  }
});

// Run a task with knowledge base context
const session = await client.session.createSession({
  workerId: worker.data.id,
  metadata: { context: 'diabetes consultation' }
});

const task = await client.task.createTask(session.data.id, {
  instructions: 'Recommend medication for Type 2 diabetes based on current guidelines',
  model: 'medgemma-27b'
});

The Worker retrieves the most relevant text from your KB and includes it in the model's prompt automatically — no manual context injection needed.

5. File Storage Overview

File Storage is ByteEngine's secure, encrypted storage layer for all healthcare-related files — clinical reports, images, CSV exports, or DICOM files.

Every file you upload is:

Encrypted at rest (AES-256)
Scanned for PHI (Protected Health Information)
Indexed for AI and search
Linked to your FHIR resources where applicable

File Storage Architecture

Flow: [Upload File] → [Encryption] → [Metadata Index] → [Secure Access URL]

6. Uploading Files

Using the Console

Go to File Storage → Upload
Choose your file or drag-and-drop
Optionally attach metadata (e.g., patient ID, file type)
Click Upload

UI Example: [Screenshot: File upload interface showing drag-and-drop and metadata options]

Using the API

curl -X POST "https://api.engine.boolbyte.com/api/storage" \
  -H "Authorization: Bearer $BYTEENGINE_API_KEY" \
  -F "[email protected]" \
  -F "metadata={\"patient_id\":\"12345\",\"type\":\"LabReport\"}"

Example Response

{
  "success": true,
  "data": {
    "id": "file_abc123",
    "url": "https://storage.engine.boolbyte.com/file_abc123",
    "metadata": {
      "patient_id": "12345",
      "type": "LabReport"
    },
    "status": "stored",
    "createdAt": "2024-01-15T10:00:00.000Z"
  }
}

Using JavaScript SDK

// Upload a file
const file = await client.storage.uploadFile({
  file: labReportFile,
  metadata: {
    patient_id: '12345',
    type: 'LabReport',
    category: 'laboratory'
  }
});

console.log('File uploaded:', file.data.id);

7. Retrieving Files

Files can be retrieved securely using access tokens or API calls.

curl -X GET "https://api.engine.boolbyte.com/api/storage/file_abc123" \
  -H "Authorization: Bearer $BYTEENGINE_API_KEY"

Files can also be linked to FHIR DocumentReference resources for interoperability.

8. Linking Files to FHIR Resources

Example: attach a PDF to a patient's medical record.

curl -X POST "https://api.engine.boolbyte.com/api/fhir/DocumentReference" \
  -H "Authorization: Bearer $BYTEENGINE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "resourceType": "DocumentReference",
    "subject": {"reference": "Patient/123"},
    "content": [{
      "attachment": {
        "url": "https://storage.engine.boolbyte.com/file_abc123",
        "title": "Lab Results PDF"
      }
    }]
  }'

9. File-Based Triggers

You can configure subscriptions or workflows to trigger on new file uploads.

trigger:
  type: "file.uploaded"
  filter: "type == 'LabReport'"
  workflow: "notify-lab-team"

Real-world use case:

When a new lab report is uploaded, automatically run an AI Worker to summarize it and send the summary to the physician's dashboard.

10. RAG (Retrieval-Augmented Generation) in Practice

ByteEngine makes it effortless to build AI systems that think with context.

Example:

A "Clinical Summarizer" Worker that answers clinician queries based on uploaded patient files and guidelines.

Workflow Example

workflow:
  name: "clinical-summarizer"
  steps:
    - worker: "summarize-documents"
      input:
        kb: "patient-files"
        question: "{{workflow.input.query}}"

Output:

"Based on the uploaded lab report and SOAP notes, the patient's HbA1c trend indicates potential Type 2 Diabetes progression."

11. Best Practices

Area	Recommendation
File Naming	Use descriptive names with identifiers (e.g., Patient_123_LabReport_2024.pdf)
Knowledge Bases	Separate KBs by clinical domain for precision (e.g., Cardiology, Radiology)
Storage Security	Enable file access expiry or signed URLs for sharing
Compliance	Use EU/US data residency options for sensitive uploads
RAG Optimization	Limit retrieved context chunks to < 2KB for better LLM performance

12. Example: AI-Powered Research Assistant

Goal: Create a Worker that helps clinicians find the latest diabetes research.

Steps:

Create a Knowledge Base → upload research PDFs
Create a Worker → attach that KB
Ask questions in natural language

// Create research assistant worker
const researchWorker = await client.worker.createWorker({
  name: 'research-assistant',
  defaultModelName: 'medgemma-27b',
  instructions: 'Help clinicians find the latest diabetes research and treatment guidelines.',
  toolConfigs: {
    tools: [
      {
        toolName: 'knowledge_base',
        config: {
          knowledgeBaseId: 'diabetes-research'
        }
      }
    ]
  }
});

// Query the research assistant
const session = await client.session.createSession({
  workerId: researchWorker.data.id,
  metadata: { context: 'research query' }
});

const task = await client.task.createTask(session.data.id, {
  instructions: 'What are the new GLP-1 therapy guidelines?',
  model: 'medgemma-27b'
});

Output:

"According to ADA 2024 guidelines, GLP-1 receptor agonists are recommended as first-line for patients with cardiovascular risk factors."

13. Coming Soon: Hybrid Knowledge Graphs

ByteEngine will soon support FHIR + unstructured knowledge graph linking, allowing automatic relationships between structured EHR data and unstructured notes, e.g.:

[Patient] → [Observation: HbA1c] → [Lab Report PDF] → [KnowledgeBase: Diabetes Guidelines]

Real-World Implementation Examples

Clinical Documentation System

// Complete clinical documentation workflow
const clinicalDocs = {
  // 1. Upload patient documents
  uploadDocument: async (patientId, documentFile) => {
    const file = await client.storage.uploadFile({
      file: documentFile,
      metadata: {
        patient_id: patientId,
        type: 'ClinicalDocument',
        category: 'progress_notes'
      }
    });
    
    // 2. Create FHIR DocumentReference
    await client.dataStore.initializeFhirStoreClient('main-fhir-server');
    const fhirClient = client.dataStore.getFhirStoreClient();
    
    await fhirClient.create({
      resource: {
        resourceType: 'DocumentReference',
        status: 'current',
        subject: { reference: `Patient/${patientId}` },
        content: [{
          attachment: {
            url: file.data.url,
            title: documentFile.name
          }
        }]
      }
    });
    
    return file.data;
  },
  
  // 3. Add to knowledge base for AI access
  addToKnowledgeBase: async (fileId, knowledgeBaseId) => {
    return await client.knowledgeBase.uploadFile(knowledgeBaseId, {
      fileId: fileId,
      name: 'clinical-document'
    });
  }
};

Research Paper Analysis

// AI-powered research analysis system
const researchAnalysis = {
  // Upload research papers
  uploadResearch: async (paperFile) => {
    const file = await client.storage.uploadFile({
      file: paperFile,
  metadata: {
        type: 'ResearchPaper',
        category: 'diabetes_research'
      }
    });
    
    // Add to research knowledge base
    await client.knowledgeBase.uploadFile('diabetes-research-kb', {
      fileId: file.data.id,
      name: paperFile.name
    });
    
    return file.data;
  },
  
  // Query research with AI
  queryResearch: async (question) => {
    const session = await client.session.createSession({
      workerId: 'research-analyst',
      metadata: { context: 'research analysis' }
    });
    
    const task = await client.task.createTask(session.data.id, {
      instructions: `Based on the research papers in the knowledge base, answer: ${question}`,
      model: 'medgemma-27b'
    });
    
    return task.data;
  }
};

Next Steps

Learn about AI Workers - Create intelligent healthcare agents
Build RAG-enabled Workflows - Automate knowledge-based processes
View REST API for File Uploads - Complete API documentation
Explore Full Example on GitHub - Real-world implementations
Quick Start Guide - Get started with ByteEngine

Overview​

1. What is a Knowledge Base?​

Knowledge Base Architecture​

2. Creating a Knowledge Base​

Using the Console (No Code)​

Using the API​

Upload Files to the Knowledge Base​

Using JavaScript SDK​

3. Querying a Knowledge Base​

Example API Query​

Example Response​

4. Using Knowledge Bases in AI Workers​

YAML Example​

Programmatic Example (JavaScript)​

5. File Storage Overview​

File Storage Architecture​

6. Uploading Files​

Using the Console​

Using the API​

Example Response​

Using JavaScript SDK​

7. Retrieving Files​

8. Linking Files to FHIR Resources​

9. File-Based Triggers​

10. RAG (Retrieval-Augmented Generation) in Practice​

Workflow Example​

11. Best Practices​

12. Example: AI-Powered Research Assistant​

13. Coming Soon: Hybrid Knowledge Graphs​

Real-World Implementation Examples​

Clinical Documentation System​

Research Paper Analysis​

Next Steps​

Overview

1. What is a Knowledge Base?

Knowledge Base Architecture

2. Creating a Knowledge Base

Using the Console (No Code)

Using the API

Upload Files to the Knowledge Base

Using JavaScript SDK

3. Querying a Knowledge Base

Example API Query

Example Response

4. Using Knowledge Bases in AI Workers

YAML Example

Programmatic Example (JavaScript)

5. File Storage Overview

File Storage Architecture

6. Uploading Files

Using the Console

Using the API

Example Response

Using JavaScript SDK

7. Retrieving Files

8. Linking Files to FHIR Resources

9. File-Based Triggers

10. RAG (Retrieval-Augmented Generation) in Practice

Workflow Example

11. Best Practices

12. Example: AI-Powered Research Assistant

13. Coming Soon: Hybrid Knowledge Graphs

Real-World Implementation Examples

Clinical Documentation System

Research Paper Analysis

Next Steps