SDK & Developer Tools

Build with On-Device AI

Integrate private, fast AI into your applications with our optimized mobile SDK. Run inference locally, protect user data, and deliver sub-second responses.

Quick Start API Reference

Get Started in Minutes

Quick Start Guide

Step 1

Install

            # npm

            npm install @pocketai/sdk

            # or pip

            pip install pocketai

Step 2

Initialize

            import { PocketAI } from '@pocketai/sdk';

            const ai = new PocketAI({

              apiKey: 'pk_your_api_key',

              model: 'llama-3.2-1b',

              onDevice: true

            });

Step 3

Run Inference

            const response = await ai.query({

              prompt: 'Summarize this doc',

              context: documentText

            });

            console.log(response.text);

SDK Features

Everything You Need to Build

iOS Native

Optimized Swift SDK for iOS 16+. Leverage Core ML and Apple Neural Engine for blazing-fast on-device inference with minimal battery drain.

Android Native

Kotlin SDK with GPU and NPU acceleration. Supports Qualcomm AI Engine, MediaTek APU, and Samsung Exynos NPU for maximum performance.

React Native

Cross-platform bridge for React Native apps. Single codebase, native performance. Supports both old and new architecture with Turbo Modules.

Model Hub

Access 20+ optimized on-device models including Llama 3.2, Phi-3, Gemma, and Mistral. Pre-quantized for mobile with no quality loss.

Custom Training

Fine-tune models on your own data locally. Use federated learning for privacy-preserving training across devices without data leaving the phone.

Apps Studio

Visual builder for creating LLM-powered apps. Drag-and-drop interface to chain prompts, models, and tools into production-ready workflows.

API Reference

Core Methods

Simple, powerful methods to integrate on-device AI into any application.

Method

Description

init(config)

Initialize the PocketAI SDK with your API key, model selection, and runtime configuration. Returns a ready-to-use client instance.

query(params)

Send a prompt and receive a complete response. Supports context injection, temperature control, and max token limits. Runs entirely on-device.

streamQuery(params)

Stream tokens in real-time as the model generates them. Ideal for chat interfaces, providing a responsive user experience with token-by-token output.

loadModel(id)

Download and cache a model from the Model Hub to the device. Supports background downloads, progress callbacks, and automatic version management.

getStatus()

Retrieve current SDK status including loaded models, device capabilities, available memory, inference speed benchmarks, and battery impact metrics.

Build with On-Device AI

Quick Start Guide

Install

Initialize

Run Inference

Everything You Need to Build

iOS Native

Android Native

React Native

Model Hub

Custom Training

Apps Studio

Core Methods

Supported Platforms

iOS

Android

React Native

Flutter

Start Building Today