Generate text embeddings using Apple's on-device NLContextualEmbedding model with the AI SDK.
The Apple Embeddings provider uses Apple's NLContextualEmbedding
to generate contextual text embeddings entirely on-device. This is Apple's implementation of a BERT-like transformer model integrated into iOS 17+, providing privacy-preserving text understanding capabilities.
NLContextualEmbedding uses a transformer-based architecture trained with masked language modeling (similar to BERT). Apple provides three optimized models grouped by writing system:
Each model is multilingual within its script family, enabling cross-lingual semantic understanding. The models are compressed and optimized for Apple's Neural Engine, typically under 100MB when downloaded.
The embeddings model supports multiple languages. You can specify the language using ISO 639-1 codes or full names:
For list of all supported languages, check Apple documentation.
The default language is english.
Apple's NLContextualEmbedding requires downloading language-specific assets to the device. The provider automatically requests assets when needed, but you can also prepare them manually:
The asset management system is designed to be efficient and user-friendly. When you call prepare()
for a language, the system first checks if the required assets are already present on the device. If they are, the method resolves immediately without any network activity, making subsequent embedding operations instant.
All language models and assets are stored in Apple's system-wide assets catalog, separate from your app bundle. This means zero impact on your app's size. Assets may already be available if the user has previously used other apps, or if system features have requested them.
For advanced use cases, you can access the embeddings API directly:
Performance results showing processing time in milliseconds per embedding across different text lengths:
Device | Short (~10 tokens) | Medium (~30 tokens) | Long (~90 tokens) |
---|---|---|---|
iPhone 16 Pro | 19.19 | 21.53 | 33.59 |
Each category is tested with 5 consecutive runs to calculate reliable averages and account for system variability.