Posted by Caren Chang – Developer Relations Engineer, Chengji Yan – Software program Engineer, Taj Darra – Product Supervisor
We’re excited to announce a set of on-device GenAI APIs, as a part of ML Package, that can assist you combine Gemini Nano in your Android apps.
To begin, we’re releasing 4 new APIs:
- Summarization: to summarize articles and conversations
- Proofreading: to shine brief textual content
- Rewriting: to reword textual content in several kinds
- Picture Description: to offer brief description for photos
Key advantages of GenAI APIs
GenAI APIs are excessive degree APIs that permit for simple integration, much like current ML Package APIs. This implies you’ll be able to anticipate high quality outcomes out of the field with out further effort for immediate engineering or fantastic tuning for particular use circumstances.
GenAI APIs run on-device and thus present the next advantages:
- Enter, inference, and output information is processed domestically
- Performance stays the identical with out dependable web connection
- No extra value incurred for every API name
To stop misuse, we additionally added security safety in varied layers, together with base mannequin coaching, safety-aware LoRA fine-tuning, enter and output classifiers and security evaluations.
How GenAI APIs are constructed
There are 4 essential parts that make up every of the GenAI APIs.
- Gemini Nano is the bottom mannequin, as the muse shared by all APIs.
- Small API-specific LoRA adapter fashions are educated and deployed on high of the bottom mannequin to additional enhance the standard for every API.
- Optimized inference parameters (e.g. immediate, temperature, topK, batch dimension) are tuned for every API to information the mannequin in returning the most effective outcomes.
- An analysis pipeline ensures high quality in varied datasets and attributes. This pipeline consists of: LLM raters, statistical metrics and human raters.
Collectively, these parts make up the high-level GenAI APIs that simplify the hassle wanted to combine Gemini Nano in your Android app.
Evaluating high quality of GenAI APIs
For every API, we formulate a benchmark rating primarily based on the analysis pipeline talked about above. This rating relies on attributes particular to a job. For instance, when evaluating the summarization job, one of many attributes we have a look at is “grounding” (ie: factual consistency of generated abstract with supply content material).
To offer out-of-box high quality for GenAI APIs, we utilized characteristic particular fine-tuning on high of the Gemini Nano base mannequin. This resulted in a rise for the benchmark rating of every API as proven beneath:
Use case in English | Gemini Nano Base Mannequin | ML Package GenAI API |
---|---|---|
Summarization | 77.2 | 92.1 |
Proofreading | 84.3 | 90.2 |
Rewriting | 79.5 | 84.1 |
Picture Description | 86.9 | 92.3 |
As well as, it is a fast reference of how the APIs carry out on a Pixel 9 Professional:
Prefix Velocity (enter processing price) |
Decode Velocity (output era price) |
|
---|---|---|
Textual content-to-text | 510 tokens/second | 11 tokens/second |
Picture-to-text | 510 tokens/second + 0.8 seconds for picture encoding | 11 tokens/second |
Pattern utilization
That is an instance of implementing the GenAI Summarization API to get a one-bullet abstract of an article:
val articleToSummarize = "We're excited to announce a set of on-device generative AI APIs..." // Outline job with desired enter and output format val summarizerOptions = SummarizerOptions.builder(context) .setInputType(InputType.ARTICLE) .setOutputType(OutputType.ONE_BULLET) .setLanguage(Language.ENGLISH) .construct() val summarizer = Summarization.getClient(summarizerOptions) droop enjoyable prepareAndStartSummarization(context: Context) { // Examine characteristic availability. Standing will likely be one of many following: // UNAVAILABLE, DOWNLOADABLE, DOWNLOADING, AVAILABLE val featureStatus = summarizer.checkFeatureStatus().await() if (featureStatus == FeatureStatus.DOWNLOADABLE) { // Obtain characteristic if needed. // If downloadFeature will not be referred to as, the primary inference request will // additionally set off the characteristic to be downloaded if it isn't already // downloaded. summarizer.downloadFeature(object : DownloadCallback { override enjoyable onDownloadStarted(bytesToDownload: Lengthy) { } override enjoyable onDownloadFailed(e: GenAiException) { } override enjoyable onDownloadProgress(totalBytesDownloaded: Lengthy) {} override enjoyable onDownloadCompleted() { startSummarizationRequest(articleToSummarize, summarizer) } }) } else if (featureStatus == FeatureStatus.DOWNLOADING) { // Inference request will mechanically run as soon as characteristic is // downloaded. // If Gemini Nano is already downloaded on the system, the // feature-specific LoRA adapter mannequin will likely be downloaded very // shortly. Nonetheless, if Gemini Nano will not be already downloaded, // the obtain course of might take longer. startSummarizationRequest(articleToSummarize, summarizer) } else if (featureStatus == FeatureStatus.AVAILABLE) { startSummarizationRequest(articleToSummarize, summarizer) } } enjoyable startSummarizationRequest(textual content: String, summarizer: Summarizer) { // Create job request val summarizationRequest = SummarizationRequest.builder(textual content).construct() // Begin summarization request with streaming response summarizer.runInference(summarizationRequest) { newText -> // Present new textual content in UI } // You can even get a non-streaming response from the request // val summarizationResult = summarizer.runInference(summarizationRequest) // val abstract = summarizationResult.get().abstract } // Be sure you launch the useful resource when not wanted // For instance, on viewModel.onCleared() or exercise.onDestroy() summarizer.shut()
For extra examples of implementing the GenAI APIs, take a look at the official documentation and samples on GitHub:
Use circumstances
Right here is a few steering on how one can finest use the present GenAI APIs:
For Summarization, think about:
- Dialog messages or transcripts that contain 2 or extra customers
- Articles or paperwork lower than 4000 tokens (or about 3000 English phrases). Utilizing the primary few paragraphs for summarization is often adequate to seize a very powerful data.
For Proofreading and Rewriting APIs, think about using them in the course of the content material creation course of for brief content material beneath 256 tokens to assist with duties corresponding to:
- Refining messages in a specific tone, corresponding to extra formal or extra informal
- Sprucing private notes for simpler consumption later
For the Picture Description API, think about it for:
- Producing titles of photos
- Producing metadata for picture search
- Using descriptions of photos in use circumstances the place the photographs themselves can’t be displayed, corresponding to inside an inventory of chat messages
- Producing various textual content to assist visually impaired customers higher perceive content material as a complete
GenAI API in manufacturing
Envision is an app that verbalizes the visible world to assist people who find themselves blind or have low imaginative and prescient lead extra unbiased lives. A standard use case within the app is for customers to take an image to have a doc learn out loud. Using the GenAI Summarization API, Envision is now capable of get a concise abstract of a captured doc. This considerably enhances the person expertise by permitting them to shortly grasp the details of paperwork and decide if a extra detailed studying is desired, saving them effort and time.

Supported gadgets
GenAI APIs can be found on Android gadgets utilizing optimized MediaTek Dimensity, Qualcomm Snapdragon, and Google Tensor platforms by way of AICore. For a complete listing of gadgets that assist GenAI APIs, consult with our official documentation.
Study extra
Begin implementing GenAI APIs in your Android apps at this time with steering from our official documentation and samples on GitHub: AI Catalog GenAI API Samples with Compose, ML Package GenAI APIs Quickstart.