Monday, December 23, 2024

Google’s Proofread: AI-Driven Typing Accuracy in One Tap | Synced

Must read

Gboard, Google’s keyboard for mobile devices, utilizes statistical decoding to offer a smooth typing experience. It features both automatic and manual error correction capabilities, ensuring user-friendly interactions. Leveraging the impressive capabilities of Large Language Models (LLMs), Gboard enhances sentence- and paragraph-level corrections, revolutionizing the typing experience.

In a new paper Proofread: Fixes All Errors with One Tap, a Google research team introduces Proofread, an innovative Gboard feature powered by a server-side LLM. This feature allows for seamless sentence and paragraph corrections with a single tap. Launched on Pixel 8 devices, it benefits thousands of users daily.

The system comprises four key components: data generation, metrics design, model tuning, and model serving.

For Data Generation: a sophisticated error synthesis framework generates datasets by incorporating common keyboard errors to simulate user input. Additional steps ensure the data distribution closely aligns with the Gboard domain.

For Metrics Design: Multiple metrics are designed to evaluate the model from various perspectives. Given the variability in possible answers for longer texts, key metrics include checks for grammatical errors and semantic consistency based on LLMs.

For Model Tuning: Inspired by InstructGPT, the model undergoes Supervised Fine-Tuning followed by Reinforcement Learning (RL) tuning. During the RL tuning stage, Global Reward and Direct Reward techniques are employed, significantly enhancing the model’s performance. Results indicate that RL tuning reduces grammatical errors, decreasing the Bad ratio of the PaLM2-XS model by 5.74%.

For Model Serving: The model is deployed on TPU v5 in the Cloud with optimized latency achieved through quantization, bucketing, input segmentation, and speculative decoding. Speculative decoding alone reduces median latency by 39.4%.

This work showcases the substantial potential of LLMs to improve typing experiences by providing high-quality sentence and paragraph corrections. It highlights the transformative power of LLMs in user input interactions and suggests a fundamental improvement in how we engage with our devices.

The paper Proofread: Fixes All Errors with One Tap is on arXiv.


Author: Hecate He | Editor: Chain Zhang


Latest article