What is Gemma?
Gemma is a series of lightweight and advanced open AI models developed by Google DeepMind and other teams at Google. Based on the same technology as the Gemini model, it aims to help developers and researchers build responsible AI applications. The Gemma model series includes two models with different weight scales: Gemma 2B and Gemma 7B. It provides pretrained and instruction-tuned versions, supports multiple frameworks such as JAX, PyTorch, and TensorFlow, and can run efficiently on different devices. On June 28, the second-generation model Gemma 2 was released.
The main features of Gemma
- Lightweight Architecture: The Gemma model is designed to be lightweight, making it easy to run in various computing environments, including personal computers and workstations.
- Open Model: The weights of the Gemma model are open, allowing users to use and distribute them commercially while complying with the licensing agreement.
- Pre-training and Instruction Tuning: It provides both a pre-trained model and an instruction-tuned version. The latter ensures the responsible behavior of the model through Reinforcement Learning from Human Feedback (RLHF).
- Multi-framework Support: Gemma supports major AI frameworks such as JAX, PyTorch, and TensorFlow. It offers a toolchain through Keras 3.0, simplifying the inference and Supervised Fine-Tuning (SFT) processes.
- Safety and Reliability: During the design process, Gemma adheres to Google’s AI principles. It uses automated techniques to filter sensitive information in the training data and has undergone a series of security evaluations, including red team testing and adversarial testing.
- Performance Optimization: The Gemma model is optimized for hardware platforms such as NVIDIA GPUs and Google Cloud TPUs, ensuring high performance on different devices.
- Community Support: Google provides free resources on platforms like Kaggle and Colab, as well as credits for Google Cloud, encouraging developers and researchers to innovate and conduct research using Gemma.
- Cross-platform Compatibility: The Gemma model can run on various devices, including laptops, desktops, IoT devices, and the cloud, supporting a wide range of AI functions.
- Responsible AI Toolkit: Google has also released the Responsible Generative AI Toolkit to help developers build safe and responsible AI applications, including safety classifiers, debugging tools, and application guides.
The technical key points of Gemma
- Model Architecture: Gemma is built upon the Transformer decoder, which is one of the most advanced model architectures in the current field of natural language processing (NLP). It employs a multi-head attention mechanism, allowing the model to focus on multiple parts of the text simultaneously during processing. Additionally, Gemma utilizes Rotary Position Embedding (RoPE) instead of absolute position embedding to reduce model size and enhance efficiency. The GeGLU activation function replaces the standard ReLU non-linear activation, and normalization is performed at both the input and output of each Transformer sub-layer.
- Training Infrastructure: The Gemma model is trained on Google’s TPUv5e, a high-performance computing platform specifically designed for machine learning. By performing model sharding and data replication across multiple Pods (clusters of chips), Gemma can efficiently utilize distributed computing resources.
- Pre-training Data: The Gemma model is pre-trained on a vast amount of English data (approximately 2 trillion tokens for the 2B model and 6 trillion tokens for the 7B model), primarily sourced from web documents, mathematics, and code. The pre-training data is filtered to reduce unwanted or unsafe content while ensuring data diversity and quality.
- Fine-tuning Strategy: The Gemma model is fine-tuned through Supervised Fine-Tuning (SFT) and Reinforcement Learning from Human Feedback (RLHF). This includes the use of synthetic text pairs and human-generated prompt-response pairs, as well as a reward model trained on human preference data.
- Safety and Responsibility: Gemma is designed with safety and responsibility in mind, including filtering data during the pre-training phase to reduce the risk of sensitive information and harmful content. Furthermore, Gemma has undergone a series of safety evaluations, including automated benchmark testing and human evaluation, to ensure its safety in practical applications.
- Performance Evaluation: Gemma has undergone extensive performance evaluations across multiple domains, including question answering, commonsense reasoning, mathematics and science problem-solving, and coding tasks. The Gemma model has been compared with other open models of similar or larger scale, achieving superior results in 11 out of 18 benchmarks such as MMLU and MBPP, compared to models like Llama-13B or Mistral-7B.
- Openness and Accessibility: The Gemma model is released as open-source, providing pre-trained and fine-tuned checkpoints, as well as open-source codebases for inference and deployment. This enables researchers and developers to access and utilize these advanced language models, driving innovation in the AI field.