INCLUSION AI

Introducing Ming-Lite-Omni V1.5

GITHUB 🤗 Hugging Face｜ 🤖 ModelScope Overview Ming-lite-omni v1.5 is a comprehensive upgrade to the full-modal capabilities of Ming-lite-omni(Github). It significantly improves performance across tasks including image-text understanding, document understanding, video understanding, speech understanding and synthesis, and image generation and editing. Built upon Ling-lite-1.5, Ming-lite-omni v1.5 has a total of 20.3 billion parameters, with 3 billion active parameters in its MoE (Mixture-of-Experts) section. It demonstrates highly competitive results in various modal benchmarks compared to industry-leading models....

M2-Reasoning: Empowering MLLMs with Unified General and Spatial Reasoning

📖 Technical Report | 🤗 Hugging Face｜ 🤖 ModelScope Introduction We introduce M2-Reasoning-7B, a model designed to excel in both general and spatial reasoning. Our approach integrates two key innovations: (1) a novel data pipeline that generates 294.2K high-quality data samples (168K for cold-start fine-tuning and 126.2K for RLVR), which feature logically coherent reasoning trajectories and have undergone comprehensive assessment; and (2) a dynamic multi-task training strategy with step-wise optimization to mitigate conflicts between data, and task-specific rewards for delivering tailored incentive signals....

ABench: An Evolving Open-Source Benchmark

GITHUB 🌟 Overview ABench is an evolving open-source benchmark suite designed to rigorously evaluate and enhance Large Language Models (LLMs) on complex cross-domain tasks. By targeting current model weaknesses, ABench provides systematic challenges in high-difficulty specialized domains, including physics, actuarial science, logical reasoning, law, and psychology. 🎯 Core Objectives Address Evaluation Gaps: Design high-differentiation assessment tasks targeting underperforming question types Establish Unified Standards: Create reliable, comparable benchmarks for multi-domain LLM evaluation Expand Capability Boundaries: Drive continuous optimization of knowledge systems and reasoning mechanisms through challenging innovative problems 📊 Dataset Release Status Domain Description Status Physics 500 university/competition-level physics problems (400 static + 100 dynamic parametric variants) covering 10+ fields from classical mechanics to modern physics ✅ Released Actuary Curated actuarial exam problems covering core topics: probability statistics, financial mathematics, life/non-life insurance, actuarial models, and risk management ✅ Released Logic High-differentiation logical reasoning problems from authoritative tests (LSAT/GMAT/GRE/SBI/Chinese Civil Service Exam) 🔄 In Preparation Psychology Psychological case studies and research questions (objective/subjective) evaluating understanding of human behavior and theories 🔄 In Preparation Law Authoritative judicial exam materials covering core legal domains: criminal/civil/administrative/procedural/international law 🔄 In Preparation

AWorld: The Agent Runtime for Self-Improvement

“Self-awareness: the hardest problem isn’t solving within limits, it’s discovering the own limitations” Table of Contents News — Latest updates and announcements. Introduction — Overview and purpose of the project. Installation — Step-by-step setup instructions. Quick Start — Get started with usage examples. Architecture — Explore the multi-agent system design. Demo — See the project in action with demonstrations. Contributing — How to get involved and contribute. License — Project licensing details....

Ming-Omni: A Unified Multimodal Model for Perception and Generation

GITHUB 📑 Technical Report｜📖Project Page ｜🤗 Hugging Face｜ 🤖 ModelScope Introduction Ming-lite-omni, a light version of Ming-omni, which is derived from Ling-lite and features 2.8 billion activated parameter. Ming-lite-omni is a unified multimodal model capable of processing images, text, audio, and video, while demonstrating strong proficiency in both speech and image generation. Ming-lite-omni employs dedicated encoders to extract tokens from different modalities, which are then processed by Ling, an MoE architecture equipped with newly proposed modality-specific routers....