.. _preparation: =========== Preparation =========== Last updated: 2025-11-08 With the environment and dependencies installed, the final step is to prepare the necessary assets for training. This guide covers the two main prerequisites: downloading and converting the model weights, and formatting the training dataset. Download and Merge Model Weights ================================ Our training scripts require model weights in a “merged-expert” format for optimal performance. Before starting, you must download the standard weights and convert them. Step 1: Download Original Model ------------------------------- We provide a helper script to download the weights from Hugging Face: .. code-block:: bash # Choose a destination for the original model files python ./scripts/download_hf_model.py \ --repo_id inclusionAI/LLaDA2.0-mini-preview \ --local_dir /path/to/separate_expert_model Step 2: Convert to Merged Format -------------------------------- Run the following script to create the merged checkpoint required for training: .. code-block:: bash # Use the path from the previous step as the source python scripts/moe_convertor.py \ --input-path /path/to/separate_expert_model \ --output-path /path/to/save/merged_model \ --mode merge The directory ``/path/to/save/merged_model`` is what you will use for the training script. For more details, see `MoE Expert Merging and Splitting Utilities <#moe-expert-merging-and-splitting-utilities>`__ Prepare Training Data ===================== This tutorial uses the ``openai/gsm8k`` dataset and demonstrates how to convert it into the conversational format. Provided Script --------------- We provide an example script ``./scripts/build_gsm8k_dataset.py`` for this purpose. You can adapt this script or write your own to process other datasets. The script converts the "question" and "answer" fields into a conversational messages field. The processed dataset is saved to the ``./gsm8k_datasets/`` directory, split into: - ``train.jsonl`` - Training data - ``test.jsonl`` - Evaluation data Run the script: .. code-block:: bash python ./scripts/build_gsm8k_dataset.py