Installation

Environment Preparation

The required packages and corresponding version are shown as follows:

  • Python == 3.10

  • GCC == 10.2.0

  • MPFR == 4.1.0

  • CUDA >= 11.8

  • Pytorch >= 2.1.0

  • Transformers >= 4.28.0

  • Flash-Attention >= v2.2.1

  • Apex == 23.05

  • GPU with Ampere or Hopper architecture (such as H100, A100)

  • Linux OS

After installing the above dependencies, some system environment variables need to be updated:

export CUDA_PATH={path_of_cuda_11.8}
export GCC_HOME={path_of_gcc_10.2.0}
export MPFR_HOME={path_of_mpfr_4.1.0}
export LD_LIBRARY_PATH=${GCC_HOME}/lib64:${MPFR_HOME}/lib:${CUDA_PATH}/lib64:$LD_LIBRARY_PATH
export PATH=${GCC_HOME}/bin:${CUDA_PATH}/bin:$PATH
export CC=${GCC_HOME}/bin/gcc
export CXX=${GCC_HOME}/bin/c++

Installation

可以通过pip命令直接安装,命令如下:

pip install InternEvo==xxx (xxx是需要安装的版本号信息)

这种方式仅安装了InternEvo项目,其依赖的软件包及子模块尚未安装。

也可以通过源码安装,将项目InternEvo及其依赖子模块,从 github 仓库中 clone 下来,命令如下:

git clone git@github.com:InternLM/InternEvo.git --recurse-submodules

It is recommended to build a Python-3.10 virtual environment using conda and install the required dependencies based on the requirements/ files:

conda create --name internevo-env python=3.10 -y
conda activate internevo-env
cd InternEvo
pip install -r requirements/torch.txt
pip install -r requirements/runtime.txt

安装 flash-attention (version v2.2.1):

cd ./third_party/flash-attention
python setup.py install
cd ./csrc
cd fused_dense_lib && pip install -v .
cd ../xentropy && pip install -v .
cd ../rotary && pip install -v .
cd ../layer_norm && pip install -v .
cd ../../../../

Install Apex (version 23.05):

cd ./third_party/apex
pip install -v --disable-pip-version-check --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./
cd ../../

Environment Image

Users can use the provided dockerfile combined with docker.Makefile to build their own images, or obtain images with InternEvo runtime environment installed from https://hub.docker.com/r/internlm/internlm.

Image Configuration and Build

The configuration and build of the Dockerfile are implemented through the docker.Makefile. To build the image, execute the following command in the root directory of InternEvo:

make -f docker.Makefile BASE_OS=centos7

In docker.Makefile, you can customize the basic image, environment version, etc., and the corresponding parameters can be passed directly through the command line. For BASE_OS, ubuntu20.04 and centos7 are respectively supported.

Pull Standard Image

The standard image based on ubuntu and centos has been built and can be directly pulled:

# ubuntu20.04
docker pull internlm/internlm:torch1.13.1-cuda11.7.1-flashatten1.0.5-ubuntu20.04
# centos7
docker pull internlm/internlm:torch1.13.1-cuda11.7.1-flashatten1.0.5-centos7

Run Container

For the local standard image built with dockerfile or pulled, use the following command to run and enter the container:

docker run --gpus all -it -m 500g --cap-add=SYS_PTRACE --cap-add=IPC_LOCK --shm-size 20g --network=host --name myinternlm internlm/internlm:torch1.13.1-cuda11.7.1-flashatten1.0.5-centos7 bash

The default directory in the container is /InternLM, please start training according to the Usage.