最经济本地部署Deepseek-R1 32B模型方法

发布于 12 天前  72 次阅读


参考来自FUN科技

首先购买两张INTEL A770(4-9更新:MI50单卡32G即可运行)+带两个满速PCIE4.0*16主板+CPU(最经济:MZ32-AR0+EPYC 7282),不低于750W的电源,64G或更大内存,安装UBUNTU2204系统后进入操作教程:

1.开启主板BIOS内的PCIE Resizable BAR Support功能

安装驱动前先在根目录下创建一个/model/文件夹,后文中的文件需要全部放在那个文件夹内!!

 安装 Intel Out-of-Tree GPU driver

wget -qO - https://repositories.intel.com/gpu/intel-graphics.key | \
sudo gpg --yes --dearmor --output /usr/share/keyrings/intel-graphics.gpg
echo "deb [arch=amd64,i386 signed-by=/usr/share/keyrings/intel-graphics.gpg] https://repositories.intel.com/gpu/ubuntu jammy unified" | \
sudo tee /etc/apt/sources.list.d/intel-gpu-jammy.list
sudo apt update
sudo apt install -y intel-i915-dkms intel-fw-gpu

设置用户组

sudo gpasswd -a ${USER} render
sudo reboot

•将${USER}替换为用户名

验证Intel® Arc™ A770 PCIe Configuration Space

sudo lspci | grep -i vga

输出应为:

03:00.0 VGA compatible controller: Intel Corporation Device 56a0 (rev 08)
04:00.0 VGA compatible controller: Intel Corporation Device 56a0 (rev 08)

sudo lspci -s 03:00.0 -vvv

输出应为(A770输出)

03:00.0 VGA compatible controller: Intel Corporation Device 56a0 (rev 08) (prog-if 00 [VGA controller])
        Subsystem: Intel Corporation Device 1020
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0, Cache Line Size: 64 bytes
        Interrupt: pin ? routed to IRQ 222
        IOMMU group: 21
        Region 0: Memory at 75000000 (64-bit, non-prefetchable) [size=16M]
        Region 2: Memory at 4800000000 (64-bit, prefetchable) [size=16G]
        Expansion ROM at 76000000 [disabled] [size=2M]
        Capabilities: [40] Vendor Specific Information: Len=0c <?>
        Capabilities: [70] Express (v2) Endpoint, MSI 00
                DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us
                        ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 0.000W
                DevCtl: CorrErr- NonFatalErr- FatalErr- UnsupReq-
                        RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+ FLReset-
                        MaxPayload 128 bytes, MaxReadReq 128 bytes
                DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq+ AuxPwr- TransPend-
                LnkCap: Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1, Exit Latency L0s <64ns, L1 <1us
                        ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+
                LnkCtl: ASPM Disabled; RCB 64 bytes, Disabled- CommClk-
                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                LnkSta: Speed 2.5GT/s (ok), Width x1 (ok)
                        TrErr- Train- SlotClk- DLActive- BWMgmt- ABWMgmt-
                DevCap2: Completion Timeout: Range B, TimeoutDis+ NROPrPrP- LTR+
                         10BitTagComp+ 10BitTagReq+ OBFF Not Supported, ExtFmt+ EETLPPrefix-
                         EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
                         FRS- TPHComp- ExtTPHComp-
                         AtomicOpsCap: 32bit- 64bit- 128bitCAS-
                DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LTR+ OBFF Disabled,
                         AtomicOpsCtl: ReqEn-
                LnkCap2: Supported Link Speeds: 2.5GT/s, Crosslink- Retimer- 2Retimers- DRS-
                LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-
                         Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
                         Compliance De-emphasis: -6dB
                LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete- EqualizationPhase1-
                         EqualizationPhase2- EqualizationPhase3- LinkEqualizationRequest-
                         Retimer- 2Retimers- CrosslinkRes: unsupported
        Capabilities: [ac] MSI: Enable+ Count=1/1 Maskable+ 64bit+
                Address: 00000000fee00f58  Data: 0000
                Masking: 00000000  Pending: 00000000
        Capabilities: [d0] Power Management version 3
                Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold-)
                Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [100 v1] Alternative Routing-ID Interpretation (ARI)
                ARICap: MFVC- ACS-, Next Function: 0
                ARICtl: MFVC- ACS-, Function Group: 0
        Capabilities: [420 v1] Physical Resizable BAR
                BAR 2: current size: 16GB, supported: 256MB 512MB 1GB 2GB 4GB 8GB 16GB
        Capabilities: [400 v1] Latency Tolerance Reporting
                Max snoop latency: 15728640ns
                Max no snoop latency: 15728640ns
        Kernel driver in use: i915
        Kernel modules: i915, xe

3.安装Docker

sudo apt-get update
sudo apt-get install ca-certificates curl
sudo install -m 0755 -d /etc/apt/keyrings
sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc
sudo chmod a+r /etc/apt/keyrings/docker.asc
echo \
"deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu \
$(. /etc/os-release && echo "${UBUNTU_CODENAME:-$VERSION_CODENAME}") stable" | \
sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt-get update
sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin

4. 模型路径


\\Nas-disk8\研发组\本地大语言模型\DeepSeek-R1-Distill-Qwen-32B-AWQ

5. 加载镜像

•载入Intel提供的LLM后端镜像,位置如下:

\\Nas-disk8\研发组\本地大语言模型\intel\ipex-llm-serving.tar.gz

使用以下命令载入:

sudo docker load -i ipex-llm-serving.tar.gz

•载入Intel提供的前端镜像,位置如下:

\\Nas-disk8\研发组\本地大语言模型\intel\openwebui.tar.gz

使用以下命令载入:

sudo docker load -i openwebui.tar.gz

加载成功后sudo docker images应该出现以下打印:

REPOSITORY                            TAG                IMAGE ID       CREATED        SIZE
intelanalytics/ipex-llm-serving-xpu   2.2.0-b12-client   09b8caae0ec1   3 days ago     22.5GB
ghcr.io/open-webui/open-webui         main               afe4f6c41e46   4 months ago   4.11GB

6. 启动容器

 将两个启动容器的脚本拷贝到本地/model文件夹

\\Nas-disk8\研发组\本地大语言模型\intel\create-llm.sh
\\Nas-disk8\研发组\本地大语言模型\intel\create-ui.sh

分别执行两个脚本启动前后端:

启动后端:

sudo bash create-llm.sh

启动前端:

bash create-ui.sh

7. 启动应用

执行以下命令:

docker exec -it llm-backend bash /model/ds.sh

出现以下内容,后端启动完成:

INFO 02-21 09:37:47 launcher.py:19] Available routes are:
INFO 02-21 09:37:47 launcher.py:27] Route: /openapi.json, Methods: GET, HEAD
INFO 02-21 09:37:47 launcher.py:27] Route: /docs, Methods: GET, HEAD
INFO 02-21 09:37:47 launcher.py:27] Route: /docs/oauth2-redirect, Methods: GET, HEAD
INFO 02-21 09:37:47 launcher.py:27] Route: /redoc, Methods: GET, HEAD
INFO 02-21 09:37:47 launcher.py:27] Route: /health, Methods: GET
INFO 02-21 09:37:47 launcher.py:27] Route: /tokenize, Methods: POST
INFO 02-21 09:37:47 launcher.py:27] Route: /detokenize, Methods: POST
INFO 02-21 09:37:47 launcher.py:27] Route: /v1/models, Methods: GET
INFO 02-21 09:37:47 launcher.py:27] Route: /version, Methods: GET
INFO 02-21 09:37:47 launcher.py:27] Route: /v1/chat/completions, Methods: POST
INFO 02-21 09:37:47 launcher.py:27] Route: /v1/completions, Methods: POST
INFO 02-21 09:37:47 launcher.py:27] Route: /v1/embeddings, Methods: POST
INFO:     Started server process [19]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:8001 (Press CTRL+C to quit)
INFO:     127.0.0.1:47794 - "GET /v1/models HTTP/1.1" 200 OK

执行以下命令验证前端启动:

docker logs llm-frontend

出现以下内容即已启动:

INFO  [open_webui.apps.audio.main] whisper_device_type: cpu

  ___                    __        __   _     _   _ ___
 / _ \ _ __   ___ _ __   \ \      / /__| |__ | | | |_ _|
| | | | '_ \ / _ \ '_ \   \ \ /\ / / _ \ '_ \| | | || |
| |_| | |_) |  __/ | | |   \ V  V /  __/ |_) | |_| || |
 \___/| .__/ \___|_| |_|    \_/\_/ \___|_.__/ \___/|___|
      |_|                                               

      
v0.3.32 - building the best open-source AI user interface.

https://github.com/open-webui/open-webui

INFO:     Started server process [1]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:8080 (Press CTRL+C to quit)

  • alipay_img
  • wechat_img
得不到的永远在骚动,被偏爱的都有恃无恐
最后更新于 2025-04-09