参考来自FUN科技
首先购买两张INTEL A770(4-9更新:MI50单卡32G即可运行)+带两个满速PCIE4.0*16主板+CPU(最经济:MZ32-AR0+EPYC 7282),不低于750W的电源,64G或更大内存,安装UBUNTU2204系统后进入操作教程:
1.开启主板BIOS内的PCIE Resizable BAR Support功能
安装驱动前先在根目录下创建一个/model/文件夹,后文中的文件需要全部放在那个文件夹内!!
安装 Intel Out-of-Tree GPU driver
wget -qO - https://repositories.intel.com/gpu/intel-graphics.key | \
sudo gpg --yes --dearmor --output /usr/share/keyrings/intel-graphics.gpg
echo "deb [arch=amd64,i386 signed-by=/usr/share/keyrings/intel-graphics.gpg] https://repositories.intel.com/gpu/ubuntu jammy unified" | \
sudo tee /etc/apt/sources.list.d/intel-gpu-jammy.list
sudo apt update
sudo apt install -y intel-i915-dkms intel-fw-gpu
设置用户组
sudo gpasswd -a ${USER} render
sudo reboot
•将${USER}替换为用户名
验证Intel® Arc™ A770 PCIe Configuration Space
sudo lspci | grep -i vga
输出应为:
03:00.0 VGA compatible controller: Intel Corporation Device 56a0 (rev 08)
04:00.0 VGA compatible controller: Intel Corporation Device 56a0 (rev 08)
sudo lspci -s 03:00.0 -vvv
输出应为(A770输出)
03:00.0 VGA compatible controller: Intel Corporation Device 56a0 (rev 08) (prog-if 00 [VGA controller])
Subsystem: Intel Corporation Device 1020
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0, Cache Line Size: 64 bytes
Interrupt: pin ? routed to IRQ 222
IOMMU group: 21
Region 0: Memory at 75000000 (64-bit, non-prefetchable) [size=16M]
Region 2: Memory at 4800000000 (64-bit, prefetchable) [size=16G]
Expansion ROM at 76000000 [disabled] [size=2M]
Capabilities: [40] Vendor Specific Information: Len=0c <?>
Capabilities: [70] Express (v2) Endpoint, MSI 00
DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us
ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 0.000W
DevCtl: CorrErr- NonFatalErr- FatalErr- UnsupReq-
RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+ FLReset-
MaxPayload 128 bytes, MaxReadReq 128 bytes
DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq+ AuxPwr- TransPend-
LnkCap: Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1, Exit Latency L0s <64ns, L1 <1us
ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+
LnkCtl: ASPM Disabled; RCB 64 bytes, Disabled- CommClk-
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 2.5GT/s (ok), Width x1 (ok)
TrErr- Train- SlotClk- DLActive- BWMgmt- ABWMgmt-
DevCap2: Completion Timeout: Range B, TimeoutDis+ NROPrPrP- LTR+
10BitTagComp+ 10BitTagReq+ OBFF Not Supported, ExtFmt+ EETLPPrefix-
EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
FRS- TPHComp- ExtTPHComp-
AtomicOpsCap: 32bit- 64bit- 128bitCAS-
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LTR+ OBFF Disabled,
AtomicOpsCtl: ReqEn-
LnkCap2: Supported Link Speeds: 2.5GT/s, Crosslink- Retimer- 2Retimers- DRS-
LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-
Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
Compliance De-emphasis: -6dB
LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete- EqualizationPhase1-
EqualizationPhase2- EqualizationPhase3- LinkEqualizationRequest-
Retimer- 2Retimers- CrosslinkRes: unsupported
Capabilities: [ac] MSI: Enable+ Count=1/1 Maskable+ 64bit+
Address: 00000000fee00f58 Data: 0000
Masking: 00000000 Pending: 00000000
Capabilities: [d0] Power Management version 3
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold-)
Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [100 v1] Alternative Routing-ID Interpretation (ARI)
ARICap: MFVC- ACS-, Next Function: 0
ARICtl: MFVC- ACS-, Function Group: 0
Capabilities: [420 v1] Physical Resizable BAR
BAR 2: current size: 16GB, supported: 256MB 512MB 1GB 2GB 4GB 8GB 16GB
Capabilities: [400 v1] Latency Tolerance Reporting
Max snoop latency: 15728640ns
Max no snoop latency: 15728640ns
Kernel driver in use: i915
Kernel modules: i915, xe
3.安装Docker
sudo apt-get update
sudo apt-get install ca-certificates curl
sudo install -m 0755 -d /etc/apt/keyrings
sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc
sudo chmod a+r /etc/apt/keyrings/docker.asc
echo \
"deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu \
$(. /etc/os-release && echo "${UBUNTU_CODENAME:-$VERSION_CODENAME}") stable" | \
sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt-get update
sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
4. 模型路径
\\Nas-disk8\研发组\本地大语言模型\DeepSeek-R1-Distill-Qwen-32B-AWQ
5. 加载镜像
•载入Intel提供的LLM后端镜像,位置如下:
\\Nas-disk8\研发组\本地大语言模型\intel\ipex-llm-serving.tar.gz
使用以下命令载入:
sudo docker load -i ipex-llm-serving.tar.gz
•载入Intel提供的前端镜像,位置如下:
\\Nas-disk8\研发组\本地大语言模型\intel\openwebui.tar.gz
使用以下命令载入:
sudo docker load -i openwebui.tar.gz
加载成功后sudo docker images应该出现以下打印:
REPOSITORY TAG IMAGE ID CREATED SIZE
intelanalytics/ipex-llm-serving-xpu 2.2.0-b12-client 09b8caae0ec1 3 days ago 22.5GB
ghcr.io/open-webui/open-webui main afe4f6c41e46 4 months ago 4.11GB
6. 启动容器
将两个启动容器的脚本拷贝到本地/model文件夹
\\Nas-disk8\研发组\本地大语言模型\intel\create-llm.sh
\\Nas-disk8\研发组\本地大语言模型\intel\create-ui.sh
分别执行两个脚本启动前后端:
启动后端:
sudo bash create-llm.sh
启动前端:
bash create-ui.sh
7. 启动应用
执行以下命令:
docker exec -it llm-backend bash /model/ds.sh
出现以下内容,后端启动完成:
INFO 02-21 09:37:47 launcher.py:19] Available routes are:
INFO 02-21 09:37:47 launcher.py:27] Route: /openapi.json, Methods: GET, HEAD
INFO 02-21 09:37:47 launcher.py:27] Route: /docs, Methods: GET, HEAD
INFO 02-21 09:37:47 launcher.py:27] Route: /docs/oauth2-redirect, Methods: GET, HEAD
INFO 02-21 09:37:47 launcher.py:27] Route: /redoc, Methods: GET, HEAD
INFO 02-21 09:37:47 launcher.py:27] Route: /health, Methods: GET
INFO 02-21 09:37:47 launcher.py:27] Route: /tokenize, Methods: POST
INFO 02-21 09:37:47 launcher.py:27] Route: /detokenize, Methods: POST
INFO 02-21 09:37:47 launcher.py:27] Route: /v1/models, Methods: GET
INFO 02-21 09:37:47 launcher.py:27] Route: /version, Methods: GET
INFO 02-21 09:37:47 launcher.py:27] Route: /v1/chat/completions, Methods: POST
INFO 02-21 09:37:47 launcher.py:27] Route: /v1/completions, Methods: POST
INFO 02-21 09:37:47 launcher.py:27] Route: /v1/embeddings, Methods: POST
INFO: Started server process [19]
INFO: Waiting for application startup.
INFO: Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:8001 (Press CTRL+C to quit)
INFO: 127.0.0.1:47794 - "GET /v1/models HTTP/1.1" 200 OK
执行以下命令验证前端启动:
docker logs llm-frontend
出现以下内容即已启动:
INFO [open_webui.apps.audio.main] whisper_device_type: cpu
___ __ __ _ _ _ ___
/ _ \ _ __ ___ _ __ \ \ / /__| |__ | | | |_ _|
| | | | '_ \ / _ \ '_ \ \ \ /\ / / _ \ '_ \| | | || |
| |_| | |_) | __/ | | | \ V V / __/ |_) | |_| || |
\___/| .__/ \___|_| |_| \_/\_/ \___|_.__/ \___/|___|
|_|
v0.3.32 - building the best open-source AI user interface.
https://github.com/open-webui/open-webui
INFO: Started server process [1]
INFO: Waiting for application startup.
INFO: Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:8080 (Press CTRL+C to quit)
Comments NOTHING