文本向量模型评测
至于如何评估一个模型的好坏:MTEB Leaderboard - a Hugging Face Space by mteb (https://huggingface.co/spaces/mteb/leaderboard)是针对大规模文本表示学习方法的一个评测排行榜。这个排行榜会将文本向量化模型在大量的评测数据集:文本分类,聚类,文本排序,文本召回等大量数据集上进行评测,并给出一个平均的分数,来评估这个模型文本embeding的能力。
至于如何评估一个模型的好坏:MTEB Leaderboard - a Hugging Face Space by mteb (https://huggingface.co/spaces/mteb/leaderboard)是针对大规模文本表示学习方法的一个评测排行榜。这个排行榜会将文本向量化模型在大量的评测数据集:文本分类,聚类,文本排序,文本召回等大量数据集上进行评测,并给出一个平均的分数,来评估这个模型文本embeding的能力。
一、题目列表:
题目1、用两个栈实现队列JZ9
二、题目
题目1、用两个栈实现队列JZ9
描述
用两个栈来实现一个队列,使用n个元素来完成 n 次在队列尾部插入整数(push)和n次在队列头部删除整数(pop)的功能。 队列中的元素为int类型。保证操作合法,即保证pop操作时队列内已有元素。
数据范围: nle1000n≤1000
要求:存储n个元素的空间复杂度为 O(n)O(n) ,插入与删除的时间复杂度都是 O(1)O(1)
2.代码
# -*- coding:utf-8 -*-
class Solution:
def __init__(self):
self.stack1 = []
self.stack2 = []
def push(self, node):
# write code here
self.stack1.append(node)
def pop(self):
# return xx
if self.stack2 == []:
while self.stack1:
self.stack2.append(self.stack1.pop())
return self.stack2.pop()
查看GPU型号:lspci | grep -i nvidia
10:00.0 3D controller: NVIDIA Corporation Device 20f3 (rev a1)
16:00.0 3D controller: NVIDIA Corporation Device 20f3 (rev a1)
49:00.0 3D controller: NVIDIA Corporation Device 20f3 (rev a1)
4d:00.0 3D controller: NVIDIA Corporation Device 20f3 (rev a1)
54:00.0 Bridge: NVIDIA Corporation Device 1af1 (rev a1)
55:00.0 Bridge: NVIDIA Corporation Device 1af1 (rev a1)
56:00.0 Bridge: NVIDIA Corporation Device 1af1 (rev a1)
57:00.0 Bridge: NVIDIA Corporation Device 1af1 (rev a1)
58:00.0 Bridge: NVIDIA Corporation Device 1af1 (rev a1)
59:00.0 Bridge: NVIDIA Corporation Device 1af1 (rev a1)
89:00.0 3D controller: NVIDIA Corporation Device 20f3 (rev a1)
8e:00.0 3D controller: NVIDIA Corporation Device 20f3 (rev a1)
c5:00.0 3D controller: NVIDIA Corporation Device 20f3 (rev a1)
c9:00.0 3D controller: NVIDIA Corporation Device 20f3 (rev a1)
通过网址:https://admin.pci-ids.ucw.cz/read/PC/10de/20f3 查看具体型号
参考
1.https://www.zhihu.com/question/618932114/answer/3192465335
nvidia-smi来查看驱动是否安装
如果没有安装,可通过cuda-toolkit下载,里面包含了驱动一起安装
基于cuda-toolkit下载,里面包含了驱动,主要在里面下载对应的版本。
https://developer.nvidia.com/cuda-toolkit-archive
问题1.
安装驱动报错:
/var/log/nvidia-installer.log
ERROR: Unable to find the kernel source tree for the currently running kernel. Please make sure you have installed the kernel source files for your kernel and that they are properly configured; on Red Hat Linux systems, for example, be sure you have the 'kernel-source' or 'kernel-devel' RPM installed. If you know the correct kernel source files are installed, you may specify the kernel source path with the '--kernel-source-path' command line option.
ERROR: Installation has failed. Please see the file '/var/log/nvidia-installer.log' for details. You may find suggestions on fixing installation problems in the README available on the Linux driver download page at www.nvidia.com.
解决办法:
yum install kernel-devel-$(uname -r)
sh cuda_12.2.2_535.104.05_linux.run --kernel-source-path=/usr/src/kernels/3.10.0-1160.el7.x86_64/
此处的安装环境为离线环境,需要先下载cuda安装文件,安装文件可以去官网地址下载对应的系统版本。官网下载地址:https://developer.nvidia.com/cuda-toolkit-archive
驱动和cuda版本对应:
https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html
查看版本
cat /etc/redhat-release
uname -r
GPU
驱动版本
驱动下载
(安装教程)
下载与系统内核版本对应的kernel-devel、kernel-headers
问题:
Error 'An NVIDIA kernel module 'nvidia' appears to already be loaded in your kernel' when trying to get GPU support in AWS EMR
sudo lsof /dev/nvidia*
Cannot generate ORC metadata for CONFIG_UNWINDER_ORC=y, please install libelf-dev, libelf-devel or elfutils-libelf-devel
解决:yum install elfutils-libelf-devel (centos8)
-查看内核版本是否一致:uname -r和在usr/src下的版本号不一致:http://jiaocheng.bubufx.com/info-show-1012538.html
1.1.安装c编译器 yum install gcc
2.1安装kernel-devel yum install kernel-devel
3.检查kernel跟kernel-devel的版本号 uname -r | rpm -q kernel
4.4.两个版本号不一致,进行升级 yum -y update kernel kernel-devel
5.再次检查版本号,还不一致,需要重启。 reboot
关闭 X server
开启 X server
The Nouveau kernel driver is currently in use by your system
查看
- ./NVIDIA-Linux-x86_64-495.46.run --kernel-source-path=/usr/src/kernels/$(uname -r) -k $(uname -r) --dkms -s
-
cuda 安装
环境变量
export PATH=$PATH:/usr/local/cuda/bin
export CUDA_HOME=$CUDA_HOME:/usr/local/cuda
source一下
cudnn 安装
验证:
coda 安装
Using built-in stream user interface
-> Detected 32 CPUs online; setting concurrency level to 32.
-> The file '/tmp/.X0-lock' exists and appears to contain the process ID '2647' of a running X server.
ERROR: You appear to be running an X server; please exit X before installing. For further details, please see the section INSTALLING THE NVIDIA DRIVER in the README available on the Linux driver download page at www.nvidia.com.
ERROR: Installation has failed. Please see the file '/var/log/nvidia-installer.log' for details. You may find suggestions on fixing installation problems in the README available on the Linux driver download page at www.nvidia.com.
解决:
systemctl stop gdm.service
-> Detected 128 CPUs online; setting concurrency level to 32.
-> Tagging shared libraries with chcon -t textrel_shlib_t.
ERROR: An NVIDIA kernel module 'nvidia-uvm' appears to already be loaded in your kernel. This may be because it is in use (for example, by an X server, a CUDA program, or the NVIDIA Persistence Daemon), but this may also happen if your kernel was configured without support for module unloading. Please be sure to exit any programs that may be using the GPU(s) before attempting to upgrade your driver. If no GPU-based programs are running, you know that your kernel supports module unloading, and you still receive this message, then an error may have occurred that has corrupted an NVIDIA kernel module's usage count, for which the simplest remedy is to reboot your computer.
ERROR: Installation has failed. Please see the file '/var/log/nvidia-installer.log' for details. You may find suggestions on fixing installation problems in the README available on the Linux driver download page at www.nvidia.com.
~
解决:
GPU正在使用,关闭正在使用的GPU
Using built-in stream user interface
-> Detected 128 CPUs online; setting concurrency level to 32.
-> Tagging shared libraries with chcon -t textrel_shlib_t.
ERROR: An NVIDIA kernel module 'nvidia' appears to already be loaded in your kernel. This may be because it is in use (for example, by an X server, a CUDA program, or the NVIDIA Persistence Daemon), but this may also happen if your kernel was configured without support for module unloading. Please be sure to exit any programs that may be using the GPU(s) before attempting to upgrade your driver. If no GPU-based programs are running, you know that your kernel supports module unloading, and you still receive this message, then an error may have occurred that has corrupted an NVIDIA kernel module's usage count, for which the simplest remedy is to reboot your computer.
ERROR: Installation has failed. Please see the file '/var/log/nvidia-installer.log' for details. You may find suggestions on fixing installation problems in the README available on the Linux driver download page at www.nvidia.com.
解决:
GPU正在使用,关闭正在使用的GPU,通过命令:
sudo lsof /dev/nvidia*
kill -9 pid
sh ./cuda_11.6.0_510.39.01_linux.run
Extraction failed.
Ensure there is enough space in /tmp and that the installation package is not corrupt
Signal caught, cleaning up
没有安装解压软件
yum install tar
nvidia-smi 中可以显示GPU,但是torch.cuda.is_available() 出现如下错误:
UserWarning: CUDA initialization: Unexpected error from cudaGetDeviceCount(). Did you run some cuda functions before calling
Error 802: system not yet initialized
centos8中解决办法
安装完成后验证:
安装后的位置: /usr/local/下面
nvcc -v
地址:https://developer.nvidia.com/rdp/cudnn-archive
选择:
(1)根据cuda的情况选择版本,例如12.x选择
Download cuDNN v8.9.7 (December 5th, 2023), for CUDA 12.x
(2)选择下载格式,一般都选择tar包
Local Installer for Linux x86_64 (Tar)
一般情况下从官网下载:https://developer.nvidia.com/cuda-toolkit-archive
注意安装的时候:不要安装cuda driver
安装完成后切换软连接:
rm -rf /usr/local/cuda #删除之前创建的软链接
sudo ln -s /usr/local/cuda-11.3/ /usr/local/cuda/
nvcc --version #查看当前 cuda 版本
如果还不行直接在环境变量中修改:
vim ~/.bashrc
#然后添加
export CUDA_HOME=/usr/local/cuda
export PATH=$PATH:/usr/local/cuda/bin
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda-11.3/lib64
遇见过一种情况还不行:
查看 which nvcc 发现指向没有变 可能就是环境变量没有改过来
查看环境变量:
echo $PATH
## 打印:/home/centos/anaconda3/bin:/home/centos/anaconda3/condabin:/home/centos/.local/bin:/home/centos/bin:/usr/local/cuda-12.2/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/usr/local/cuda/bin
##发现环境变量没有变,需要将固定指向的环境变量修改:
export PATH=/home/centos/anaconda3/bin:/home/centos/anaconda3/condabin:/home/centos/.local/bin:/home/centos/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/usr/local/cuda/bin