The Python environment was pre-installed by a PhD student (Yifan) in /expanse/lustre/projects/csb176/yifanq/share-env/pytorch-env. Using this shared environment reduces the storage space usage.
git clone https://github.com/Qiaoyf96/Expanse-Instruction.git
There are two files fetched in this "Expanse-Instruction" directory: .bashrc and example.py. You may get them from here (bashrc.txt is the same as .bashrc).
cd Expanse-Instruction
source .bashrc
You can use the following Conda activation command (I have not verified yet)
conda activate /expanse/lustre/projects/csb176/yifanq/share-env/pytorch-env
Or you can use the following method (I have verified)
mkdir ~/.conda
cd ~/.conda
mkdir envs
cd envs
ln -s /expanse/lustre/projects/csb176/yifanq/share-env/pytorch-env .
echo "echo $HOME/.conda/envs/pytorch-env" >> ~/.conda/envs/environments.txt
Then activate conda environment:
conda activate pytorch-env
srun --pty --nodes=1 --ntasks-per-node=1 -p gpu-shared --gpus=1 -t 00:05:00 -A csb175 python example.py
Expected output: Click here.
Notice you can also submit a Python job using sbatch.
The above process loads and produces about 66MB of temporary data space under "data" sub-directory.
srun --pty --nodes=1 --ntasks-per-node=1 -p shared -t 00:10:00 -A csb175 /bin/bash
python
>>> from pyserini.search import SimpleSearcher
searcher = SimpleSearcher.from_prebuilt_index('robust04')
hits = searcher.search('hubble space telescope')
for i in range(0, 10):
print(f'{i+1:2} {hits[i].docid:15} {hits[i].score:.5f}')
Click here for the detailed trace. Notice it may take few minutes to load Pyserini and fetch the Robust04 index remotely from the Pyserini website.
Hidden temporary space used by Pyserini.
Notice that the Robust04 index dataset is loaded during the above test process under .cache of
the home directory. For example, /home/tyang/.cache/pyserini/.
The Robust04 index dataset takes about 2GB space.
If you process and search other datasets pre-processed by Pyserini, the space of
this directory will grow very fast. You can use Linux command "du" to check the space usage.
When appropriate, you should remove this directory or part of that to avoid large wasted temporary space usage.