dislib currently requires:
- PyCOMPSs == 2.5
- Scikit-learn >= 0.19.1
- Scipy >= 1.0.0
- NumPy >= 1.15.4
Some of the examples also require matplotlib >= 2.0.0 and pandas >= 0.20.1. numpydoc >= 0.8.0 is requried to build the documentation.
Check which PyCOMPSs version to install.
- Latest dislib release requires PyCOMPSs 2.5 (check here for information about other releases).
Install PyCOMPSs following these instructions.
Install the latest dislib version with
pip3 install dislib.
- IMPORTANT: dislib requires the
pycompssPython module. However, this command will NOT install the module automatically. The module should be available after manually installing PyCOMPSs following the instructions in step 2. For more information on this, see here.
- IMPORTANT: dislib requires the
You can check that everything works fine by running one of our examples:
Download the latest source code here.
Extract the contents of the tar package.
tar xzvf dislib-X.Y.Z.tar.gz
Run an example application.
runcompss --python_interpreter=python3 dislib-X.Y.Z/examples/kmeans.py
1. Install Docker and docker-py¶
Warning: requires docker version >= 17.12.0-ce
- Follow these instructions
Be aware that for some distros the docker package has been renamed from
docker-ce. Make sure you install the new package.
- Add user to docker group to run dislib as a non-root user.
- Check that docker is correctly installed
docker --version docker ps # this should be empty as no docker processes are yet running.
- Install docker-py
sudo pip3 install docker
2. Install dislib¶
sudo pip3 install dislib
This should add the dislib executable to your path.
3. Start dislib in your development directory¶
Initialize dislib where your source code will be (you can re-init anytime). This will allow docker to access your local code and run it inside the container.
Note that the first time dislib needs to download the docker image from the registry, and it may take a while.
# Without a path it operates on the current working directory. dislib init # You can also provide a path dislib init /home/user/replace/path/
4. Running applications¶
Note: running the docker dislib does not work with applications with GUI or with visual plots such as
First clone dislib repo and checkout release branch vX.Y.Z (docker version and dislib code should preferably be the same to avoid inconsistencies):
git clone https://github.com/bsc-wdc/dislib.git
Init the dislib environment in the root of the repo.
The source files path are resolved from the init directory which sometimes can be confusing. As a rule of thumb, initialize the library in a current directory and check the paths are correct running the file with
python3 path_to/file.py (in this case
cd dislib dislib init dislib exec examples/rf_iris.py
The log files of the execution can be found at $HOME/.COMPSs.
You can also init the library inside the examples folder. This will mount the examples directory inside the container so you can execute it without adding the path:
cd dislib/examples dislib init dislib exec rf_iris.py
5. Running Jupyter notebooks¶
Notebooks can be run using the
dislib jupyter command. Run the
following snippet from the root of the project:
dislib init dislib jupyter ./notebooks
An alternative and more flexible way of starting jupyter is using the
dislib run command in the following way:
dislib run jupyter-notebook ./notebooks --ip=0.0.0.0 --allow-root
Access your notebook by ctrl-clicking or copy pasting into the browser the link shown on the CLI (e.g. http://127.0.0.1:8888/?token=TOKEN_VALUE).
If the notebook process is not properly closed, you might get the following warning when trying to start jupyter notebooks again:
The port 8888 is already in use, trying another port.
To fix it, just restart the dislib container with
6. Adding more nodes¶
Note: adding more nodes is still in beta phase. Please report issues, suggestions, or feature requests on Github.
To add more computing nodes, you can either let docker create more workers for you or manually create and config a custom node.
For docker just issue the desired number of workers to be added. For example, to add 2 docker workers:
dislib components add worker 2
You can check that both new computing nodes are up with:
dislib components list
If you want to add a custom node it needs to be reachable through ssh without user. Moreover, dislib will try to copy the
working_dir there, so it needs write permissions for the scp.
For example, to add the local machine as a worker node:
dislib components add worker '127.0.0.1:6'
- ‘127.0.0.1’: is the IP used for ssh (can also be a hostname like ‘localhost’ as long as it can be resolved).
- ‘6’: desired number of available computing units for the new node.
Please be aware that
dislib components will not list your custom nodes because they are not docker processes and thus it can’t be verified if they are up and running.