Troubleshoot

Compilation

Running an example

If you get an error like Segmentation fault (core dumped) while running an example, it could be because you haven’t changed the computing service used in the example. (e.g: It uses de GPU and you don’t have one)

Also, if it is using CPU and the library has been compiled for CPU, it could be because your processor does not support AVX instructions.

OpenMP (MacOS)

On MacOS, the Clang that comes with XCode doesn’t support OpenMP. Therefore, we recommend you to install the GCC compile so that you can take advantange of OpenMP. (Note: By default, GCC is just a symlink to Clang. more_)

# Install the real GCC
brew install gcc

# Add flags to cmake
cmake .. -DCMAKE_PREFIX_PATH=$CONDA_PREFIX -DCMAKE_INSTALL_PREFIX=$CONDA_PREFIX
-DCMAKE_C_COMPILER=/usr/local/Cellar/gcc/10.2.0_3/bin/gcc-10
-DCMAKE_CXX_COMPILER=/usr/local/Cellar/gcc/10.2.0_3/bin/g++-10

As a last resort, you can always disable OpenMP and use the EDDL by making use of the cmake flag -D BUILD_OPENMP=OFF

Undefined symbols for architecture x86_64

If you cannot compile the EDDL using the distributed mode due to OpenSSL, you might try these things:

First, make you you have OpenSSL installed:

  • Ubuntu/Debian: sudo apt-get install libcrypto++-dev libssl-dev

  • MacOS: brew install openssl

If this does not work, check if the following paths are correctly setup:

# NOTE: This is a copy-paste from "brew", but for linux should be quite similar.

If you need to have openssl@1.1 first in your PATH run:
  echo 'export PATH="/usr/local/opt/openssl@1.1/bin:$PATH"' >> ~/.zshrc

For compilers to find openssl@1.1 you may need to set:
  export LDFLAGS="-L/usr/local/opt/openssl@1.1/lib"
  export CPPFLAGS="-I/usr/local/opt/openssl@1.1/include"

For pkg-config to find openssl@1.1 you may need to set:
  export PKG_CONFIG_PATH="/usr/local/opt/openssl@1.1/lib/pkgconfig"

(MacOS) Undefined symbols for architecture x86_64

First, try deleting the build/ folder and run cmake again. If this doesn’t work, try forcing a specific compiler either with the flag -DCMAKE_CXX_COMPILER or by exporting these variables to your environment:

# In MacOS we recommend CLang

# Set env variables
export CC=/usr/local/opt/llvm/bin/clang
export CXX=/usr/local/opt/llvm/bin/clang++
export LDFLAGS="-L/usr/local/opt/llvm/lib"
export CPPFLAGS="-I/usr/local/opt/llvm/include"

Memory

Memory optimization

You can change the memory consumption through three different levels of memory optimization:

  • full_mem (default): No memory bound (highest speed at the expense of the memory requirements)

  • mid_mem: Slight memory optimization (good trade-off memory-speed)

  • low_mem: Optimized for hardware with restricted memory capabilities.

Take into account that these levels respond to the typical memory-speed trade-off

Protobuf

Missing includes

If you get an error like this:

.../eddl/src/serialization/onnx/onnx.pb.h:10:10: fatal error: google/protobuf/port_def.inc: No such file or directory
#include <google/protobuf/port_def.inc>

First, make sure that you have protobuf installed and cmake is detecting the paths correctly:

-- Protobuf include: /usr/include
-- Protobuf libraries: /usr/lib/x86_64-linux-gnu/libprotobuf.so-lpthread
-- Protobuf compiler: /usr/bin/protoc

If you are using conda, first check that you have activated the environment: conda activate eddl. Then, if the error persists, check if the paths of protobuf outputed by CMake have been mixed up with the paths from the system (in case protobuf is also installed in the system) like this:

-- Protobuf dir:
-- Protobuf include: /usr/include
-- Protobuf libraries: /usr/lib/x86_64-linux-gnu/libprotobuf.so-lpthread
-- Protobuf compiler: /home/salvacarrion/anaconda3/envs/eddl/bin/protoc

You can try fixing it by forcing cmake to look into the conda env using the flags: -DCMAKE_PREFIX_PATH=$CONDA_PREFIX -DCMAKE_INSTALL_PREFIX=$CONDA_PREFIX (We recommend to delete the build/ folder to avoid cache problems)

If the error persists, use the flag -D BUILD_SUPERBUILD=ON to download all dependencies and link them automatically to the EDDL.

Missing lib

If you get an error like this:

make[2]: *** No rule to make target 'cmake/third_party/protobuf/lib/libprotobuf.a', needed by 'lib64/libeddl.so'.  Stop.

It is because, when using -DBUILD_SUPERBUILD=ON, all critical dependencies are downloaded and compiled locally. These compiled libraries can be found in eddl/build/cmake/third_party/. The problem with the protobuf static library is that, in some systems, it can be found either on protobuf/lib/ or protobuf/lib64/.

Because the EDDL looks into lib/ (by default), when the protobuf library appears in lib64/ we cannot find it. To fix this, create a symbolic link from lib64/ to lib/:

# Inside: eddl/build/cmake/third_party/protobuf/
ln -s lib64 lib

No matching function

See question below (Old version of protoc).

Old version of protoc

This is because your version of protobuf is not compatible with the ONNX files we provide (onnx.pb.h/cc and onnx.proto). We know that the current version of the EDDL (v0.7 at the moment of writing this) works with protobuf 3.11. To install it, you can either use the conda environment (recommended):

# Install dependencies
conda env create -f environment.yml
conda activate eddl

…or install protobuf manually:

# Variables
PROTOBUF_VERSION=3.11.4

# Install requirements
sudo apt-get install -y wget
sudo apt-get install -y autoconf automake libtool curl make g++ unzip

# Download source
wget https://github.com/protocolbuffers/protobuf/releases/download/v$PROTOBUF_VERSION/protobuf-cpp-$PROTOBUF_VERSION.tar.gz
tar -xf protobuf-cpp-$PROTOBUF_VERSION.tar.gz

# Build and install
cd protobuf-$PROTOBUF_VERSION
./configure
make -j$(nproc)
make install  # you may need sudo
ldconfig

If everything is correct, cmake should output something like this, and compile without problems.

-- Use Protobuf: ON
-- Protobuf dir:
-- Protobuf include: /usr/local/include
-- Protobuf libraries: /usr/local/lib/libprotobuf.so-lpthread
-- Protobuf compiler: /usr/local/bin/protoc

ONNX functions

If the ONNX functions don’t work, it might be due to a problem with protobuf, so:

  1. Make sure you have protobuf and libprotobuf installed in standard paths

  2. If you are building the EDDL from source:

    1. Make use of the cmake flag: BUILD_PROTOBUF=ON

    2. Go to src/serialization/onnx/ and delete these files: onnx.pb.cc and onnx.pb.cc

    3. Run protoc --cpp_out=. onnx.proto in the previous directory (src/serialization/onnx/) and make sure these files have been generated: onnx.pb.cc and onnx.pb.cc

Note

Additionally, we recommend making use of the Anaconda environment (see Installation section for more details).

CUDA

Unsupported GNU version

If you get an error like this:

/usr/include/crt/host_config.h:138:2: error: #error -- unsupported GNU version! gcc versions later than 8 are not supported!
138 | #error -- unsupported GNU version! gcc versions later than 8 are not supported!

It is because NVIDIA does not support all GNU compilers. Each new version of CUDA supports a different range of GNU compilers. The solution is to simply use a GNU C++ compiler with a version lower or equal to 8.x. You can do this by:

// Exporting these aliases to .bashrc
export CC=gcc-7
export CXX=g++-7

// Or creating a symbolic link to the CUDA GCC
sudo ln -s /usr/bin/gcc-7 /usr/local/cuda/bin/gcc
sudo ln -s /usr/bin/g++-7 /usr/local/cuda/bin/g++

// ..or set the following flags on the cmake command
cmake .. -DCMAKE_PREFIX_PATH=$CONDA_PREFIX -DCMAKE_INSTALL_PREFIX=$CONDA_PREFIX \
-DBUILD_TARGET=CUDNN \
-DCMAKE_C_COMPILER=/usr/bin/gcc-7 \
-DCMAKE_CXX_COMPILER=/usr/bin/g++-7 \
-DCMAKE_CUDA_COMPILER=/usr/local/cuda/bin/nvcc \
-DCMAKE_CUDA_HOST_COMPILER=/usr/bin/g++-7 \

Anyway, it is convenient to check which is the maximum GCC version that your CUDA supports.

If the problem persists, reinstall CUDA from the official site

IDEs

CLion

I usually have to set additional flags in order to make CLion able to run the EDDL smoothly:

-DBUILD_TARGET=CUDNN
-DCMAKE_C_COMPILER=/usr/bin/gcc-7
-DCMAKE_CXX_COMPILER=/usr/bin/g++-7
-DCMAKE_CUDA_COMPILER=/usr/local/cuda/bin/nvcc

If you want to run it using the conda environment, add:

-DCMAKE_INSTALL_PREFIX=/path/to/dir
-DCMAKE_PREFIX_PATH=/path/to/dir

# Note:
To get the path, activate the environment and type:
echo $CONDA_PREFIX

Dealing with Numpy files

To convert EDDL files (.bin) to numpy files (.npy) or viceversa, we need to install the PyEDDL:

# Install numpy
pip install numpy

# Install PyEDDL
conda config --add channels dhealth
conda config --add channels conda-forge
conda config --set channel_priority strict
conda install -c dhealth pyeddl-cpu  # or *-gpu

- From Numpy (.npy) to EDDL (.bin) files:

from pyeddl.tensor import Tensor
import numpy as np

# Convert numpy to bin
t_npy = np.load("myarray.npy")
t_eddl = Tensor.fromarray(t_npy)
t_eddl.save("myarray.bin")

- From EDDL (.bin) files to Numpy (.npy):

from pyeddl.tensor import Tensor
import numpy as np

# Convert bin to numpy
t_bin = Tensor.load("myarray.bin")
np.save("myarray.npy", t_bin)