|  |  | 
DeepHealth Deep Learning Dataset. More...
#include <support_eddl.h>
 
  
| Public Member Functions | |
| DLDataset (const filesystem::path &filename, const int batch_size, const DatasetAugmentations &augs, const ColorType ctype=ColorType::RGB, const ColorType ctype_gt=ColorType::GRAY, const unsigned num_workers=1, const double queue_ratio_size=1., const std::vector< bool > &drop_last={}, bool verify=false) | |
| void | ResetBatch (const ecvl::any &split=-1, bool shuffle=false) | 
| Reset the batch counter and optionally shuffle samples indices of the specified split.  More... | |
| void | ResetAllBatches (bool shuffle=false) | 
| Reset the batch counter of each split and optionally shuffle samples indices (within each split).  More... | |
| void | LoadBatch (Tensor *&images, Tensor *&labels) | 
| Load a batch into images and labels tensor.  More... | |
| void | LoadBatch (Tensor *&images) | 
| Load a batch into images tensor. Useful for tests set when you don't have labels.  More... | |
| void | SetBatchSize (int bs) | 
| Set a new batch size inside the dataset.  More... | |
| virtual void | ProduceImageLabel (DatasetAugmentations &augs, Sample &elem) | 
| Load a sample and its label, and push them to the producers-consumer queue.  More... | |
| void | ThreadFunc (int thread_index) | 
| Function called when the thread are spawned.  More... | |
| std::tuple< std::vector< Sample >, std::shared_ptr< Tensor >, std::shared_ptr< Tensor > > | GetBatch () | 
| Pop batch_size samples from the queue and copy them into EDDL tensors.  More... | |
| void | Start (int split_index=-1) | 
| Spawn num_workers thread.  More... | |
| void | Stop () | 
| Join all the threads.  More... | |
| auto | GetQueueSize () const | 
| Get the current size of the producers-consumer queue of the dataset.  More... | |
| void | SetAugmentations (const DatasetAugmentations &da) | 
| Set the dataset augmentations.  More... | |
| const int | GetNumBatches (const ecvl::any &split=-1) | 
| Get the number of batches of the specified split.  More... | |
| void | ToTensorPlane (const std::vector< int > &label, Tensor *&tensor) | 
| Convert the sample labels in a one-hot encoded tensor and copy it to the batch tensor.  More... | |
| void | SetWorkers (const unsigned num_workers) | 
| Change the number of workers.  More... | |
| void | SetNumChannels (const int n_channels, const int n_channels_gt=1) | 
| Change the number of channels of the Image produced by ECVL and update the internal EDDL tensors shape accordingly. Useful for custom data loading.  More... | |
|  Public Member Functions inherited from ecvl::Dataset | |
| Dataset () | |
| Dataset (const filesystem::path &filename, bool verify=false) | |
| virtual | ~Dataset () | 
| std::vector< int > & | GetSplit (const ecvl::any &split=-1) | 
| Returns the image indexes of the requested split.  More... | |
| void | SetSplit (const ecvl::any &split) | 
| Set the current split.  More... | |
| void | Dump (const filesystem::path &file_path) | 
| Dump the Dataset into a YAML file following the DeepHealth Dataset Format.  More... | |
| std::vector< std::vector< filesystem::path > > | GetLocations () const | 
| Retrieve the list of all samples locations in the dataset file.  More... | |
| Static Public Member Functions | |
| static void | SetSplitSeed (unsigned seed) | 
| Set a fixed seed for the random generated values. Useful to reproduce experiments with same shuffling during training.  More... | |
| Public Attributes | |
| int | n_channels_ | 
| Number of channels of the images.  More... | |
| int | n_channels_gt_ = -1 | 
| Number of channels of the ground truth images.  More... | |
| std::vector< int > | resize_dims_ | 
| Dimensions (HxW) to which Dataset images must be resized.  More... | |
|  Public Attributes inherited from ecvl::Dataset | |
| std::string | name_ = "DeepHealth dataset" | 
| Name of the Dataset.  More... | |
| std::string | description_ = "This is the DeepHealth example dataset!" | 
| Description of the Dataset.  More... | |
| std::vector< std::string > | classes_ | 
| Vector with all the classes available in the Dataset.  More... | |
| std::vector< std::string > | features_ | 
| Vector with all the features available in the Dataset.  More... | |
| std::vector< Sample > | samples_ | 
| Vector containing all the Dataset samples. See Sample.  More... | |
| std::vector< Split > | split_ | 
| Splits of the Dataset. See Split.  More... | |
| int | current_split_ = -1 | 
| Current split from which images are loaded.  More... | |
| Task | task_ | 
| Task of the dataset.  More... | |
| Protected Member Functions | |
| void | InitTC (int split_index) | 
| Set which are the indices of the samples managed by each thread.  More... | |
| void | SetTensorsShape () | 
| Set internal EDDL tensors shape.  More... | |
|  Protected Member Functions inherited from ecvl::Dataset | |
| std::vector< ecvl::Split >::iterator | GetSplitIt (ecvl::any split) | 
| const int | GetSplitIndex (ecvl::any split) | 
| Protected Attributes | |
| int | batch_size_ | 
| Size of each dataset mini batch.  More... | |
| std::vector< int > | current_batch_ | 
| Number of batches already loaded for each split.  More... | |
| ColorType | ctype_ | 
| ecvl::ColorType of the Dataset images.  More... | |
| ColorType | ctype_gt_ | 
| ecvl::ColorType of the Dataset ground truth images.  More... | |
| DatasetAugmentations | augs_ | 
| ecvl::DatasetAugmentations to be applied to the Dataset images (and ground truth if exist) for each split.  More... | |
| unsigned | num_workers_ | 
| Number of parallel workers.  More... | |
| ProducersConsumerQueue | queue_ | 
| Producers-consumer queue of the dataset.  More... | |
| std::pair< std::vector< int >, std::vector< int > > | tensors_shape_ | 
| Shape of sample and label tensors.  More... | |
| std::vector< std::vector< ThreadCounters > > | splits_tc_ | 
| Each dataset split has its own vector of threads, each of which has its counters: <counter,min,max>.  More... | |
| std::vector< std::thread > | producers_ | 
| Vector of threads representing the samples producers.  More... | |
| bool | active_ = false | 
| Whether the threads have already been launched or not.  More... | |
| std::mutex | active_mutex_ | 
| Mutex for active_ variable.  More... | |
| Static Protected Attributes | |
| static std::default_random_engine | re_ | 
| Engine used for random number generation.  More... | |
| Additional Inherited Members | |
|  Static Public Attributes inherited from ecvl::Dataset | |
| static const std::regex | url_regex_ | 
DeepHealth Deep Learning Dataset.
This class extends the DeepHealth Dataset with Deep Learning specific members.
Definition at line 275 of file support_eddl.h.
| 
 | inline | 
| [in] | filename | Path to the Dataset file. | 
| [in] | batch_size | Size of each dataset mini batch. | 
| [in] | augs | Array with DatasetAugmentations to be applied to the Dataset images (and ground truth if exists) for each split. If no augmentation is required nullptr has to be passed. | 
| [in] | ctype | ecvl::ColorType of the Dataset images. Default is RGB. | 
| [in] | ctype_gt | ecvl::ColorType of the Dataset ground truth images. Default is GRAY. | 
| [in] | num_workers | Number of parallel threads spawned. | 
| [in] | queue_ratio_size | The producers-consumer queue will have a maximum size equal to \(batch\_size \times queue\_ratio\_size \times num\_workers\). | 
| [in] | drop_last | For each split, whether to drop the last samples that don't fit the batch size or not. The vector dimensions must match the number of splits. | 
| [in] | verify | If true, a list of all the images in the Dataset file which don't exist is printed with an ECVL_WARNING_MSG. | 
Definition at line 332 of file support_eddl.h.
| std::tuple<std::vector<Sample>, std::shared_ptr<Tensor>, std::shared_ptr<Tensor> > ecvl::DLDataset::GetBatch | ( | ) | 
Pop batch_size samples from the queue and copy them into EDDL tensors.
| const int ecvl::DLDataset::GetNumBatches | ( | const ecvl::any & | split = -1 | ) | 
Get the number of batches of the specified split.
If no split is provided or an illegal value is provided, the number of batches of the current split is returned.
| [in] | split | index, name or ecvl::SplitType representing the split from which to get the number of batches. | 
| 
 | inline | 
Get the current size of the producers-consumer queue of the dataset.
Definition at line 483 of file support_eddl.h.
| 
 | protected | 
Set which are the indices of the samples managed by each thread.
| [in] | split_index | index of the split to initialize. | 
| void ecvl::DLDataset::LoadBatch | ( | Tensor *& | images, | 
| Tensor *& | labels | ||
| ) | 
Load a batch into images and labels tensor. 
| [out] | images | tensorwhich stores the batch of images. | 
| [out] | labels | tensorwhich stores the batch of labels. | 
| void ecvl::DLDataset::LoadBatch | ( | Tensor *& | images | ) | 
Load a batch into images tensor. Useful for tests set when you don't have labels. 
| [out] | images | tensorwhich stores the batch of images. | 
| 
 | virtual | 
Load a sample and its label, and push them to the producers-consumer queue.
| [in] | elem | Sample to load and push to the queue. | 
| void ecvl::DLDataset::ResetAllBatches | ( | bool | shuffle = false | ) | 
Reset the batch counter of each split and optionally shuffle samples indices (within each split).
| [in] | shuffle | boolean which indicates whether to shuffle the samples indices or not. | 
| void ecvl::DLDataset::ResetBatch | ( | const ecvl::any & | split = -1, | 
| bool | shuffle = false | ||
| ) | 
Reset the batch counter and optionally shuffle samples indices of the specified split.
If no split is provided or an illegal value is provided, the current split is reset.
| [in] | split_index | index, name or SplitType of the split to reset. | 
| [in] | shuffle | boolean which indicates whether to shuffle the split samples indices or not. | 
| void ecvl::DLDataset::SetAugmentations | ( | const DatasetAugmentations & | da | ) | 
Set the dataset augmentations.
| [in] | da | DatasetAugmentations to set. | 
| void ecvl::DLDataset::SetBatchSize | ( | int | bs | ) | 
Set a new batch size inside the dataset.
Notice that this will not affect the EDDL network batch size, that it has to be changed too.
| [in] | bs | Value to set for the batch size. | 
| 
 | inline | 
Change the number of channels of the Image produced by ECVL and update the internal EDDL tensors shape accordingly. Useful for custom data loading.
| [in] | n_channels | Number of channels of input Image. | 
| [in] | n_channels_gt | Number of channels of ground truth. | 
Definition at line 528 of file support_eddl.h.
| 
 | inlinestatic | 
Set a fixed seed for the random generated values. Useful to reproduce experiments with same shuffling during training.
| [in] | seed | Value of the seed for the random engine. | 
Definition at line 439 of file support_eddl.h.
| 
 | inlineprotected | 
Set internal EDDL tensors shape.
Definition at line 300 of file support_eddl.h.
| 
 | inline | 
Change the number of workers.
| [in] | num_workers | Number of threads/workers that will be spawned. | 
Definition at line 510 of file support_eddl.h.
| void ecvl::DLDataset::Start | ( | int | split_index = -1 | ) | 
Spawn num_workers thread.
| [in] | split_index | Index of the split to use in the GetBatch function. If not specified, current split is used. | 
| void ecvl::DLDataset::Stop | ( | ) | 
Join all the threads.
| void ecvl::DLDataset::ThreadFunc | ( | int | thread_index | ) | 
Function called when the thread are spawned.
ProduceImageLabel is called for each sample under the competence of the thread.
| [in] | thread_index | index of the thread. | 
| void ecvl::DLDataset::ToTensorPlane | ( | const std::vector< int > & | label, | 
| Tensor *& | tensor | ||
| ) | 
Convert the sample labels in a one-hot encoded tensor and copy it to the batch tensor.
| [in] | label | vector of the sample labels | 
| [out] | tensor | EDDL Tensor in which to copy the labels (dimensions: [batch_size, num_classes]) | 
| 
 | protected | 
Whether the threads have already been launched or not.
Definition at line 289 of file support_eddl.h.
| 
 | protected | 
Mutex for active_ variable.
Definition at line 290 of file support_eddl.h.
| 
 | protected | 
ecvl::DatasetAugmentations to be applied to the Dataset images (and ground truth if exist) for each split.
Definition at line 283 of file support_eddl.h.
| 
 | protected | 
Size of each dataset mini batch.
Definition at line 279 of file support_eddl.h.
| 
 | protected | 
ecvl::ColorType of the Dataset images.
Definition at line 281 of file support_eddl.h.
| 
 | protected | 
ecvl::ColorType of the Dataset ground truth images.
Definition at line 282 of file support_eddl.h.
| 
 | protected | 
Number of batches already loaded for each split.
Definition at line 280 of file support_eddl.h.
| int ecvl::DLDataset::n_channels_ | 
Number of channels of the images.
Definition at line 317 of file support_eddl.h.
| int ecvl::DLDataset::n_channels_gt_ = -1 | 
Number of channels of the ground truth images.
Definition at line 318 of file support_eddl.h.
| 
 | protected | 
Number of parallel workers.
Definition at line 284 of file support_eddl.h.
| 
 | protected | 
Vector of threads representing the samples producers.
Definition at line 288 of file support_eddl.h.
| 
 | protected | 
Producers-consumer queue of the dataset.
Definition at line 285 of file support_eddl.h.
| 
 | staticprotected | 
Engine used for random number generation.
Definition at line 291 of file support_eddl.h.
| std::vector<int> ecvl::DLDataset::resize_dims_ | 
Dimensions (HxW) to which Dataset images must be resized.
Definition at line 319 of file support_eddl.h.
| 
 | protected | 
Each dataset split has its own vector of threads, each of which has its counters: <counter,min,max>.
Definition at line 287 of file support_eddl.h.
| 
 | protected | 
Shape of sample and label tensors.
Definition at line 286 of file support_eddl.h.
 1.8.15
 1.8.15