... | ... | @@ -10,118 +10,95 @@ |
|
|
4.[Network creation](manual-network)
|
|
|
|
|
|
5.[Network training](manual-training)
|
|
|
- [Input Manager ](manual-training#input-manager)
|
|
|
- [Sessions](manual-training#sessions)
|
|
|
- [Snapshots](manual-training#snapshots)
|
|
|
- [Input Manager ](manual-training#input-manager)
|
|
|
- [Snapshots](manual-training#snapshots)
|
|
|
- [Console](manual-training#console)
|
|
|
- [Plotter](manual-training#plotter)
|
|
|
- [CSV-export](manual-training#csv-export)
|
|
|
- [CSV-export](manual-training#csv-export)
|
|
|
- [Weight visualization](manual-training#weight-visualization)
|
|
|
- [Deployment](manual-training#deployment)
|
|
|
|
|
|
6.[Miscellaneous](manual-miscellaneous)
|
|
|
|
|
|
## Input Manager
|
|
|
|
|
|
Caffe allows the user to provide the test- and training-data in three different formats.
|
|
|
These formats consist of LMDB, LEVELDB and HDF5.
|
|
|
|
|
|
To help the user to manage different formats and paths **Barista** provides an input manager. With the help of the manager the user can see the properties of the database and assign it to a layer of the corresponding type.
|
|
|
## Sessions
|
|
|
|
|
|
The input manager can be found over the **"Input Manager"** entry in the **"Edit"** menu. New databases can be added over the file dialog behind the **"Add new Database"** button. The file dialog filter looks for files typical for LMDB or LEVELDB but also for *.hdf5 / *.h5 and *.txt files containing paths to hdf5 files. The filter can be changed from "all" to a specific database format.
|
|
|
A Barista session is a collection of a network topology (as shown in the [Node Editor](manual-network#navigation-in-the-node-editor)) all parameters set for the layers of the network (as shown in the [Layer Properties](manual-network#editing-layer-and-solver-parameters) dock), an optimization/learning method and its hyper parameters (as shown in the Solver Properties dock) and the data used for training and testing the performance (as defined in the [Input Manager](manual-training#input-manager)).
|
|
|
|
|
|
Since Caffe expects txt files with paths to hdf5 instead of raw hdf5 files the input manager automatically creates a txt file which contains the path of the selected hdf5 file.
|
|
|
The Session List gives an overview of all currently defined Sessions:
|
|
|
|
|
|
To ensure that no database is imported twice by accident, the input manager assigns each database an ID by hashing the relevant files. In case of a LMDB the data.mdb file will be hashed, in case of a LEVELDB all *.sst and all *.ldb that are in the same directory as the CURRENT file and in case of a HDF5 database all *.hd5 and *.hdf5 files whose paths are specified in the selected *.txt file.
|
|
|
![SessionList](SessionList.png)
|
|
|
|
|
|
All paths are made relative to the *project path*. On training and testing these are changed dynamically relative to the *session path*. This includes paths inside a HDF5TXT file.
|
|
|
Every session is represented by one entry in the list. Every item provides basic information about the session status as well as controls. The controls and indicators for a single session item are from top to bottom and left to right:
|
|
|
|
|
|
![InputManagerScreenshot_from_2017-11-29_16-33-31](/uploads/686d91742553cbd0d5808aa64ccc0663/InputManagerScreenshot_from_2017-11-29_16-33-31.png)
|
|
|
* **Remote Host** information: For remote sessions, the host name and connected port are displayed
|
|
|
* **Session ID**: A running ID of the session, this helps identifying which session a certain log-line in the console belongs to.
|
|
|
* **State Label**: A coloured marker indicating the session state. A Session can be in one of the follwoing states:
|
|
|
|
|
|
The **Input Manager** provides some useful meta information for all loaded databases. This includes the format or whether the database path is valid and the data can be read. Additional information like the number and dimensions of elements inside the database.
|
|
|
* **WAITING**: The session was just created and is ready to run
|
|
|
* **RUNNING**: The training process is running
|
|
|
* **PAUSED**: The session was paused and can be proceeded
|
|
|
* **FINISHED**: The training process was finished and the maximum iteration was reached
|
|
|
* **FAILED**: The session failed with some error, look in the error console for further information
|
|
|
* **INVALID**: Baristas internal checks found some faulty properties. More details are provided in an additional label instead of the Progress Bar. Even more Details are given when hovering over the latter label.
|
|
|
* **Not Connected**: A remote session lost its connection to a host. Make sure the remote machine is still running and its network connection is still active.
|
|
|
|
|
|
Every database can be renamed inside the **Input Manager** by clicking the **pencil**symbol. This changes only the displayed name, not the database itself. Refreshment of the displayed information is executed on the **Reload** button.
|
|
|
* **Snapshot Button**: When a session is running, the snapshot button can be used to create an unplanned (i.e. not defined in the solver properties) snapshot. Note: Your caffe version has to support the SIGHUP signal on Linux and Mac OS or SIGBREAK on Windows for this to work.
|
|
|
* **Delete Button**: This will delete a session and all its associated files and folders.
|
|
|
* **Context Menu**: More - less often used - functions like cloning and resetting sessions.
|
|
|
* **Play/Pause Button**: Start or Pause the training of a session.
|
|
|
* **Progress Bar**: Displays the iterations that have been trained so far and the maximum iterations as defined in the solver settings.
|
|
|
|
|
|
Assignment of databases to layer can be accomplished by the **RIGHT** arrow assign button. The **Input Manager** will automatically search the net for **input layer** of the corresponding format and prompt the user with a list to select the desired layer.
|
|
|
Remote sessions are not connected automatically, hence on loading a project, they have to be imported from the appropriate remote host, using the host manager.
|
|
|
|
|
|
![LayerSessionScreenshot_from_2017-11-29_17-03-36](/uploads/32ee384a221b912875a5eccb192671ba/LayerSessionScreenshot_from_2017-11-29_17-03-36.png)
|
|
|
Once training of a session has been started, the session can no longer be edited. This ensures that the training results are always in line with the displayed network and settings. If you want to change settings for a network for which training was already started, you can either create a new session which will have the same settings and alter them, or you can clone pre-learned weights from an existing session to a new session using the context menu in the old session. If you know, that you do not want to use the trained network state of one session, you could also select the reset option to throw away all training results and treat the session as new (Please note that you will loose all your training results for the selected session).
|
|
|
|
|
|
The **Input Manager** provides two additional functions for HDF5TXT databases. By clicking the **Open** button the txt files is opened in the local text-editor to make changes by hand. Existing HDF5TXT databases can be extended by adding new HDF5 files. All added file paths are automatically converted to relative paths.
|
|
|
All files needed to train a network (except databases) and all files created during training are stored in the session folder. This is a sub-folder of the project directory or - for remote sessions - a subfolder of the session directory provided when starting the host. Hence, a session folder can be easily transfered to another machine where training could be resumed even if Barista is not installed.
|
|
|
|
|
|
The list of databases can be filtered by type. Selected databases can be deleted by clicking the **"Delete selected Database"** button. Databases can be sorted by using the **UP** and **DOWN** arrow buttons.
|
|
|
##Input Manager
|
|
|
|
|
|
## Sessions
|
|
|
After creating a network in the network designer and defining data sources with training data you could start training the created model.
|
|
|
The training of a model is possible directly from within Barista. All you have to do is pressing '**Start Training**' in the Sessions dock.
|
|
|
In order to save storage space, databases containing training and test data are not stored within a project or session folder. However, Barista offer an Input Manager that takes care of managing all your data.
|
|
|
The Input Manager can be accessed via manu bar: **Edit -> Input Manager** or pressing **Ctrl + I**.
|
|
|
|
|
|
![start_training](/uploads/109b6c39e29cf8bd15bb3cd51d52df1e/start_training.png)
|
|
|
![InputManager](InputManager.png)
|
|
|
|
|
|
You will be asked to save the project and a new session is started.
|
|
|
A **session** is a representation of a Caffe training process. One os process is created for every session, which ensures that a crashed training process does not affect the application.
|
|
|
The training process will use the caffe tool from your caffe installation. The tool is called with the following parameter:
|
|
|
```
|
|
|
/path/to/caffe/caffe train -solver /path/to/solver_file.prototxt
|
|
|
```
|
|
|
If you continue from a snapshot the snapshot parameter is added to the parameter list:
|
|
|
```
|
|
|
/path/to/caffe/caffe train -solver /path/to/solver_file.prototxt -snapshot /path/to/snapshot.solverstate
|
|
|
```
|
|
|
For further information about the parameter consult `caffe --help`.
|
|
|
Barista supports the three main database formats used in caffe: LMDB, LEVELDB and HDF5.
|
|
|
|
|
|
Sessions could be in the following states:
|
|
|
- **WAITING**: the session was just created and is ready to run
|
|
|
- **RUNNING**: the training process is actively running
|
|
|
- **PAUSED**: the session was paused and could be proceeded
|
|
|
- **FINISHED**: the training process was finished and the maximum iteration was reached
|
|
|
- **FAILED**: the session failed with some error, look in the error console for further information
|
|
|
Via **Add new Database**, new data sources can be imported via file dialog.
|
|
|
**The search and add new Databases** recursively searches a directory for known databases formats and provides a list of available data sources.
|
|
|
|
|
|
The session will be contained in a separate directory within the project directory. It will contain log files, snapshots and the net and solver prototxt files for Caffe.
|
|
|
For every database the Input Manager provides an overview of the contained data sets.
|
|
|
|
|
|
![session_directory](/uploads/1ff35e6eec697a18df986d5cd90b5012/session_directory.png)
|
|
|
For every database a unique ID is calculated by hashing its content. Hence loading the same database multiple times is prevented. Furthermore the same database located on multiple hosts can be identified.
|
|
|
|
|
|
After the session was started it will appear in the list of sessions. The list will contain all sessions which were created within the project so far and they will be sorted by their id. Make sure to filter for ALL sessions to see the new session.
|
|
|
Every entry in the host manager has a number of manipulation options, from left to right the buttons are:
|
|
|
|
|
|
![session_entry](/uploads/415553ad9d65c55b17912c8c5175292b/session_entry.png)
|
|
|
- Edit: Change the Name of the Database. This is only for user support, the database name has no influence on the training.
|
|
|
- Delete: Delete the Database entry. This does only delete the Reference in the Input Manager, not your actual data.
|
|
|
- Reload: Reload a database and its content. This can be used if a database became unavailable/dead because a host connection was lost or a network mount failed.
|
|
|
- Move Up/Down: Change the order of the items in the list.
|
|
|
- Assign to layer: Assign the data base as source to one of the available data layers in the network.
|
|
|
|
|
|
The list entry of the session contains the id, controls for starting and pausing the session, a label with the state of the session, the number of training iterations and a list of created snapshots.
|
|
|
**HDF5** data base files can not directly be set as a source to a caffe HDF5 data layer. Barista supports creation and editing of the necessary txt files, that contain links to the actual HDF5 data files. Like for the databases themselfes, HDF5 files can be added to these files one-by-one, as a group or by recursively searching a directory.
|
|
|
|
|
|
### Snapshots
|
|
|
## Snapshots
|
|
|
|
|
|
A snapshot will be created when you press the snapshot button (camera) or when the session is paused.
|
|
|
When a session is continued the last created snapshot will be used to initialise the network weights.
|
|
|
|
|
|
It is possible to start a new session with the current network design (including all hyper parameter) with a snapshot from another session. This allows you to continue training after adjusting parameter.
|
|
|
|
|
|
There are two possible ways to do this:
|
|
|
1. Click the small arrow right next to '**Start Training**' and choose either '**Start from solverstate**' or '**Start from caffemodel**'.
|
|
|
When a session is continued, training will start from the last saved snapshot.
|
|
|
|
|
|
![start_from](/uploads/aecc869a1248ece12c1b17b7ddeac3bb/start_from.png)
|
|
|
It is possible to copy learned weights from one session to a new one to start training a network with pre-learned weights.
|
|
|
|
|
|
Then select a snapshot file. A new session will be created which uses the snapshot for weight initialisation.
|
|
|
|
|
|
2. Click the small arrow on the snapshots button of a session entry. All snapshots which were created within the sessions will be displayed. Choose one snapshot and a new session will started.
|
|
|
|
|
|
![start_from_session](/uploads/2744e87210ef8cba4ea2660a609a10f9/start_from_session.png)
|
|
|
|
|
|
To **delete** one ore more sessions just select the sessions in the list and press '**Delete**'.
|
|
|
This will actually **delete the directories** of the sessions with all created snapshots and log files.
|
|
|
To do so, click on the context-menu icon (three dots) for the session to be cloned in the session list and select **clone session**. You can then choose which snapshot you want to clone and a new session will be created in that state.
|
|
|
|
|
|
## Console
|
|
|
The console is the main output of barista. There are different callers, that register to the console and can write custom messages and errors. The messages should keep the user informed about the current state and actions. E.g. sessions write the caffe output here, the input manger would write errors if the selected data is faulty and other barista modules would also complain about errors.
|
|
|
The messages are not filtered at the beginning. They can be filtered by their type ("Text" or "Error") and caller. The filter is set by the two drop-down boxes.
|
|
|
If there is no filter set, every line has a prefix to identify the caller. it has the format: ``[HH:MM:ss, Caller] Sometext``
|
|
|
|
|
|
![console](/uploads/244e74a4c57428cbb88dcf3449e065e1/console.png)
|
|
|
|
|
|
## Plotter
|
|
|
The plotter is the main tool to visualize the current network performance during training. The caffe training log as seen in the console is parsed and the information are passed to the plotter. Every output (e.g. loss, accuracy, learning rate) can be selected to be plotted independently. Outputs named `loss` wil be added to the Plotter automatically, if a training session is started. As soon as a plotable key is written to the log of the caffe training process, it will appear in the settings panel of the plotter.
|
|
|
|
|
|
The plotter is the main tool to visualize the feedback given by caffe. The caffe training log as seen in the console gets parsed and the information are passed to the plotter. This is highly dynamical: Every output (e.g. loss, accuracy, learning rate) can be plotted independently. As soon as a plotable key is written to the log of the caffe training process, it will appear in the settings panel of the plotter. The setting panel appears when clicking the button on the right hand side of the plotter.
|
|
|
|
|
|
![plotter](/uploads/b38bb9221cf6f4915868af6aec40a8b1/plotter.png)
|
|
|
|
|
|
In the settings panel there is a list of all training **sessions** or loaded **log files**. Every session/log has a corresponding set of plotable keys represented as check boxes contained in the area on the right. Each key is associated to the *train* or *test* phase. If a check box is checked a plot line with the name 'logname.phase.key' is added to the plotter.
|
|
|
|
|
|
![plotter_settings](/uploads/faf0726ebe2aaaf4c32e3204bf53212a/plotter_settings.png)
|
|
|
In the settings panel there is a list of all training sessions or loaded log files. Every session/log has a corresponding set of plotable keys represented as check boxes contained in the area on the right. Each key is associated to the train or test phase. If a check box is checked a plot line with the name 'logname.phase.key' is added to the plotter.
|
|
|
|
|
|
The list of sessions and log files can be extended by starting a new session or opening an existing log file via the 'open file' button. Because the logs, phases and keys can be chosen separately the trainings can be easily compared and analysed.
|
|
|
|
... | ... | @@ -129,8 +106,7 @@ For instance one can compare the loss rate and the learning rate of a training i |
|
|
|
|
|
Furthermore the user can choose how to plot the data (linear, logarithmic) and to plot it against the time or the number of iterations. Both options can be set in the settings panel.
|
|
|
|
|
|
### CSV-export
|
|
|
|
|
|
## CSV-export
|
|
|
It is possible to export the plotted data in the CSV-format. Only the selected data (which is actually plotted) will be written to a CSV-file. The file then contains one table for each session/log file and phase introduced by a comment for indetification. To export the plots just click on 'Export as CSV'. The exported file could look like this
|
|
|
|
|
|
```
|
... | ... | @@ -163,33 +139,29 @@ By holding and dragging the left mouse button, the visualization can be panned i |
|
|
Using the mouse wheel alters the scale of the picture.
|
|
|
A visualization can be saved by clicking the **save** button and entering a desired name and format-suffix.
|
|
|
|
|
|
|
|
|
## Deployment
|
|
|
Barista also supports deploying your network automatically. This is quite handy if you are not only interested in the training results themself, but want to use the final net e.g. inside of an external application, too. As the latter usually involves inference only, some parts of the created graph aren't needed anymore or need to be altered instead.
|
|
|
|
|
|
To start the deployment process in Barista, use the according entry in the top menu bar as shown below.
|
|
|
|
|
|
![barista-deployment-bar-cut](/uploads/62d016f4e3c1365971a66d78c57e0dca/barista-deployment-bar-cut.png)
|
|
|
To start the deployment process in Barista, use the according entry in the top menu bar.
|
|
|
|
|
|
Now you need to specify two things:
|
|
|
- a path pointing to an existing folder on your machine
|
|
|
- and a desired snapshot that has been created inside of your Barista project (or was imported by copying the file into one of the session folders)
|
|
|
|
|
|
![deployment-dialog](/uploads/cc675ed4ebe6659c3e8f382ecbb9087c/deployment-dialog.png)
|
|
|
|
|
|
The chosen snapshot determines which version of your network you want to deploy. The given path will be used as the destination to store all generated files. These files include a copy of the snapshot's caffemodel file, which consists of all trained weights, as well as a prototxt file containing a modified definition of your network's architecture.
|
|
|
The chosen snapshot determines which state (or weight set) of your network you want to deploy. The given path will be used as the destination to store all generated files. These files include a copy of the snapshot's caffemodel file, which consists of all trained weights, as well as a prototxt file containing a modified definition of your network's architecture.
|
|
|
Before exporting those files, the following steps are performed automatically:
|
|
|
- All Data Layers are removed
|
|
|
- All layers requiring labeled data are removed, too (especially all Loss Layers as well as accuracy calculations etc.)
|
|
|
- Instead, new Input Layers with static input dimension will be added
|
|
|
- Input dimension will be determined automatically as well (as long as the former Data Layers used to have a valid data source)
|
|
|
- Blob connections are set automatically
|
|
|
- Usually, there will be only one Input Layer. However, note that, the number of newly-added input layers might be lower than the number of previously-removed data layers. On the one hand, we will add only one new input layer for each unique data blob name, while multiple data layers might have used the same data blob name. On the other hand, input layers will only be added, if at least one other layer is using the provided data.
|
|
|
- Append a new Softmax layer (only if none does exist yet and a SoftmaxWithLoss layer used to be included)
|
|
|
- Usually, there will be only one Input Layer. However, note that, the number of newly-added input layers might be lower than the number of previously-removed data layers. On the one hand, a new input layer will be added only once for every data blob name, while multiple data layers might have used the same data blob name (in Training and Test phase). On the other hand, input layers will only be added, if at least one other layer is using the provided data.
|
|
|
- Append a new Softmax layer (only if none does exist yet and a SoftmaxWithLoss layer was used before).
|
|
|
|
|
|
The above described rules are based on the information provided in the following caffe wiki page: https://github.com/BVLC/caffe/wiki/Using-a-Trained-Network:-Deploy
|
|
|
|
|
|
Finally, some restrictions apply to the deployment process:
|
|
|
- Each data layer must not have more than two top blobs (a general restriction of caffe). Raises a warning.
|
|
|
- At least the label blob name must follow the naming conventions (so it's always called "label"). Raises a warning.
|
|
|
- The shape of a Data layer can only be determined automatically, if the layer type is either "Data" (LMDB or LEVELDB) or "HDF5Data". Otherwise, a warning will inform the user about necessary manual changes.
|
|
|
|
... | ... | |