Dynamic vs Static Computational Graphs [Pytorch or Tensorflow]
Updated: Feb 22
A piece on the difference between dynamic and static computational graphs
The main difference between frameworks that use static computational graphs like Tensorflow, CNTK and frameworks that use dynamic computational graphs like Pytorch and DyNet is that the latter work as follows:
A different computation graph is constructed from scratch for each training sample followed by forward and backward propagation so in another words the user is free to use different networks for each input sample. This of course will cost you a little overhead but no need to worry as frameworks like DyNet have an optimized C++ backend and lightweight graph representation.
Experiments show that DyNet’s speeds are faster than or comparable with static declaration toolkits while in static graph frameworks the graph is defined once and is followed by the optimization graph compiler where then both the resulting optimized graph and training samples are fed to this graph. On the one hand, once compiled, large graphs can run efficiently on either a CPU or a GPU, making it ideal for large graphs with a fixed structure, where only the inputs change between instances. However, the compilation step itself can be costly, and it makes the interface more cumbersome to work with.
Now let go into details to know more about the differences between two paradigms.
Static declaration follows two steps:
Definition of a computational architecture
In this step the user defines the shape of the graph that he wishes to proceed with. Like for example take a 16*16 image and pass this image to 10 convolution layers, calculate the loss with this certain function and predict the class of a certain image.
An obvious thing to know is that the graph once defined it can be used multiple times as fast since actually we are not going to create anything new making it super useful in large data-sets and could be very useful in training and testing speed. Secondly, the static computational graph can be used to schedule computations across a pool of computational devices so computational cost could be shared.
Different input sizes could be a problem so for example if your inputs are not restricted to 16*16 , it will be more difficult to define a single structure of identical computations.
Variably structured inputs/outputs: A more complicated case is when each input has not only a different size, but also a different structure for example your data may be has images,texts and structured tables.
However these difficulties can be avoided if we can declare a graph with an unspecified size of input at declaration time and let the graph cope with the inputs, as TensorFlow offers the dynamic_rnn operation. While it is possible to deal with variable architectures with static declaration in principle, it still poses some difficulties in practice:
Complexity of the computation graph implementation:
To support dynamic execution, the computation graph must be able to handle more complex data types (e.g., variable sized tensors and structured data), and operations like flow control primitives must be available as operations. This increases the complexity of computation graph formalism and implementation, and reduces opportunities for optimization.
Difficulty in debugging:
While static analysis permits some errors to be identified during declaration, many logic errors will necessarily wait to be uncovered until execution (especially when many variables are left unspecified at declaration time), which is necessarily far removed from the declaration code that gave rise to them. This separation of the location of the root cause and location of the observed crash makes for difficult debugging.
This performs a one step technique only, so lets recall our example of a 16*16 image that is loaded then passed to say 2 convolution layers then either compute the loss in case of training phase or compute the predictive probabilities in case of testing, the graph is created for each training instance so it should be light weighted.
Yoav Goldberg — Neural Network Methods in Natural Language Processing-Morgan & Claypool (2017) book.
Synapse's own AI & Machine Learning engineer Omar Ayman on Dynamic vs Static Computational Graphs [Pytorch or Tensorflow]. Synapse Analytics
Want to make your operations A.I. powered?