nGraph performs a combination of device-specific and non-device-specific optimizations:

  • Fusion -- Fuse multiple ops to decrease memory usage.
  • Data layout abstraction -- Make abstraction easier and faster with nGraph translating element order to work best for a given or available device.
  • Data reuse -- Save results and reuse for subgraphs with the same input.
  • Graph scheduling -- Run similar subgraphs in parallel via multi-threading.
  • Graph partitioning -- Partition subgraphs to run on different devices to speed up computation; make better use of spare CPU cycles with nGraph.
  • Memory management -- Prevent peak memory usage by intercepting a graph with or by a "saved checkpoint," and to enable data auditing.

Beta Limitations

In this Beta release, nGraph only supports Just In Time compilation, but we plan to add support for Ahead of Time compilation in the official release of nGraph. nGraph currently has limited support for dynamic graphs.