Nano Tensorflow API#

bigdl.nano.tf.keras#

class bigdl.nano.tf.keras.Model(*args, **kwargs)[source]#

A wrapper class for tf.keras.Model adding more functions for BigDL-Nano.

fit(x=None, y=None, batch_size=None, epochs=1, verbose='auto', callbacks=None, validation_split=0.0, validation_data=None, shuffle=True, class_weight=None, sample_weight=None, initial_epoch=0, steps_per_epoch=None, validation_steps=None, validation_batch_size=None, validation_freq=1, max_queue_size=10, workers=1, use_multiprocessing=False, num_processes=None, backend='multiprocessing')#

Override tf.keras.Model.fit to add more parameters.

All arguments that already exists in tf.keras.Model.fit has the same sementics with tf.keras.Model.fit.

Additional parameters: :param num_processes: when num_processes is not None, it specifies how many sub-processes

to launch to run pseudo-distributed training; when num_processes is None, training will run in the current process.

Parameters

backend – when num_processes is not None, it specifies which backend to use when launching sub-processes to run psedu-distributed training; when num_processes is None, this parameter takes no effect.

quantize(calib_dataset: tensorflow.python.data.ops.dataset_ops.DatasetV1, precision: str = 'int8', accelerator: Optional[str] = None, metric: Optional[tensorflow.python.keras.metrics.Metric] = None, accuracy_criterion: Optional[dict] = None, approach: str = 'static', method: Optional[str] = None, conf: Optional[str] = None, tuning_strategy: Optional[str] = None, timeout: Optional[int] = None, max_trials: Optional[int] = None, batch=None, inputs: Optional[List[str]] = None, outputs: Optional[List[str]] = None, sample_size: int = 100, onnxruntime_session_options=None, openvino_config=None)#

Post-training quantization on a keras model.

Parameters
  • calib_dataset – An unbatched tf.data.Dataset object for calibration. Required for static quantization. It’s also used as validation dataloader.

  • precision – Global precision of quantized model, supported type: ‘int8’, defaults to ‘int8’.

  • accelerator – Use accelerator ‘None’, ‘onnxruntime’, ‘openvino’, defaults to None. None means staying in tensorflow.

  • metric – A tensorflow.keras.metrics.Metric object for evaluation.

  • accuracy_criterion – Tolerable accuracy drop. accuracy_criterion = {‘relative’: 0.1, ‘higher_is_better’: True} allows relative accuracy loss: 1%. accuracy_criterion = {‘absolute’: 0.99, ‘higher_is_better’:False} means accuracy must be smaller than 0.99.

  • approach – ‘static’ or ‘dynamic’. ‘static’: post_training_static_quant, ‘dynamic’: post_training_dynamic_quant. Default: ‘static’. Only ‘static’ approach is supported now.

  • method – Method to do quantization. When accelerator=None, supported methods: None. When accelerator=’onnxruntime’, supported methods: ‘qlinear’, ‘integer’, defaults to ‘qlinear’. Suggest ‘qlinear’ for lower accuracy drop if using static quantization. More details in https://onnxruntime.ai/docs/performance/quantization.html. This argument doesn’t take effect for OpenVINO, don’t change it for OpenVINO.

  • conf – A path to conf yaml file for quantization. Default: None, using default config.

  • tuning_strategy – ‘bayesian’, ‘basic’, ‘mse’, ‘sigopt’. Default: ‘bayesian’.

  • timeout – Tuning timeout (seconds). Default: None, which means early stop. Combine with max_trials field to decide when to exit.

  • max_trials – Max tune times. Default: None, which means no tuning. Combine with timeout field to decide when to exit. “timeout=0, max_trials=1” means it will try quantization only once and return satisfying best model.

  • batch – Batch size of dataloader for calib_dataset. Defaults to None, if the dataset is not a BatchDataset, batchsize equals to 1. Otherwise, batchsize complies with the dataset._batch_size.

  • inputs – A list of input names. Default: None, automatically get names from graph.

  • outputs – A list of output names. Default: None, automatically get names from graph.

  • sample_size – (optional) a int represents how many samples will be used for Post-training Optimization Tools (POT) from OpenVINO toolkit, only valid for accelerator=’openvino’. Default to 100. The larger the value, the more accurate the conversion, the lower the performance degradation, but the longer the time.

  • onnxruntime_session_options – The session option for onnxruntime, only valid when accelerator=’onnxruntime’, otherwise will be ignored.

  • openvino_config – The config to be inputted in core.compile_model. Only valid when accelerator=’openvino’, otherwise will be ignored.

Returns

A TensorflowBaseModel for INC. If there is no model found, return None.

trace(accelerator: Optional[str] = None, input_sample=None, thread_num: Optional[int] = None, onnxruntime_session_options=None, openvino_config=None)#

Trace a Keras model and convert it into an accelerated module for inference.

Parameters
  • input_sample – A set of inputs for trace, defaults to None. It should be a (tuple or list of) tf.TensorSpec or numpy array defining the shape/dtype of the input when using ‘onnxruntime’ accelerator. The parameter will be ignored if accelerator is ‘openvino’.

  • accelerator – The accelerator to use, defaults to None meaning staying in Keras backend. ‘openvino’ and ‘onnxruntime’ are supported for now.

  • thread_num – (optional) a int represents how many threads(cores) is needed for inference, only valid for accelerator=’onnxruntime’ or accelerator=’openvino’.

  • onnxruntime_session_options – The session option for onnxruntime, only valid when accelerator=’onnxruntime’, otherwise will be ignored.

  • openvino_config – The config to be inputted in core.compile_model. Only valid when accelerator=’openvino’, otherwise will be ignored.

Returns

Model with different acceleration(OpenVINO/ONNX Runtime).

class bigdl.nano.tf.keras.Sequential(*args, **kwargs)[source]#

A wrapper class for tf.keras.Sequential adding more functions for BigDL-Nano.

Create a nano Sequential model, having the same arguments with tf.keras.Sequential.

fit(x=None, y=None, batch_size=None, epochs=1, verbose='auto', callbacks=None, validation_split=0.0, validation_data=None, shuffle=True, class_weight=None, sample_weight=None, initial_epoch=0, steps_per_epoch=None, validation_steps=None, validation_batch_size=None, validation_freq=1, max_queue_size=10, workers=1, use_multiprocessing=False, num_processes=None, backend='multiprocessing')#

Override tf.keras.Model.fit to add more parameters.

All arguments that already exists in tf.keras.Model.fit has the same sementics with tf.keras.Model.fit.

Additional parameters: :param num_processes: when num_processes is not None, it specifies how many sub-processes

to launch to run pseudo-distributed training; when num_processes is None, training will run in the current process.

Parameters

backend – when num_processes is not None, it specifies which backend to use when launching sub-processes to run psedu-distributed training; when num_processes is None, this parameter takes no effect.

quantize(calib_dataset: tensorflow.python.data.ops.dataset_ops.DatasetV1, precision: str = 'int8', accelerator: Optional[str] = None, metric: Optional[tensorflow.python.keras.metrics.Metric] = None, accuracy_criterion: Optional[dict] = None, approach: str = 'static', method: Optional[str] = None, conf: Optional[str] = None, tuning_strategy: Optional[str] = None, timeout: Optional[int] = None, max_trials: Optional[int] = None, batch=None, inputs: Optional[List[str]] = None, outputs: Optional[List[str]] = None, sample_size: int = 100, onnxruntime_session_options=None, openvino_config=None)#

Post-training quantization on a keras model.

Parameters
  • calib_dataset – An unbatched tf.data.Dataset object for calibration. Required for static quantization. It’s also used as validation dataloader.

  • precision – Global precision of quantized model, supported type: ‘int8’, defaults to ‘int8’.

  • accelerator – Use accelerator ‘None’, ‘onnxruntime’, ‘openvino’, defaults to None. None means staying in tensorflow.

  • metric – A tensorflow.keras.metrics.Metric object for evaluation.

  • accuracy_criterion – Tolerable accuracy drop. accuracy_criterion = {‘relative’: 0.1, ‘higher_is_better’: True} allows relative accuracy loss: 1%. accuracy_criterion = {‘absolute’: 0.99, ‘higher_is_better’:False} means accuracy must be smaller than 0.99.

  • approach – ‘static’ or ‘dynamic’. ‘static’: post_training_static_quant, ‘dynamic’: post_training_dynamic_quant. Default: ‘static’. Only ‘static’ approach is supported now.

  • method – Method to do quantization. When accelerator=None, supported methods: None. When accelerator=’onnxruntime’, supported methods: ‘qlinear’, ‘integer’, defaults to ‘qlinear’. Suggest ‘qlinear’ for lower accuracy drop if using static quantization. More details in https://onnxruntime.ai/docs/performance/quantization.html. This argument doesn’t take effect for OpenVINO, don’t change it for OpenVINO.

  • conf – A path to conf yaml file for quantization. Default: None, using default config.

  • tuning_strategy – ‘bayesian’, ‘basic’, ‘mse’, ‘sigopt’. Default: ‘bayesian’.

  • timeout – Tuning timeout (seconds). Default: None, which means early stop. Combine with max_trials field to decide when to exit.

  • max_trials – Max tune times. Default: None, which means no tuning. Combine with timeout field to decide when to exit. “timeout=0, max_trials=1” means it will try quantization only once and return satisfying best model.

  • batch – Batch size of dataloader for calib_dataset. Defaults to None, if the dataset is not a BatchDataset, batchsize equals to 1. Otherwise, batchsize complies with the dataset._batch_size.

  • inputs – A list of input names. Default: None, automatically get names from graph.

  • outputs – A list of output names. Default: None, automatically get names from graph.

  • sample_size – (optional) a int represents how many samples will be used for Post-training Optimization Tools (POT) from OpenVINO toolkit, only valid for accelerator=’openvino’. Default to 100. The larger the value, the more accurate the conversion, the lower the performance degradation, but the longer the time.

  • onnxruntime_session_options – The session option for onnxruntime, only valid when accelerator=’onnxruntime’, otherwise will be ignored.

  • openvino_config – The config to be inputted in core.compile_model. Only valid when accelerator=’openvino’, otherwise will be ignored.

Returns

A TensorflowBaseModel for INC. If there is no model found, return None.

trace(accelerator: Optional[str] = None, input_sample=None, thread_num: Optional[int] = None, onnxruntime_session_options=None, openvino_config=None)#

Trace a Keras model and convert it into an accelerated module for inference.

Parameters
  • input_sample – A set of inputs for trace, defaults to None. It should be a (tuple or list of) tf.TensorSpec or numpy array defining the shape/dtype of the input when using ‘onnxruntime’ accelerator. The parameter will be ignored if accelerator is ‘openvino’.

  • accelerator – The accelerator to use, defaults to None meaning staying in Keras backend. ‘openvino’ and ‘onnxruntime’ are supported for now.

  • thread_num – (optional) a int represents how many threads(cores) is needed for inference, only valid for accelerator=’onnxruntime’ or accelerator=’openvino’.

  • onnxruntime_session_options – The session option for onnxruntime, only valid when accelerator=’onnxruntime’, otherwise will be ignored.

  • openvino_config – The config to be inputted in core.compile_model. Only valid when accelerator=’openvino’, otherwise will be ignored.

Returns

Model with different acceleration(OpenVINO/ONNX Runtime).

class bigdl.nano.tf.keras.layers.Embedding(input_dim, output_dim, embeddings_initializer='uniform', embeddings_regularizer=None, activity_regularizer=None, embeddings_constraint=None, mask_zero=False, input_length=None, **kwargs)[source]#

A slightly modified version of tf.keras Embedding layer.

This embedding layer only applies regularizer to the output of the embedding layers, so that the gradient to embeddings is sparse.

Create a slightly modified version of tf.keras Embedding layer.

Parameters
  • input_dim – Integer. Size of the vocabulary, i.e. maximum integer index + 1.

  • output_dim – Integer. Dimension of the dense embedding.

  • embeddings_initializer – Initializer for the embeddings matrix (see keras.initializers).

  • embeddings_regularizer – Applying regularizer directly on embeddings will make the sparse gradient dense and may result in degraded performance. We recommend you to use activity_regularizer.

  • activity_regularizer – Regularizer function applied to the output tensor after looking up the embeddings matrix.

  • embeddings_constraint – Constraint function applied to the embeddings matrix (see keras.constraints).

  • mask_zero – Boolean, whether or not the input value 0 is a special “padding” value that should be masked out. This is useful when using recurrent layers which may take variable length input. If this is True, then all subsequent layers in the model need to support masking or an exception will throw. If mask_zero is set to True, as a consequence, index 0 cannot be used in the vocabulary (input_dim should equal size of vocabulary + 1).

  • input_length – Length of input sequences, when it is constant. This argument is required if you are going to connect Flatten then Dense layers upstream (without it, the shape of the dense outputs cannot be computed).

  • kwargs – Keyword arguments passed to tf.keras.layers.Embedding

bigdl.nano.tf.optimizers#

class bigdl.nano.tf.optimizers.SparseAdam(learning_rate=0.001, beta_1=0.9, beta_2=0.999, epsilon=1e-07, amsgrad=False, name='SparseAdam', **kwargs)[source]#

A variant of the Adam optimizer that handles sparse updates more efficiently.

The original Adam algorithm maintains two moving-average accumulators for each trainable variable; the accumulators are updated at every step. In this variant, only moments that show up in the gradient get updated, and only those portions of the gradient get applied to the parameters. Compared with the original Adam optimizer, it can provide large improvements in model training throughput for some applications.

Create a slightly modified version of tf.keras.optimizers.Adam.

which only update moving-average accumulators for sparse variable indices that appear in the current batch.

Parameters
  • learning_rate – A Tensor, floating point value, or a schedule that is a tf.keras.optimizers.schedules.LearningRateSchedule, or a callable that takes no arguments and returns the actual value to use, The learning rate. Defaults to 0.001.

  • beta_1 – A float value or a constant float tensor, or a callable that takes no arguments and returns the actual value to use. The exponential decay rate for the 1st moment estimates. Defaults to 0.9.

  • beta_2 – A float value or a constant float tensor, or a callable that takes no arguments and returns the actual value to use, The exponential decay rate for the 2nd moment estimates. Defaults to 0.999.

  • epsilon – A small constant for numerical stability. This epsilon is “epsilon hat” in the Kingma and Ba paper (in the formula just before Section 2.1), not the epsilon in Algorithm 1 of the paper. Defaults to 1e-7.

  • amsgrad – Boolean. Currently amsgrad is not supported and it can only set to False.

  • name – Optional name for the operations created when applying gradients. Defaults to “Adam”.

  • kwargs – Keyword arguments. Allowed to be one of “clipnorm” or “clipvalue”. “clipnorm” (float) clips gradients by norm; “clipvalue” (float) clips gradients by value.

Patch API#

bigdl.nano.tf.dispatcher.patch_tensorflow()[source]#

patch_tensorflow is used to patch optimized tensorflow classes to replace original ones.

Optimized classes include:

1. tf.keras.Model/keras.Model -> bigdl.nano.tf.keras.Model
2. tf.keras.Sequential/keras.Sequential -> bigdl.nano.tf.keras.Sequential
3. tf.keras.layers.Embedding/keras.layers.Embedding -> bigdl.nano.tf.keras.layers.Embedding
4. tf.optimizers.Adam -> bigdl.nano.tf.optimizers.SparseAdam
bigdl.nano.tf.dispatcher.unpatch_tensorflow()[source]#

unpatch_tensorflow is used to unpatch optimized tensorflow classes to original ones.