Struct AnalysisConfig

Struct Documentation

struct paddle::AnalysisConfig

configuration manager for AnalysisPredictor.

AnalysisConfig manages configurations of AnalysisPredictor. During inference procedure, there are many parameters(model/params path, place of inference, etc.) to be specified, and various optimizations(subgraph fusion, memory optimazation, TensorRT engine, etc.) to be done. Users can manage these settings by creating and modifying an AnalysisConfig, and loading it into AnalysisPredictor.

Since

1.7.0

Public Types

enum Precision

Precision of inference in TensorRT.

Values:

enumerator kFloat32 = 0

fp32

enumerator kInt8

int8

enumerator kHalf

fp16

Public Functions

AnalysisConfig() = default
AnalysisConfig(const AnalysisConfig &other)

Construct a new AnalysisConfig from another AnalysisConfig.

Parameters

AnalysisConfig(const std::string &model_dir)

Construct a new AnalysisConfig from a no-combined model.

Parameters
  • [in] model_dir: model directory of the no-combined model.

AnalysisConfig(const std::string &prog_file, const std::string &params_file)

Construct a new AnalysisConfig from a combined model.

Parameters
  • [in] prog_file: model file path of the combined model.

  • [in] params_file: params file path of the combined model.

void SetModel(const std::string &model_dir)

Set the no-combined model dir path.

Parameters
  • model_dir: model dir path.

void SetModel(const std::string &prog_file_path, const std::string &params_file_path)

Set the combined model with two specific pathes for program and parameters.

Parameters
  • prog_file_path: model file path of the combined model.

  • params_file_path: params file path of the combined model.

void SetProgFile(const std::string &x)

Set the model file path of a combined model.

Parameters
  • x: model file path.

void SetParamsFile(const std::string &x)

Set the params file path of a combined model.

Parameters
  • x: params file path.

void SetOptimCacheDir(const std::string &opt_cache_dir)

Set the path of optimization cache directory.

Parameters
  • opt_cache_dir: the path of optimization cache directory.

const std::string &model_dir() const

Get the model directory path.

Return

const std::string& The model directory path.

const std::string &prog_file() const

Get the program file path.

Return

const std::string& The program file path.

const std::string &params_file() const

Get the combined parameters file.

Return

const std::string& The combined parameters file.

void DisableFCPadding()

Turn off FC Padding.

bool use_fc_padding() const

A boolean state telling whether fc padding is used.

Return

bool Whether fc padding is used.

void EnableUseGpu(uint64_t memory_pool_init_size_mb, int device_id = 0)

Turn on GPU.

Parameters
  • memory_pool_init_size_mb: initial size of the GPU memory pool in MB.

  • device_id: device_id the GPU card to use (default is 0).

void DisableGpu()

Turn off GPU.

bool use_gpu() const

A boolean state telling whether the GPU is turned on.

Return

bool Whether the GPU is turned on.

int gpu_device_id() const

Get the GPU device id.

Return

int The GPU device id.

int memory_pool_init_size_mb() const

Get the initial size in MB of the GPU memory pool.

Return

int The initial size in MB of the GPU memory pool.

float fraction_of_gpu_memory_for_pool() const

Get the proportion of the initial memory pool size compared to the device.

Return

float The proportion of the initial memory pool size.

void EnableCUDNN()

Turn on CUDNN.

bool cudnn_enabled() const

A boolean state telling whether to use CUDNN.

Return

bool Whether to use CUDNN.

void SwitchIrOptim(int x = true)

Control whether to perform IR graph optimization. If turned off, the AnalysisConfig will act just like a NativeConfig.

Parameters
  • x: Whether the ir graph optimization is actived.

bool ir_optim() const

A boolean state telling whether the ir graph optimization is actived.

Return

bool Whether to use ir graph optimization.

void SwitchUseFeedFetchOps(int x = true)

INTERNAL Determine whether to use the feed and fetch operators. Just for internal development, not stable yet. When ZeroCopyTensor is used, this should be turned off.

Parameters
  • x: Whether to use the feed and fetch operators.

bool use_feed_fetch_ops_enabled() const

A boolean state telling whether to use the feed and fetch operators.

Return

bool Whether to use the feed and fetch operators.

void SwitchSpecifyInputNames(bool x = true)

Control whether to specify the inputs’ names. The ZeroCopyTensor type has a name member, assign it with the corresponding variable name. This is used only when the input ZeroCopyTensors passed to the AnalysisPredictor.ZeroCopyRun() cannot follow the order in the training phase.

Parameters
  • x: Whether to specify the inputs’ names.

bool specify_input_name() const

A boolean state tell whether the input ZeroCopyTensor names specified should be used to reorder the inputs in AnalysisPredictor.ZeroCopyRun().

Return

bool Whether to specify the inputs’ names.

void EnableTensorRtEngine(int workspace_size = 1 << 20, int max_batch_size = 1, int min_subgraph_size = 3, Precision precision = Precision::kFloat32, bool use_static = false, bool use_calib_mode = true)

Turn on the TensorRT engine. The TensorRT engine will accelerate some subgraphes in the original Fluid computation graph. In some models such as resnet50, GoogleNet and so on, it gains significant performance acceleration.

Parameters
  • workspace_size: The memory size(in byte) used for TensorRT workspace.

  • max_batch_size: The maximum batch size of this prediction task, better set as small as possible for less performance loss.

  • min_subgrpah_size: The minimum TensorRT subgraph size needed, if a subgraph is smaller than this, it will not be transferred to TensorRT engine.

  • precision: The precision used in TensorRT.

  • use_static: Serialize optimization information to disk for reusing.

  • use_calib_mode: Use TRT int8 calibration(post training quantization).

bool tensorrt_engine_enabled() const

A boolean state telling whether the TensorRT engine is used.

Return

bool Whether the TensorRT engine is used.

void SetTRTDynamicShapeInfo(std::map<std::string, std::vector<int>> min_input_shape, std::map<std::string, std::vector<int>> max_input_shape, std::map<std::string, std::vector<int>> optim_input_shape, bool disable_trt_plugin_fp16 = false)

Set min, max, opt shape for TensorRT Dynamic shape mode.

Parameters
  • min_input_shape: The min input shape of the subgraph input.

  • max_input_shape: The max input shape of the subgraph input.

  • opt_input_shape: The opt input shape of the subgraph input.

  • disable_trt_plugin_fp16: Setting this parameter to true means that TRT plugin will not run fp16.

void EnableLiteEngine(AnalysisConfig::Precision precision_mode = Precision::kFloat32, const std::vector<std::string> &passes_filter = {}, const std::vector<std::string> &ops_filter = {})

Turn on the usage of Lite sub-graph engine.

Parameters
  • precision_mode: Precion used in Lite sub-graph engine.

  • passes_filter: Set the passes used in Lite sub-graph engine.

  • ops_filter: Operators not supported by Lite.

bool lite_engine_enabled() const

A boolean state indicating whether the Lite sub-graph engine is used.

Return

bool whether the Lite sub-graph engine is used.

void SwitchIrDebug(int x = true)

Control whether to debug IR graph analysis phase. This will generate DOT files for visualizing the computation graph after each analysis pass applied.

Parameters
  • x: whether to debug IR graph analysis phase.

void EnableMKLDNN()

Turn on MKLDNN.

void SetMkldnnCacheCapacity(int capacity)

Set the cache capacity of different input shapes for MKLDNN. Default value 0 means not caching any shape.

Parameters
  • capacity: The cache capacity.

bool mkldnn_enabled() const

A boolean state telling whether to use the MKLDNN.

Return

bool Whether to use the MKLDNN.

void SetCpuMathLibraryNumThreads(int cpu_math_library_num_threads)

Set the number of cpu math library threads.

Parameters
  • cpu_math_library_num_threads: The number of cpu math library threads.

int cpu_math_library_num_threads() const

An int state telling how many threads are used in the CPU math library.

Return

int The number of threads used in the CPU math library.

NativeConfig ToNativeConfig() const

Transform the AnalysisConfig to NativeConfig.

Return

NativeConfig The NativeConfig transformed.

void SetMKLDNNOp(std::unordered_set<std::string> op_list)

Specify the operator type list to use MKLDNN acceleration.

Parameters
  • op_list: The operator type list.

void EnableMkldnnQuantizer()

Turn on MKLDNN quantization.

bool mkldnn_quantizer_enabled() const

A boolean state telling whether the MKLDNN quantization is enabled.

Return

bool Whether the MKLDNN quantization is enabled.

MkldnnQuantizerConfig *mkldnn_quantizer_config() const

Get MKLDNN quantizer config.

Return

MkldnnQuantizerConfig* MKLDNN quantizer config.

void SetModelBuffer(const char *prog_buffer, size_t prog_buffer_size, const char *params_buffer, size_t params_buffer_size)

Specify the memory buffer of program and parameter. Used when model and params are loaded directly from memory.

Parameters
  • prog_buffer: The memory buffer of program.

  • prog_buffer_size: The size of the model data.

  • params_buffer: The memory buffer of the combined parameters file.

  • params_buffer_size: The size of the combined parameters data.

bool model_from_memory() const

A boolean state telling whether the model is set from the CPU memory.

Return

bool Whether model and params are loaded directly from memory.

void EnableMemoryOptim()

Turn on memory optimize NOTE still in development.

bool enable_memory_optim() const

A boolean state telling whether the memory optimization is activated.

Return

bool Whether the memory optimization is activated.

void EnableProfile()

Turn on profiling report. If not turned on, no profiling report will be generated.

bool profile_enabled() const

A boolean state telling whether the profiler is activated.

Return

bool Whether the profiler is activated.

void DisableGlogInfo()

Mute all logs in Paddle inference.

bool glog_info_disabled() const

A boolean state telling whether logs in Paddle inference are muted.

Return

bool Whether logs in Paddle inference are muted.

void SetInValid() const

Set the AnalysisConfig to be invalid. This is to ensure that an AnalysisConfig can only be used in one AnalysisPredictor.

bool is_valid() const

A boolean state telling whether the AnalysisConfig is valid.

Return

bool Whether the AnalysisConfig is valid.

PassStrategy *pass_builder() const

Get a pass builder for customize the passes in IR analysis phase. NOTE: Just for developer, not an official API, easy to be broken.

void PartiallyRelease()

Protected Functions

void Update()
std::string SerializeInfoCache()

Protected Attributes

std::string model_dir_
std::string prog_file_
std::string params_file_
bool use_gpu_ = {false}
int device_id_ = {0}
uint64_t memory_pool_init_size_mb_ = {100}
bool use_cudnn_ = {false}
bool use_fc_padding_ = {true}
bool use_tensorrt_ = {false}
int tensorrt_workspace_size_ = {1 << 30}
int tensorrt_max_batchsize_ = {1}
int tensorrt_min_subgraph_size_ = {3}
Precision tensorrt_precision_mode_ = {Precision::kFloat32}
bool trt_use_static_engine_ = {false}
bool trt_use_calib_mode_ = {true}
std::map<std::string, std::vector<int>> min_input_shape_ = {}
std::map<std::string, std::vector<int>> max_input_shape_ = {}
std::map<std::string, std::vector<int>> optim_input_shape_ = {}
bool disable_trt_plugin_fp16_ = {false}
bool enable_memory_optim_ = {false}
bool use_mkldnn_ = {false}
std::unordered_set<std::string> mkldnn_enabled_op_types_
bool model_from_memory_ = {false}
bool enable_ir_optim_ = {true}
bool use_feed_fetch_ops_ = {true}
bool ir_debug_ = {false}
bool specify_input_name_ = {false}
int cpu_math_library_num_threads_ = {1}
bool with_profile_ = {false}
bool with_glog_info_ = {true}
std::string serialized_info_cache_
std::unique_ptr<PassStrategy> pass_builder_
bool use_lite_ = {false}
std::vector<std::string> lite_passes_filter_
std::vector<std::string> lite_ops_filter_
Precision lite_precision_mode_
int mkldnn_cache_capacity_ = {0}
bool use_mkldnn_quantizer_ = {false}
std::shared_ptr<MkldnnQuantizerConfig> mkldnn_quantizer_config_
bool is_valid_ = {true}
std::string opt_cache_dir_

Friends

friend class ::paddle::AnalysisPredictor