Struct AnalysisConfig¶
Defined in File paddle_analysis_config.h
Struct Documentation¶
-
struct
paddle
::
AnalysisConfig
¶ configuration manager for AnalysisPredictor.
AnalysisConfig manages configurations of AnalysisPredictor. During inference procedure, there are many parameters(model/params path, place of inference, etc.) to be specified, and various optimizations(subgraph fusion, memory optimazation, TensorRT engine, etc.) to be done. Users can manage these settings by creating and modifying an AnalysisConfig, and loading it into AnalysisPredictor.
- Since
1.7.0
Public Types
Public Functions
-
AnalysisConfig
() = default¶
-
AnalysisConfig
(const AnalysisConfig &other)¶ Construct a new AnalysisConfig from another AnalysisConfig.
- Parameters
[in] other
: another AnalysisConfig
-
AnalysisConfig
(const std::string &model_dir)¶ Construct a new AnalysisConfig from a no-combined model.
- Parameters
[in] model_dir
: model directory of the no-combined model.
-
AnalysisConfig
(const std::string &prog_file, const std::string ¶ms_file)¶ Construct a new AnalysisConfig from a combined model.
- Parameters
[in] prog_file
: model file path of the combined model.[in] params_file
: params file path of the combined model.
-
void
SetModel
(const std::string &model_dir)¶ Set the no-combined model dir path.
- Parameters
model_dir
: model dir path.
-
void
SetModel
(const std::string &prog_file_path, const std::string ¶ms_file_path)¶ Set the combined model with two specific pathes for program and parameters.
- Parameters
prog_file_path
: model file path of the combined model.params_file_path
: params file path of the combined model.
-
void
SetProgFile
(const std::string &x)¶ Set the model file path of a combined model.
- Parameters
x
: model file path.
-
void
SetParamsFile
(const std::string &x)¶ Set the params file path of a combined model.
- Parameters
x
: params file path.
-
void
SetOptimCacheDir
(const std::string &opt_cache_dir)¶ Set the path of optimization cache directory.
- Parameters
opt_cache_dir
: the path of optimization cache directory.
-
const std::string &
model_dir
() const¶ Get the model directory path.
- Return
const std::string& The model directory path.
-
const std::string &
prog_file
() const¶ Get the program file path.
- Return
const std::string& The program file path.
-
const std::string &
params_file
() const¶ Get the combined parameters file.
- Return
const std::string& The combined parameters file.
-
void
DisableFCPadding
()¶ Turn off FC Padding.
-
bool
use_fc_padding
() const¶ A boolean state telling whether fc padding is used.
- Return
bool Whether fc padding is used.
-
void
EnableUseGpu
(uint64_t memory_pool_init_size_mb, int device_id = 0)¶ Turn on GPU.
- Parameters
memory_pool_init_size_mb
: initial size of the GPU memory pool in MB.device_id
: device_id the GPU card to use (default is 0).
-
void
DisableGpu
()¶ Turn off GPU.
-
bool
use_gpu
() const¶ A boolean state telling whether the GPU is turned on.
- Return
bool Whether the GPU is turned on.
-
int
gpu_device_id
() const¶ Get the GPU device id.
- Return
int The GPU device id.
-
int
memory_pool_init_size_mb
() const¶ Get the initial size in MB of the GPU memory pool.
- Return
int The initial size in MB of the GPU memory pool.
-
float
fraction_of_gpu_memory_for_pool
() const¶ Get the proportion of the initial memory pool size compared to the device.
- Return
float The proportion of the initial memory pool size.
-
void
EnableCUDNN
()¶ Turn on CUDNN.
-
bool
cudnn_enabled
() const¶ A boolean state telling whether to use CUDNN.
- Return
bool Whether to use CUDNN.
-
void
SwitchIrOptim
(int x = true)¶ Control whether to perform IR graph optimization. If turned off, the AnalysisConfig will act just like a NativeConfig.
- Parameters
x
: Whether the ir graph optimization is actived.
-
bool
ir_optim
() const¶ A boolean state telling whether the ir graph optimization is actived.
- Return
bool Whether to use ir graph optimization.
-
void
SwitchUseFeedFetchOps
(int x = true)¶ INTERNAL Determine whether to use the feed and fetch operators. Just for internal development, not stable yet. When ZeroCopyTensor is used, this should be turned off.
- Parameters
x
: Whether to use the feed and fetch operators.
-
bool
use_feed_fetch_ops_enabled
() const¶ A boolean state telling whether to use the feed and fetch operators.
- Return
bool Whether to use the feed and fetch operators.
-
void
SwitchSpecifyInputNames
(bool x = true)¶ Control whether to specify the inputs’ names. The ZeroCopyTensor type has a name member, assign it with the corresponding variable name. This is used only when the input ZeroCopyTensors passed to the AnalysisPredictor.ZeroCopyRun() cannot follow the order in the training phase.
- Parameters
x
: Whether to specify the inputs’ names.
-
bool
specify_input_name
() const¶ A boolean state tell whether the input ZeroCopyTensor names specified should be used to reorder the inputs in AnalysisPredictor.ZeroCopyRun().
- Return
bool Whether to specify the inputs’ names.
-
void
EnableTensorRtEngine
(int workspace_size = 1 << 20, int max_batch_size = 1, int min_subgraph_size = 3, Precision precision = Precision::kFloat32, bool use_static = false, bool use_calib_mode = true)¶ Turn on the TensorRT engine. The TensorRT engine will accelerate some subgraphes in the original Fluid computation graph. In some models such as resnet50, GoogleNet and so on, it gains significant performance acceleration.
- Parameters
workspace_size
: The memory size(in byte) used for TensorRT workspace.max_batch_size
: The maximum batch size of this prediction task, better set as small as possible for less performance loss.min_subgrpah_size
: The minimum TensorRT subgraph size needed, if a subgraph is smaller than this, it will not be transferred to TensorRT engine.precision
: The precision used in TensorRT.use_static
: Serialize optimization information to disk for reusing.use_calib_mode
: Use TRT int8 calibration(post training quantization).
-
bool
tensorrt_engine_enabled
() const¶ A boolean state telling whether the TensorRT engine is used.
- Return
bool Whether the TensorRT engine is used.
-
void
SetTRTDynamicShapeInfo
(std::map<std::string, std::vector<int>> min_input_shape, std::map<std::string, std::vector<int>> max_input_shape, std::map<std::string, std::vector<int>> optim_input_shape, bool disable_trt_plugin_fp16 = false)¶ Set min, max, opt shape for TensorRT Dynamic shape mode.
- Parameters
min_input_shape
: The min input shape of the subgraph input.max_input_shape
: The max input shape of the subgraph input.opt_input_shape
: The opt input shape of the subgraph input.disable_trt_plugin_fp16
: Setting this parameter to true means that TRT plugin will not run fp16.
-
void
EnableLiteEngine
(AnalysisConfig::Precision precision_mode = Precision::kFloat32, const std::vector<std::string> &passes_filter = {}, const std::vector<std::string> &ops_filter = {})¶ Turn on the usage of Lite sub-graph engine.
- Parameters
precision_mode
: Precion used in Lite sub-graph engine.passes_filter
: Set the passes used in Lite sub-graph engine.ops_filter
: Operators not supported by Lite.
-
bool
lite_engine_enabled
() const¶ A boolean state indicating whether the Lite sub-graph engine is used.
- Return
bool whether the Lite sub-graph engine is used.
-
void
SwitchIrDebug
(int x = true)¶ Control whether to debug IR graph analysis phase. This will generate DOT files for visualizing the computation graph after each analysis pass applied.
- Parameters
x
: whether to debug IR graph analysis phase.
-
void
EnableMKLDNN
()¶ Turn on MKLDNN.
-
void
SetMkldnnCacheCapacity
(int capacity)¶ Set the cache capacity of different input shapes for MKLDNN. Default value 0 means not caching any shape.
- Parameters
capacity
: The cache capacity.
-
bool
mkldnn_enabled
() const¶ A boolean state telling whether to use the MKLDNN.
- Return
bool Whether to use the MKLDNN.
-
void
SetCpuMathLibraryNumThreads
(int cpu_math_library_num_threads)¶ Set the number of cpu math library threads.
- Parameters
cpu_math_library_num_threads
: The number of cpu math library threads.
-
int
cpu_math_library_num_threads
() const¶ An int state telling how many threads are used in the CPU math library.
- Return
int The number of threads used in the CPU math library.
-
NativeConfig
ToNativeConfig
() const¶ Transform the AnalysisConfig to NativeConfig.
- Return
NativeConfig The NativeConfig transformed.
-
void
SetMKLDNNOp
(std::unordered_set<std::string> op_list)¶ Specify the operator type list to use MKLDNN acceleration.
- Parameters
op_list
: The operator type list.
-
void
EnableMkldnnQuantizer
()¶ Turn on MKLDNN quantization.
-
bool
mkldnn_quantizer_enabled
() const¶ A boolean state telling whether the MKLDNN quantization is enabled.
- Return
bool Whether the MKLDNN quantization is enabled.
-
MkldnnQuantizerConfig *
mkldnn_quantizer_config
() const¶ Get MKLDNN quantizer config.
- Return
MkldnnQuantizerConfig* MKLDNN quantizer config.
-
void
SetModelBuffer
(const char *prog_buffer, size_t prog_buffer_size, const char *params_buffer, size_t params_buffer_size)¶ Specify the memory buffer of program and parameter. Used when model and params are loaded directly from memory.
- Parameters
prog_buffer
: The memory buffer of program.prog_buffer_size
: The size of the model data.params_buffer
: The memory buffer of the combined parameters file.params_buffer_size
: The size of the combined parameters data.
-
bool
model_from_memory
() const¶ A boolean state telling whether the model is set from the CPU memory.
- Return
bool Whether model and params are loaded directly from memory.
-
void
EnableMemoryOptim
()¶ Turn on memory optimize NOTE still in development.
-
bool
enable_memory_optim
() const¶ A boolean state telling whether the memory optimization is activated.
- Return
bool Whether the memory optimization is activated.
-
void
EnableProfile
()¶ Turn on profiling report. If not turned on, no profiling report will be generated.
-
bool
profile_enabled
() const¶ A boolean state telling whether the profiler is activated.
- Return
bool Whether the profiler is activated.
-
void
DisableGlogInfo
()¶ Mute all logs in Paddle inference.
-
bool
glog_info_disabled
() const¶ A boolean state telling whether logs in Paddle inference are muted.
- Return
bool Whether logs in Paddle inference are muted.
-
void
SetInValid
() const¶ Set the AnalysisConfig to be invalid. This is to ensure that an AnalysisConfig can only be used in one AnalysisPredictor.
-
bool
is_valid
() const¶ A boolean state telling whether the AnalysisConfig is valid.
- Return
bool Whether the AnalysisConfig is valid.
-
PassStrategy *
pass_builder
() const¶ Get a pass builder for customize the passes in IR analysis phase. NOTE: Just for developer, not an official API, easy to be broken.
-
void
PartiallyRelease
()¶
Protected Attributes
-
std::string
model_dir_
¶
-
std::string
prog_file_
¶
-
std::string
params_file_
¶
-
bool
use_gpu_
= {false}¶
-
int
device_id_
= {0}¶
-
uint64_t
memory_pool_init_size_mb_
= {100}¶
-
bool
use_cudnn_
= {false}¶
-
bool
use_fc_padding_
= {true}¶
-
bool
use_tensorrt_
= {false}¶
-
int
tensorrt_workspace_size_
= {1 << 30}¶
-
int
tensorrt_max_batchsize_
= {1}¶
-
int
tensorrt_min_subgraph_size_
= {3}¶
-
bool
trt_use_static_engine_
= {false}¶
-
bool
trt_use_calib_mode_
= {true}¶
-
std::map<std::string, std::vector<int>>
min_input_shape_
= {}¶
-
std::map<std::string, std::vector<int>>
max_input_shape_
= {}¶
-
std::map<std::string, std::vector<int>>
optim_input_shape_
= {}¶
-
bool
disable_trt_plugin_fp16_
= {false}¶
-
bool
enable_memory_optim_
= {false}¶
-
bool
use_mkldnn_
= {false}¶
-
std::unordered_set<std::string>
mkldnn_enabled_op_types_
¶
-
bool
model_from_memory_
= {false}¶
-
bool
enable_ir_optim_
= {true}¶
-
bool
use_feed_fetch_ops_
= {true}¶
-
bool
ir_debug_
= {false}¶
-
bool
specify_input_name_
= {false}¶
-
int
cpu_math_library_num_threads_
= {1}¶
-
bool
with_profile_
= {false}¶
-
bool
with_glog_info_
= {true}¶
-
std::string
serialized_info_cache_
¶
-
std::unique_ptr<PassStrategy>
pass_builder_
¶
-
bool
use_lite_
= {false}¶
-
std::vector<std::string>
lite_passes_filter_
¶
-
std::vector<std::string>
lite_ops_filter_
¶
-
int
mkldnn_cache_capacity_
= {0}¶
-
bool
use_mkldnn_quantizer_
= {false}¶
-
std::shared_ptr<MkldnnQuantizerConfig>
mkldnn_quantizer_config_
¶
-
bool
is_valid_
= {true}¶
-
std::string
opt_cache_dir_
¶
Friends
- friend class ::paddle::AnalysisPredictor