参考模型

时间

The tables in this section contain inference timings for a set of representative models. The quantized models have been imported and compiled offline using SyNAP toolkit. The floating point models are benchmarked for comparison purpose with the corresponding quantized models.

The mobilenet_v1, mobilenet_v2, posenet and inception models are open-source models available in tflite format from TensorFlow Hosted Models page: https://www.tensorflow.org/lite/guide/hosted_models

yolov5 models are available from https://github.com/ultralytics/yolov5, while yolov5_face comes from https://github.com/deepcam-cn/yolov5-face.

Other models come from AI-Benchmark APK: https://ai-benchmark.com/ranking_IoT.html.

Some of the models are Synaptics proprietary, including test models, object detection (mobilenet224), super-resolution and format conversion models.

The model test_64_128x128_5_132_132 has been designed to take maximum advantage of the computational capabilities of the NPU. It has 64 5x5 convolutions with a [1, 128, 128, 132] input and output. Its execution requires 913’519’411’200 operations (0.913 TOPs). Inference time shows that in the right conditions VS640 and SL1640 achieve above 1.6 TOP/s while VS680 and SL1680 able to achieve above 7.9 TOP/s. For 16-bits inference the maximum TOP/s can be achieved with test_64_64x64_5_132_132. With this model we achieve 0.45 TOP/s on VS640/SL1640 and above 1.7 TOP/s on VS680/SL1680. For actual models used in practice it’s very difficut to get close to this level of performance and it’s hard to predict the inference time of a model from the number of operation it contains. The only reliable way is to execute the model on the platform and measure.

Remarks:

In the following tables all timing values are expressed in milliseconds
The columns Online CPU and Online NPU represent the inference time obtained by running the original tflite model directly on the board (online conversion)
Online CPU tests have been done with 4 threads (--num_threads=4) on both vs680 and vs640
Online CPU tests of floating point models on vs640 have been done in fp16 mode (--allow_fp16=true)
Online NPU tests executed with the timvx delegate (--external_delegate_path=libvx_delegate.so)
The Offline Infer column represents the inference time obtained by using a model converted offline using SyNAP toolkit (median time over 10 consecutive inferences)
The Online timings represent the minimum time measured (for both init and inference). We took minimim instead of average because this is measure less sensitive to outliers due to the test process being temporarily suspended by the CPU scheduler
Online timings, in particular for init and CPU inference, can be influenced by other processes running on the board and the total amount of free memory available. We ran all tests on Android AOSP/64bits with 4GB of memory on VS680 and 2GB on VS640. Running on Android GMS or 32-bits OS or with less memory can result in longer init and inference times
Offline tests have been done with non-contiguous memory allocation and no cache flush
Models marked with * come precompiled and preinstalled on the platform

VS680 和 SL1680 的推理时间

These tables show the inference timings for a set of models on VS680 and SL1680. All tests have been done on 64-bits OS with 4GB of memory.

Synaptics models
Model	Online NPU Init	Online NPU Infer	Offline NPU Init	Offline NPU Infer
convert_nv12@1920x1080_rgb@1920x1080			17.52	30.81	*
convert_nv12@1920x1080_rgb@224x224			14.25	1.27	*
convert_nv12@1920x1080_rgb@640x360			13.55	5.14	*
sr_fast_y_uv_1280x720_3840x2160	317	32.88	18.09	11.46	*
sr_fast_y_uv_1920x1080_3840x2160	776	50.56	20.40	17.50	*
sr_qdeo_y_uv_1280x720_3840x2160	149	32.49	21.84	20.59	*
sr_qdeo_y_uv_1920x1080_3840x2160	233	38.41	24.11	25.84	*
mobilenet224_full80			66.28	25.17	*
mobilenet224_full1			57.71	14.23	*
test_64_128x128_5_132_132			50.07	119.34

Open models
Model	Online CPU Infer	Online GPU Infer	Online NPU Init	Online NPU Infer	Offline NPU Init	Offline NPU Infer
inception_v4_299_quant	500.54		13502	17.80	100.79	19.59
mobilenet_v1_0.25_224_quant	3.37		166	0.81	2.61	0.77	*
mobilenet_v2_1.0_224_quant	18.60		854	1.85	6.13	1.79	*
posenet_mobilenet_075_float	34.44	61.78					*
posenet_mobilenet_075_quant	28.60		382	6.01	1.84	2.32
yolov8s-pose					14.61	30.79	*
yolov5m-640x480			6606	113.88	54.11	118.82
yolov5s-640x480			2672	72.27	22.17	75.83
yolov5s_face_640x480_onnx_mq					13.00	31.88	*

AiBenchmark 4 models
Model	Online CPU Infer	Online GPU Infer	Online NPU Init	Online NPU Infer	Offline NPU Init	Offline NPU Infer
deeplab_v3_plus_quant	231.76		4090	60.73	7.68	59.81
dped_quant	335.63		1019	8.93	4.74	8.82
inception_v3_float	370.95	436.74
inception_v3_quant	267.95		7210	9.47	59.55	10.22
mobilenet_v2_b4_quant	53.54		875	12.50	11.53	13.63
mobilenet_v2_float	20.84	35.40
mobilenet_v2_quant	18.72		886	2.02	9.27	1.98
mobilenet_v3_quant	51.89		1089	9.76	13.15	10.15
pynet_quant	976.61		3175	18.56	24.45	19.30
srgan_quant	1513.86		3517	54.58	14.72	56.95
unet_quant	265.51		487	9.34	7.73	14.80
vgg_quant	1641.18		2177	29.77	10.74	30.07

AiBenchmark 5 models
Model	Online CPU Infer	Online GPU Infer	Online NPU Init	Online NPU Infer	Offline NPU Init	Offline NPU Infer
crnn_float	240.53	217.89
crnn_quant	113.52		40641	22.54	217.19	23.03
deeplab_v3_plus_float	1181.26	1447.74
deeplab_v3_plus_quant	674.10		2381	97.98	16.68	103.05
dped_float	4043.67	2353.72
dped_instance_float	2236.11	1386.78
dped_instance_quant	3422.17		288	229.41	5.43	953.44
dped_quant	2249.53		3266	196.30	18.70	199.15
efficientnet_b4_float	432.00	579.98
efficientnet_b4_quant	228.86		9649	162.06	54.48	166.33
esrgan_float	1445.08	1522.63
esrgan_quant	770.69		2119	93.78	5.08	101.56
imdn_float	2553.12	2382.78
imdn_quant	1350.97		3215	165.92	8.63	155.46
inception_v3_float	371.39	437.60
inception_v3_quant	221.61		7254	10.26	76.98	11.38
mobilenet_v2_b8_float	152.22	203.15
mobilenet_v2_b8_quant	91.44		889	25.91	14.69	27.18
mobilenet_v2_float	20.98	36.08
mobilenet_v2_quant	12.28		968	2.11	10.04	2.07
mobilenet_v3_b4_float	351.62	461.38
mobilenet_v3_b4_quant	359.39		1497	97.86	20.36	101.09
mobilenet_v3_float	91.33	114.77
mobilenet_v3_quant	97.05		1706	19.82	15.14	20.92
mv3_depth_float	132.65	194.37
mv3_depth_quant	218.06		1513	71.09	15.74	90.96
punet_float	2612.87	1796.33
punet_quant	1660.59		2019	155.60	14.06	149.79
pynet_float	2836.85	1620.06
pynet_quant	2100.18		3441	137.39	15.80	135.94
resnet_float	0.10	2.86
resnet_quant	0.41		132	0.13	3.95	0.12
srgan_float	6192.96	2921.47
srgan_quant	4220.89		12224	200.90	29.85	208.23
unet_float	2909.00	2132.16
unet_quant	1710.41		775	69.08	19.11	95.29
vsr_float	820.35	974.12
vsr_quant	580.30		2124	155.28	20.45	133.86
xlsr_float	518.61	532.46
xlsr_quant	470.63		1700	36.20	3.93	31.38
yolo_v4_tiny_float	187.81	157.75
yolo_v4_tiny_quant	311.65		1406	6.62	4.69	6.03

VS640 和 SL1640 的推理时间

These tables show the inference timings for a set of models on VS640 and SL1640. All tests have been done on 64-bits OS with 2GB of memory.

Synaptics models
Model	Online NPU Init	Online NPU Infer	Offline NPU Init	Offline NPU Infer
convert_nv12@1920x1080_rgb@1920x1080			17.48	34.49	*
convert_nv12@1920x1080_rgb@224x224			15.14	1.25	*
convert_nv12@1920x1080_rgb@640x360			14.70	5.29	*
sr_fast_y_uv_1280x720_3840x2160	274	53.03	17.87	17.01	*
sr_fast_y_uv_1920x1080_3840x2160	524	86.39	20.35	25.90	*
sr_qdeo_y_uv_1280x720_3840x2160			20.33	26.16	*
sr_qdeo_y_uv_1920x1080_3840x2160			22.03	33.56	*
mobilenet224_full80			718.96	52.98	*
mobilenet224_full1			595.52	36.53	*
test_64_128x128_5_132_132			63.95	563.81

Open models
Model	Online CPU Infer	Online GPU Infer	Online NPU Init	Online NPU Infer	Offline NPU Init	Offline NPU Infer
inception_v4_299_quant	255.94		21481	54.07	127.13	53.82
mobilenet_v1_0.25_224_quant	2.80		244	1.00	4.99	0.93	*
mobilenet_v2_1.0_224_quant	12.31		1203	2.40	14.21	2.31	*
posenet_mobilenet_075_float	27.96	90.06					*
posenet_mobilenet_075_quant	18.70		565	9.76	2.48	4.13
yolov8s-pose					20.66	54.59	*
yolov5m-640x480			10657	175.90	60.64	178.00
yolov5s-640x480			4264	101.73	24.90	103.36
yolov5s_face_640x480_onnx_mq					27.31	59.63	*

AiBenchmark 4 models
Model	Online CPU Infer	Online GPU Infer	Online NPU Init	Online NPU Infer	Offline NPU Init	Offline NPU Infer
deeplab_v3_plus_quant	158.81		3679	82.47	8.31	70.85
dped_quant	134.25		1694	25.84	6.71	25.72
inception_v3_float	229.07	706.98
inception_v3_quant	146.25		11130	30.21	80.70	29.82
mobilenet_v2_b4_quant	47.85		1373	18.95	13.60	18.39
mobilenet_v2_float	18.27	52.41
mobilenet_v2_quant	12.44		1282	2.57	13.81	2.44
mobilenet_v3_quant	47.99		1593	12.57	16.38	11.91
pynet_quant	447.61		4803	57.11	31.04	56.30
srgan_quant	829.17		5232	121.92	15.97	121.75
unet_quant	159.01		745	18.58	9.93	24.20
vgg_quant	572.70		3258	103.66	10.65	102.65

AiBenchmark 5 models
Model	Online CPU Infer	Online GPU Infer	Online NPU Init	Online NPU Infer	Offline NPU Init	Offline NPU Infer
crnn_float	140.70	352.45
crnn_quant	87.40		70679	33.96	284.46	33.66
deeplab_v3_plus_float	976.50	2465.12
deeplab_v3_plus_quant	580.25		3889	137.92	19.01	133.41
dped_float	2197.13	4492.49
dped_instance_float	1458.77	2556.78
dped_instance_quant	3565.00		438	366.10
dped_quant	1166.57		4025	340.13	14.97	326.37
efficientnet_b4_float	396.99	945.97
efficientnet_b4_quant	266.18		15393	202.61	74.93	200.67
esrgan_float	971.15	2632.86
esrgan_quant	501.45		2394	147.30	5.28	147.75
imdn_float	1596.76	4181.09
imdn_quant	1123.80		5018	281.54	5.57	269.71
inception_v3_float	228.96	724.73
inception_v3_quant	157.53		11300	30.83	96.69	30.12
mobilenet_v2_b8_float	132.81	338.71
mobilenet_v2_b8_quant	88.94		1395	39.21	15.75	38.26
mobilenet_v2_float	18.09	52.04
mobilenet_v2_quant	11.68		1395	2.69	14.11	2.58
mobilenet_v3_b4_float	327.02	765.60
mobilenet_v3_b4_quant	364.18		2358	136.42	22.52	135.36
mobilenet_v3_float	90.62	181.89
mobilenet_v3_quant	100.55		2586	24.74	20.79	23.85
mv3_depth_float	189.75	331.05
mv3_depth_quant	301.72		2307	80.36	18.43	98.45
punet_float	1856.32	3173.80
punet_quant	1572.03		2736	259.45	11.74	249.44
pynet_float	2747.17	2833.27
pynet_quant	2126.74		5113	282.27	17.12	275.28
resnet_float	0.08	2.16
resnet_quant	0.04		205	0.17	5.09	0.23
srgan_float	3295.12	5420.38
srgan_quant	1740.11		17125	423.24	29.30	420.17
unet_float	1726.05	3776.94
unet_quant	1315.86		1089	155.08	14.68	195.52
vsr_float	680.60	1905.83
vsr_quant	621.03		1732	200.22	11.42	156.74
xlsr_float	595.84	987.90
xlsr_quant	570.30		2063	42.01	3.27	41.15
yolo_v4_tiny_float	125.44	254.04
yolo_v4_tiny_quant	321.50		2064	13.68	8.08	12.73

超分辨率

Synaptics provides two proprietary families of super resolution models: fast and qdeo, the former provides better inference time, the latter better upscaling quality. They can be tested using synap_cli_ip application, see synap_cli_ip 应用.

These models are preinstalled in $MODELS/image_processing/super_resolution .

Synaptics SuperResolution Models on Y+UV Channels
Name	Input Image	Ouput Image	Factor
sr_fast_y_uv_960x540_3840x2160	960x540	3840x2160	4
sr_fast_y_uv_1280x720_3840x2160	1280x720	3840x2160	3
sr_fast_y_uv_1920x1080_3840x2160	1920x1080	3840x2160	2
sr_qdeo_y_uv_960x540_3840x2160	960x540	3840x2160	4
sr_qdeo_y_uv_1280x720_3840x2160	1280x720	3840x2160	3
sr_qdeo_y_uv_1920x1080_3840x2160	1920x1080	3840x2160	2
sr_qdeo_y_uv_640x360_1920x1080	640x360	1920x1080	3

格式转换

Conversion models can be used to convert an image from NV12 format to RGB. A set of models is provided for the most commonly used resolutions. These models have been generated by taking advantage of the preprocessing feature of the SyNAP toolkit (see 预处理) and can be used to convert an image so that it can be fed to a processing model with RGB input.

These models are preinstalled in $MODELS/image_processing/preprocess and can be tested using synap_cli_ic2 application, see synap_cli_ic2 应用.

Synaptics Conversion Models NV12 to RGB 224x224
Name	Input Image (NV12)	Ouput Image (RGB)
convert_nv12@426x240_rgb@224x224	426x240	224x224
convert_nv12@640x360_rgb@224x224	640x360	224x224
convert_nv12@854x480_rgb@224x224	854x480	224x224
convert_nv12@1280x720_rgb@224x224	1280x720	224x224
convert_nv12@1920x1080_rgb@224x224	1920x1080	224x224
convert_nv12@2560x1440_rgb@224x224	2560x1440	224x224
convert_nv12@3840x2160_rgb@224x224	3840x2160	224x224
convert_nv12@7680x4320_rgb@224x224	7680x4320	224x224

Synaptics Conversion Models NV12 to RGB 640x360
Name	Input Image (NV12)	Ouput Image (RGB)
convert_nv12@426x240_rgb@640x360	426x240	640x360
convert_nv12@640x360_rgb@640x360	640x360	640x360
convert_nv12@854x480_rgb@640x360	854x480	640x360
convert_nv12@1280x720_rgb@640x360	1280x720	640x360
convert_nv12@1920x1080_rgb@640x360	1920x1080	640x360
convert_nv12@2560x1440_rgb@640x360	2560x1440	640x360
convert_nv12@3840x2160_rgb@640x360	3840x2160	640x360
convert_nv12@7680x4320_rgb@640x360	7680x4320	640x360

Synaptics Conversion Models NV12 to RGB 1920x1080
Name	Input Image (NV12)	Ouput Image (RGB)
convert_nv12@426x240_rgb@1920x1080	426x240	1920x1080
convert_nv12@640x360_rgb@1920x1080	640x360	1920x1080
convert_nv12@854x480_rgb@1920x1080	854x480	1920x1080
convert_nv12@1280x720_rgb@1920x1080	1280x720	1920x1080
convert_nv12@1920x1080_rgb@1920x1080	1920x1080	1920x1080
convert_nv12@2560x1440_rgb@1920x1080	2560x1440	1920x1080
convert_nv12@3840x2160_rgb@1920x1080	3840x2160	1920x1080
convert_nv12@7680x4320_rgb@1920x1080	7680x4320	1920x1080