Skip to content

Conversation

@gaozhangfei
Copy link
Collaborator

@gaozhangfei gaozhangfei commented Jul 24, 2025

Add nosva support

design doc
【腾讯文档】nosva
https://docs.qq.com/doc/DRVBtbVh6TkdaTFVB

++ echo 'test zip, sync'
++ ./uadk_tool/uadk_tool benchmark --alg zlib --mode sva --opt 0 --sync --pktlen 1024
algname: length: perf: iops: CPU_rate:
zlib 1024Bytes 316882.67KiB/s 316.9Kops 290.67%

++ ./uadk_tool/uadk_tool benchmark --alg zlib --mode sva --user --opt 0 --sync --pktlen 1024
algname: length: perf: iops: CPU_rate:
zlib 1024Bytes 498163.67KiB/s 498.2Kops 299.67%

++ ./uadk_tool/uadk_tool benchmark --alg zlib --mode sva --sgl --opt 0 --sync --pktlen 1024
algname: length: perf: iops: CPU_rate:
zlib 1024Bytes 417162.33KiB/s 417.2Kops 299.67%

++ echo 'test zip, async'
++ ./uadk_tool/uadk_tool benchmark --alg zlib --mode sva --opt 0 --async --pktlen 1024
algname: length: perf: iops: CPU_rate:
zlib 1024Bytes 415351.67KiB/s 415.4Kops 312.33%

++ ./uadk_tool/uadk_tool benchmark --alg zlib --mode sva --user --opt 0 --async --pktlen 1024
algname: length: perf: iops: CPU_rate:
zlib 1024Bytes 681221.00KiB/s 681.2Kops 376.00%

++ ./uadk_tool/uadk_tool benchmark --alg zlib --mode sva --sgl --opt 0 --async --pktlen 1024
algname: length: perf: iops: CPU_rate:
zlib 1024Bytes 491375.00KiB/s 491.4Kops 320.33%

digest:
uadk_tool benchmark --alg aes-128-ecb --mode sva --opt 0
--sync --pktlen 1024 --seconds 1 --multi 1 --thread 1
uadk_tool benchmark --alg aes-128-ecb --mode sva --opt 0
--async --pktlen 1024 --seconds 1 --multi 1 --thread 1

cipher:
uadk_tool benchmark --alg sm4-128-cbc --mode sva --opt 0 \
--sync --pktlen 1024 --seconds 1 --multi 1 --thread 1
uadk_tool benchmark --alg sm4-128-cbc --mode sva --opt 0 \
--async --pktlen 1024 --seconds 1 --multi 1 --thread 1

Remove nosva limitation to permit nosva run

Signed-off-by: Zhangfei Gao <zhangfei.gao@linaro.org>
Signed-off-by: Zhangfei Gao <zhangfei.gao@linaro.org>
@gaozhangfei gaozhangfei force-pushed the nosva-patch-7.24 branch 4 times, most recently from 8e83665 to 589e0af Compare August 1, 2025 01:56
Add new apis:
	wd_blkpool_new;
	wd_blkpool_delete;
	wd_blkpool_phy;
	wd_blkpool_alloc;
	wd_blkpool_free;
	wd_blkpool_setup;
	wd_blkpool_destroy_mem;
	wd_blkpool_create_sglpool;
	wd_blkpool_destroy_sglpool;

App only use two apis after setup blkpool
	wd_blkpool_alloc;
	wd_blkpool_free;

Signed-off-by: Zhangfei Gao <zhangfei.gao@linaro.org>
When init, all ctxs will call wd_blkpool_new,
Only nosva case will get a pointer, while sva case get NULL.

When uninit, delete blkpool and related resources.

Signed-off-by: Zhangfei Gao <zhangfei.gao@linaro.org>
Add api wd_comp_setup_blkpool.
Other alg.c will need wd_xxx_setup_blkpool as well.
The reason is app does not know ctx.

It will setup blkpool for ctx[0] and sglpool for sgl mode.
The blkpool will be used by app and driver.

App need call wd_xxx_setup_blkpool for user pointer mode and sgl
mode. The returned blkpool will be used for wd_blkpool_alloc/free.

Alloc_sess will call wd_xxx_setup_blkpool if it is not called by app.
Then uadk library will alloc blkpool and memcpy to user memory,
with poorer performance.

The driver will translate va to pa when configure register.

Signed-off-by: Zhangfei Gao <zhangfei.gao@linaro.org>
Support sync: mempcy, user pointer and sgl case
Support async: memcpy, user pointer

For flat memory:
If user call wd_xxx_setup_blkpool, will use user pointer mode.
uadk directly use pointer for app, assume it is continuous memory
and translate va to pa when configure register

Otherwise, alloc_sess will setup blkpool and use memcpy mode.
wd_comp alloc continuous memory for hardware and memcpy from
src pointer and memcpy results to dst pointer

For sgl memory:
App has to call wd_xxx_setup_blkpool.
The wd_datalist.data has to use continuous memory

Signed-off-by: Zhangfei Gao <zhangfei.gao@linaro.org>
sync:
mempcy mode
./uadk_tool/uadk_tool benchmark --alg zlib --mode sva \
	--opt 0 --sync --pktlen 1024

user pointer mode, --user
./uadk_tool/uadk_tool benchmark --alg zlib --mode sva \
	--user --opt 0 --sync --pktlen 1024

sgl mode, --sgl
./uadk_tool/uadk_tool benchmark --alg zlib --mode sva \
	--sgl --opt 0 --sync --pktlen 1024

async:
memcpy mode
./uadk_tool/uadk_tool benchmark --alg zlib --mode sva \
	--opt 0 --async --pktlen 1024

user pointer mode, --user
./uadk_tool/uadk_tool benchmark --alg zlib --mode sva \
	--user --opt 0 --async --pktlen 1024

sgl mode, --sgl
./uadk_tool/uadk_tool benchmark --alg zlib --mode sva \
	--sgl --opt 0 --async --pktlen 1024

Signed-off-by: Zhangfei Gao <zhangfei.gao@linaro.org>
This patch addresses the SGL (Scatter-Gather List) next pointer
handling issue in both SVA and non-SVA scenarios.

Background:
- A single SGL can typically cover 255 SGEs (Scatter-Gather Elements) * 8MB
- Multi-SGL cases require proper next pointer handling

Implementation details:
1. In SVA (Shared Virtual Addressing) case:
   - hisi_sgl.next_dma serves both hardware and CPU access
   - Address translation is handled automatically by the IOMMU

2. In non-SVA case:
   - hisi_sgl.next_dma contains DMA address for hardware use only
   - CPU cannot directly use next_dma to access SGL members
   - Solution: Reuse one pad1 field as hisi_sgl.next for CPU access

The modification maintains the original 64B hardware SGL header size while
adding proper CPU-accessible pointer support.
This ensures correct operation in both single and multi-SGL scenarios.

Signed-off-by: Zhangfei Gao <zhangfei.gao@linaro.org>
test:

digest:
uadk_tool benchmark --alg aes-128-ecb --mode sva --opt 0 \
--sync --pktlen 1024 --seconds 1 --multi 1 --thread 1
uadk_tool benchmark --alg aes-128-ecb --mode sva --opt 0 \
--async --pktlen 1024 --seconds 1 --multi 1 --thread 1

cipher:
uadk_tool benchmark --alg sm4-128-cbc --mode sva --opt 0 \
--sync --pktlen 1024 --seconds 1 --multi 1 --thread 1
uadk_tool benchmark --alg sm4-128-cbc --mode sva --opt 0 \
--async --pktlen 1024 --seconds 1 --multi 1 --thread 1

Signed-off-by: Zhangfei Gao <zhangfei.gao@linaro.org>
Signed-off-by: Zhangfei Gao <zhangfei.gao@linaro.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants