基于频率特征增强的结直肠息肉分割模型 Colon Polyp Segmentation Model Based on Frequency Feature Enhancement

doi:10.12677/mos.2024.133327

Modeling and Simulation
Vol. 13 No. 03 ( 2024 ), Article ID: 88391 , 14 pages
10.12677/mos.2024.133327

基于频率特征增强的结直肠息肉分割模型

刘峻昊，瑚琦

●How to Cite this Article

上海理工大学光电信息与计算机工程学院，上海

收稿日期：2024年4月24日；录用日期：2024年5月23日；发布日期：2024年5月31日

摘要

早期息肉检查是防范结直肠癌发病的重要手段，针对现有基于深度学习方法依旧不能准确辨别息肉位置和边缘信息的问题，提出了一种利用傅里叶变换增强频率特征(FFENet)的息肉分割方法。具体地，在FFENet中设计了一个细节特征增强注意力模块和一个全局频率特征学习模块，前者重耦合不同深度的特征并计算三种显著性特征图来细化息肉区域及其边缘；后者在频域中引入可学习的滤波核，以增强息肉与其边缘间的连贯性并捕捉图像像素之间的长距离依赖关系。结合改善的部分解码器和自适应特征选择模块，大量实验结果表明所提出FFENet在五类息肉数据上更具优势。尤其是在ETIS数据集上，对比其他最先进的模型，大模型版本FFENet-L在Dice和IoU指标上分别提升了4%和5.5%，而小模型版本FFENet-S在保持精度相当的同时，仅仅使用了6.2M参数。

关键词

结直肠癌，息肉分割，深度学习，傅里叶变换

Colon Polyp Segmentation Model Based on Frequency Feature Enhancement

Junhao Liu, Qi Hu

School of Opitcal-Electrical and Computer Engineering, University of Shanghai for Science and Technology, Shanghai

Received: Apr. 24^th, 2024; accepted: May. 23^rd, 2024; published: May. 31^st, 2024

ABSTRACT

Colorectal cancer mainly originates from mutated polyp tissues, so early polyp examination can effectively reduce the incidence of colorectal cancer. The ability to automatically and accurately assist doctors in screening polyps is of great significance in the clinical diagnosis of colorectal cancer. However, existing deep learning-based methods cannot fully address the challenges posed by the color, brightness, and complexity of polyps in polyp images, making it difficult to accurately distinguish the location and edge information of polyps. Considering these problems, we innovatively utilize the Fourier Transform in traditional digital image processing to propose a frequency feature enhancement network (FFENet) for polyp segmentation. In FFENet, a detail feature enhancement attention module and a global frequency feature learning module are presented based on the frequency features. DFEA recouples features from different depths and calculates three types of saliency feature maps to refine the polyp region and polyp boundary. GFFLM aims to enhance the coherence of the polyp body and boundary by incorporating a learnable filtering kernel in the frequency domain to capture the long-range relationships between image pixels. Combined with an enhanced partial decoder and an adaptive feature selection module, our method excels in small polyp segmentation and polyp segmentation in complex environments. Extensive experiments are conducted on five public datasets and demonstrate the superiority of our proposed method for polyp segmentation compared with several state-of-the-art methods. Especially on the ETIS dataset, compared to other SOTA models, our large model version FFENet-L has improved by 4% and 5.5% in terms of Dice and IoU metrics respectively, while our small model version FFENet-S has only used 6.2M parameters while maintaining similar accuracy.

Keywords:Colorectal Cancer, Polyp Segmentation, Deep Learning, Fourier Transform

This work is licensed under the Creative Commons Attribution International License (CC BY 4.0).

http://creativecommons.org/licenses/by/4.0/

1. 引言

目前的研究表明，结直肠癌是全球最常见的恶性肿瘤之一，其发生率在第二到第四位之间，并呈上升趋势。据估计，将近40%年龄大于50岁的个体至少有一个腺瘤息肉，95%的结直肠癌起源于这些腺瘤息肉的恶性转变。因此，早期发现和筛查息肉对显著降低结直肠癌发生率起着关键作用 [1] [2] 。结肠镜检查是检测和筛查结直肠息肉的金标准 [3] 。然而，在结肠镜检查期间手动检测结直肠息肉存在较高比例未诊断和误诊，特别是对于小于10毫米大小的息肉 [4] 。检测准确度严重依赖医师的专业知识和主观判断力。因此，使用基于计算机视觉技术来协助医生识别潜在癌前病变非常重要。

由于结肠息肉的各种形状和大小不同，在结肠镜检查过程中对息肉进行视觉估计存在相当大的挑战 [5] 。息肉分割的有效性受到图像质量可变因素的影响，这些因素包括色度、亮度以及肠道清洁程度。许多用于息肉分割最先进的(SOTA)深度学习方法采用诸如多尺度特征融合、上下文特征聚合和注意机制等策略，以增强分割模型的准确性。 [6] - [11] 等工作通过计算转置特征图或添加前景和不确定区域特征图来增强模型息肉分割能力。但基于图像阈值计算显著性特征图的方法可能会模糊某些边界细节，在训练过程中导致边缘信息丢失或引入不相关数据。此外，这些方法仅关注边缘和部分身体信息，导致身体和边缘信息之间存在不一致性，进而导致息肉分割性能下降 [12] [13] 。为了更好地整合身体和边缘信息， [12] [13] [14] 等工作提出了将图像的身体和边缘信息解耦的方法。然而，这些方法对两个解耦部分分别优化，导致当身体和边缘部分耦合时发生特征混叠的问题。

针对这些问题，本文提出了一种新颖的细节特征增强注意(DFEA)模块，利用傅立叶变换在频域中解耦低频和高频信息，从而获取单独的主体和边缘信息，并采用特定优化来进一步增强边缘信息的表示。处理过的深层主体部分与浅层边缘部分相结合，以获得增强的细节特征图。利用频率关系跨层级特征融合可以有效加强息肉主体和边缘信息之间的一致性，有效改善边缘信息并降低息肉主体部分的误识别率。此外，设计了一个基于频率的特征学习模块，称为全局频率特征学习模块(GFFLM)，通过在频域中应用可学习全局滤波器来捕捉长期和短期交互信息。值得注意的是，GFFLM还可以充当自注意机制，对比自注意力机制其线性复杂度更低且参数更少。

2. 模型方法

2.1. 总体架构

本文提出的FFENet基于PraNet [10] 设计，如图1所示。FFENet包含六个模块：骨干网络(Backbone)、部分解码器(PD)、细节特征增强注意力(DFEA)、自适应特征选择模块(AFSM)和全局频率特征学习模块(GFFLM)。

Figure 1. The overall architecture of our proposed FFENet

图1. FFENet整体结构图

2.2. 频率特征学习模块

本文提出的DFEA和GFFLM基于离散傅里叶变换(DFT)进行构建。在频率域中使用DFT分析和处理图像是图像处理的重要手段，也是许多其他方法的基础。给定一个具有N个采样点的均匀序列 $x [n]$ ，其一维离散傅里叶变换 $X [k]$ 和逆变换 $x [n]$ 表示如公式(1)、(2)所示。其中 $W_{N}^{k n} = e^{- j 2 π k n / N}$ 和 $W_{N}^{- k n} = e^{j 2 π k n / N}$ 是复共轭对， $j$ 是虚数单位。

$X [k] = \sum_{n = 0}^{N - 1} x [n] W_{N}^{k n}, k \in {0, \dots, N - 1}$ (1)

$x [n] = \frac{1}{N} \sum_{k = 0}^{N - 1} X [k] W_{N}^{- k n}, k \in {0, \dots, N - 1}$ (2)

得益于旋转因子的复共轭性质，DFT的计算复杂度可以从 $O (N^{2})$ 降低到 $O (N \log N)$ ，显著减少了频率变换的计算成本。在某些情况下，计算速度甚至可能超过传统卷积，从而提供了频率特征学习的可

Figure 2. The architecture of detail feature enhancement attention (DFEA)

图2. 细节特征增强注意力(DFEA)结构图

能性。通过将一维DFT扩展到二维，可以获得二维图像的DFT。给定一个二维图像 $x [m, n]$ ，其二维离散傅里叶变换和逆变换表示如公式(3)、(4)所示。其中 $W_{M}^{m μ} = e^{- j 2 π m μ / M}$ 和 $W_{M}^{- m μ} = e^{j 2 π m μ / M}$ 、 $W_{N}^{n ν} = e^{- j 2 π n ν / N}$ 和 $W_{N}^{- n ν} = e^{j 2 π n ν / N}$ 分别构成复共轭对， $j$ 为虚数单位， $M$ 和 $N$ 是图像的宽和高。

$\begin{array}{l} X [μ, ν] = \sum_{m = 0}^{M - 1} \sum_{n = 0}^{N - 1} x [m, n] W_{M}^{m μ} W_{N}^{n ν}, \\ μ \in {0, \dots, M - 1}, ν \in {0, \dots, N - 1} \end{array}$ (3)

$\begin{array}{l} x [m, n] = \frac{1}{M N} \sum_{μ = 0}^{M - 1} \sum_{ν = 0}^{N - 1} X [μ, ν] W_{M}^{- m μ} W_{N}^{- n ν}, \\ μ \in {0, \dots, M - 1}, ν \in {0, \dots, N - 1} \end{array}$ (4)

2.2.1. 细节特征增强注意力

图2展示了DFEA的结构。该模块首先将来自Backbone的特征 $f_{b a c k b o n e}$ 和来自PD的特征 $f_{p d}$ 或上一层GFFLM的特征 $f_{g f f l m}$ 转换到频率域，如公式(5)所示。其中 $f_{l o w} \in ℝ^{B \times C \times H \times W}$ 和 $F_{l o w} \in ℂ^{B \times C \times H \times W^{'}} (W^{'} = ⌈ W / 2 ⌉)$ 分别表示 $f_{b a c k b o n e}$ 及其频率特征， $f_{h i g h} \in ℝ^{B \times C \times H \times W}$ 和 $F_{h i g h} \in ℂ^{B \times C \times H \times W^{'}} (W^{'} = ⌈ W / 2 ⌉)$ 分别表示 $f_{p d}$ 或 $f_{g f f l m}$ 对应的特征及其频率特征， $F [\cdot]$ 表示公式(3)中的二维离散傅里叶变换。

$F_{l o w} = F [f_{l o w}], F_{h i g h} = F [f_{h i g h}]$ (5)

图像的低频成分主要包含身体信息，而高频成分则包含更多边缘信息 [14] [15] 。图3说明随着网络深度增加和图像尺度减小，模型更加关注深层语义信息，即息肉的主体部分(低频)。然而，模型的浅层包含着更丰富的边缘信息(高频)。通过重新耦合来自不同层级的高低频信息，模型可以更好地区分息肉的边缘信息，并降低其位置误识率。因此，该模块使用频率滤波器来分离图像的频率特征，保留深层特征中的低频信号以及浅层特征中的高频信号。该频域滤波核可以由公式(6)、(7)表示，其中 $f l_{l o w} \in ℂ^{1 \times 1 \times H \times W^{'}}$ 和 $f l_{h i g h} \in ℂ^{1 \times 1 \times H \times W^{'}}$ 分别是频率域中的低通和高通滤波器。

$f l_{l o w} = {\begin{array}{l} 0, & others \\ 1, & \frac{H}{4} \leq h < \frac{3 H}{4}, \frac{W^{'}}{2} \leq w < \frac{5 W^{'}}{8} \end{array}$ (6)

Figure 3. Visualization of feature maps between different stages. (a) is Stage 1 with an image scale of 44 × 44, (b) is Stage 2 with an image scale of 22 × 22, and (c) is Stage 3 with an image scale of 11 × 11

图3. 不同尺度层级的输出特征图可视化，(a)为44 × 44尺度，(b)为22 × 22尺度，(c)为11 × 11尺度

$f l_{h i g h} = {\begin{array}{l} 0, & \frac{H}{4} \leq h < \frac{3 H}{4}, \frac{W^{'}}{2} \leq w < \frac{5 W^{'}}{8} \\ 1, & others \end{array}$ (7)

随后使用逐元素相乘完成高通滤波，并逆变换到时域中，如公式(8)、(9)所示。其中 $⊙$ 表示逐元素相乘， ${F^{'}}_{h i g h} \in ℂ^{B \times C \times H \times W^{'}}$ 表示 $F_{h i g h}$ 经过高通后的频率特征， $F^{- 1} [\cdot]$ 表示公式(4)中的二维离散傅里叶逆变换。

${F^{'}}_{h i g h} = F_{h i g h} ⊙ f l_{h i g h}$ (8)

${f^{'}}_{h i g h} = F^{- 1} [{F^{'}}_{h i g h}]$ (9)

此后对高频边缘信息进行增强，如公式(10)所示。其中 ${f^{'}}_{h i g h} \in ℝ^{B \times C \times H \times W}$ 表示经过增强的高频时域特征， $Sigmoid (x) = 1 / (1 + e^{- x})$ 。

${f^{'}}_{h i g h} = (0.5 - abs (Sigmoid ({f^{'}}_{h i g h}) - 0.5)) ⊙ {f^{'}}_{h i g h}$ (10)

然后对于公式(5)得到的频率特征进行低通滤波处理，并与公式(10)得到的高频特征进行重耦合，如公式(11)、(12)所示。其中 ${F^{'}}_{l o w} \in ℂ^{B \times C \times H \times W^{'}}$ 是 $F_{l o w}$ 低通滤波后的特征， $f_{r e} \in ℝ^{B \times C \times H \times W}$ 是特征重耦合后输出的时域特征。

${F^{'}}_{l o w} = F_{l o w} ⊙ f l_{l o w}$ (11)

$f_{r e} = F^{- 1} [{F^{'}}_{l o w} + F [{f^{'}}_{h i g h}]]$ (12)

接下来为DFEA获得的输入特征图 $m a p$ 计算三元注意力图：前景、背景、不确定区域，如公式(13)所示。其中 $m_{f}$ 表示增强前景注意力图， $m_{b}$ 表示增强背景注意力图， $m_{u}$ 表示增强不确定边缘注意力图。

$\begin{array}{l} m_{f} = \max (Sigmoid (m a p) - 0.5, 0) \\ m_{b} = \max (0.5 - Sigmoid (m a p), 0) \\ m_{u} = 0.5 - abs (Sigmoid (m a p) - 0.5) \end{array}$ (13)

获得的三元注意力图将会与重耦合的特征进行逐元素相乘，并按通道拼接获得DFEA的最终输出 $f_{d f e a} \in ℝ^{B \times C \times H \times W}$ ，如公式(14)所示。

$f_{d f e a} = Concat (m_{f} ⊙ f_{r e} + m_{b} ⊙ f_{r e} + m_{u} ⊙ f_{r e}) + f_{r e}$ (14)

2.2.2. 全局频率特征学习模块

Figure 4. The architecture of detail feature enhancement attention (DFEA)

图4. 细节特征增强注意力(GFFLM)结构图

图4展示了本文设计提出的全局频率特征学习模块(GFFLM)。在使用Res2Net类网络结构时，使用过多的小卷积核可能会导致特征图上的感受野不足，从而造成全局信息丢失，加深对小目标息肉的误识率和分割精度的下降。因此，本文设计的GFFLM首先将空域特征通过DFT(在PyTorch框架下，具体算子为RFFT2D)转换到频率特征，其次在频率中与可学习的滤波核逐元素点乘完成全局频率特征学习。频率中的特征具有全局性，利用频率关系完成特征学习能够有效的增加空域中特征的感受野。此外，滤波核使用逐元素点乘其参数仅取决于输入张量的高度、宽度以及通道数，在一定程度上减少了参数数量和计算复杂度。在GFFLM的设计上，本文又引入了SimCSPSPPF [16] 模块，通过三个级联层的最大池化对不同尺度特征进行聚合，进一步加强模型的多尺度特征表示能力。GFFLM整体参考图4，设计实现了任意数量层的堆叠，并在最后一层输出了预测特征图。

2.3. 部分解码器和自适应特征选择模块

图5展示了改进后的部分解码器(PD)和自适应特征选择模块(AFSM)。在PD的三个输入 ${f_{b a c k b o n e}^{1}, f_{b a c k b o n e}^{2}, f_{b a c k b o n e}^{3}}$ 后，插入了SimCSPSPPF模块进一步增强其多尺度特征表示。而在不同层级特征聚合后，进一步使用多尺度注意力模块(EMA) [17] 重新校准空间和通道维度上的组合多尺度特征。PD模块最后将生成特征输出 $f_{p d}$ 以及预测特征图 $f_{p d_m a p}$ 。AFSM受工作 [7] 启发，针对实际实验效果，去除了其中并没有带来精度提升的Non-Local操作进一步降低计算复杂度。对于聚合后的特征使用Squeeze-and-Excite操作能够重新校准其通道值，突出有效特征信息，压缩无效特征信息。

Figure 5. The architecture of partial decoder (PD) and adaptive feature selection module (AFSM)

图5. 部分解码器(PD)和自适应特征选择模块(AFSM)结构图

3. 实验及结果分析

3.1. 数据集

Table 1. Details of the adopted datasets

表1. 数据集详细情况

如表1所示，在数据集的选择上，挑选了目前最广泛使用的五类数据集：CVC-ClinicDB (也被称为CVC-612) [18] 、CVC-ColonDB [19] 、CVC-300 [20] 、ETIS [21] 和Kvasir [22] 。

3.2. 评价指标

为了更好地评估所提FFENet地有效性，本文采用了六中广泛使用的评价指标：Dice、IoU、Mean Absolute Error (MAE)、Weighted F-measure ( $F_{β}^{ω}$ )、S-measure ( $S_{α}$ )和E-measure ( $E_{ϕ}^{\max}$ )。为了更好的定义评价指标，本文引入true positive (TP)、false positive (FP)、true negative (TN)和false negative (FN)来表示息肉区域和背景区域之间的关系。这些评价指标中最重要的三类指标可以由公式(15)、(16)、(17)表示。

$Dice = \frac{2 | A \cap B |}{| A | + | B |} = \frac{2 TP}{2 TP + FP + FN}$ (15)

$IoU = \frac{| A \cap B |}{| A \cup B |} = \frac{TP}{TP + FP + FN}$ (16)

$MAE = \frac{1}{M N} \sum_{m}^{M} \sum_{n}^{N} | A (m, n) - B (m, n) |$ (17)

3.3. 损失函数

为了和之前的其他工作保持一致 [8] [10] ，本文采用广泛使用的weighted BCE和weighted IoU评价指标作为损失函数： $L o s s = L o s s_{B C E}^{ω} + L o s s_{I o U}^{ω}$ 。如图1所示，本文采用了深度监督策略，最终的损失函数可以由公式(18)表示。

$L o s s_{t o t a l} = \sum_{i = 1}^{4} L o s s_{i}$ (18)

3.4. 实施细节

本文所提出的FFENet使用一块NVIDIA GeForce RTX 3090 GPU在PyTorch框架下进行训练。与先前的研究工作 [8] [10] 一致，输入图像固定为352 × 352。为了增强模型的鲁棒性，对数据集采用随机翻转、随机旋转、腐蚀、膨胀等数据增强手段。同时，本文对输入图像采用了0.75~1.25倍的图像尺寸裁剪或扩充。训练次数Epoch固定为240轮，Batchsize设置为32。对于本文所提出的三种不同大小的模型(FFENet-S、FFENet-M和FFENet-L)，在Backbone上分别使用了MobileNetV3、Res2Net50和Res2Net101。

3.5. 消融实验

Table 2. Ablation study for proposed method on 5 testing datasets

表2. 所提出模型在五种数据集上的消融实验

Table 3. Ablation study for feature recoupling (FR) on five testing datasets

表3. 所提出特征重耦合(FR)方法在五种数据集上的消融实验

为了证实本文设计提出FFENet模型的有效性，本文以FFENet-M为例进行了详尽的消融实验，包括各个模块间的有效性消融实验以及DFEA中重耦合方式的有效性消融实验，如表2、表3所示。从表2中可以看到，在加入所提DFEA之后，模型整体在CVC-ColonDB、CVC-300和ETIS三类未参与训练的数据上有较大的性能提升。而在加入GFFLM后，模型整体性能又有所提升。对比仅使用了PD和AFSM的模型，在添加了DFEA和GFFLM后在CVC-ClinicDB数据集上Dice指标提升了13%，IoU指标提升了10%，而在ETIS数据集上Dice指标提升了11.3%，IoU指标提升了11.7%。

同时，本文对DFEA中的重耦合方法(FR)也进行了消融实验。表3中FFENet (w/o FR)表示不DFEA中不使用FR，而是直接按通道拼接输入的两类特征。FFENet (Concat)表示使用按通道拼接代替FR中的特征耦合方法即公式(12)。从表3和图6中都能看到，无论是FFENet (w/o FR)还是FFENet (Concat)都带来了一定的性能损失。

Figure 6. Heatmaps for DFEA and GFFLM. (a) presents Backbone + PD + AFSM, (b) presentsBackbone + PD + AFSM + DFEA, (c) presents Backbone + PD + AFSM + DFEA + GFFLM(FFENet), (d) presents FFENet (w/o FR), and (e) presents FFENet (Concate)

图6. DFEA和GFFLM的可视化特征图，(a) 表示Backbone + PD + AFSM，(b) 表示Backbone + PD + AFSM + DFEA，(c) 表示Backbone + PD + AFSM + DFEA + GFFLM(FFENet)，(d) 表示FFENet(w/o FR)，(e) 表示FFENet(Concate)

3.6. 与SOTA模型对比

本文在五种数据集上与之前的SOTA模型进行了详细的定性和定量的分析。作为对比的SOTA模型包括U-Net [23] 、U-Net++ [24] 、ResUNet++ [25] 、SFA [26] 、PraNet、UACANet、SANet [27] 、MSNet [28] 、M²SNet [29] 、LDNet [30] 、Polyp-Mixer [31] 、CFA-Net [32] 和MMFIL-Net [33] 。详细的数据对比如表4和图7所示。鉴于这些模型都采用了基于PraNet的训练方式，本文直接采用其对应论文中的数据作为参考。

3.6.1. 定量分析

Table 4. Quantitative comparison with SOTA methods on the CVC-ClinicDB and Kvasir-SEG datasets

表4. 在CVC-ClinicDB和Kvasir-SEG数据集上与SOTA方法的定量对比表

Figure 7. Quantitative comparison with SOTA methods on the CVC-300, CVC-ColonDB, and ETIS datasets

图7. 在CVC-300、CVC-ColonDB以及ETIS数据集上与SOTA方法的定量对比图

如表4所示，本文提出的FFENet在Kvasir-SEG和CVC-ClinicDB数据集上的学习能力方面均优于其他方法。图7揭示了我们的FFENet在未见数据集CVC-ColonDB、ETIS和CVC-300上也优于现有的SOTA方法。在CVC-ColonDB、ETIS和CVC-300数据集上，FFENet-L和FFENet-M分别取得第一名和第二名。与最新的CFA-Net相比，所提出的FFENet-S在ETIS数据集上Dice值领先2.9%，IoU值领先1.3%，而其参数数量及MACs约为其0.25倍。图8展示的Precision-Recall曲线和F-measure曲线，也证实所提出的FFENet优于其他方法。

Figure 8. Precision-Recall and F-measure curves of our method and other methods on the CVC-300, CVC-ColonDB, and ETIS datasets

图8. 在CVC-300、CVC-ColonDB以及ETIS数据集上各种模型方法的Precision-Recall和F-measure曲线图

Figure 9. Qualitative comparison with SOTA methods on five datasets

图9. 在五种数据集上不同模型方法之间的可视化对比图

3.6.2. 定性分析

图9和图10展示了在上述五个数据集上推理结果的可视化比较。显而易见，本文所提出的FFENet明显优于其他方法。我们的方法可以在各种图像情况下，获得更准确的位置和边缘信息，如亮度、颜色、纹理和息肉位置及边缘信息。即使在具有复杂图像和微小息肉特征的ETIS数据集中，本文提出的方法仍保持着精确的检测能力，能够准确识别这些息肉的位置和大小。总体而言，与其他方法相比，我们的FFENet具有更好的泛化能力和鲁棒性。

Figure 10. Qualitative comparison with SOTA methods on five datasets

图10. 在五种数据集上不同模型方法之间的可视化对比图

3.6.3. 复杂度分析

表5列出了不同模型的参数量和MACs。很明显，FFENet在分割效果方面超越了其他SOTA方法，并在参数和计算成本上保持了竞争优势，所提出的FFENet-S仅需要6.20M参数和3.83G MACs。

Table 5. Complexity analysis among different methods

表5. 不同模型方法间的复杂度对比表

4. 结论

本文提出了一种称为FFENet的全新息肉图像分割模型，通过DFEA和GFFLM在频域中增强特征边缘表征能力和捕捉能力，有效提高了分割息肉的鲁棒性。尽管FFENet在高分辨率图像处理上表现出色，但傅立叶变换的计算负担仍是一个挑战。未来研究将致力于减少这一开销，并探索模型在临床环境中的应用以减轻医生负担。

文章引用

刘峻昊,瑚琦. 基于频率特征增强的结直肠息肉分割模型
Colon Polyp Segmentation Model Based on Frequency Feature Enhancement[J]. 建模与仿真, 2024, 13(03): 3593-3606. https://doi.org/10.12677/mos.2024.133327

参考文献

1. Sawicki, T., Ruszkowska, M., Danielewicz, A., Niedźwiedzka, E., Arłukowicz, T. and Przybyłowicz, K.E. (2021) A Review of Colorectal Cancer in Terms of Epidemiology, Risk Factors, Development, Symptoms and Diagnosis. Cancers, 13, Article No. 2025. https://doi.org/10.3390/cancers13092025

2. Simon, K. (2016) Colorectal Cancer Development and Advances in Screening. Clinical Interventions in Aging, 11, 967-976. https://doi.org/10.2147/CIA.S109285

3. Muhammad, Z.-U.-D., Huang, Z., Gu, N. and Muhammad, U. (2022) DCANet: Deep Context Attention Network for Automatic Polyp Segmentation. Visual Computer, 39, 5513-5525. https://doi.org/10.1007/s00371-022-02677-x

4. Zimmermann-Fraedrich, K., et al. (2019) Right-Sided Location Not Associated with Missed Colorectal Adenomas in an Individual-Level Reanalysis of Tandem Colonoscopy Studies. Gastroenterology, 157, 660-671. https://doi.org/10.1053/j.gastro.2019.05.011

5. Djinbachian, R., et al. (2023) Comparing Size Measurement of Colorectal Polyps Using a Novel Virtual Scale Endoscope, Endoscopic Ruler or Forceps: A Preclinical Randomized Trial. Endoscopy International Open, 11, E128-E135. https://doi.org/10.1055/a-2005-7548

6. Lou, A., Guan, S. and Loew, M.H. (2023) Caranet: Context Axial Reverse Attention Network for Segmentation of Small Medical Objects. Journal of Medical Imaging, 10, Article ID: 014005. https://doi.org/10.1117/1.JMI.10.1.014005

7. Zhang, R., Li, G., Li, Z., Cui, S., Qian, D. and Yu, Y. (2020) Adaptive Context Selection for Polyp Segmentation. Medical Image Computing and Computer Assisted Intervention—MICCAI 2020, Lima, 4-8 October 2020, 253-262. https://doi.org/10.1007/978-3-030-59725-2_25

8. Kim, T., Lee, H. and Kim, D. (2021) UACANet: Uncertainty Augmented Context Attention for Polyp Segmentation. Proceedings of the 29th ACM International Conference on Multimedia (MM ‘21), 20-24 October 2021, 2167-2175. https://doi.org/10.1145/3474085.3475375

9. Zhu, J., Ge, M., Chang, Z. and Dong, W. (2023) CRCNet: Global-Local Context and Multi-Modality Cross Attention for Polyp Segmentation. Biomedical Signal Processing and Control, 83, Article ID: 104593. https://doi.org/10.1016/j.bspc.2023.104593

10. Fan, D.-P., Ji, G.-P., Zhou, T., Chen, G., Fu, H., Shen, J. and Shao, L. (2020) PraNet: Parallel Reverse Attention Network for Polyp Segmentation. Medical Image Computing and Computer Assisted Intervention—MICCAI 2020, Lima, 4-8 October 2020, 263-273. https://doi.org/10.1007/978-3-030-59725-2_26

11. Song, P., Li, J. and Fan, H. (2022) Attention Based Multi-Scale Parallel Network for Polyp Segmentation. Computers in Biology and Medicine, 146, Article ID: 105476. https://doi.org/10.1016/j.compbiomed.2022.105476

12. Shan, L., Li, X. and Wang, W. (2021) Decouple the High-Frequency and Low-Frequency Information of Images for Semantic Segmentation. ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing, Toronto, 6-11 June 2021, 1805-1809. https://doi.org/10.1109/ICASSP39728.2021.9414019

13. Su, Y., et al. (2023) FeDNet: Feature Decoupled Network for Polyp Segmentation from Endoscopy Images. Biomedical Signal Processing and Control, 83, Article ID: 104699. https://doi.org/10.1016/j.bspc.2023.104699

14. Tang, X., Peng, J., Zhong, B., Li, J. and Yan, Z. (2021) Introducing Frequency Representation into Convolution Neural Networks for Medical Image Segmentation via Twin-Kernel Fourier Convolution. Computer Methods and Programs in Biomedicine, 205, Article ID: 106110. https://doi.org/10.1016/j.cmpb.2021.106110

15. Chi, L., Jiang, B. and Mu, Y. (2020) Fast Fourier Convolution. 34th Conference on Neural Information Processing Systems (NeurIPS 2020), Vancouver, 6-12 December 2020, 4479-4488.

16. Li, C., Li, L., Geng, Y., Jiang, H., Cheng, M., Zhang, B., Ke, Z., Xu, X. and Chu, X. (2023) YOLOv6 V3.0: A Full-Scale Reloading.

17. Ouyang, D., He, S., Zhan, J., Guo, H., Huang, Z., Luo, M. and Zhang, G. (2023) Efficient Multi-Scale Attention Module with Cross-Spatial Learning. ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Rhodes Island, 4-10 June 2023, 1-5. https://doi.org/10.1109/ICASSP49357.2023.10096516

18. Bernal, J., Sánchez, F.J., Fernández-Esparrach, G., Gil, D., Rodríguez, C. and Vilariño, F. (2015) WM-DOVA Maps for Accurate Polyp Highlighting in Colonoscopy: Validation vs. Saliency Maps from Physicians. Computerized Medical Imaging and Graphics, 43, 99-111. https://doi.org/10.1016/j.compmedimag.2015.02.007

19. Bernal, J., Sánchez, F.J. and Vilariño, F. (2012) Towards Automatic Polyp Detection with a Polyp Appearance Model. Pattern Recognition, 45, 3166-3182. https://doi.org/10.1016/j.patcog.2012.03.002

20. Vázquez, D., et al. (2017) A Benchmark for Endoluminal Scene Segmentation of Colonoscopy Images. Journal of Healthcare Engineering, 2017, Article ID: 4037190. https://doi.org/10.1155/2017/4037190

21. Silva, J., Histace, A., Romain, O., Dray, X. and Granado, B. (2014) Toward Embedded Detection of Polyps in WCE Images for Early Diagnosis of Colorectal Cancer. International Journal of Computer Assisted Radiology and Surgery, 9, 283-293. https://doi.org/10.1007/s11548-013-0926-3

22. Jha, D., Smedsrud, P.H., Riegler, M., Halvorsen, P., De Lange, T., Johansen, D. and Johansen, H.D. (2019) Kvasir-SEG: A Segmented Polyp Dataset. Conference on Multimedia Modeling, Daejeon, 5-8 January 2020, 451-462. https://doi.org/10.1007/978-3-030-37734-2_37

23. Ronneberger, O., Fischer, P. and Brox, T. (2015) U-Net: Convolutional Networks for Biomedical Image Segmentation. https://doi.org/10.1007/978-3-319-24574-4_28

24. Zhou, Z., Rahman Siddiquee, M.M., Tajbakhsh, N. and Liang, J. (2018) UNet : A Nested U-Net Architecture for Medical Image Segmentation. In: Stoyanov, D., et al., Eds., Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support, Springer, Berlin, 3-11. https://doi.org/10.1007/978-3-030-00889-5_1

25. Jha, D., Smedsrud, P.H., Riegler, M., Johansen, D., De Lange, T., Halvorsen, P. and Johansen, H.D. (2019) ResUNet : An Advanced Architecture for Medical Image Segmentation. 2019 IEEE International Symposium on Multimedia (ISM), San Diego, 9-11 December 2019, 225-2255. https://doi.org/10.1109/ISM46123.2019.00049

26. Fang, Y.Q., Chen, C., Yuan, Y.X. and Tong, K.-Y. (2019) Selective Feature Aggregation Network with Area-Boundary Constraints for Polyp Segmentation. In: Medical Image Computing and Computer Assisted Intervention MICCAI 2019, Springer-Verlag, Berlin, 302-310. https://doi.org/10.1007/978-3-030-32239-7_34

27. Wei, J., Hu, Y., Zhang, R., Li, Z., Zhou, S. and Cui, S. (2021) Shallow Attention Network for Polyp Segmentation. https://doi.org/10.1007/978-3-030-87193-2_66

28. Zhao, X., Zhang, L. and Lu, H. (2021) Automatic Polyp Segmentation via Multi-Scale Subtraction Network. International Conference on Medical Image Computing and Computer-Assisted Intervention, Strasbourg, 27 September-1 October 2021, 120-130. https://doi.org/10.1007/978-3-030-87193-2_12

29. Zhao, X., Jia, H., Pang, Y., Lv, L., Tian, F., Zhang, L., Sun, W. and Lu, H. (2023) M2SNet: Multi-Scale in Multi-Scale Subtraction Network for Medical Image Segmentation.

30. Zhang, R., Lai, P., Wan, X., Fan, D., Gao, F., Wu, X. and Li, G. (2023) Lesion-Aware Dynamic Kernel for Polyp Segmentation. International Conference on Medical Image Computing and Computer-Assisted Intervention, Singapore, 18-22 September 2022, 99-109. https://doi.org/10.1007/978-3-031-16437-8_10

31. Shi, J., Zhang, Q., Tang, Y. and Zhang, Z. (2023) Polyp-Mixer: An Efficient Context-Aware MLP-Based Paradigm for Polyp Segmentation. IEEE Transactions on Circuits and Systems for Video Technology, 33, 30-42. https://doi.org/10.1109/TCSVT.2022.3197643

32. Zhou, T., Zhou, Y., He, K., Gong, C., Yang, J., Fu, H. and Shen, D. (2023) Cross-Level Feature Aggregation Network for Polyp Segmentation. Pattern Recognition, 140, Article ID: 109555. https://doi.org/10.1016/j.patcog.2023.109555

33. Muhammad, Z., Usman, M., Huang, Z. and Gu, N. (2024) MMFIL-Net: Multi-Level and Multi-Source Feature Interactive Lightweight Network for Polyp Segmentation. Displays, 81, Article ID: 102600. https://doi.org/10.1016/j.displa.2023.102600

期刊菜单