二阶锥规划(SOCP)
二阶锥规划(Second-order cone program),是指目标函数为线性函数,约束条件中的决策变量在二阶锥中。
SOCP的原始和对偶问题
SOCP的原始问题为:
min f ⊤ x s . t . ∥ A i x + b i ∥ 2 ≤ c i ⊤ x + d i , i = 1 , … , m \begin{array}{ll}
\min & \boldsymbol{f}^\top\boldsymbol{x}\\
\mathrm{s.~t.} & \|\boldsymbol{A}_i\boldsymbol{x} + \boldsymbol{b}_i\|_2 \leq \boldsymbol{c}_i^\top \boldsymbol{x} + d_i, \quad i=1,\ldots,m
\end{array}
min s. t. f ⊤ x ∥ A i x + b i ∥ 2 ≤ c i ⊤ x + d i , i = 1 , … , m
其中A i ∈ R n i × n , b i ∈ R n i , c i ∈ R n , d i ∈ R , x ∈ R n , i = 1 , … , m A_i\in\mathbb{R}^{n_i\times n},\boldsymbol{b}_i\in\mathbb{R}^{n_i},\boldsymbol{c}_i\in\mathbb{R}^n,d_i\in\mathbb{R},\boldsymbol{x}\in\mathbb{R}^n, i=1,\ldots,m A i ∈ R n i × n , b i ∈ R n i , c i ∈ R n , d i ∈ R , x ∈ R n , i = 1 , … , m 。
对偶问题为:
max − ∑ i = 1 m ( u i ⊤ b i + v i d i ) s . t . ∑ i = 1 m A i ⊤ u i + ∑ i = 1 m v i c i = f , i = 1 , … , m , ∥ u i ∥ 2 ≤ v i , i = 1 , … , m , \begin{array}{ll}
\max & -\sum_{i=1}^m(\boldsymbol{u}_i^\top \boldsymbol{b}_i + v_id_i)\\
\mathrm{s.~t.} & \sum_{i=1}^m \boldsymbol{A}_i^\top\boldsymbol{u}_i + \sum_{i=1}^mv_i\boldsymbol{c}_i = \boldsymbol{f},\quad i=1,\ldots,m, \\
&\|\boldsymbol{u}_i\|_2\leq v_i, \quad i=1,\ldots,m,
\end{array}
max s. t. − ∑ i = 1 m ( u i ⊤ b i + v i d i ) ∑ i = 1 m A i ⊤ u i + ∑ i = 1 m v i c i = f , i = 1 , … , m , ∥ u i ∥ 2 ≤ v i , i = 1 , … , m ,
其中,u i ∈ R n \boldsymbol{u}_i\in\mathbb{R}^n u i ∈ R n ,v i ∈ R v_i\in\mathbb{R} v i ∈ R ,i = 1 , … , m i=1,\ldots,m i = 1 , … , m 。
证明:
令y i = A i x + b i \boldsymbol{y}_i = \boldsymbol{A}_i\boldsymbol{x} + \boldsymbol{b}_i y i = A i x + b i ,t i = c i ⊤ + d i t_i = \boldsymbol{c}_i^\top + d_i t i = c i ⊤ + d i ,则SOCP原问题等价于:
min f ⊤ x s . t . ∥ y i ∥ 2 ≤ t i , i = 1 , … , m y i = A i ⊤ x + b i , i = 1 , … , m t i = c i ⊤ x + d i , i = 1 , … , m \begin{array}{ll}
\min & \boldsymbol{f}^\top\boldsymbol{x}\\
\mathrm{s.~t.} & \|\boldsymbol{y}_i\|_2 \leq t_i, \quad i=1,\ldots,m \\
& \boldsymbol{y}_i = \boldsymbol{A}_i^\top\boldsymbol{x} + \boldsymbol{b}_i, \quad i=1,\ldots,m \\
& t_i = \boldsymbol{c}_i^\top\boldsymbol{x}+d_i, \quad i=1,\ldots,m \\
\end{array}
min s. t. f ⊤ x ∥ y i ∥ 2 ≤ t i , i = 1 , … , m y i = A i ⊤ x + b i , i = 1 , … , m t i = c i ⊤ x + d i , i = 1 , … , m
引入拉格朗日乘子λ ∈ R m , μ i ∈ R n i , i = 1 , … , m , ν ∈ R m \boldsymbol{\lambda}\in\mathbb{R}^m,\boldsymbol{\mu}_i\in\mathbb{R}^{n_i},i=1,\ldots,m,\boldsymbol{\nu}\in\mathbb{R}^m λ ∈ R m , μ i ∈ R n i , i = 1 , … , m , ν ∈ R m ,则可以写出上述约束优化问题的拉格朗日函数:
L ( x , y , t , λ , μ , ν ) = f ⊤ x + ∑ i = 1 m λ i ( ∥ y i ∥ 2 − t i ) + ∑ i = 1 m μ i ⊤ ( y i − A i ⊤ x − b i ) + ∑ i = 1 m ν i ( t i − c i ⊤ x ) = ( f − ∑ i = 1 m A i ⊤ − ∑ i = 1 m u i c i ) ⊤ x + ∑ i = 1 m ( λ i ∥ y i ∥ 2 + μ i ⊤ y i ) + ∑ i = 1 m ( − λ i + ν i ) t i − ∑ i = 1 m ( μ i ⊤ b i + ν i d i ) \begin{aligned}
L(\boldsymbol{x}, \boldsymbol{y}, \boldsymbol{t}, \boldsymbol{\lambda}, \boldsymbol{\mu}, \boldsymbol{\nu})& = \boldsymbol{f}^\top\boldsymbol{x} + \sum_{i=1}^m\lambda_i(\|\boldsymbol{y}_i\|_2 -t_i) + \sum_{i=1}^m\boldsymbol{\mu}_i^\top(\boldsymbol{y}_i-\boldsymbol{A}_i^\top\boldsymbol{x}-\boldsymbol{b}_i) + \sum_{i=1}^m\nu_i(t_i - \boldsymbol{c}_i^\top\boldsymbol{x})\\
&= (\boldsymbol{f}-\sum_{i=1}^m\boldsymbol{A}_i^\top -\sum_{i=1}^m\boldsymbol{u}_i\boldsymbol{c}_i)^\top\boldsymbol{x} + \sum_{i=1}^m(\lambda_i\|\boldsymbol{y}_i\|_2+\boldsymbol{\mu}_i^\top\boldsymbol{y}_i) \\
&\quad+ \sum_{i=1}^m(-\lambda_i+\nu_i)t_i-\sum_{i=1}^m(\boldsymbol{\mu}_i^\top\boldsymbol{b}_i+\nu_id_i)
\end{aligned}
L ( x , y , t , λ , μ , ν ) = f ⊤ x + i = 1 ∑ m λ i ( ∥ y i ∥ 2 − t i ) + i = 1 ∑ m μ i ⊤ ( y i − A i ⊤ x − b i ) + i = 1 ∑ m ν i ( t i − c i ⊤ x ) = ( f − i = 1 ∑ m A i ⊤ − i = 1 ∑ m u i c i ) ⊤ x + i = 1 ∑ m ( λ i ∥ y i ∥ 2 + μ i ⊤ y i ) + i = 1 ∑ m ( − λ i + ν i ) t i − i = 1 ∑ m ( μ i ⊤ b i + ν i d i )
原始问题的极小极大表示为:
p ⋆ = min x , y , t max λ , μ , ν L ( x , y , t , λ , μ , ν ) \begin{aligned}
p^\star = \min_{\boldsymbol{x}, \boldsymbol{y}, \boldsymbol{t}}\max_{\boldsymbol{\lambda}, \boldsymbol{\mu}, \boldsymbol{\nu}} L(\boldsymbol{x}, \boldsymbol{y}, \boldsymbol{t}, \boldsymbol{\lambda}, \boldsymbol{\mu}, \boldsymbol{\nu})
\end{aligned}
p ⋆ = x , y , t min λ , μ , ν max L ( x , y , t , λ , μ , ν )
那么对偶问题的极大极小表示为:
d ⋆ = max λ , μ , ν min x , y , t L ( x , y , t , λ , μ , ν ) \begin{aligned}
d^\star = \max_{\boldsymbol{\lambda}, \boldsymbol{\mu}, \boldsymbol{\nu}}\min_{\boldsymbol{x}, \boldsymbol{y}, \boldsymbol{t}} L(\boldsymbol{x}, \boldsymbol{y}, \boldsymbol{t}, \boldsymbol{\lambda}, \boldsymbol{\mu}, \boldsymbol{\nu})
\end{aligned}
d ⋆ = λ , μ , ν max x , y , t min L ( x , y , t , λ , μ , ν )
针对内层关于x \boldsymbol{x} x 的min \min min 问题,当且仅当满足:
f − ∑ i = 1 m A i ⊤ − ∑ i = 1 m u i c i = 0 , \boldsymbol{f}-\sum_{i=1}^m\boldsymbol{A}_i^\top -\sum_{i=1}^m\boldsymbol{u}_i\boldsymbol{c}_i = 0,
f − i = 1 ∑ m A i ⊤ − i = 1 ∑ m u i c i = 0 ,
该问题才有下界。
针对内层关于y i \boldsymbol{y}_i y i 的min \min min 问题,我们可以得到:
min y λ i ∥ y i ∥ 2 + μ i ⊤ y i = { 0 ∥ μ i ∥ 2 ≤ λ i − ∞ o t h e r w i s e . \min_{\boldsymbol{y}} \lambda_i\|\boldsymbol{y}_i\|_2+\boldsymbol{\mu}_i^\top\boldsymbol{y}_i = \left\{\begin{array}{ll}{0}&{\|\boldsymbol{\mu}_{i}\|_{2}\leq\lambda_{i}}\\{-\infty}&{\mathrm{otherwise}.}\end{array}\right.
y min λ i ∥ y i ∥ 2 + μ i ⊤ y i = { 0 − ∞ ∥ μ i ∥ 2 ≤ λ i otherwise .
当且仅当λ i = ν i \lambda_i = \nu_i λ i = ν i ,内层关于t \boldsymbol{t} t 的min \min min 才有下界。
因此,SOCP的对偶问题等价于:
max − ∑ i = 1 m ( μ i ⊤ b i + ν i d i ) s . t . f − ∑ i = 1 m A i ⊤ μ i − ∑ i = 1 m ν i c i = 0 , i = 1 , … , m , ∥ μ i ∥ ≤ λ i , i = 1 , … , m , − λ i + ν i = 0 , i = 1 , … , m , \begin{array}{ll}
\max & -\sum_{i=1}^m(\boldsymbol{\mu}_i^\top\boldsymbol{b}_i+\nu_id_i)\\
\mathrm{s.~t.} & \boldsymbol{f} - \sum_{i=1}^m \boldsymbol{A}_i^\top\boldsymbol{\mu}_i - \sum_{i=1}^m\nu_i\boldsymbol{c}_i =0,\quad i=1,\ldots,m, \\
&\|\boldsymbol{\mu}_i\|\leq \lambda_i, \quad i=1,\ldots,m,\\
&-\lambda_i+\nu_i=0, \quad i=1,\ldots,m,
\end{array}
max s. t. − ∑ i = 1 m ( μ i ⊤ b i + ν i d i ) f − ∑ i = 1 m A i ⊤ μ i − ∑ i = 1 m ν i c i = 0 , i = 1 , … , m , ∥ μ i ∥ ≤ λ i , i = 1 , … , m , − λ i + ν i = 0 , i = 1 , … , m ,
令μ i = u i , λ i = ν i = v i \boldsymbol{\mu}_i=\boldsymbol{u}_i, \lambda_i=\nu_i=v_i μ i = u i , λ i = ν i = v i ,则可以得到SOCP的对偶问题。
示例
基于椭球不确定集合的鲁棒线性规划是一个典型的二阶锥规划问题。下面列举一个鲁棒对应(RC)的形式:
min c ⊤ x s . t . ( a i + u i ) T x ≤ b i for all ∥ u i ∥ 2 ≤ 1 , i = 1 , … , m , \begin{aligned}&\min\quad c^\top x\\
&\mathrm{s.~t.}\quad(a_i+u_i)^Tx\leq b_i\text{ for all }\|u_i\|_2\leq1,\quad i=1,\ldots,m,
\end{aligned} min c ⊤ x s. t. ( a i + u i ) T x ≤ b i for all ∥ u i ∥ 2 ≤ 1 , i = 1 , … , m ,
其中,a i ∈ R n a_i\in\mathbb{R}^n a i ∈ R n ,b i ∈ R b_i\in\mathbb{R} b i ∈ R ,u i ∈ R n u_i\in\mathbb{R}^n u i ∈ R n ,c ∈ R n c\in\mathbb{R}^n c ∈ R n 。
不确定参数u i u_i u i 的集合被限制为一个ℓ 2 \ell_2 ℓ 2 范数构成的椭球集合,即∥ u i ∥ 2 ≤ 1 \|u_i\|_2\leq1 ∥ u i ∥ 2 ≤ 1 。
通过对偶转换可以得到RC的可处理形式:
min c T x s . t . a i T x + ∥ x ∥ 2 ≤ b i , i = 1 , … , m , \begin{aligned}&\min\quad c^Tx\\&\mathrm{s.~t.}\quad a_i^Tx+\|x\|_2\leq b_i,\quad i=1,\ldots,m,\end{aligned}
min c T x s. t. a i T x + ∥ x ∥ 2 ≤ b i , i = 1 , … , m ,
Python实现求解
CVXPY库可以直接求解SOCP。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 import cvxpy as cpimport numpy as npm = 3 n = 5 np.random.seed(2 ) c = np.random.random(n) A = np.random.random((m, n)) x0 = np.random.random(n) b = A@x0 x = cp.Variable(n) obj = cp.Minimize(c @ x) soc_constraints = [ cp.SOC(b[i] - A[i]@x, x) for i in range (m) ] prob = cp.Problem(obj, soc_constraints) prob.solve() print ("status:" , prob.status)print ("optimal value" , prob.value)print ("optimal var" , x.value)for i in range (m): print ("soc constraint" , soc_constraints[i].dual_value)
求解结果如下:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 status: optimal optimal value -2.487555966760468 optimal var [-1.37881038 -0.81856901 -1.7036923 -1.23789512 -0.9273785 ] soc constraint [array([3.71192161]), array([[1.82467162], [1.08325999], [2.25507434], [1.63818725], [1.22738155]])] soc constraint [array([1.2546655]), array([[0.61681542], [0.36620225], [0.76212621], [0.55379853], [0.41483849]])] soc constraint [array([3.29587551e-08]), array([[1.10774611e-08], [8.15413818e-10], [1.51970753e-08], [1.14059208e-08], [1.06823249e-08]])]