The Action Replay Process

Preface

A commonly used inequality

− x > ln ⁡ ( 1 − x ) , 0 < x < 1 -x > \ln(1 - x), \quad 0 < x < 1 −x>ln(1−x),0<x<1

Proof: Let f ( x ) = ln ⁡ ( 1 − x ) + x f(x) = \ln(1 - x) + x f(x)=ln(1−x)+x, for 0 < x < 1 0 < x < 1 0<x<1. Then f ( 0 ) = 0 f(0) = 0 f(0)=0.

f ′ ( x ) = − 1 1 − x + 1 = x x − 1 < 0 f'(x) = \frac{-1}{1 - x} + 1 = \frac{x}{x - 1} < 0 f′(x)=1−x−1+1=x−1x<0

Hence, − x > ln ⁡ ( 1 − x ) , 0 < x < 1 -x > \ln(1 - x), \quad 0 < x < 1 −x>ln(1−x),0<x<1. Q.E.D.


Fundamental Theorem

If a n > − 1 a_n > -1 an>−1, then

∏ n = 1 ∞ ( 1 + a n ) = 0 ⇔ ∑ n = 1 ∞ ln ⁡ ( 1 + a n ) = − ∞ \prod_{n=1}^\infty (1 + a_n) = 0 \Leftrightarrow \sum_{n=1}^\infty \ln(1 + a_n) = -\infty n=1∏∞(1+an)=0⇔n=1∑∞ln(1+an)=−∞

Proof: Let P k = ∏ n = 1 k ( 1 + a n ) P_k = \prod_{n=1}^k (1 + a_n) Pk=∏n=1k(1+an), then

ln ⁡ P k = ln ⁡ ( ∏ n = 1 k ( 1 + a n ) ) = ∑ n = 1 k ln ⁡ ( 1 + a n ) \ln P_k = \ln\left(\prod_{n=1}^k (1 + a_n)\right) = \sum_{n=1}^k \ln(1 + a_n) lnPk=ln(n=1∏k(1+an))=n=1∑kln(1+an)

Thus,

∑ n = 1 ∞ ln ⁡ ( 1 + a n ) = − ∞ ⇔ lim ⁡ k → ∞ ∑ n = 1 k ln ⁡ ( 1 + a n ) = − ∞ ⇔ lim ⁡ k → ∞ ln ⁡ P k = − ∞ ⇔ lim ⁡ k → ∞ P k = 0 \sum_{n=1}^\infty \ln(1 + a_n) = -\infty \Leftrightarrow \lim_{k \to \infty} \sum_{n=1}^k \ln(1 + a_n) = -\infty \Leftrightarrow \lim_{k \to \infty} \ln P_k = -\infty \Leftrightarrow \lim_{k \to \infty} P_k = 0 n=1∑∞ln(1+an)=−∞⇔k→∞limn=1∑kln(1+an)=−∞⇔k→∞limlnPk=−∞⇔k→∞limPk=0

Q.E.D.


Corollary

If 0 ≤ b n < 1 0 \le b_n < 1 0≤bn<1 and ∑ n = 1 ∞ b n = + ∞ \sum_{n=1}^\infty b_n = +\infty ∑n=1∞bn=+∞, then

∏ n = 1 ∞ ( 1 − b n ) = 0 \prod_{n=1}^\infty (1 - b_n) = 0 n=1∏∞(1−bn)=0

Proof: Consider the subsequence { b n k } \{b_{n_k}\} {bnk} consisting of non-zero b n b_n bn. Since − b n k > − 1 -b_{n_k} > -1 −bnk>−1, and applying the fundamental theorem, we have:

∏ n = 1 ∞ ( 1 − b n ) = ∏ k = 1 ∞ ( 1 − b n k ) = 0 ⇔ ∑ k = 1 ∞ ln ⁡ ( 1 − b n k ) = − ∞ \prod_{n=1}^\infty (1 - b_n) = \prod_{k=1}^\infty (1 - b_{n_k}) = 0 \Leftrightarrow \sum_{k=1}^\infty \ln(1 - b_{n_k}) = -\infty n=1∏∞(1−bn)=k=1∏∞(1−bnk)=0⇔k=1∑∞ln(1−bnk)=−∞

We now show ∑ k = 1 ∞ ln ⁡ ( 1 − b n k ) = − ∞ \sum_{k=1}^\infty \ln(1 - b_{n_k}) = -\infty ∑k=1∞ln(1−bnk)=−∞.

Given 0 < 1 − b n k < 1 0 < 1 - b_{n_k} < 1 0<1−bnk<1, we have ln ⁡ ( 1 − b n k ) < 0 \ln(1 - b_{n_k}) < 0 ln(1−bnk)<0, and ∑ k = 1 ∞ b n k = + ∞ \sum_{k=1}^\infty b_{n_k} = +\infty ∑k=1∞bnk=+∞. It's not immediately obvious, so we proceed by contradiction:

Assume ∑ k = 1 ∞ ln ⁡ ( 1 − b n k ) ≠ − ∞ \sum_{k=1}^\infty \ln(1 - b_{n_k}) \ne -\infty ∑k=1∞ln(1−bnk)=−∞. Since each term is negative, this implies convergence, i.e.,

∑ k = 1 ∞ ln ⁡ ( 1 − b n k ) > − ∞ \sum_{k=1}^\infty \ln(1 - b_{n_k}) > -\infty k=1∑∞ln(1−bnk)>−∞

But ∑ k = 1 ∞ ( − b n k ) = − ∞ ≥ ∑ k = 1 ∞ ln ⁡ ( 1 − b n k ) > − ∞ \sum_{k=1}^\infty (-b_{n_k}) = -\infty \ge \sum_{k=1}^\infty \ln(1 - b_{n_k}) > -\infty ∑k=1∞(−bnk)=−∞≥∑k=1∞ln(1−bnk)>−∞, a contradiction.

Therefore, ∑ k = 1 ∞ ln ⁡ ( 1 − b n k ) = − ∞ \sum_{k=1}^\infty \ln(1 - b_{n_k}) = -\infty ∑k=1∞ln(1−bnk)=−∞, and so

∏ n = 1 ∞ ( 1 − b n ) = 0 \prod_{n=1}^\infty (1 - b_n) = 0 n=1∏∞(1−bn)=0

Q.E.D.


The Essence of Mathematical Truth: Induction

Observe a linear-looking relation, fantasize wildly, then coldly examine whether it is truly valid.

Given X 1 X_1 X1 and the recursive formula:

X n + 1 = X n + β n ( ξ n − X n ) = ( 1 − β n ) X n + β n ξ n X_{n+1} = X_n + \beta_n(\xi_n - X_n) = (1 - \beta_n)X_n + \beta_n \xi_n Xn+1=Xn+βn(ξn−Xn)=(1−βn)Xn+βnξn

Show that

X n + 1 = ∑ j = 1 n ξ j β j ∏ i = j n − 1 ( 1 − β i + 1 ) + X 1 ∏ i = 1 n ( 1 − β i ) X_{n+1} = \sum_{j=1}^{n} \xi_j \beta_j \prod_{i=j}^{n-1} (1 - \beta_{i+1}) + X_1 \prod_{i=1}^n (1 - \beta_i) Xn+1=j=1∑nξjβji=j∏n−1(1−βi+1)+X1i=1∏n(1−βi)

Proof:

  1. Base case: n = 1 n = 1 n=1

    X 2 = ( 1 − β 1 ) X 1 + β 1 ξ 1 = ξ 1 β 1 + X 1 ( 1 − β 1 ) X_2 = (1 - \beta_1)X_1 + \beta_1 \xi_1 = \xi_1 \beta_1 + X_1 (1 - \beta_1) X2=(1−β1)X1+β1ξ1=ξ1β1+X1(1−β1)

    holds.

  2. Inductive step: assume true for n n n, prove for n + 1 n+1 n+1:

    X n + 2 = ( 1 − β n + 1 ) X n + 1 + β n + 1 ξ n + 1 X_{n+2} = (1 - \beta_{n+1})X_{n+1} + \beta_{n+1} \xi_{n+1} Xn+2=(1−βn+1)Xn+1+βn+1ξn+1

    Plug in inductive hypothesis:

    = ( 1 − β n + 1 ) ∑ j = 1 n ξ j β j ∏ i = j n − 1 ( 1 − β i + 1 ) + X 1 ∏ i = 1 n ( 1 − β i ) + β n + 1 ξ n + 1 = (1 - \beta_{n+1})\left\\sum_{j=1}\^{n} \\xi_j \\beta_j \\prod_{i=j}\^{n-1} (1 - \\beta_{i+1}) + X_1 \\prod_{i=1}\^n (1 - \\beta_i)\\right + \beta_{n+1} \xi_{n+1} =(1−βn+1)j=1∑nξjβji=j∏n−1(1−βi+1)+X1i=1∏n(1−βi)+βn+1ξn+1

    = ∑ j = 1 n + 1 ξ j β j ∏ i = j n ( 1 − β i + 1 ) + X 1 ∏ i = 1 n + 1 ( 1 − β i ) = \sum_{j=1}^{n+1} \xi_j \beta_j \prod_{i=j}^{n} (1 - \beta_{i+1}) + X_1 \prod_{i=1}^{n+1} (1 - \beta_i) =j=1∑n+1ξjβji=j∏n(1−βi+1)+X1i=1∏n+1(1−βi)

  3. By induction, the formula holds for all positive integers n n n. Q.E.D.


Now, let's relax for a while --- it's movie time.


1. Definition of Action Replay Process

Given an n n n-step finite MDP with a possibly varying learning rate α \alpha α, in step i i i, the agent is in state x i x_i xi, takes action a i a_i ai, receives random reward r i r_i ri, and transitions to a new state y i y_i yi.

Action Replay Process (ARP) is a re-examination of state x x x and action a a a within a given MDP.

Suppose we focus on state x x x and action a a a, and consider an MDP consisting of n n n steps.

We add a step 0 in which the agent immediately terminates and receives reward Q 0 ( x , a ) Q_0(x,a) Q0(x,a).

During steps 1 to n n n, due to MDP randomness, the agent may take action a a a in state x x x at time steps 1 ≤ n i 1 , n i 2 , . . . , n i ∗ ≤ n 1 \le n^{i_1}, n^{i_2}, ..., n^{i_*} \le n 1≤ni1,ni2,...,ni∗≤n.

If action a a a is never taken at x x x in this episode, the only opportunity for it is at step 0.

When i ∗ ≥ 1 i_* \ge 1 i∗≥1, to determine ARP's next reward and state, we sample an index n i e n^{i_e} nie as follows:

n i e = { n i ∗ , with probability α n i ∗ n i ∗ − 1 , with probability ( 1 − α n i ∗ ) α n i ∗ − 1 ⋮ 0 , with probability ∏ i = 1 i ∗ ( 1 − α n i ) n^{i_e} = \begin{cases} n^{i_*}, & \text{with probability } \alpha_{n^{i_*}} \\ n^{i_{*-1}}, & \text{with probability } (1 - \alpha_{n^{i_*}})\alpha_{n^{i_{*-1}}} \\ \vdots \\ 0, & \text{with probability } \prod_{i=1}^{i_*}(1 - \alpha_{n^i}) \end{cases} nie=⎩ ⎨ ⎧ni∗,ni∗−1,⋮0,with probability αni∗with probability (1−αni∗)αni∗−1with probability ∏i=1i∗(1−αni)

Then, after one ARP step, the state < x , n > <x, n> <x,n> transitions to < y n i e , n i e − 1 > <y_{n^{i_e}}, n^{i_e} - 1> <ynie,nie−1>, and the reward is r n i e r_{n^{i_e}} rnie.

Clearly, n i e − 1 < n n^{i_e} - 1 < n nie−1<n, so ARP terminates with probability 1. Thus, ARP is a finite process almost surely.

To summarize, the core transition formula is:

< x , n > → a < y n i e , n i e − 1 > , reward r n i e <x,n> \overset{a}{\rightarrow} <y_{n^{i_e}}, n^{i_e} - 1>, \quad \text{reward } r_{n^{i_e}} <x,n>→a<ynie,nie−1>,reward rnie


2. Properties of the Action Replay Process

We now examine ARP's properties, particularly in comparison to MDPs. Given an MDP rule and a (non-terminating) instance, we can construct an ARP accordingly.

Property 1

∀ n , x , a , Q A R P ∗ ( < x , n > , a ) = Q n ( x , a ) \forall n, x, a,\quad Q^*_{ARP}(<x, n>, a) = Q_n(x, a) ∀n,x,a,QARP∗(<x,n>,a)=Qn(x,a)

Proof:

Using mathematical induction on n n n:

  1. Base case n = 1 n=1 n=1:

    • If the MDP did not take a a a at x x x in step 1, ARP gives reward Q 0 ( x , a ) = 0 = Q 1 ( x , a ) Q_0(x,a) = 0 = Q_1(x,a) Q0(x,a)=0=Q1(x,a)

    • If ( x , a ) = ( x 1 , a 1 ) (x,a) = (x_1, a_1) (x,a)=(x1,a1), then:

      Q A R P ∗ ( < x , 1 > , a ) = α 1 r 1 + ( 1 − α 1 ) Q 0 ( x , a ) = α 1 r 1 = Q 1 ( x , a ) Q^*_{ARP}(<x,1>, a) = \alpha_1 r_1 + (1 - \alpha_1) Q_0(x,a) = \alpha_1 r_1 = Q_1(x,a) QARP∗(<x,1>,a)=α1r1+(1−α1)Q0(x,a)=α1r1=Q1(x,a)

  2. Inductive step: Assume Q A R P ∗ ( < x , k − 1 > , a ) = Q k − 1 ( x , a ) Q^*{ARP}(<x, k-1>, a) = Q{k-1}(x,a) QARP∗(<x,k−1>,a)=Qk−1(x,a), show for k k k:

    • If ( x , a ) ≠ ( x k , a k ) (x,a) \ne (x_k, a_k) (x,a)=(xk,ak), then:

      Q k ( x , a ) = Q k − 1 ( x , a ) = Q A R P ∗ ( < x , k > , a ) Q_k(x,a) = Q_{k-1}(x,a) = Q^*_{ARP}(<x, k>, a) Qk(x,a)=Qk−1(x,a)=QARP∗(<x,k>,a)

    • If ( x , a ) = ( x k , a k ) (x,a) = (x_k, a_k) (x,a)=(xk,ak), then:

      Q A R P ∗ ( < x , k > , a ) = α k r k + γ max ⁡ a Q k − 1 ( y k , a ) + ( 1 − α k ) Q k − 1 ( x , a ) = Q k ( x , a ) Q^*{ARP}(<x,k>, a) = \alpha_k r_k + \\gamma \\max_a Q_{k-1}(y_k,a) + (1 - \alpha_k) Q{k-1}(x,a) = Q_k(x,a) QARP∗(<x,k>,a)=αkrk+γamaxQk−1(yk,a)+(1−αk)Qk−1(x,a)=Qk(x,a)

  3. Therefore, Q A R P ∗ ( < x , n > , a ) = Q n ( x , a ) Q^*_{ARP}(<x,n>, a) = Q_n(x,a) QARP∗(<x,n>,a)=Qn(x,a). Q.E.D.


Property 2 In the ARP {\}, for all l , s , ϵ > 0 l, s, \epsilon > 0 l,s,ϵ>0, there exists h > l h > l h>l such that for all n 1 > h n_1 > h n1>h,

P ( n s + 1 < l ) < ϵ P(n_{s+1} < l) < \epsilon P(ns+1<l)<ϵ

Proof:

Let us first consider the final step, that is, the case where n i e < n i l n^{i_e} < n^{i_l} nie<nil or even lower.

Given in the ARP, starting from < x , h > <x, h> <x,h>, after taking action a a a, the probability of reaching a level lower than l l l in one step is:

∑ j = 0 i l − 1 α n j ∏ k = j + 1 i h ( 1 − α n k ) = ∑ j = 0 i l − 1 α n j ∏ k = j + 1 i l − 1 ( 1 − α n k ) ∏ i = i l i h ( 1 − α n i ) = ∏ i = i l i h ( 1 − α n i ) ∑ j = 0 i l − 1 α n j ∏ k = j + 1 i l − 1 ( 1 − α n k ) \sum_{j=0}^{i_l - 1} \left \\alpha_{n\^j} \\prod_{k=j+1}\^{i_h} (1 - \\alpha_{n\^k}) \\right = \sum_{j=0}^{i_l - 1} \left \\alpha_{n\^j} \\prod_{k=j+1}\^{i_l - 1} (1 - \\alpha_{n\^k}) \\right \left \\prod_{i=i_l}\^{i_h} (1 - \\alpha_{n\^i}) \\right = \left \\prod_{i=i_l}\^{i_h} (1 - \\alpha_{n\^i}) \\right \sum_{j=0}^{i_l - 1} \left \\alpha_{n\^j} \\prod_{k=j+1}\^{i_l - 1} (1 - \\alpha_{n\^k}) \\right j=0∑il−1 αnjk=j+1∏ih(1−αnk) =j=0∑il−1 αnjk=j+1∏il−1(1−αnk) i=il∏ih(1−αni)=i=il∏ih(1−αni)j=0∑il−1 αnjk=j+1∏il−1(1−αnk)

But note that:

∑ j = 0 i l − 1 α n j ∏ k = j + 1 i l − 1 ( 1 − α n k ) = 1 \sum_{j=0}^{i_l - 1} \left \\alpha_{n\^j} \\prod_{k=j+1}\^{i_l - 1} (1 - \\alpha_{n\^k}) \\right = 1 j=0∑il−1 αnjk=j+1∏il−1(1−αnk) =1

Therefore,

∑ j = 0 i l − 1 α n j ∏ k = j + 1 i h ( 1 − α n k ) = ∏ i = i l i h ( 1 − α n i ) < e − ∑ i = i l i h α n i \sum_{j=0}^{i_l - 1} \left \\alpha_{n\^j} \\prod_{k=j+1}\^{i_h} (1 - \\alpha_{n\^k}) \\right = \prod_{i=i_l}^{i_h} (1 - \alpha_{n^i}) < e^{-\sum_{i=i_l}^{i_h} \alpha_{n^i}} j=0∑il−1 αnjk=j+1∏ih(1−αnk) =i=il∏ih(1−αni)<e−∑i=ilihαni

As long as every subsequence of { α n } \{\alpha_n\} {αn} diverges , then as h → ∞ h \to \infty h→∞:

∑ j = 0 i l − 1 α n j ∏ k = j + 1 i h ( 1 − α n k ) = ∏ i = i l i h ( 1 − α n i ) < e − ∑ i = i l i h α n i → 0 \sum_{j=0}^{i_l - 1} \left \\alpha_{n\^j} \\prod_{k=j+1}\^{i_h} (1 - \\alpha_{n\^k}) \\right = \prod_{i=i_l}^{i_h} (1 - \alpha_{n^i}) < e^{-\sum_{i=i_l}^{i_h} \alpha_{n^i}} \to 0 j=0∑il−1 αnjk=j+1∏ih(1−αnk) =i=il∏ih(1−αni)<e−∑i=ilihαni→0

Moreover, since the MDP is finite, we have:

∀ l j ∈ N ∗ , ∀ η j > 0 , ∃ M j > 0 , ∀ n j > M j , ∀ X j , a j , \forall l_j \in \mathbb{N}^*, \forall \eta_j > 0, \exists M_j > 0, \forall n_j > M_j, \forall X_j, a_j, ∀lj∈N∗,∀ηj>0,∃Mj>0,∀nj>Mj,∀Xj,aj,

starting from < X j , n j > <X_j, n_j> <Xj,nj>, after taking action a j a_j aj,

P ( n j + 1 ≥ l j ) = 1 − η j P(n_{j+1} \ge l_j) = 1 - \eta_j P(nj+1≥lj)=1−ηj

Using the index j j j, we recursively apply this conclusion from step s s s back to step 1. Then, the probability of reaching at least l = l s l = l_s l=ls is at least:

∏ j = 1 s ( 1 − η j ) = 1 − ϵ \prod_{j=1}^{s} (1 - \eta_j) = 1 - \epsilon j=1∏s(1−ηj)=1−ϵ

where n j + 1 ≥ l j n_{j+1} \ge l_j nj+1≥lj, and < X j + 1 , n j + 1 > <X_{j+1}, n_{j+1}> <Xj+1,nj+1> is reached from < x j , n j > <x_j, n_j> <xj,nj> after executing a j a_j aj. Q.E.D.

Now, define:

P x y ( n ) a = ∑ m = 1 n − 1 P < x , n > , < y , m > A R P a P_{xy}^{(n)}a = \sum_{m=1}^{n-1} P_{<x,n>,<y,m>}^{ARP}a Pxy(n)a=m=1∑n−1P<x,n>,<y,m>ARPa

Lemma:

Let ξ n {\xi_n} ξn be a sequence of bounded random variables with expectation E \mathfrak{E} E, and let 0 ≤ β n < 1 0 \le \beta_n < 1 0≤βn<1 satisfy ∑ i = 1 ∞ β i = + ∞ \sum_{i=1}^{\infty} \beta_i = +\infty ∑i=1∞βi=+∞ and ∑ i = 1 ∞ β i 2 < + ∞ \sum_{i=1}^{\infty} \beta_i^2 < +\infty ∑i=1∞βi2<+∞.

Define the sequence X n + 1 = X n + β n ( ξ n − X n ) X_{n+1} = X_n + \beta_n(\xi_n - X_n) Xn+1=Xn+βn(ξn−Xn). Then:

P ( lim ⁡ n → ∞ X n = E ) = 1 P\left( \lim_{n \to \infty} X_n = \mathfrak{E} \right) = 1 P(n→∞limXn=E)=1

My attempt:

X n + 1 = X n + β n ( ξ n − X n ) = ( 1 − β n ) X n + β n ξ n X_{n+1} = X_n + \beta_n(\xi_n - X_n) = (1 - \beta_n) X_n + \beta_n \xi_n Xn+1=Xn+βn(ξn−Xn)=(1−βn)Xn+βnξn

By induction, we obtain:

X n + 1 = ∑ j = 1 n ξ j β j ∏ i = j n − 1 ( 1 − β i + 1 ) + X 1 ∏ i = 1 n ( 1 − β i ) X_{n+1} = \sum_{j=1}^{n} \xi_j \beta_j \prod_{i=j}^{n-1} (1 - \beta_{i+1}) + X_1 \prod_{i=1}^{n} (1 - \beta_i) Xn+1=j=1∑nξjβji=j∏n−1(1−βi+1)+X1i=1∏n(1−βi)

From a corollary of a fundamental theorem:

∏ i = 1 ∞ ( 1 − β i ) = 0 \prod_{i=1}^{\infty} (1 - \beta_i) = 0 i=1∏∞(1−βi)=0

Hence:

lim ⁡ n → ∞ X n = lim ⁡ n → ∞ ∑ j = 1 n ξ j β j ∏ i = j + 1 n ( 1 − β i ) = lim ⁡ n → ∞ ∑ j = 1 n ξ j β j ∏ i = j + 1 n ( 1 − β i ) 1 − 0 \lim_{n \to \infty} X_n = \lim_{n \to \infty} \sum_{j=1}^{n} \xi_j \beta_j \prod_{i=j+1}^{n} (1 - \beta_i) = \frac{ \lim_{n \to \infty} \sum_{j=1}^{n} \xi_j \beta_j \prod_{i=j+1}^{n} (1 - \beta_i) }{1 - 0} n→∞limXn=n→∞limj=1∑nξjβji=j+1∏n(1−βi)=1−0limn→∞∑j=1nξjβj∏i=j+1n(1−βi)

= lim ⁡ n → ∞ ∑ j = 1 n ξ j β j ∏ i = j + 1 n ( 1 − β i ) 1 − ∏ i = 1 ∞ ( 1 − β i ) = lim ⁡ n → ∞ ∑ j = 1 n ξ j β j ∏ i = j + 1 n ( 1 − β i ) 1 − ∏ i = 1 n ( 1 − β i ) = \frac{ \lim_{n \to \infty} \sum_{j=1}^{n} \xi_j \beta_j \prod_{i=j+1}^{n} (1 - \beta_i) }{1 - \prod_{i=1}^{\infty} (1 - \beta_i)} = \lim_{n \to \infty} \sum_{j=1}^{n} \xi_j \frac{ \beta_j \prod_{i=j+1}^{n} (1 - \beta_i) }{1 - \prod_{i=1}^{n} (1 - \beta_i)} =1−∏i=1∞(1−βi)limn→∞∑j=1nξjβj∏i=j+1n(1−βi)=n→∞limj=1∑nξj1−∏i=1n(1−βi)βj∏i=j+1n(1−βi)

Property 3

P { lim ⁡ n → ∞ P x y ( n ) a = P x y a } = 1 , P lim ⁡ n → ∞ R x ( n ) ( a ) = R x ( a ) = 1 P\left\{ \lim_{n \to \infty} P_{xy}^{(n)}a = P_{xy}a \right\} = 1, \quad P\left \\lim_{n \\to \\infty} \\mathfrak{R}_{x}\^{(n)}(a) = \\mathfrak{R}_{x}(a) \\right = 1 P{n→∞limPxy(n)a=Pxya}=1,Pn→∞limRx(n)(a)=Rx(a)=1

相关推荐
测试仪器廖生135902563857 分钟前
罗德与施瓦茨 FSP13频谱分析仪FSP30
网络·人工智能·算法
happymaker062610 分钟前
LeetCodeHot100——560.和为K的子数组
算法
dtq042427 分钟前
C语言刷题数组5,6(求平均值,求最大值)
c语言·数据结构·算法
郭梧悠38 分钟前
Hash算法入门Hash冲突解决方案
算法·哈希算法
洛水水1 小时前
【力扣100题】81.寻找两个正序数组的中位数
数据结构·算法·leetcode
happymaker06262 小时前
LeetCodeHot100——155.最小栈
算法
洛水水2 小时前
【力扣100题】85.每日温度
算法·leetcode·职场和发展
Coder-magician2 小时前
《代码随想录》刷题打卡day15:二叉树part05
数据结构·c++·算法
Kurisu_红莉栖2 小时前
力扣56合并区间
算法·leetcode
Irissgwe2 小时前
算法的时间复杂度和空间复杂度
数据结构·c++·算法·c·时间复杂度·空间复杂度