Effective convergence of coranks of random R\'edei matrices

We give effective estimates for the $l^1$-distance between the corank distribution of $r \times r$ R\'edei matrices and the measure predicted by the Cohen--Lenstra heuristics. To this end we pinpoint a class of stochastic processes, which we call $c$-transitioning. These stochastic processes are well approximated by Markov processes, and we give an effective ergodic theorem for such processes. With this tool we make effective a theorem of Gerth \cite{Gerth} that initiated the study of the Cohen--Lenstra heuristics for $p = 2$. Gerth's work triggered a series of developments that has recently culminated in the breakthrough of Smith \cite{Smith}. The present work will be used in upcoming work of the authors on further applications of Smith's ideas to the arithmetic of quadratic fields. To this end we extend our main result to several other families of matrix spaces that occur in the study of integral points on the equation $x^2 - dy^2 = l$ as $d$ varies.


Introduction
In 1983 Cohen and Lenstra [2] put forth a systematic set of conjectures on the distribution of the p-Sylow of class groups of quadratic fields, for an odd prime p.These conjectures postulate that these p-Sylows should behave as randomly as possible in the following natural sense.Each finite abelian p-group A is conjectured to appear as the p-Sylow of the class group of a quadratic field with probability proportional to A substantial amount of effort has subsequently been invested in detecting a possible source of randomness in the behavior of class groups of quadratic fields.In 1987 Friedman and Washington [8] considered the case of quadratic function fields and observed that the Cohen-Lenstra's prediction could be reformulated in terms of random matrices.Indeed, they suggested as potential source of randomness the behavior of the Frobenius operator acting on the Tate module of the corresponding hyperelliptic curve.
This lead to a reinterpretation of the Cohen-Lenstra heuristics in terms of corank statistics of large random matrices, a point of view that has been further explored in the work of Wood [24] and proved fruitful in her recent extension of the conjectures of Cohen and Lenstra to a non-abelian setting [23].Incidentally, the matrices occurring over function fields are constrained to respect a symplectic pairing (the Weil pairing), but it can be shown that this does not affect the limiting distribution, a feature that plays a major role also in the present work.In the context of quadratic function fields Ellenberg, Venkatesh and Westerland [4] were able to make substantial progress relating these conjectures to the homological stability of Hurwitz spaces.They then used topological methods to make progress on the latter.
For quadratic number fields the situation is currently much more mysterious if p is an odd prime, apart from the average 3-torsion of quadratic fields [1,3].As we shall now explain, the story for p = 2 is entirely different.
In 1801 Gauss [9] gave a description of the 2-torsion of the narrow class group Cl(Q( √ d)) of a quadratic field Q( √ d).In particular Gauss showed that where ω(•) denotes the number of distinct prime divisors and ∆ Q( Recently, the authors [16] investigated the 2-torsion of the narrow class group of multiquadratic fields Q( √ d 1 , . . ., √ d n ).This readily shows that the 2-Sylow of the class group of a quadratic field is not a random finite, abelian 2-group in the sense of Cohen-Lenstra.A natural guess, which can be found implicitly in [10] and explicitly in [12], is that instead the group 2Cl(Q( √ d)) [2 ∞ ] is a random finite, abelian 2-group.
In 1984 Gerth [10] gave the first evidence for the correctness of this guess.During the proof he made use of an explicit description of 2Cl(Q( √ d)) [4] due to Rédei [19].Rédei was able to relate the dimension of the space 2Cl(Q( √ d)) [4] to the corank of a certain r × r matrix constructed out of the mutual Legendre symbols of the prime divisors of d, where This matrix is now commonly referred to as the Rédei matrix of the field Q( √ d).Due to quadratic reciprocity the Rédei matrix is far from being a random r × r matrix.Gerth showed that when one fixes r, then the Rédei matrices of Q( √ d), as d varies, equidistribute in the space of r × r matrices satisfying the constraints of quadratic reciprocity.He then showed that the corresponding densities converge to the limiting distribution predicted by the above guess as r goes to infinity, despite the constrained shape of the matrices.Gerth also extended his work to cyclic degree p extensions [11] and formulated a conjecture for the distribution of their p-Sylows, which is, in case p = 2, precisely the modified version of the Cohen-Lenstra conjectures mentioned above.This then became known as Gerth's conjecture or the Cohen-Lenstra-Gerth heuristic.Despite Gerth's progress, the distribution of the 4-torsion as d varies among all squarefree integers was still an open problem.In 2006 Fouvry and Klüners [5,6], using a different approach, were able to solve this problem and showed that the 4-rank of class groups of quadratic fields have the limiting distribution predicted by the Cohen-Lenstra-Gerth heuristic.Instead of directly trying to handle the randomness of the Rédei matrices, they expressed the 4-rank of Q( √ d) as a sum of Legendre symbols.They proved oscillation of this sum by using ideas introduced in the seminal work of Heath-Brown on 2-Selmer groups [13].
On the one hand this method has proved to be very robust for the 4-torsion and analogous statistics: a similar line of attack has been subsequently used for cyclic degree p fields [14], for ray class groups of imaginary quadratic fields [18] and very recently by Fouvry and the authors [7] for the 4-rank of Cl(Q(i, √ d)).On the other hand the situation for the 8-torsion and higher powers remained mysterious until very recently.
In 2017 Smith [21], improving on earlier work of himself on the 8-torsion [20], was able to prove Gerth's conjecture for imaginary quadratic fields in its entirety.In 2018 this was extended by the authors [15] to all cyclic degree p extensions, conditional on GRH.
In Smith's work the description of the 4-torsion in terms of Rédei matrices becomes again central.In [21] he manages to prove that the Rédei matrices attached to squarefree integers d is, for the purposes of the corank statistics, equidistributed in the space of all possible Rédei matrices, when one lets d run through all squarefree integers.In this context he claims that the rate of convergence in the main result of Gerth [10,Theorem 4.3] can be made effective.The main result of the present work fills this gap in the literature by showing that this is indeed the case.
We prove effective convergence of the corank distribution of a large random Rédei matrix to the probability distribution predicted by Cohen-Lenstra-Gerth.For an integer 0 ≤ κ ≤ r, let X r (Rédei, κ) be the random variable that computes the probability that a uniformly chosen r × r matrix A with coefficients in F 2 satisfying has corank equal to j.For an integer r ≥ 0 denote by µ r (Rédei) the probability distribution on Z ≥0 given by the corank of a r × r random Rédei matrix, i.e.
Denote by π C.L. : Z ≥0 → R ≥0 the distribution of the rank of the 2-torsion of random abelian 2-groups, in the sense of Cohen-Lenstra-Gerth. Writing for k ∈ Z ≥0 ∪ {∞}, we have the explicit formula which equals the probability that a uniformly chosen random r × r matrix with coefficients in F 2 has corank j, as r goes to infinity.
As explained in Remark 3, Theorem 1.1 is an effective version of Gerth's main result [10,Theorem 4.3].The method of proof of Theorem 1.1, which we summarize below, can be adapted to other matrix spaces.For instance, in Theorem 4.8 we extend this result to the case of spaces of Rédei matrices occurring in the study of the solubility of the equation Here l is fixed and d varies over squarefree integers divisible by l.
We remark that a similar analysis is not required in the case of cyclic degree p fields, where p is an odd prime.The difference is explained by quadratic reciprocity.This highly constraints the space of Rédei matrices of quadratic fields.However, the key point, already present in Gerth's work [10], is that for most r × r Rédei matrices A adding a random row and column to A, in a way that the resulting (r + 1) × (r + 1) matrix is still Rédei, the corank transitions with the same probabilistic rules as that of a random matrix.At this point Gerth proceeds with a detailed analysis of the transition rules for the exceptions and construct several explicit Markov processes that allow him to obtain the desired limiting distribution by an intricate approximation argument.
We completely bypass this intricate step and take the following route instead.We pinpoint the general class of c-transitioning processes, which are well approximated by a Markov process in a precise quantitative sense.Then we give an effective ergodic theorem for such processes in Theorem 3.1 with uniform error term.After that, we rapidly deduce Theorem 1.1 and several analogues.

An ergodic theorem for Markov chains
We shall need to work with Markov chains of considerable generality to prove our main theorems.Fortunately, the relevant Markov chains are still simple enough that we shall not need too much machinery from measure theory.Let Ω be a countable set, which we view as a measurable space by equipping it with the σ-algebra consisting of all subsets of Ω.In this way we can think of measures simply as functions Ω → R ≥0 and we shall often do so implicitly.
For every x ∈ Ω, there is a natural random variable X(x) on Ω, which assigns to x probability 1 and 0 to all other points.Let P : Ω × Ω → R ≥0 be a function such that {x : P (x, y) > 0} is finite for all y ∈ Ω, {y : P (x, y) > 0} is finite for all x ∈ Ω and y∈Ω P (x, y) = 1 for all x ∈ Ω.We can think of P as an infinite matrix with only finitely many non-zero entries in each row and column such that the sum of the entries in every row is 1, and we call such to obtain another probability measure.The Markov chains that we shall encounter will start with a random variable X(x) for some x ∈ Ω, and the next random variables are obtained by repeated application of the same P satisfying the assumptions above.For A ⊆ Ω, we write P n (x, A) for the probability that the Markov chain is in A after n steps, assuming that the Markov chain starts in x, i.e. with the random variable X(x).
Let ψ : Ω → R ≥0 be a measure such that ψ(Ω) > 0. We say that P is ψ-irreducible if for every x ∈ Ω and every A ⊆ Ω with ψ(A) > 0, there is some positive integer n such that P n (x, A) > 0. We say that P is aperiodic if gcd({n ≥ 1 : P n (x, x) > 0}) = 1 for all x ∈ Ω and extremely aperiodic if P (x, x) > 0 for all x ∈ Ω.
Certainly, if P is extremely aperiodic, then it is also aperiodic.We say that a function for all but finitely many x ∈ Ω and furthermore for every real number r.Here P V denotes the function obtained by right multiplying P with V .
Theorem 2.1.Let Ω be a countable set, and let P : Ω × Ω → R ≥0 be a transition matrix.Assume that P is ψ-irreducible and extremely aperiodic.Then if V : Ω → R ≥1 is a drift function, there is a unique probability measure π : Ω → R ≥0 such that πP = P .Furthermore, there are constants R > 0 and ρ < 1 such that for every x ∈ Ω.
Proof.Let C be the finite set of exceptions to the inequality Since our Markov chain is extremely aperiodic, it follows that any finite subset of Ω is petite, see Section 5.5 of [17] for the definition of petite.In particular, C is petite.Then condition (iii) of Theorem 15.0.1 in [17] is satisfied (in their notation ∆V (x) := P V (x) − V (x)).It follows from [17, Theorem 15.0.1] that π exists, and that there are constants R > 0 and r > 1 such that Here || • || V is by definition Note that equation (2.1) implies that there are R ′ > 0 and ρ < 1 such that This proves the theorem by choosing f to be the function with |f | = 1 and f (y) > 0 if and only if P n (x, y) − π(y) > 0.

The equilibrium for almost transitioning processes
Let Q be a transition matrix on Z ≥0 .Let d be a positive real number.We say that Q is d-driftable in case x → d x , viewed as map from Z ≥0 to R ≥0 , is a drift function for Q.We further assume that Q is ψ-irreducible for some non-trivial measure ψ on Z ≥0 and that Q is extremely aperiodic.Observe that multiplication by Q on the left on l 1 (Z ≥0 ) gives a bounded linear operator with Indeed given v ∈ l 1 (Z ≥0 ) with ||v|| 1 = 1, we have that, writing w for the vector obtained from v by taking absolute values componentwise, since the entries of Q are non-negative.On the other hand ||wQ|| 1 = 1 because Q is a transition matrix.Let (A, P) be a probability space (the σ-algebra will not play a role and hence we do not introduce notation for it).Let c be a real number in (0, 1).Suppose to have for each integer i ≥ 0 random variables X i : A → {0, . . ., i}.
We say that the sequence {X i } i∈Z ≥0 is c-transitioning with Q in case there exists a sequence of random variables for each s ∈ {0, . . ., i} and j ∈ {0, . . ., i + 1}.For a random variable X : A → Z ≥0 , write µ X for the vector in l 1 (Z ≥0 ) given by the distribution of X, that is µ X (j) := P(X = j).
Theorem 3.1.Let d ∈ R ≥0 and let c ∈ (0, 1).Let Q be a d-driftable transition matrix on Z ≥0 .Let {X i } i∈Z ≥0 be a sequence of random variables, with X i taking values in {0, . . ., i}.Suppose that {X i } i∈Z ≥0 is c-transitioning with Q.Then there exists a unique probability measure π on Z ≥0 with πQ = π and constants and for each j ∈ Z ≥0 .A simple calculation, using that our process transitions correctly when Applying iteratively this identity we find that for each i ∈ Z ≥0 and each h Let us now pick ǫ in (0, 1) and write g(r) := ⌊ǫ • r⌋ and h(r) := r − g(r).By the triangle inequality we have Thanks to Theorem 2.1, there are R ∈ R ≥0 and ρ ∈ (0, 1) depending entirely on Q and d such that the first summand is bounded by Now choose ǫ ∈ (0, 1) so that d ǫ ρ 1−ǫ < 1.This choice makes the expression smaller that C •ρ ′r for some ρ ′ depending only on Q and d.Now we focus on Using that ||Q|| 1 = 1 and that we find the upper bound This gives the desired conclusion.
Remark 1. From the proof, it is clear that we reach the same conclusion of Theorem 3.1 on ||µ Xr − π|| 1 as long as we have the definition of c-transitioning for all indices up to r, the ones after r being clearly irrelevant for the estimate at stage r.This point will be important in the proof of Theorem 4.6.

Corank distributions of matrix spaces
We now study the rank distribution in a number of matrix spaces.Such spaces, often occurring in arithmetic applications, arise by randomly adding to a given matrix a row and a column subject to certain rules.We formalize this notion as follows.Write F [n] 2 for the free F 2 vector space over [n].For a subset S ⊆ [n] := {1, . . ., n}, we write π S for the natural projection map.
Definition 4.1.A rule is a product space where S i is a non-empty subset of We can view each S i as probability space with uniform probability distribution and as discrete topological space.Thanks to the Kolmogorov extension theorem [22,Theorem 2.4.3], this naturally endows a rule S with the structure of a probability space (where the sigma algebra is the one generated by the open sets of S, viewed as profinite space).
We shall often use the following construction.To each point B ∈ S one can naturally attach an infinite matrix together with its sequence of top left minors ).This gives a sequence of random variables given by B → co-rk(ω i (B)).
Below we consider several different rules S.These rules will give rather different matrix spaces.However, each rule has in common that we are able to find a generic class of matrices such that the corank transitions precisely as in the simplest case, namely the class of random matrices.This is given by the transition matrix and zero otherwise.The matrix Q C.L. is ψ-irreducible for the function ψ(x) = 1, and extremely aperiodic.Furthermore, Q C.L. is 2-driftable with the exceptional states being {0, 1, 2}.This matrix will play the role of Q in Theorem 3.1.The rule S will play the role of A from Section 3, and the variables Z i from that same section will precisely be the detector of genericity, which we define in each space.A direct computation using equation (1.1) shows that π C.L. Q C.L. = π C.L. .
In this way the effective convergence will fall as a formal consequence of Theorem 3.1.

Rank transition in row-column extension of a matrix
Let n be a positive integer and let ).We denote by <, > the standard inner product of two vectors in F [n] 2 .A vector w is in Im(A T ) if and only if it is in ker(A) ⊥ .Therefore, if v is in Im(A), the set < A −1 (v), w > consists of a unique number, which by abuse of notation we also denote as 2 and c ∈ F 2 , we denote by A(v, w, c) the matrix in Mat The following fact describes the corank transition co-rk(A) → co-rk(A(v, w, c)).
2 and c ∈ F 2 .Then one has the following: Proof.This follows from basic linear algebra, see Gerth [10].
We write H n for the vector in 2 with all entries equal to 1.
). Denote by j := co-rk(A).Then we have the following.
(1) Picking (v, w, c) uniformly at random in F (3) Picking (v, c) uniformly at random in F [n] 2 × F 2 , the random variable co-rk(A(v, v + H n , c)) takes the values {j − 1, j, j + 1} with probability given respectively by Proof.This is a simple consequence of Proposition 4.2 and the next remark.
Remark 2. We have Indeed, this follows immediately after applying ⊥ and using that Im(A T ) = ker(A) ⊥ .

Random matrices
Let us consider the rule S mat defined by and let X i (mat) : S mat → {0, . . ., i} be the corresponding sequence of random variables.In this case there are also explicit formulas available for µ Xr(mat) , which also allow one to deduce Theorem 4.4.We have included this case as the simplest illustration of the methods used here.

Alternating matrices
Let us consider the rule S alt defined by and let X i (alt) : S alt → {0, . . ., i} be the corresponding sequence of random variables.

Rédei matrices
Let us consider the rule S Rédei (κ) defined by This gives a sequence of random variables given by B → co-rk(ω i (B)).
We now prove the following.Proof.Fix ǫ ∈ (0, 1  2 ).Then we have Thanks to Hoeffiding's inequality, the contribution from the last two summands is no more than which is certainly within the bound.Hence it is enough to show that there are Therefore we now focus on showing the existence of such C ′ and ρ ′ .To this end we start by defining for each κ ∈ Z ≥1 the following sequence of random variables Let B be in S Rédei (κ).If i ≤ κ we put Z i (κ)(B) = 0 in case ω i (B) • H i = 0 and we put Z i (κ)(B) = 1 otherwise.Instead for i > κ we put Z i (κ)(B) = 0 in case the last i − κ columns of ω i (B) are linearly independent and the following additional condition is satisfied.We ask that for each vector x ∈ F 2 .Put q := min(i, κ).Then we always have the inclusion But since the last i − κ columns are linearly independent, it follows that the projection map on the last i − κ coordinates remains surjective when restricted to Im(ω for some x q+1 , . . ., x i ∈ F 2 .After applying ⊥ and Remark 2, we see that this is excluded by our assumptions on H i and H κ , which establishes our claim.Hence parts (2) and (3) of Proposition 4.3 give for each k ∈ {0, . . ., i} and j ∈ {0, . . ., i + 1}.
Let us now bound P(Z i (κ) = 1).In case i ≤ κ then we clearly have that this probability equals 1  2 i .For i > κ we use the union bound, applied to the 2 i−κ candidate vectors x each happening with probability at most 1 2 i , to deduce that we can bound the two summands as follows.The first summand is smaller than .
This last expression can be bounded as c r ǫ ≤ c i ǫ , for a constant c ǫ ∈ (0, 1) depending only on ǫ.Keeping in mind Remark 1 we invoke Theorem 3.1 and obtain precisely the desired uniform upper bound.

Rédei matrices in Pellian families
We now examine Rédei matrices that occur in the study of the solubility of the following Pellian equations.Fix l an integer such that |l| is a prime congruent to 3 modulo 4. One then looks at the solubility of with x, y ∈ Z as d varies among squarefree positive integers with l | d.
Certainly for equation (4.1) to be soluble, there needs to be a solution with x, y ∈ Q.Here we will study only solubility with x, y ∈ Q, the transition to Z is made in upcoming work of the authors.As we explain below the Rédei matrix attached to d is more constrained than those appearing in Section 4.4.We divide the discussion according to sgn(l), gcd(2, ∆ Q( √ d)/Q ) and explain for each possibility which type of matrices occur and parametrize (in a rank preserving manner) each space with a rule.Finally for the corresponding random variables we prove the analogue of Theorem 4.6 in each of these cases.
For nonnegative integers κ ≤ s, we denote by H s (κ) the vector of 2 whose first κ entries are ones and the remaining entries are zeroes.Before we proceed, we shall define the Rédei matrix attached to a squarefree integer d.Definition 4.7.Let d be a squarefree integer and let D be the discriminant of Q( √ d).Write q 1 , . . ., q t for the prime divisors of D. Then we can uniquely decompose where χ i is a character with conductor a power of q i .if q i is an odd prime, we have that χ i is the quadratic character of Q( q * i ), where q * i is the unique integer satisfying |q * i | = q i and q * i ≡ 1 mod 4. If instead q i = 2, we have that χ i is the quadratic character of The diagonal entries are determined by the rule that the sum of every row is zero.
Remark 3. In case we fix the number of prime divisors of the discriminant, it is a fact that almost all discriminants are odd.Furthermore, in case that d < 0 and d ≡ 1 mod 4, we know that also the sum of every column is zero.Then removing a random row and the corresponding column from the Rédei matrix gives a matrix satisfying the constraints as described in Subsection 4.4: this follows from quadratic reciprocity.Gerth [10] proves equidistribution in this space of matrices and proves convergence to π C.L. as the number of prime divisors goes to infinity.With these remarks we directly see that Theorem 4.6 is an effective version of Gerth's result.However, if we consider all squarefree integers simultaneously, one also needs to consider even discriminants.

Auxiliary matrix spaces
In this subsection we define several matrix spaces.In the remaining paragraphs of this subsection we motivate these definitions by showing that the Rédei matrix of d, such that equation (4.1) is soluble over Q, naturally gives a point in one of these spaces.The matrix space depends on the value of l and the parity of the discriminant of Q( √ d).Let s be a positive integer and κ ≤ s be a nonnegative integer.
We let Pell 1 (s, κ) be the space of (s + 1) × (s + 1) matrices A with coefficients in F 2 satisfying the following constraints.The first row of A must be 0 and the sum of all the columns of A equals 0. Furthermore, for 1 ≤ i < j ≤ κ + 1 we demand that A(i, j) = A(j, i) + 1, while for κ + 1 < j ≤ s + 1 and 1 ≤ i ≤ s + 1 we demand that A(i, j) = A(j, i).
We next put Pell 2 (s, κ), to be the space of (s + 1) × (s + 1) matrices A with coefficients in F 2 with the following constraints.The first row of A must be H s+1 (κ + 1) and the sum of all the columns of A equals 0. Finally, we ask for 1 ≤ i < j ≤ κ + 1 that A(i, j) = A(j, i) + 1, while we ask for κ + 1 < j ≤ s + 1 and 1 ≤ i ≤ s + 1 that A(i, j) = A(j, i).
We set Pell ′ 1 (s, κ) to be the space of (s + 2) × (s + 2) matrices A with coefficients in F 2 satisfying the following constraints.The first row of A is 0 and the sum of all the columns is 0. For 1 ≤ i < j ≤ κ + 2, with i, j = 2 we require that A(i, j) = A(j, i) + 1 and A(i, 2) = A(2, i) + κ + 1, while we require for κ + 2 < j ≤ s + 2 and 1 ≤ i ≤ s + 2 that A(i, j) = A(j, i).
Let now (a, b) be in F 2 2 .Finally, we put Pell 3 (s, κ, (a, b)) to be the space of (s + 2) × (s + 2) matrices A with coefficients in F 2 satisfying the following constraints.The first row of A equals H s+2 (κ + 2).The projection on the first two entries of the second row equals (a, b).The projection on the last s entries of the second column equals H s (κ).For each 1 ≤ i < j ≤ κ + 2 and i, j = 2 we have that If instead κ + 2 < j ≤ s + 2 and 1 ≤ i ≤ s + 2, we have that A(i, j) = A(j, i).

Positive l, odd discriminant
Enumerate the odd prime divisors of d different from l as q 1 , . . ., q s such that precisely the first κ of the q i are congruent 3 modulo 4. Represent the Rédei matrix, Rédei(d), with the prime l being the first row and the character χ −l being the first column.The remaining s rows and columns are numbered precisely as the q i .Later, we shall also have to deal with even discriminants, in which case we always put the prime 2 in the second row and second column.With this convention the equation is soluble over Q if and only if the first row of Rédei(d) is 0. The sum of all the columns will be zero as this is true for any Rédei matrix.Keeping in mind quadratic reciprocity we conclude that Rédei(d) ∈ Pell 1 (s, κ).

Negative l, odd discriminant
Maintain the notation as in the previous subsection for d, l, s, κ, q 1 , . . ., q s .Now the solubility of equation (4.1) over Q becomes equivalent to the first row being H s+1 (κ + 1).Invoking quadratic reciprocity once more, we conclude that Rédei(d) ∈ Pell 2 (s, κ).
To each B ∈ S Pell,1 (κ) corresponds a sequence of matrices ω ).We define a corresponding sequence of random variables given by the assignment B → co-rk(ω i (B)).
As in Subsection 4.4, we put for each integer r ≥ 0.

About Pell 2 (s, κ)
Let κ ≤ s be positive integers with κ even.Let A be in Pell 2 (s, κ).We see that we can eliminate the second column and row of A to obtain a matrix whose co-rank equals co-rk(A)− 1.This is a s × s matrix whose first column is e 1 , first row is H s (κ) and whose bottom right minor is a matrix arising as ω s−1 (B) with B ∈ S Rédei (κ − 1).This corresponds to the rule where for 1 ≤ i ≤ κ we have that while for i > κ we have that and finally S 0 (Pell 2 , κ) = {1}.As before we let

About Pell
By throwing away the first row and the second column, and then adding up all the other rows to the second, we get again the rule S Pell,1 (κ) and random variables X r (Pell 1 , κ).Hence this case does not give any new sequence of random variables.
for i > κ.As before we put for each integer r ≥ 0 and a, b ∈ F 2 .

Effective convergence in the Pellian families
We now state and prove our final result.
(1) We have for all integers r ≥ 0 and for all j ∈ {1, 2} Proof.The proof is the same as the proof of Theorem 4.6 except for the choice of the variables Z i .We shall only focus on this aspect, provide the upper bound for P(Z i = 1) in each case, and then refer to the proof of Theorem 4.6.Let us first prove (1).We start with the case j = 1.If i ≤ κ, we put Z i (κ)(B) = 0 in case ω i (B) • H i = 0 and we put Z i (κ)(B) = 1 otherwise.Instead for i > κ we put Z i (κ)(B) = 0 in case the bottom right i − 1 × i − κ submatrix of ω i (B) has full rank and furthermore for each vector x ∈ F 2 .However we also want to guarantee that Im(ω i (B)) ∩ Im(ω i (B) T ) ⊆ ker(π 1 ).
Indeed, this ensures that we get the same transitioning probabilities as in the proof of Theorem 4.6 when we restrict to vectors whose first component is fixed.Taking ⊥ we see that the condition is equivalent to e 1 ∈ ker(ω i (B)) + ker(ω i (B) T ).
Suppose e 1 = t 1 + t 2 with t 1 ∈ ker(ω i (B) T ) and t 2 ∈ ker(ω i (B)).If we apply ω i (B) to the above equality, we obtain But this last equality is impossible since Im(ω i (B) + ω i (B) T ) ⊆ ker(π 1 ), while H i (κ) has first coordinate non-zero.Hence this condition is automatically satisfied, and we may proceed as in Theorem 4.6.
Let us now consider j = 2.With the same definition of Z i as in the case j = 1, we still have Im(ω i (B)) + Im(ω i (B) T ) = F [i] 2 .However, we also want to guarantee that e 1 ∈ ker(ω i (B)) + ker(ω i (B) T ).This time the equality becomes e 1 = (ω i (B) + ω i (B) T )t 1 . (4.2) The matrix ω i (B) + ω i (B) T has as top left min(i, κ) × min(i, κ) minor the matrix with zeroes on the diagonal and ones everywhere else.All other entries of the matrix ω i (B) + ω i (B) T are zero.
In case min(i, κ) is odd, then equation (4.2) is impossible, since the image of ω i (B)+ω i (B) T is in that case contained in the sum zero space.In case min(i, κ) is even, we conclude that π [min(i,κ)] (t 1 ) = (0, H min(i,κ)−1 ).
Hence it is sufficient to further demand that ω i (B) T x = 0 for every vector x with projection in the first min(i, κ) coordinates equal to (0, H min(i,κ)−1 ).This is still at most 1 2 i for i ≤ κ and by the union bound no more than 1 2 κ for i > κ.Hence with this small modification, one can proceed as in the proof of Theorem 4.6.
For the proof of part (2), we additionally fix the second row and then bound the conditional probabilities with the same considerations used as in part (1) for j = 2.