Designs of the divider and special multiplier optimizing T and CNOT gates

Quantum circuits for multiplication and division are necessary for scientiﬁc computing on quantum computers. Cliﬀord + T circuits are widely used in fault-tolerant realizations. T gates are more expensive than other gates in Cliﬀord + T circuits. But neglecting the cost of CNOT gates may lead to a signiﬁcant underestimation. Moreover, the small number of qubits available in existing quantum devices is another constraint on quantum circuits. As a result, reducing T-count, T-depth, CNOT-count, CNOT-depth, and circuit width has become the important optimization goal. We use 3-bit Hermitian gates to design basic arithmetic operations. Then, we present a special multiplier and a divider using basic arithmetic operations, where ‘special’ means that one of the two operands of multiplication is non-zero. Next, we use new rules to optimize the Cliﬀord + T circuits of the special multiplier and divider in terms of T-count, T-depth, CNOT-count, CNOT-depth, and circuit width. Comparative analysis shows that the proposed multiplier and divider have lower T-count, T-depth, CNOT-count, and CNOT-depth than the current works. For instance, the proposed 32-bit divider achieves improvement ratios of 40.41 percent, 31.64 percent, 45.27 percent, and 65.93 percent in terms of T-count, T-depth, CNOT-count, and CNOT-depth compared to the best current work. Further, the circuit widths of the proposed n -bit multiplier and divider are 3 n . I.e., our multiplier and divider reach the minimum width of multipliers and dividers, keeping an operand unchanged.


Introduction
One of the most significant challenges in quantum computing is the realization of quantum computers [1].The quantum circuit model is a realistic quantum computer model [2] and promotes the efficient implementation of quantum algorithms such as quantum image processing [3,4], quantum transforms [5,6], and quantum amplitude estimation [7].
Quantum circuits for arithmetic operations as the vital part of a quantum computer's reversible arithmetic logic unit can be realized by quantum gates [8,9].For instance, Noorallahzadeh et al. used elementary quantum gates in the NCV (NOT, CNOT, Controlled-V, and Controlled-V+) library to design quantum multipliers [10][11][12].These multipliers have the low quantum cost, garbage output, and constant input.The fault-tolerant implementation of quantum gates is needed for robust quantum computing in the presence of noise.Clifford + T circuits are widely accepted solutions to fault-tolerant implementation [13,14].An instruction set {H, S, S † , CNOT, T, T † } can be used to implement quantum gates [15].
The T gates are more expensive than other gates in terms of space and time cost due to their increased tolerance to noise errors [16,17].Since the CNOT gate is the only doublequbit gate in the Clifford + T gate set, neglecting the cost of the CNOT gates may lead to a significant underestimation [18,19].Therefore, the number of T gates (T-count), the maximum number of T gates in any circuit path (T-depth), CNOT-count, and CNOTdepth are the main performance indicators of Clifford + T circuits.
Toffoli, Fredkin, Peres, and TR gates are typical for the design of quantum arithmetic operations [20][21][22].Clifford + T circuits with T-depth 3 and T-count 7 for Toffoli, Fredkin, Peres, and TR gates have been proposed without ancillae [17,18,23].Their CNOT-counts are 7, 9, 6, and 6, respectively.Compared with Peres and TR gates, the Hermitian Toffoli gate has a better symmetry performance.Four 3-bit Hermitian gates were presented in [24], whose T-depth, T-count, and CNOT-count are 3, 7, and 6.Therefore, these Hermitian gates in [24] have better symmetry than Peres or TR gates and the smaller CNOTcount than the Toffoli gate.So, we use these Hermitian gates to design the divider and special multiplier in this paper.
Thapliyal et al. used ancillae to realize n-bit multipliers with two unchanged operands, so their circuit width is 4n + 1 [34,35].To reduce the T-count and circuit width, Li et al. utilized Peres and TR gates to design the multiplier with two unchanged operands and the circuit width 4n [23].Two multipliers with an unchanged operand and the circuit width 3n + 1 were implemented using approximate Toffoli, Peres, and TR gates [36].They have smaller T-counts, T-depths, and circuit widths than other multipliers proposed in [23,34,35].A quantum divider based on quantum Fourier transform was designed with the circuit width 4n [37].Thapliyal et al. replaced Toffoli gates with the complex quantum Fourier transform to realize two divisions with fewer qubits [38].The two dividers have circuit widths 3n + 3 and 3n + 2. To further reduce circuit width, Li et al. designed a divider with the circuit width 3n [36].The above multipliers and dividers do not consider optimizing CNOT-count and CNOT-depth.
Quantum algorithms may likely be implemented in these noisy intermediate-scale quantum (NISQ) devices [39].Some algorithms, such as a quantum convolutional neural network, have been proposed for NISQ devices [40,41].The large circuit width blocks applications of algorithms in NISQ devices, so the small circuit width is crucial for algorithms applied in NISQ devices.Since an n-bit divider with an unchanged operand needs at least 3n qubits to store the n-bit operand and 2n-bit result, we will use 3n qubits to realize a multiplier and a divider, respectively.In this paper, we design some basic arithmetic operations such as the modular adder and controlled modular adder.Then, we propose a special multiplier where 'special' means that one of the two operands of multiplication is not equal to zero, such as the multiplication s = a × b with a = 0. Since the division s/a is the inverse operation of the special multiplication s = a × b, we can obtain a divider circuit modifying the proposed special multiplier.Finally, the optimized Clifford + T circuits of the divider and special multiplier are presented.
The rest of this paper is organized as follows.In Sect.2, we review the background knowledge.Section 3 presents basic arithmetic operations.In Sects. 4 and 5, we propose the special multiplier and divider.Comparative analysis and conclusions are drawn in Sects.6 and 7.

Background
For clarity, we briefly introduce 3-bit Hermitian gates in [24] and approximate Toffoli gates in [36].The matrix forms of six Clifford + T gates are defined by We use the instruction set {H, X, S, S † , CNOT, T, T † } to realize the Clifford + T circuits for arithmetic operations in this paper.For instance, the implementations for the Toffoli, Fredkin, Peres, and TR gates are illustrated in Fig. 1 [15,36].Figure 1 reveals that Peres and TR gates consist of a Toffoli gate and a CNOT gate, respectively.Similarly, four Hermitian gates are constructed with Toffoli and CNOT gates [24].The four Hermitian gates are denoted as LI1, LI2, LI3, and LI4 with LI = {LI1, LI2, LI3, LI4}.Their Clifford + T circuits are presented in Fig. 2.
Figure 2 reveals that LI1 and LI2 gates are essentially the same.Since we swap the lines of operands A and B, we can get LI from LI2 swaping the lines of operands A and B. Similarly, we obtain the gate in Fig. 2(e) by swapping the lines of operands C and B for the LI2 gate.
The optimization rule for two LI2 gates is illustrated in Fig. 3.The implementations for the Toffoli and approximate Toffoli gates are shown in Fig. 4. Compared to the corresponding Toffoli gate in Fig. 4(a), the approximate Tof-   foli gate in Fig. 4(b) differs in that it maps |111 to -|111 [36].In addition, we give Clifford + T circuits for two variants of the approximating Toffoli gate in Fig. 4(c) and (d).

Quantum circuits of basic arithmetic operations
We substitute LI2 gates for Peres and TR gates to realize the modular adder and subtractor in [23].Their circuits presented in Fig. 5 implement the following n-bit operations: , and k ∈ {0, 1, . . ., n -1}.Compared to MA, MS consists of the same gates with the inverted order.It reveals that the inverse of basic arithmetic operations based on LI gates can be realized by inverting the circuit order of the corresponding arithmetic operations.We use LI2 gates to design quantum circuits for the modular adder-subtractor (MAS) and the controlled modular adder (CMA) in Fig. 6, which implement When a n-1 a n-2 . . .a 1 a 0 is equal to 0a n-2 . . .a 1 a 0 for MA and MAS in Fig. 5 and Fig. 6, we can omit the most significant bit of 0a n-2 . . .a 1 a 0 and design the special modular adder (SMA) and special modular adder-subtractor (SMAS) to reduce circuit width.Compared to basic arithmetic operations based on Peres and TR gates in [23], the above arithmetic operations have an advantage: these arithmetic operations and their inverse can be realized using the same gates.Compared to basic arithmetic operations based on Toffoli gates in [32,36,38], the above arithmetic operations have fewer CNOT-count.For instance, the n-bit CMA in Fig. 6(b) can reduce approximately 2.5n CONT gates compared with the controlled modular adders based on Toffoli gates.

Optimized Clifford + T circuits for basic arithmetic operations
We present four rules for LI gates in Fig. 8 to optimize Clifford + T circuits.T-count, T-depth, CNOT-count, and CNOT-depth are selected as optimization goals.Rule 2 is given in Fig. 8(a) to optimize CNOT gates.Rules 3 and 4 are obtained by modifying rule 1 in Fig. 3. Rule 5 shows that the Clifford + T circuit in box 2 is for the circuit in box 1.Furthermore, for each iteration of the circuit in box 2, the gate in the box will be multiplied by T † .Therefore, the gate in box 3 becomes (T † ) n after n + 2 times iterate the circuit in box 2. We obtain (T 7 = T, and (T † ) 8 = I, where the matrix forms of the Clifford gates I and Z are I = 1 0 0 1 , Z = 1 0 0 -1 .We can use rules 3, 4, and 5 to optimize Clifford + T circuits for basic arithmetic operations.For instance, the optimized Clifford + T circuits for the 4-bit SMA,4-bit SMSA, and 3-bit CMA are presented in Fig. 9 and Fig. 10.Circuits in dashed boxes are iterative circuits of SMA, SMSA, and CMA.Increasing by 1-bit for SMA will increase 8 T-counts, 2 T-depths, 13 CNOT-counts, and 6 CNOTdepths.Therefore, the n-bit SMA has 8(n -3) + 8 + 7 = 8n -9 T-counts, 2(n -3) + 5 = 2n -1 T-depth, 13(n -3) + 17 = 13n -22 CNOT-counts, and 6(n -3) + 10 = 6n -8 CNOTdepths.Similarly, we can calculate performance indexes for other basic arithmetic operations listed in Table 1.

The design of the special multiplier
In this section, we design a special multiplier using LI and approximate Toffoli gates.

The circuit for the special multiplier optimizing CNOT gates
The special multiplication s = ab can be expressed by . .a 0 , and a = 0. Since a + b = (a + b) mod 2 n+1 holds for any two n-bit positive integers, an n-bit addition can be realized by (n + 1)-bit modular adders.It is the reason that the n-bit integer a in (3) is expressed as 0a n-1 a n-2 . . .a 0 .
Equation ( 3) is rewritten as . .a 0 , b 0 = 1b 0 , and a = 0. Special arithmetic operations in Fig. 7 and their inverses can be adopted for the operand 0a n-1 a n-2 . . .a 0 .I.e., we can omit the most significant bit of 0a n-1 a n-2 . . .a 0 using special arithmetic operations.Thus, we can use a CMS, (n -1) SMSA, and an SMA to realize the special multiplication in (4).The circuits for the 2-bit and 3-bit special multipliers are shown in Fig. 11.Swap gates between basic arithmetic operations are used to shift quantum lines.The output |s 5 s 4 s 3 s 2 s 1 s 0 equals to |ba .
The CMS in Fig. 11 perfects the operation (0 -b0 a) mod 2 n , which can be rewritten by with a = a n-1 . . .a 1 a 0 and a n-1 , . . ., a 1 , a 0 , b 0 ∈ {0, 1}.The circuits for the operation (0 -b0 a) mod 2 n with n = 2, 3 are presented in Fig. 12.We replace CMS in the first column with circuits in the second column to realize (5).Since the LI2 gate changes |a 1 |0 |a 0 into |a 1 ⊕ a 0 |0 |a 0 , we substitute CNOT gates for LI2 gates to obtain circuits in the third column.Finally, we give circuits based on LI and approximate Toffoli gates in the fourth column.The circuit for (0 -b0 a) mod 2 n is presented in Fig. 13.We give circuits in the left in Fig. 14(b) and the top in Fig. 14(c) by shifting lines directly to replace Swap gates (see Fig. 14(a)), substituting circuits in Fig. 12 for CMS and eliminating some CNOT gates.Then, we obtain the circuits optimizing CNOT gates in Fig. 14 for the 2-bit and 3-bit special multipliers using rule 3.
For clarity, we give the circuit for the 4-bit special multiplier in Fig. 15(a).The circuits in the dashed boxes 1, 2, and 3 are named the first module of the multiplier (FMM), the iterative module of the multiplier (IMM), and the last module of the multiplier (LMM), respectively.Next, we use these modules to design the n-bit special multiplier in Fig. 15(b).

Clifford + T circuits for the special multiplier
We give the Clifford + T circuit for the 2-bit special multiplier in Fig. 16 using rules 3 and 4. The circuit in dashed box 1 is provided using the Clifford + T circuit for the approximate    We propose Clifford + T circuits of the IMM and LMM for the 4-bit special multiplier in Fig. 18 using rules 3 and 4. Analyzing iterative circuits in dashed boxes, we calculate the  ( For clarity, we give performance indexes of the 2-bit special multiplier, n-bit special multiplier (n ≥ 3), and its modules in Table 2.

The design of the divider
In this section, we design a divider using LI and approximate Toffoli gates.

The circuit for the divider optimizing CNOT gates
The division s/a is the inverse operation of the special multiplication s = ba with a = 0. We obtain a divider by reversing the circuit order for the special multiplier in Fig. 11.There-Figure 19 The circuits for (a) a 2-bit divider and (b) a 3-bit divider.The outputs |q 1 q 0 and |q 2 q 1 q 0 are quotients of the division |s/a ; |r 1 r 0 and |r 2 r 1 r 0 are remainders of the division |s/a fore, we can use an SMS, (n -1) SMAS, and a CMA to realize the divider.For instance, the 2-bit and 3-bit dividers are presented in Fig. 19.
We design circuits in Fig. 20 for the special operation in (9) using approximate Toffoli gates in Fig. 4(c) and (d).
Firstly, we substitute circuits in Fig. 20 for SMS and eliminate some CNOT gates.Then, we use rule 2 to design circuits of the 2-bit and 3-bit dividers optimizing CNOT gates in

Clifford + T circuits for the divider
We use rule 5 and Clifford + T circuits for approximate Toffoli gates in Fig. 4 to design the Clifford + T circuit of the 2-bit divider in Fig. 22. Clifford + T circuits in dashed boxes 1 and 2 in Fig. 22 correspond to circuits in dashed boxes 1 and 2 in Fig. 21(a).Similarly, we propose the Clifford + T circuit for the FMD in Fig. 23(a) using approximate Toffoli gates in Fig. 4. The Clifford + T circuit for the IMD is presented in Fig. 23(b) using rule 5. We calculate performance indexes of the FMD and LMD by analyzing iterative circuits in dashed boxes in Fig. 23:  The Clifford + T circuit for the LMD is presented in Fig. 24 using rule 5. From Fig. 24, we give performance indexes of the LMD as follows: Furthermore, we give performance indexes of the 2-bit divider, n-bit divider (n ≥ 3), and its modules in Table 3.

Comparative analysis 6.1 Comparisons of basic arithmetic operators
The section Introduction shows that the adder has the best T-count 4n -4 in [30].But the adder needs 2n -2 measurements.It is thus not directly comparable with the T-count.Considering the circuit width, T-count, T-depth, CNOT-count, and CNOT-depth, we compare the proposed works with the rest basic arithmetic operators in [23,29,33,36,38].The results are presented in Table 4. Table 4 shows that the proposed basic arithmetic operators are superior to the others.
Objectively, compared with these operators in [36], the proposed basic arithmetic operations have only a slight advantage regarding performance indexes.However, the proposed basic arithmetic operators have a significant advantage: They are more convenient for designs of the multiplier and divider.For instance, the proposed controlled modular adder can be used to realize the LMD of the divider with excellent performance indexes (see Fig. 24).

Method comparisons with previous works
The main contributions of this paper are to design the new multiplier and divider.Therefore, we provide a description of the new contributions in this section.Compared to multipliers based on the measure-and-fixup approach [42,43], our method for the multiplier differs in that it does not require quantum measurements.Method comparisons with previous works [23,35,36] are presented in Table 5. Realization formulas of multipliers are and where b k is equal to 1b k with k ∈ {1, 2, . . ., n -1}.Table 5 shows that the proposed method is different from methods in [23,35] in terms of realization formulas, unchanged operands, and optimization objects.From realization formulas in Table 5, we obtain that methods in [23,35] use n controlled modular adders to implement multipliers, respectively.Our method adopts the circuit in Fig. 13, (n -1) special modular subtractor-adders, and a special modular adder to implement the multiplier.Compared to the method in [36], our method used the different realization formula, optimization objects, optimization rules, and realization of -b 0 a.According to the realization formula in [36], the multiplier is implemented by a circuit named SM, (n -1) modular adder-subtractors, and a modular adder, where the SM consists of a controlled modular subtractor and other gates to realize [-b 0 + (-1) b 1 ]a.The proposed multiplier reduces the circuit width, T-count, T-depth, CNOT-count and CNOT-depth, because the method in the paper uses the different realization formula, optimization objects, and optimization rules.
Due to the Hermitian property of LI gates, we can reverse the circuit order for the special multiplier to obtain the divider.The method for the divider in the paper is significantly different from methods of dividers in [36,38].

Performance comparisons of multipliers
Multipliers based on the measure-and-fixup approach have small T-counts.For instance, T-counts of two multipliers in [42,43] are 6n 2 + O(n) and 8n 2 -4n, respectively.But, the two multipliers also require O(n 2 ) quantum measurements.We compare the proposed multiplier against recent works without quantum measurements [23,35,36].Two multipliers are proposed in [36], but the second multiplier has better performance indexes than the first multiplier.Therefore, we only compare the proposed multiplier with the second multiplier in [36].The results are presented in Table 6 and Table 7, which illustrate that the proposed multiplier is superior to the others for the five performance indexes.For instance, the CNOT-count of the proposed 32-bit multiplier achieves improvement ratios of 50.20 percent, 54.35 percent, and 26.96 percent compared to the works presented in [23,35,36], respectively.The caveat is that the proposed multiplier can only realize the special multiplication ab with a = 0.The other three multipliers do not have this limitation.The proposed multiplier based on LI gates has another advantage: it can be easily used to design dividers.Note: Performance indexes of the multiplier in [36] are miscalculated.We recalculate them as follows.The n-bit multiplier consists of an SM module, n -1 modular addersubtractors ((n + 1)-bit), an (n + 1)-bit modular adder, and an Aswap module.Performance indexes of the modular adder and modular adder-subtractor in [36] have been listed in Table 4.The SM module has 20n -9 T-counts, 5n -1 T-depth, 33n -23 CNOT-counts, and 21n -21 CNOT-depths.The Aswap module has 4n T-counts, 2 T-depths, 6n CNOTcount, and 5 CNOT-depths.Performance indexes of the n-bit multiplier are calculated by

Performance comparisons of dividers
We compare the proposed divider against recent works [36,38].Thapliyal et al. design restoring and non-restoring dividers [38].The non-restoring divider has a smaller T-count and T-depth than the restoring divider, so we only compare the proposed divider with the non-restoring divider.The results in Table 8 and Table 9 illustrate that the proposed divider is superior to the others in terms of T-count, T-depth, CNOT-count, and CNOTdepth.For instance, the proposed 32-bit divider achieves improvement ratios of 40.41 percent, 31.64 percent, 45.27 percent, and 65.93 percent in terms of T-count, T-depth, CNOT-count, and CNOT-depth compared to the work presented in [36].Meanwhile, the proposed 32-bit divider reduces T-count by 45.54 percent, T-depth by 67.62 percent, CNOT-count by 47.36 percent, and CNOT-count by 68.85 percent when compared with the work in [38].The n-bit division requires at least 3n qubits to store the quotient, remainder, and operand; thus, the proposed divider has the minimum circuit width 3n for the n-bit division keeping an operand unchanged.
Note: Thapliyal et al. designed a non-restoring divider to realize the positive 2's complement division [38].An n-bit positive integer can be changed into the complement number by adding a binary 0 before the high bit.Therefore, a complement operand of the n-bit   positive integer division requires m = n + 1 bits.The n-bit divider consists of an m-bit modular subtractor named Subtraction, (m -1) m-bit modular adder-subtractor named Ctrl-AddSub, and an (m -1)-bit controlled modular adder named Ctrl-AddNOP [38].
Performance indexes of the three modules can be found in Table 4.Then, performance indexes of the n-bit divider are calculated by T-count:

Conclusions and future works
In this paper, we have proposed a special multiplier and a divider based on LI and approximate Toffoli gates.We designed circuits of basic arithmetic operations used in the proposed multiplier and divider, such as the modular adder, modular adder-subtractor, controlled modular adder, special modular adder, special modular adder-subtractor, and their inverses.These basic arithmetic operations based on LI gates have the advantage that their inverse can be realized by inverting the circuit order of the corresponding arithmetic operations.We have proposed new rules of LI gates to design Clifford + T circuits of the proposed multiplier and divider, optimizing T-count, T-depth, CNOT-count, and CNOT-depth.Clifford + T circuits of the proposed multiplier and divider are superior to existing multiplier and dividers in terms of T-count, T-depth, CNOT-count, and CNOTdepth.Furthermore, circuit widths of the proposed n-bit multiplier and divider are 3n.That is, our multiplier and divider have reached the minimum width of multipliers and dividers, keeping an operand unchanged.As a future work, it will be interesting to apply the proposed multiplier and divider in quantum image processing, such as quantum bilinear interpolation algorithm.

Figure 1
Figure 1 Implementation circuits for (a) the Toffoli gate, (b) the Fredkin gate, (c) the Peres gate, and (d) an inverse-Peres gate named TR.Note: The circuit of the Toffoli gate has an error in[36], so we have modified the error in the dashed box 1 in (a)

Figure 2
Figure 2 Clifford + T circuits for (a) the LI1 gate, (b) the LI2 gate, (c) the LI3 gate, (d) the LI4 gate, and (e) the variant of the LI2 gate

Figure 3
Figure 3 Rule 1 of two LI gates proposed in [36].U is the combination of LI and Clifford gates

Figure 4
Figure 4 Clifford + T circuits for (a) the Toffoli gate, (b) the approximate Toffoli gate, and the variants of the approximate Toffoli gate in (c) and (d)

Figure 6 Figure 7
Figure6 Quantum circuits for (a) the modular adder-subtractor (MAS) and (b) the controlled modular adder (CMA).Due to X n = I for the even number n and X n = X for the odd number n, the circuit in box 1 consists of n/2 + 1 CONT gates, where . is a round-up symbol

Figure 10
Figure 10 The 3-bit CMA and its optimized Clifford + T circuit

Figure 11 Figure 12
Figure11 The circuits for (a) a 2-bit special multiplier and (b) a 3-bit special multiplier.The outputs |s 3 s 2 s 1 s 0 and |s 5 s 4 s 3 s 2 s 1 s 0 equal to |ba , respectively

Figure 13 Figure 14
Figure 13The n-bit circuit for the operation in(5).|t n-1 . . .t 1 t 0 equals to |(b 0 a -a) mod 2 n . The iterative circuit of FMM is given in the dashed box. Figure 17 reveals that increasing by 1-bit for FMM increases 18 T-counts, 4 T-depths, 21 CNOT-count, and 9 CNOT-depth.Therefore, we obtain performance indexes of the FMM for the n-bit special

Figure 15 Figure 16 Figure 17
Figure15 The circuits optimizing CNOT gates for (a) the 4-bit special multiplier and (b) the n-bit multiplier

Figure 18
Figure 18 Clifford + T circuits of IMM and LMM for the 4-bit special multiplier including (a) IMM and (b) LMM

Fig. 21 (
Fig. 21(a) and (b).Circuits in dashed boxes in Fig. 21(b) are named the first module of the divider (FMD), the iterative module of the divider (IMD), and the last module of the divider (LMD), respectively.Finally, we use the three modules to design the n-bit divider in Fig. 21(c).

Figure 23 Figure 24
Figure 23 Clifford + T circuits for (a) the FMD and (b) the IMD.Circuits in dashed boxes are iterative circuits of the FMD and IMD, respectively

Table 1
Performance indexes for n-bit basic arithmetic operations.HS-count denotes the total number of H and S gates

Table 2
Performance indexes the special multiplier and its modules with n ≥ 3

Table 3
Performance indexes the divider and its modules with n ≥ 3

Table 4
Comparisons of basic arithmetic operators

Table 5
Method comparisons of multipliers with previous works.RF, UO, OO, and OR denote realization formulas, unchanged operands, optimization objects, and optimization rules

Table 6
Comparisons of multipliers for n ≥ 3

Table 7
Comparisons of multipliers by increasing n from 4 to 32.C-count and C-depth denotes CNOT-count and CNOT-depth, respectively

Table 8
Comparisons of dividers for n ≥ 3

Table 9
Comparisons of dividers by increasing n from 4 to 32.C-count and C-depth denotes CNOT-count and CNOT-depth, respectively