 Research
 Open Access
 Published:
Quantum pricing with a smile: implementation of local volatility model on quantum computer
EPJ Quantum Technology volume 9, Article number: 7 (2022)
Abstract
Quantum algorithms for the pricing of financial derivatives have been discussed in recent papers. However, the pricing model discussed in those papers is too simple for practical purposes. It motivates us to consider how to implement more complex models used in financial institutions. In this paper, we consider the local volatility (LV) model, in which the volatility of the underlying asset price depends on the price and time. As in previous studies, we use the quantum amplitude estimation (QAE) as the main source of quantum speedup and discuss the state preparation step of the QAE, or equivalently, the implementation of the asset price evolution. We compare two types of state preparation: One is the amplitude encoding (AE) type, where the probability distribution of the derivative’s payoff is encoded to the probabilistic amplitude. The other is the pseudorandom number (PRN) type, where sequences of PRNs are used to simulate the asset price evolution as in classical Monte Carlo simulation. We present detailed circuit diagrams for implementing these preparation methods in faulttolerant quantum computation and roughly estimate required resources such as the number of qubits and Tcount.
Introduction
With recent advances in quantum computing technologies, researchers are beginning to consider how to utilize them in industries. Finance is one of the major target [1]. Because financial institutions perform enormous tasks of numerical calculation in their daily works, the speedup of such calculation will bring significant benefits to them. One of such tasks is the pricing of financial derivatives [2–4]. Financial derivatives, or simply derivatives, are contracts in which payoffs are determined in reference to the prices of underlying assets at some fixed times.
In derivative pricing, movements of underlying asset prices are represented by stochastic processes, and a derivative price is written as an expected value of the sum of payoffs discounted by the riskfree interest rate. Monte Carlo simulation is often used to compute the derivative price, but it takes a computation long time. Quantum algorithms for Monte Carlo integration [5, 6] bring quadratic speedup compared with classical Monte Carlo algorithms, and several previous studies discuss their application to derivative pricing [7–10]. Although previous studies consider the BlackScholes (BS) model [11, 12], which is the pioneering model for derivative pricing, it is inappropriate as an application target of Monte Carlo for practical business for the following reasons. First, the actual market prices of derivatives are inconsistent with the BS model. This phenomenon is called volatility smile, which we will explain in Sect. 2. To price derivatives precisely, financial firms often use more complicated models than the BS models. Second, the BS model is so simple that analytic formulae are available, and thus Monte Carlo simulation is not necessary. In fact, banks use Monte Carlo simulation mainly for complex models which can take into account volatility smiles. The above points motivate us to consider the advanced models in quantum algorithms.
This paper focuses on one of the advanced models, the local volatility (LV) model [13]. In the LV model, the volatility of an asset price depends on the price itself and time. The BS model is also a special case of the LV model. Because the LV model can make derivative prices consistent with volatility smiles, it is widely used for pricing derivatives, especially exotic derivatives, which have complex transaction terms such as early redemption. In order to price a derivative by Monte Carlo simulation, we generate random trajectories (paths) of the time evolution of asset prices, then calculate the expectation value of the sum of discounted payoffs in each path. In this paper, we focus on the implementation of such a time evolution in the LV model on faulttolerant quantum computers to apply quantum algorithms for Monte Carlo simulation.
We consider two quantum integration algorithms based on the quantum amplitude estimation (QAE): the amplitude encoding (AE) type method [5] and the pseudorandom number (PRN) type method [6]. These algorithms are the same in that we prepare a quantum state encoding the integrand and estimate the integral from the state by the QAE. The difference between these algorithms is whether the probability distribution is encoded to the amplitude of a quantum state. In the AEtype method, which is adopted in previous studies [7–9], the probability distribution of the payoff is fully encoded to the probability amplitude [14]. In other words, this method takes account of all possible paths in calculating the expectation value. A problematic point of the AEtype method is that the number of qubits grows with the dimension of the integrand. In the pricing task, the number of qubits is proportional to the total number of random variables, which equals the length of the path times the number of underlying assets.^{Footnote 1} Because the length of the path, i.e., the number of time steps, can be large for derivatives with a long maturity, and the number of underlying assets can be multiple, the AEtype method will require many qubits. Let us see a common situation: the number of assets is \(\mathcal{O}(10)\), that of time steps is \(\mathcal{O}(10^{2})\), and each register for random variables consists of \(\mathcal{O}(10)\) qubits. Then, the total number of qubits for the derivative pricing becomes \(\mathcal{O}(10^{4})\). Since the large qubit overhead might incur to make a logical qubit (see Ref. [15] and references therein), calculations with a large number of logical qubits might be prohibitive.
The PRNtype method is originally proposed in Ref. [6] to reduce the number of qubits for integrating multivariate functions. In the PRNtype method, we do not encode the probability distribution to the probability amplitude, while we use PRNs whose empirical distribution reproduces the desired probability distribution as in the classical Monte Carlo simulation. Although this method introduces an additional error in the estimation, we can reduce the error by increasing the number of sampled paths. As shown in Ref. [6], we can achieve the quadratic speedup by appropriately changing the number of sampled paths. Moreover, this approach allows us to sequentially update PRNs at each time step. In other words, we do not need to have multiple random variables simultaneously. In the PRNtype method, each quantum register is not assigned to each of the random variables, but a single register is used to generate a sequence of PRNs. Thus, the number of qubits is independent of the number of random variables, which is the advantage of the PRNtype method. On the other hand, its drawback is the increase of the circuit depth. More concretely, the circuit depth is proportional to the number of random variables. When it comes to the LV model, the circuit depth is proportional to the number of time steps in both methods, and thus the drawback of the PRNtype method will be alleviated. This is different from the situation in credit portfolio risk management [16], where the AEtype method reduces the circuit depth.
Furthermore, we design the quantum circuits implementing the above state preparation methods in the faulttolerant quantum computer by using several quantum circuits for elementary arithmetic. We then estimate the number of logical qubits^{Footnote 2} and Tcount [17, 18] in the proposed quantum circuits. Because the qubit number in the PRNtype method is independent of the number of time steps, it is much less than that in the AEtype method. On the other hand, the Tcount is proportional to the number of time steps in both methods. However, the Tcount of the PRNtype method is larger than that of the AEtype method by a factor of \(\mathcal{O}(1)\).
The rest of this paper is organized as follows. Section 2 and 3 are preliminary sections, the former briefly explains the LV model, and the latter reviews the quantum algorithm for Monte Carlo simulation. In Sect. 4, we present quantum circuits for the state preparation in two methods. In Sect. 5, we estimate the qubit number and Tcount of the proposed circuits. Section 6 gives a summary.
Local volatility model
This section is devoted to defining the LV model.
Pricing of derivatives
We consider the singleasset case, but it is straightforward to extend the discussion in this paper to the multiasset case. Consider a party A involved in a derivative contract written on some asset. Let \(S_{t}\) be a stochastic process representing the asset price at time t. We assume that the payoffs arise multiple times \(t^{\mathrm{pay}}_{i}\), \(i=1,2,\dots \), and the ith payoff is given by \(f^{\mathrm{pay}}_{i} (S_{t^{\mathrm{pay}}_{i}} ) \in \mathbb{R}\). Here, the positive payoff means that A receives a money from the counterparty, and the negative one means vice versa. For example, when A buys an European call option with the strike K, the payoff is given by
with a single payment date \(t^{\mathrm{pay}}_{1}\). Note that this type of derivative contract is too simple to cover all trades in financial markets. For example, callable contracts, in which either of the parties has a right to terminate the contract at some time, are widely dealt with in markets. In this paper, we consider only derivatives expressed as Eq. (1) and leave studies for exotic derivatives for future works.
Following the theory of arbitragefree pricing [3, 4], the price V of the contract for A is given by
where \(\mathbb{E} [\cdots ]\) represents the expectation value under a riskneutral measure. We assume that the riskfree interest rate is 0 for simplicity.
LV model and volatility smile
In the LV model, the evolution of the asset price is modeled by the following stochastic differential equation:
in the riskneutral measure,^{Footnote 3} where \(W_{t}\) is the Wiener process which drives \(S_{t}\). \(dX_{t}\) is the increment of a stochastic process \(X_{t}\) over an infinitesimal time interval dt, and \(\sigma (t,S_{t})\) (≥0) represents the local volatility. The BS model corresponds to the case where
with a positive constant \(\sigma _{\mathrm{BS}}\), which we call a BS volatility. In the BS model, a price of a European call option with strike K and maturity T at \(t=0\) is given by the following formula:
where \(\Phi _{\mathrm{SN}}\) is the cumulative distribution function (CDF) of the standard normal distribution. If the BS volatility is given, we can price the option by the above equations. Conversely, we can calculate the BS volatility from the market price of the option \(V_{\mathrm{call},\mathrm{mkt}}(T,K)\). The BS volatility determined from the market price is called implied volatility. That is, the implied volatility \(\sigma _{\mathrm{IV}}(T,K)\) is defined through
If the market is described well by the BS model, \(\sigma _{\mathrm{IV}}(T,K)\) depends on neither K nor T. However, \(\sigma _{\mathrm{IV}}(T,K)\) varies with K and T in many markets. If \(\sigma _{\mathrm{IV}}(T,K)\) obtained from the market depends on K, it is said that we observe volatility smile for the market. Volatility smile implies that possible scenarios of asset price evolution in the BS model do not match those which market participants consider. The volatility smile arises when, for example, market participants think that extreme scenarios, such as big crashes or sharp rises, occur more frequently than the BS model prediction.
The LV model allows pricing of a European option to be consistent with a market price as long as there is no arbitrage in the market. This is because, in the LV model, the local volatility \(\sigma (t,S)\) has enough degrees of freedom to reproduce the twodimensional function \(V_{\mathrm{call},\mathrm{mkt}}(T,K)\). In fact, if \(V_{\mathrm{call},\mathrm{mkt}}(T,K)\) is given for any T and K, we can determine the local volatility as described in Ref. [13]. However, in reality, the market option prices are available only for limited strikes and maturities. Therefore, in practical situations, we assume the functional form of \(\sigma (t,S)\) as follows. We set \(n_{\mathrm{t}}+1 \) grid points in the time axis, \(t_{0}:=0< t_{1}<\cdots <t_{n_{\mathrm{t}}}\), and set \(n_{S}\) grid points in the asset price axis for each time grid point, \(\infty < s_{i,1}<\cdots <s_{i,n_{S}}<\infty \). Then, \(\sigma (t,S)\) is set as a piecewiselinear function on S:
where \(a_{i,j}\) and \(b_{i,j}\) are constants. In this paper, we assume that \(a_{i,j}\) and \(b_{i,j}\) are predetermined constants.
Monte Carlo simulation
We here describe how to estimate the derivative price (2) by Monte Carlo simulation. First, we discretize the time into sufficiently small meshes because we can deal with a continuous variable on neither classical nor quantum computers. For simplicity, we set the time grid points to \(\{t_{i}\}_{i=0}^{n_{\mathrm{t}}}\). Then, the time evolution (3) is approximated as
where \(\Delta t_{i} :=t_{i+1}t_{i}\), and \(w_{1},\dots ,w_{n_{\mathrm{t}}}\) are mutually independent standard normal random numbers (SNRNs). Among various ways to discretize the stochastic differential equation, we here adopt the EulerMaruyama method [19].
Second, we discretize SNRNs. Since discretized SNRN takes on a countable number of values, we denote the mth value of the discretized SNRN by \(w^{(m)}\). The associated probability mass function \(p_{m}\) is defined as the cumulative distribution of the standard normal distribution over a small interval of two grid points. Then, we can approximate Eq. (2) as
where \(\boldsymbol{m}:=(m_{1},\dots ,m_{n_{\mathrm{t}}})\) and \(S^{(\boldsymbol{m})}_{t}\) is the asset price at time t when SNRNs take values \(w_{1}^{(m_{1})},\dots ,w_{n_{\mathrm{t}}}^{(m_{n_{\mathrm{t}}})}\).
There are several ways to calculate the righthand side of Eq. (9). The simplest way is brute force calculation, but it takes an exponentially long calculation time. In fact, if we take M grids to discretize each SNRN, the total number of grid points is \(M^{n_{\mathrm{t}}}\). To overcome this problem, usually, Monte Carlo method is used. In Monte Carlo simulation, we generate finite but sufficiently many discretized samples of SNRNs \((w_{1}^{(n)},\dots ,w_{n_{\mathrm{t}}}^{(n)})\) and use them to generate sample paths of the asset price evolving according to Eq. (8). Then, Eq. (2) is approximated by the average of sums of payoffs in sample paths:
Here, \(S^{(n)}_{t}\) is the value of the asset price at time t on the nth sample path, and \(N_{\mathrm{samp}}\) denotes the number of sample paths.
Quantum algorithm for Monte Carlo simulation
In this section, we review two quantum methods for Monte Carlo simulation. We consider a problem of numerically estimating a weighted average of a given function \(f(s)\), that is, \(V:=\sum_{m} p_{m} f(s_{m})\). Here, \(s_{m}\) represents an mth value of a discretized random variable, and \(p_{m}\) is the probability that it takes a realization \(s_{m}\). Equation (9) is a special case of this problem, where the integrand is \(f(\cdot )= \sum_{i} f^{\mathrm{pay}}_{i} (\cdot )\).
AEtype method
We first review the AEtype method discussed in Ref. [5], which directly encodes \(p_{m}\) to the amplitude. It consists of the following three steps: First, we create a superposition of the inputs and the integrand values with amplitudes \(\sqrt{p_{m}}\), that is, \(\sum_{m}{\sqrt{p_{m}}s_{m}\rangle }f(s_{m})\rangle \). This step is called the state preparation step, and we need an oracle calculating \(f(s)\). Second, the integrand values are encoded to amplitudes of an ancillary qubit by a controlled rotation. The quantum state is transformed as follows:
Here, the first, second, and third ket refer to the random number register, the integrand register, and the ancilla, respectively. Finally, quantum amplitude estimation [20–24] on the ancilla gives an approximation of the desired value V. We note that the AEtype method does not directly use classical Monte Carlo approximation like Eq. (10), but the estimation error induced by the QAE.
In this method, the number of calls to an oracle calculating \(f(s)\) is \(\mathcal{O}(\epsilon ^{1})\) with an estimation error of \(\epsilon >0\). Thus, the quantum algorithm is quadratically faster than the classical Monte Carlo algorithm, which requires \(\mathcal{O}(\epsilon ^{2})\) calls. In the case of a multivariate integrand, the AEtype method requires as many random number registers as input variables. Thus, the number of qubits grows with the dimension of the integrand.
PRNtype method
We here review the PRNtype quantum method for Monte Carlo integration [6], where we prepare a state different from Eq. (11) by using the PRN generator. We first consider the case of the univariate integrand. Let \(\{x_{j} \}_{j=0}^{\infty }\) be a PRN sequence where relative frequency of \(x_{j}=s_{m}\) equals \(p_{m}\). Since a PRN sequence usually corresponds to the uniform distribution, we use some transformation techniques such as inverse transform sampling if necessary. Then, V can be approximated as \(V \approx \tilde{V}:=N_{\mathrm{samp}}^{1}\sum_{j=1}^{N_{ \mathrm{samp}}} f(x_{j}) \) by Monte Carlo sampling. The error of the approximation scales as \(N_{\mathrm{samp}}^{1/2}\). This approximation is the core of the PRNtype method, which estimates Ṽ by the QAE instead of directly estimating V. In the PRNtype method, we prepare a quantum state encoding \(f(x_{j}) \):
and apply a controlled rotation and the QAE as in the AEtype method. Although there are two error sources in the PRNtype method, by setting the number of samples \(N_{\mathrm{samp}}=\mathcal{O}(\epsilon ^{2})\), we obtain quadratic speedup over classical MonteCarlo integration.
In contrast to the AEtype method, the number of qubits does not depend on the dimension of the integrand in the PRNtype. Let us consider a multivariate function \(f(s_{1},\dots , s_{n})\). We assume that we can calculate \(f(s_{1},\dots , s_{n})\) sequentially, that is, \(y_{n}=f(s_{1},\dots , s_{n})\) is calcultaed as
with \(y_{0}=0\) and two dimentional functions \(\{f_{j}\}_{j=1}^{n}\). In calculating a sequential function, we do not need to simultaneously keep the input values \(s_{1},\dots , s_{n}\). The PRNtype method utilizes this property to reduce the number of qubits in integrating the multivariate function. To calculate integral of \(f(s_{1},\dots , s_{n})\) with PRNs, we divide a PRN sequence \(\{x_{j} \}_{j=0}^{\infty }\) into \(N_{\mathrm{samp}}\) subsequences of length n, i.e., \(\{x_{j}\}_{j=1}^{n},\dots , \{x_{j}\}_{j=n(N_{\mathrm{samp}}1)+1}^{nN_{ \mathrm{samp}}}\). Then, the integral is calculated as
where \(y^{(i)}_{j} = f_{j}(y^{(i)}_{j1}, x_{(i1)n+j})\) and i is the label of the subsequence. To realize sequential calculation in the PRNtype state preparation, we replace the random number register with two registers \(R_{\mathrm{samp}}\) and \(R_{\mathrm{PRN}}\), where \(R_{\mathrm{samp}}\) stores the label of the subsequence and \(R_{\mathrm{PRN}}\) stores an element of the subsequence, i.e., an PRN. We note that the number of qubits in \(R_{\mathrm{samp}}\) and \(R_{\mathrm{PRN}}\) is independent of n, i.e., the dimension of integrand f. Then, the PRNtype state preparation with sequential calculation is as follows:

1
create an equiprobable superposition of labels of subsequences on \(R_{\mathrm{PRN}}\): \(0\rangle \to N_{\mathrm{samp}}^{1/2}\sum_{i}i\rangle \).

2
generate a PRN on \(R_{\mathrm{PRN}}\): \(i\rangle 0\rangle \to i\rangle x_{(i1)n+1}\rangle \).

3
calculate \(f_{1}\) and write its value to the integrand register: \(x_{(i1)n+1}\rangle 0\rangle \to x_{(i1)n+1}\rangle y_{1}^{(i)}\rangle \).

4
update a PRN: \(x_{(i1)n+1}\rangle \to x_{(i1)n+2}\rangle \).

5
iterate operations 3 and 4 for \(j=1,\dots , n\).
Finally, we obtain the desired state:
Since no additional error to the aforementioned PRNtype method arises in this sequential calculation, this method provides a quadratic speedup compared to the classical calculation and uses a smaller number of qubits than the AEtype algorithm, which needs \(\mathcal{O}(n)\) qubits. The drawback of the PRNtype method is the \(\mathcal{O}(n)\) increase in circuit depth.
Remarks
To conclude this section, we would like to give some remarks. Even in classical computation, we can achieve quadratic speedup over the classical Monte Carlo integration by using lowdiscrepancy sequences instead of (pseudo) random numbers. This algorithm is known as the quasiMonte Carlo method and is used in some pricing tasks. Thus, we cannot say that quantum algorithms for the integration are better than the best classical algorithm on the asymptotic behavior of the estimation error with respect to the calculation time. However, the complexity dependence on the function dimensions is known to be worse in the quasiMonte Carlo method than the ordinal Monte Carlo method. For this reason, we expect that quantum algorithm is beneficial to the integration of high dimensional functions such as Eq. (2).
After the first version of this paper appeared as a preprint, Refs. [10, 25] have pointed out that the GroverRudolph method [14] for preparing distributions as amplitudes can eliminate the quadratic speedup. The AEtype method for the LV model with the GroverRudolph state preparation might be faced with a similar obstacle.^{Footnote 4} On the other hand, the PRNtype method is free from such a problem because it does not encode the probability distribution.
Quantum circuits for the LV model
This section presents quantum circuits for the state preparation in two methods: the PRNtype and the AEtype methods.
Elementary gate
Before presenting our proposals, we list up elementary gates used in following discussion:

Adder: \(x\rangle y\rangle \rightarrow x+y\rangle y\rangle \)

Controlled Adder: \(c\rangle x\rangle y\rangle \rightarrow \bigl\{ \scriptsize{\begin{array}{l@{\quad}l} c\rangle x+y\rangle y\rangle ; & \text{for } c=1, \\ c\rangle x\rangle y\rangle ; & \text{for } c=0 \end{array}} \)

Multiplier: \(x\rangle y\rangle z\rangle \rightarrow x\rangle y\rangle z+xy\rangle \)

Divider: \(x\rangle y\rangle 0\rangle \rightarrow x\rangle y\rangle x/y\rangle \)
Implementation of those elementary arithmetic are studied in many works [26–46]. With these gates, we can construct the other arithmetic we use. For example, subtraction \(x\rangle y\rangle \rightarrow xy\rangle y\rangle \) can be done as addition by the 2’scomplement of y. The 2’scomplement of nbit number y is defined as \(2^{n}y\), which is equivalent to −y modulo \(2^{n}\). Moreover, comparison \(x\rangle y\rangle z\rangle \rightarrow x\rangle y\rangle z\oplus (x>y)\rangle \) can be done as subtraction in 2’scomplement method, since the most significant bit represents whether the result of subtraction is positive or negative. Thus, a comparator is constructed as two adders including uncomputation.
We also note that the above multiplier uses two registers as operands and outputs the product into another register. However, we need the selfupdate type of multiplier, which updates either of input registers with the product. Such a operation is realized by the following trick:
Here, the first step is original multiplication. The second step is swap between the first and third registers. The third step is the inverse operation of division.
PRNtype method
Calculation flow
We present the calculation flow of the PRNtype state preparation for pricing in the LV model. Our purpose is estimating Eq. (10) by the PRNtype Monte Carlo method presented in Sect. 3.2. We show the detailed calculation flow to realize operation (15) in the case where the desired value is given by Eq. (10).
Before presenting the calculation flow, we explain our setup. We generate \(N_{\mathrm{samp}}=2^{n_{\mathrm{samp}}} \) sample paths of length \(n_{\mathrm{t}}\) by using the PSNRNs. In this algorithm, we prepare the following registers:

\(R_{\mathrm{samp}}\) is a register for an index of the sample path and consists of \(n_{\mathrm{samp}}\) qubits.

\(R_{W}\) is a register for a PSNRN used to calculate the asset price evolution.

\(R_{S}\) is a register for the value of the asset price.

\(R_{\mathrm{payoff}}\) is a register for the payoffs.
We note that \(R_{W}\) corresponds to the PRN register, and \(R_{\mathrm{payoff}}\) corresponds to the integrand register in Sect. 3.2. On the other hand, \(R_{S}\) is a tailored ancillary register for calculating the specific integrand and has no counterpart. Although some ancillary registers are needed in addition to the above registers, we abbreviate them in the main calculation flow.
We assume that the following gates are available to generate a sequence of PSNRNs.

\(J_{W}\) acts on \(R_{\mathrm{samp}}\otimes R_{W}\) and sets the initial value of the PSNRN subsequence: \(J_{W}i\rangle 0\rangle =i\rangle x_{in_{\mathrm{t}}+1}\rangle \), where \(n_{\mathrm{t}}\) is the number of time steps.

\(P_{W}\) advances a PSNRN sequence by one step: \(P_{W}x_{j}\rangle = x_{j+1}\rangle \), where \(x_{j}\) is the jth element of the PSNRN sequence.
Applying these gates to a superposition of \(N_{\mathrm{samp}} \) states, we obtain \(N_{\mathrm{samp}} \) PSNRN subsequences:
We also use gate \(U_{j}\) acting on \(R_{W} \otimes R_{S} \otimes R_{\mathrm{payoff}}\), which calculates the jth time step of asset price evolution and the payoff as follows:
In other words, \(U_{j}\) performs the time evolution (8) by using the value on \(R_{W}\) as \(w_{j}\). After that, \(U_{j}\) calculates the payoff at time \(t_{j}\) and adds its value into \(R_{\mathrm{payoff}}\). The concrete implementation of these gates is presented in the next subsection.
The calculation flow of the PRNtype method is as follows:

1
Initialize \(R_{S}\) to \(S_{t_{0}}\rangle \) and the other registers to \(0\rangle \).

2
Generate \(\frac{1}{\sqrt{N_{\mathrm{samp}}}}\sum_{i=0}^{N_{\mathrm{samp}}1} i\rangle \) on \(R_{\mathrm{samp}}\). This is done by applying a Hadamard gate to each qubit of \(R_{\mathrm{samp}}\).

3
Apply \(J_{W}\) to \(R_{\mathrm{samp}}\otimes R_{W}\). This step sets the initial value of the PSNRN subsequence.

4
Apply \(U_{j}\) to \(R_{W} \otimes R_{S} \otimes R_{\mathrm{payoff}}\) to simulate asset price evolution.

5
Apply \(P_{W}\) to \(R_{W}\), which updates \(R_{W}\) from \(x_{in_{\mathrm{t}}+1}\) to \(x_{in_{\mathrm{t}}+2}\).

6
Iterate operations 45 \(n_{\mathrm{t}}\)times.
The flow of the corresponding state transformations is as follows:
where the first, second, third and fourth kets correspond to \(R_{\mathrm{samp}}\), \(R_{W}\), \(R_{S}\) and \(R_{\mathrm{payoff}}\), respectively. The quantum circuit realizing the flow (19) is schematically shown in Fig. 1.
The implementation of \(U_{j}\) is also shown in Fig. 2, where the subroutine gates \(V^{(j)}_{1},\dots ,V^{(j)}_{n_{S}}\) are used to update the asset price according to Eqs. (7) and (8), and the gate \({\mathrm{payoff}}_{j}\) calculates \(f^{\mathrm{pay}}_{j}(S^{(i)}_{t_{j}})\) and adds its value into \(R_{\mathrm{payoff}}\). The subroutine \(V^{(j)}_{k}\) realizes the following three operations:

Checks whether the asset price \(R_{S}\) is in the kth interval \([s_{j,k1},s_{j,k})\).

If that is the case, updates the asset price by Eq. (8) with \(\sigma (t_{j} ,S^{(i)}_{t_{j}}) = a_{j, k}S^{(i)}_{t_{j}} + b_{j,k}\).

Clears all the intermediate output.
This procedure requires three ancillary registers, \(R_{\mathrm{count}}\), \(R_{S^{\prime }}\) and \(R_{g}\). \(R_{\mathrm{count}}\) stores an indicator of whether the jth step of evolution has already been done. If the jth update has already been done, the asset price is not updated, which is necessary to avoid double updating in a single step. \(R_{g}\) stores the check result.
We note that there is a restriction on implementing the LV model in the PRNtype method. Through operations \(V^{(j)}_{1},\dots ,V^{(j)}_{n_{S}+1}\), the state is transformed from \(j\rangle S^{(i)}_{t_{j}}\rangle \) to \(j+1\rangle S^{(i)}_{t_{j+1}}\rangle \), where the first and second kets represents states of \(R_{\mathrm{count}}\) and \(R_{S}\), respectively, and unchanged registers are abbreviated. This map must be onetoone correspondence from the unitarity, which restricts parameters. As shown in Appendix, the unitarity is certified if we set parameters \(a_{i,j}\) and \(b_{i,j}\) so that \(\sigma (t,S)\) is continuous with respect to S and set \(\Delta t_{j}\) sufficiently small.
Implementation of subroutines
We now consider how to implement subroutines used in the PRNtype method by arithmetic operations in Sect. 4.1.
Implementation of \(V^{(j)}_{k}\)
At the start of \(V^{(j)}_{k}\), \(R_{\mathrm{count}}\) takes \(j\rangle \) or \(j+1\rangle \), and the other registers take \(0\rangle \). Then, the detailed calculation flow of \(V^{(j)}_{k}\) is as follows:

1.
Check whether \(R_{\mathrm{count}}\) equals j and \(R_{S}\) is in \([s_{j,k1},s_{j,k})\). If the check is passed, flip \(R_{g}\).

2.
If \(R_{g}\) is 1, update \(R_{S}\) as
$$\begin{aligned} S_{t_{j}}\rightarrow S_{t_{j+1}}=S_{t_{j}}+(a_{j,k}S_{t_{j}}+b_{j,k}) \sqrt{\Delta t_{j}}x_{in_{\mathrm{t}}+j}, \end{aligned}$$(20)where \(x_{in_{\mathrm{t}}+j}\) is the value on \(R_{W}\), and add 1 to \(R_{\mathrm{count}}\).

3.
Calculate
$$\begin{aligned} \frac{Sb_{j,k}\sqrt{\Delta t_{j}}x_{in_{\mathrm{t}}+j}}{1+a_{j,k}\sqrt{\Delta t_{j}}x_{in_{\mathrm{t}}+j}} \end{aligned}$$(21)into \(R_{S^{\prime }}\), where S is the value on \(R_{S}\).

4.
If \(R_{\mathrm{count}}\) is \(j+1\) and \(R_{S^{\prime }}\) is in \([s_{j,k1},s_{j,k})\), flip \(R_{g}\). This uncomputes \(R_{g}\).

5.
Do the inverse operation of 3.
If and only if the jth step has not been done and the asset price is in \([s_{j,k1},s_{j,k})\), the asset price is upadated with the LV function \(a_{j,k}S+b_{j,k}\). To realize this conditional update, the check result is outputted to \(R_{g}\), and the gate doing update (20) is controlled by \(R_{g}\). The increment of \(R_{\mathrm{count}}\) is also controlled by \(R_{g}\) so that \(R_{\mathrm{count}}\) indicates the completion of the jth update. Steps 35 are necessary to clear \(R_{g}\). From the result of Step 3, we can determine whether the update has been done in Step 2. In step 4, \(R_{g}\) is flipped if and only if it is \(1\rangle \), so it goes back to the initial state \(0\rangle \). In summary, through the sequential operation of \(V^{(j)}_{1},\dots ,V^{(j)}_{n_{S}+1}\), \(R_{S}\) is updated only once at the appropriate \(V^{(j)}_{k}\), \(R_{\mathrm{count}}\) is updated from \(j\rangle \) to \(j+1\rangle \), and all intermediate outputs on ancillary registers are cleared. See also Fig. 3.
Most of subparts of \(V^{(j)}_{k}\) can be constructed from arithmetic operations, addition, subtraction, multiplication, division, and comparison. For example, Let us consider the gate \(z\leftarrow z \oplus (x=j \ \mathrm{and} \ y\in I)\), which is divided to the following two parts. The first part is checking whether the value on \(R_{\mathrm{count}}\) equals j. This check can be done by the multiple control Toffoli gate, which is studied in Refs. [18, 47, 48]. The second part is checking whether the asset price is in a given interval, which can be constructed from two comparisons. Combining these, the gate \(z\leftarrow z \oplus (x=j \ \mathrm{and} \ y\in I)\) is constructed as shown in Fig. 4. Note that the bitwise flips \(X^{1j_{0}} \otimes \cdots \otimes X^{1j_{n_{x}1}}\) are operated before the multi control Toffoli. Here, \(j_{a}\) is the ath digit of the binary representation of j, so the ath qubit is flipped if and only if \(j_{a}=0\). This convert \(x\rangle \) to \(1\rangle \dots 1\rangle \) if and only if \(x=j\).
The operation \(x\leftarrow x+(ax+b)y\) in Fig. 3 can be realized as follows:
where the third ket corresponds to an ancillary register. The first step is just setting a constant on the ancillary register. The second step is the multiplication by a. The third step is selfupdate multiplication. The fourth step is multiplication by b, and the final step is uncomputation of the first and second steps. Note that this is done under control by \(R_{g}\). In order for this to be controlled, it is sufficient to control only the second, fourth and final arrows because the third arrow becomes multiplication by 1 without the second. Also note that multiplication by an nbit constant can be done by nadders, that is, n shiftandadd’s: \(ax=\sum_{i=0}^{n1}{a_{i}2^{i}x}\), where \(a_{i}\) is the ith bit of a. This method saves the number of qubits compared with the case of using a multiplier, where we need to hold a on an ancillary register.
The operation \(x\leftarrow (xby)/(1+ay)\) in Fig. 3 is done as follows:
where the first, second, third and fourth states correspond to \(R_{S}\), \(R_{W}\), an ancillary register and \(R_{S^{\prime }}\), respectively. The first and second steps are the same as Eq. (22), the third step is the multiplication by −b, and the final step is division. Here, we do not have to uncompute \(R_{S}\) and the ancillary register because the whole of this operation is uncomputed soon after in \(V_{j,k}\).
Implementation of \(J_{W}\) and \(P_{W}\)
In Ref. [6], implementation of PRN on quantum circuits is based on permuted congruential generator (PCG) [49], which is a PRN generation algorithm with small memory requirements. We use the following two gates to run PCG: (i) \(J_{\mathrm{PRN}}\) lets the PRN sequence jump to the \(in_{\mathrm{t}}+1\). (ii) \(P_{\mathrm{PRN}}\) progresses the PRN sequence by a step. Since PCG basically generates uniform PRNs, we transform them to PSNRNs by adopting the inverse transform sampling. The implementation of \(J_{W}\) and \(P_{W}\) are schematically shown in Fig. 5.
Although we refer to Ref. [6] for the detail of the implementation of the PRN generator, we here briefly explain it. PCG is a combination of linear congruential generator (LCG) and permutation of bit string. For LCG, update of the PRN sequence is done by
where a and N are positive integers, c is a nonnegative integer. From the above equation, the nth element of the sequence is computed from the initial value \(x_{0}\) by
We can implement Eqs. (24) and (25) using only controlled adders. According to Ref. [26], the modular adder can be constructed by 5 plain adders. Modular multiplication by a nbit constant can be done as n modular shiftandadd’s. Modular division by a constant \(a1\) can be done as modular multiplication by an integer β such that \(\beta (a1)=1 \bmod N\). Modular exponentiation \(a^{x} \bmod N\) is computed as a sequence controlled modular multiplication [26]. We do not explain permutation: see Ref. [6] for the detail. We make a comment that it is implemented by a simple circuit; for example, Xorshift is implemented as a sequence of CNOT.
The step by step transformation of the implementation of Eq. (24) is as follows:
Here, the first register is \(R_{\mathrm{PRN}}\), and the other registers are ancillary registers. The first step is modular multiplication. The second step is the inverse modular multiplication by an integer α such that \(a\alpha = 1 \bmod N\), which is necessary to avoid the increase of ancillae. The third step is the load of c into an ancillary register, the fourth step is modular addition, and the last step is to unload. Equation (25) progresses as follows:
where the first and third registers are \(R_{\mathrm{samp}}\) and \(R_{\mathrm{PRN}}\), and the other registers are ancillary registers. The first step is modular exponentiation, the second step is modular multiplication, the third step is loading, the fourth step is modular addition, and the last step is uncomputation of the first and third steps.
Implementation of \(\Phi _{\mathrm{SN}}^{1}\)
We also need the gate to calculate \(\Phi _{\mathrm{SN}}^{1}\), the inverse function of the CDF of standard normal distribution. We adopt the method in Ref. [50], where \(\Phi _{\mathrm{SN}}^{1}\) is approximated by a piecewise polynomial function. Let us set the number \(n_{\mathrm{ICDF}}\) of intervals to be 109 and polynomials to be cubic, that is, \(\Phi _{\mathrm{SN}}^{1}\) is approximated as
in \(x^{\mathrm{ICDF}}_{m1}\le x< x^{\mathrm{ICDF}}_{m}\), where \(\{x^{\mathrm{ICDF}}_{m} \}_{m=0}^{n_{\mathrm{ICDF}}} \) are the end points of the intervals. This approximation realizes the error smaller than 10^{−6}. Such a piecewise cubic function can be implemented as in Fig. 6. The sequence of comparators and “Load \(c_{m,i}\)’s” gates loads appropriate values of \(c_{m,0},\dots ,c_{m,3}\) into the register \(R_{c,0},\dots ,R_{c,3}\), respectively, as explained later. After the load of coefficients, the cubic function is calculated in the Horner’s method, which is based on the following representation
Horner’s method is realized only by adders and multipliers as presented in the latter half of the circuit in Fig. 6.
Let us explain the way to load the appropriate coefficients. Comparing the input value x of \(R_{\mathrm{PRN}}\) and the grid point \(x^{\mathrm{ICDF}}_{m}\), the mth comparator flips a qubit on \(R_{g}\) if \(x< x^{\mathrm{ICDF}}_{m}\). The register \(R_{g}\) rules the activation of “Load \(c_{m,i}\)’s” gate, that is, the “Load \(c_{m,i}\)’s” gate is activated if \(R_{g}\) is 1 at mth step. If \(x \ge x^{\mathrm{ICDF}}_{n_{\mathrm{ICDF}}}\), only “Load \(c_{n_{\mathrm{ICDF}}+1,i}\)’s” gate is activated, and \(c_{n_{\mathrm{ICDF}}+1,0},\dots ,c_{n_{\mathrm{ICDF}}+1,3}\) are loaded to the registers. However, if \(x^{\mathrm{ICDF}}_{n_{\mathrm{ICDF}}1} \le x < x^{\mathrm{ICDF}}_{n_{ \mathrm{ICDF}}}\), “Load \(c_{n_{\mathrm{ICDF}},i}\)’s” and “Load \(c_{n_{\mathrm{ICDF}}+1,i}\)’s” gates are performed. Hence, we have to set “Load \(c_{n_{\mathrm{ICDF}},i}\)’s” to compensate the effect of “Load \(c_{n_{\mathrm{ICDF}}+1,i}\)’s”. More generally, the activated gates are “Load \(c_{m,i}\)’s” of \(m=M, M+2, \dots , n_{\mathrm{ICDF}}, n_{\mathrm{ICDF}}+1\) if \(n_{\mathrm{ICDF}}M\) is even and that of \(m=M,M+2,\dots ,n_{\mathrm{ICDF}}1,n_{\mathrm{ICDF}}+1\) if \(n_{\mathrm{ICDF}}M\) is odd. This is because \(R_{g}\) is flipped by all comparators after the Mth step and alternates between 0 and 1. Considering those, we set the X gates in “Load \(c_{m,i}\)’s” as in Fig. 7, so that \(c_{m,0},\dots ,c_{m,3}\) for appropriate m are loaded after the sequence of all activated gates.
Implementation of payoff
In this paper, we do not consider gates to calculate payoffs in detail because the resource the gates require is the same in both the PRNtype method and the AEtype method. We here make just a short comment. In many cases, a payoff is expressed as
where \(a_{i}\), \(b_{i}\), \(c_{i}\), \(f_{i}\) are real constants. Thst is, a payoff is a linear function of the asset price with the upper bound (cap) \(c_{i}\) and the lower bound (floor) \(f_{i}\). For example, a payoff in an European call option (1) corresponds to the case of \(a_{i}=1\), \(b_{i}=K\), \(c_{i}=+\infty \), \(f_{i}=0\). The righthand side of Eq. (30) can be calculated by a combination of comparators, adders, and multipliers.
The AEtype method
Calculation flow
The AEtype method is simpler than the PRNtype method, but it requires more registers. In the AEtype method, we use the following registers:

\(R_{W_{i}}\) is a register for the ith SNRN (\(i=1,\dots ,n_{ \mathrm{t}}\)).

\(R_{S_{i}}\) is a register for the asset price at time \(t_{i}\) (\(i=0, \dots ,n_{\mathrm{t}}\)).

\(R_{{\mathrm{payoff}},i}\) is a register for the sum of payoffs by \(t_{i}\) (\(i=1,\dots ,n_{\mathrm{t}}\)).
We again abbreviated ancillary registers. In \(R_{W_{i}}\), an SNRN is encoded into a superposition state \(\mathrm{SN}\rangle \), which is defined as
where \(p_{{\mathrm{SN}},i}:=\Phi _{\mathrm{SN}}(x_{{\mathrm{SN}},i+1}) \Phi _{\mathrm{SN}}(x_{{\mathrm{SN}},i})\). Here, \(x_{{\mathrm{SN}},0}< x_{{\mathrm{SN}},1}<\cdots <x_{{\mathrm{SN}},N_{ \mathrm{SN}}}\) are the equally spaced \(N_{\mathrm{SN}}+1\) points for discretizing the distribution. We also assume \(N_{\mathrm{SN}}=2^{n_{\mathrm{dig}}}\) with the bit size \(n_{\mathrm{dig}}\) of floating point number for simplicity. We discuss a gate creating such a state in the next subsection.
The calculation flow of the AEtype method is as follows:

1
Initialize \(R_{S_{0}}\) to \(S_{t_{0}}\rangle \) and the others to \(0\rangle \).

2
Generate \(\mathrm{SN}\rangle \) on each of \(R_{W_{1}},\dots ,R_{W_{n_{\mathrm{t}}}}\).

3
Calculate \(S_{t_{1}}\) by the time evolution (8) and output the result to \(R_{S_{1}}\).

4
Calculate the payoff at time \(t_{1}\) and add its value to \(R_{{\mathrm{payoff}},i}\).

5
Iterate operations 34 \(n_{\mathrm{t}}\)times. Then, we obtain a superposition of states in which the value on \(R_{{\mathrm{payoff}},n_{\mathrm{t}}}\) is the sum of payoffs for each path.
The flow of the corresponding state transformation is as follows. Writing only \(R_{W_{1}},\dots , R_{W_{n_{\mathrm{t}}}}\), \(R_{S_{0}},R_{S_{1}},\dots ,R_{S_{n_{\mathrm{t}}}}\) and \(R_{{\mathrm{payoff}},1},\dots ,R_{{\mathrm{payoff}},n_{\mathrm{t}}}\),
where \(S^{(i_{1}\dots i_{j})}_{t_{j}}\) is the value of the asset price at time \(t_{j}\) evolved by \(w_{1}=x_{{\mathrm{SN}},i_{1}},\dots ,w_{j}=x_{{\mathrm{SN}},i_{j}}\).
The quantum circuit of the AEtype state preparation is shown in Fig. 8. First, \(\mathrm{SN}\rangle \) is created on each \(R_{W_{j}}\) by SN gate. After that, the gate \(U_{j}\) performs the jth step of asset price evolution and payoff calculation. For each evolution step, we additionally use ancillary registers \(R_{{\mathrm{flg}},j}\) and \(R_{{\mathrm{LV}},j}\), which have 1 and \(2n_{\mathrm{dig}}\) qubits, respectively. The implementation of \(U_{j}\) is shown in Fig. 9. In this gate, the sequence of comparators and “Load” gates set \(a_{j,k}\), \(b_{j,k}\) in Eq. (7) into \(R_{{\mathrm{LV}},j}\) by the trick similar to that in the circuit presented in Fig. 6. Then, operation \(x\leftarrow x+(ax+b)y\) updates the asset price according to Eq. (8). Operation \(x\leftarrow x+(ax+b)y\) can be done as follows:
where the first ket is the state of \(R_{S_{j1}}\), the second is the state of \(R_{W_{j}}\), the third and fourth are the state of \(R_{{\mathrm{LV}},j}\), the fifth is the state of an ancillary register, and the last is the state of \(R_{S_{j}}\). So, this operation consists of copying a state and three multiplications. At the end of \(U_{j}\), the payoff is calculated by the “\({\mathrm{payoff}}_{j}\)” gate, which performs the following operation
where the first, second and third kets correspond to \(R_{S_{j}}\), \(R_{{\mathrm{payoff}},j1}\) and \(R_{{\mathrm{payoff}},j}\). This operation is done by copying \(R_{{\mathrm{payoff}},j1}\) to \(R_{{\mathrm{payoff}},j}\) and adding \(f^{\mathrm{pay}}_{j}(S^{(i_{1}\dots i_{j})}_{t_{j}})\) into \(R_{{\mathrm{payoff}},j}\).
Implementation of the SN gate
Let us consider the implementation of the SN gate, which creates a superposition state \(\mathrm{SN}\rangle \). Although our implementation is mainly based on Ref. [14], we use an approximate by the Taylor expansion.
We construct \(\mathrm{SN}\rangle \) in an inductive way. An intermediate state at mstep is given by
where \(p^{(m)}_{{\mathrm{SN}},i}=\int _{x^{(m)}_{{\mathrm{SN}},i}}^{x^{(m)}_{{ \mathrm{SN}},i+1}}\phi _{\mathrm{SN}}(x)\,dx\), and \(\phi _{\mathrm{SN}}(x)\) is the probability density function of the standard normal distribution. Here, \(\{x_{{\mathrm{SN}},i} \}_{i=0}^{2^{m}}\) is the set of equallyspaced \(2^{m}+1\) points dividing the range \([x_{{\mathrm{SN}},0},x_{{\mathrm{SN}},N_{\mathrm{SN}}}]\). We assume the existence of a gate efficiently computing \(\theta ^{(m)}_{i}:=\arccos \sqrt{f^{(m)}_{i}}\) with the input i, where \(f^{(m)}_{i}\) is
Then, the following state transformation is possible:
where we use the gate computing \(\theta ^{(m)}_{i}\) at the first step and perform the controlled rotation at the second step. Repeating this operation until \(m=n_{\mathrm{dig}}1\), we finally obtain the desired state \(\mathrm{SN}\rangle \).
The remaining part is constructing the gate to compute \(f^{(m)}_{i}\). Here, we propose a way based on simple Taylor expansion. Let us consider function
By simple calculation, it is approximated as
This result means that, for small δ, \(g(x,\delta )\) is wellapproximated by a linear function of x. We use the above approximation to compute \(f^{(m)}_{i}\), which is represented as
If \(\Delta /2^{m}\) is sufficiently small, \(f^{(m)}_{i}\) can be approximately written as a linear function of i, which is derived from the approximation of g and \(x^{(m)}_{{\mathrm{SN}},i}=x_{{\mathrm{SN}},0}+\frac{\Delta }{2^{m}}i\). We then reach the circuit in Fig. 10 for calculation of \(f^{(m)}_{i}\). For \(m\le 6\), the above approximation yields a large error, and thus we use another method. Here, we apply the most straightforward way, loading precomputed values. The quantum circuit of this method is shown in Fig. 10(a), and it uses a similar technique to the circuit in Fig. 6. In this method, each comparator checks whether the input value i equals its inherent value, and the check result is used for activation of the Load gate. If the input value is I, “Load \(f^{(m)}_{i}\)” gates are activated for all \(i\ge I\). Therefore, each “Load” gate is set to compensate the effect of the following load gates. For \(m\ge 7\), \(f^{(m)}_{i}\) is well approximated by a linear transformation. This transformation can be implemented as bitwise flips followed by a constant multiplier. We note that, depending on the required accuracy, we should adjust the threshold value of m switching calculation method of \(f^{(m)}_{i}\) and also increase the degree of the Taylor expansion.
Then, SN gate is constructed as shown in Fig. 11. First, we operate a Hadamard gate to the most significant bit in \(R_{W_{j}}\) to assign probability 1/2 to positive and negative halves of \([x_{{\mathrm{SN}},0},x_{{\mathrm{SN}},N_{\mathrm{SN}}}]\). We next operate a sequence of gates \(U^{\mathrm{SN}}_{1},\dots , U^{\mathrm{SN}}_{n_{\mathrm{dig}}1}\). \(U^{\mathrm{SN}}_{m}\) corresponds to the mth step of the above recursive calculation and is constructed as a combination of \(f^{(m)}_{i}\) gate, gates for square root and arc cosine, and controlled rotation gate \(R(\theta )\).
Finally, we comment on the implementation of arccos and square root. Reference [51] discusses the implementation of the inverse trigonometric function by the piecewise polynomial approximation. Although they consider not arccos but arcsin, we can easily apply their result by \(\arccos (x)=\frac{\pi }{2}\arcsin (x)\). We adopt a setting with the polynomial degree 3 and 2 intervals, which leads to accuracy 10^{−5} [51]. The circuit to calculate square root is given in Ref. [52].
Estimation of required resources
We roughly estimate the machine resources for the faulttolerant implementation in the PRNtype method and the AEtype method. We consider the two metrics, the number of logical qubits and Tcount. Our resource estimation focuses only on the leading contribution from the state preparation step, and we must take the implementation of the QAE into consideration for evaluating the total resource of the derivative pricing. We also neglect the resource of calculating payoffs because it can be implemented by a combination of a few arithmetic circuits, as discussed in Ref. 4.2.2.
Elementary gates
We first summarize the resources of elementary gates necessary to construct the LV circuit. We here consider fixedpoint arithmetic. Resources of the elementary gates in the case of nbit operands are summarized in Table 1. Because we aim to estimate the orders of the metrics, we take only the leading term with respect to n. For example, we approximate \(an+b\) as an.
We comment on multiplication and division. For these operations, we use modified versions of circuits proposed in Refs. [42, 46] for the following reason. Original circuits use 2nbits, but, in our setting, this causes a problem that the number of qubits doubles at every multiplication. Therefore, we have to truncate lower bits of the product and keep the digit number. This is why the number of qubits for a divider in Table 1 is different from that in Refs. [42, 46]. We explain the details of the modified multiplier and divider in Appendix.
The number of qubits in registers
We assume that the qubit numbers of the registers is as follows. Some of them have already been mentioned.

Registers which store numerical numbers, \(R_{W}\), \(R_{S}\), \(R_{\mathrm{payoff}}\), \(R_{{\mathrm{LV}},j}\) etc., and ancillary registers concerning them have \(n_{\mathrm{dig}}\) qubits. \(n_{\mathrm{dig}}\) depends on computational representation of real numbers, which is determined according to the required accuracy and range. We set \(n_{\mathrm{dig}}=16\).

\(R_{\mathrm{PRN}}\) has \(n_{\mathrm{PRN}}\) qubits, and \(n_{\mathrm{PRN}}\) is so large value that the PRN sequence has good statistical property, e.g. long period. Ancillary registers for calculating a PRN sequence have \(n_{\mathrm{PRN}}\) qubits too. We set the bit of the PRN generator as \(n_{\mathrm{PRN}}=64\) as in Ref. [49].

\(R_{\mathrm{samp}}\) has \(n_{\mathrm{samp}}\) qubits.

Other registers, e.g. \(R_{\mathrm{count}}\), have small number of qubits, and thus we neglect their contributions to the total number.
The PRNtype method
Then, let us consider the required resources in the PRNtype method.
Qubit number
In Table 2, we summarize qubits necessary in each step in the circuit. Registers which hold some values throughout the circuit are as follows: \(R_{\mathrm{samp}}\), \(R_{S}\), \(R_{\mathrm{payoff}}\) ans \(R_{\mathrm{PRN}}\). Except these, the following parts in the circuit can consume qubit number most heavily.

\(J_{\mathrm{PRN}}\) and \(P_{\mathrm{PRN}}\): \(2n_{\mathrm{PRN}}\) qubits

\(\Phi _{\mathrm{SN}}^{1}\): \(7n_{\mathrm{dig}}\) qubits
Therefore, the total number of qubits required in the PRNtype method is roughly
Let us comment on some technical points for obtaining Table 2. We first make a supplementary explanation on the ancillary qubit number in \(V^{(j)}_{k}\). There are two parts requiring ancillae in \(V^{(j)}_{k}\). First, \(x\leftarrow x+(ax+b)y\) needs the following ancillae: a \(n_{\mathrm{dig}}\)bit register to which \(1+ay\) is output, a \(n_{\mathrm{dig}}\)bit register to which the result is temporally output in the selfupdate multiplication and a \(2n_{\mathrm{dig}}\)bit register necessary for the inverse division to clear the input x. Second, \(z\leftarrow \frac{z+xby}{1+ay}\) needs the following: a \(n_{\mathrm{dig}}\)bit register to which \(1+ay\) is output and a \(2n_{\mathrm{dig}}\)bit register necessary for division. In total, \(4n_{\mathrm{dig}}\) bits are sufficient.^{Footnote 5}
We also comment on the ancilla number in \(\Phi _{\mathrm{SN}}^{1}\). As we can see from Fig. 6, we need four registers to which coefficients are loaded and two registers for intermediate outputs. Therefore, \(6n_{\mathrm{dig}}\) ancillae are necessary^{Footnote 6}
Tcount
Because we are interested in only the leading contribution, we focus on multiplications, divisions, and repeated additions. We do not consider the Tcount of \(J_{W}\) because it is used only once. For the parts in \(U_{j}\), which is used repeatedly, we specify Tcounts as follows:

1
\(V^{(j)}_{k}\)
One \(V^{(j)}_{k}\) includes the following parts:

\(x\leftarrow x+(ax+b)y\)
As we can see in (22), this includes one multiplication and one division, which come from one selfupdate multiplication, and \(3n_{\mathrm{dig}}\) controlled additions, which comes from two controlled multiplications by constant and one inverse. In total, the Tcount is \(119n_{\mathrm{dig}}^{2}\).

\(z\leftarrow \frac{z+xby}{1+ay}\)
As we can see in (23), this includes one division and \(2n_{\mathrm{dig}}\) additions, which comes from two multiplications by constant. In total, the Tcount is \(63n_{\mathrm{dig}}^{2}\).

Uncomputation of \(z\leftarrow \frac{z+xby}{1+ay}\)
Similar to the above.
Therefore, the total Tcount in one \(V^{(j)}_{k}\) is \(245n_{\mathrm{dig}}^{2}\). Since \(V^{(j)}_{k}\) is used \(n_{S}+1\) times, the total Tcount in them is \(245n_{\mathrm{dig}}^{2}n_{S}\) (only the leading term).


2
\(P_{\mathrm{PRN}}\)
This includes two modular multiplications by constant, which comes from one selfupdate modular multiplication. These are decomposed into \(2n_{\mathrm{PRN}}\) modular additions. So the Tcount is roughly \(140n_{\mathrm{PRN}}^{2}\).

3
\(\Phi _{\mathrm{SN}}^{1}\) and its inverse
Each of them includes \(2(n_{\mathrm{ICDF}} + 1)\) additions (\(n_{ \mathrm{ICDF}} + 1\) comparisons) and five multiplications. So the Tcount for each is roughly \(105n_{\mathrm{dig}}^{2} + 28n_{\mathrm{dig}}n_{\mathrm{ICDF}}\).
Summing up these and considering \(U_{j}\) is used in \(n_{\mathrm{t}}\) times, the Tcount in the whole circuit is roughly
The AEtype method
Next, we consider the required resources in the AEtype method.
Qubit number
In the AEtype method, registers shown in Table 3 are added per time step. Note that we do not uncompute ancillae. Summing up all registers, the qubit number necessary for one time step is roughly \(3n_{\mathrm{dig}}^{2}+111n_{\mathrm{dig}}\). Therefore, for the entire circuit, it is
Note that the dominant part comes from the iterative calculation in the SN gates, which prepare superpositions of the values of the SNRNs.
Tcount
Again, we focus on operations with large Tcount. For each part in the circuit, we estimate the Tcount as follows:

1
SN gate
The mth iteration \(U^{\mathrm{SN}}_{m}\) in the SN gate includes the following parts:

square root, arccos, controlled rotation
Tcounts are \(14n_{\mathrm{dig}}\), \(3.4\times 10^{4}\) and \(3n_{\mathrm{dig}}\), respectively.

\(f^{(m)}_{i}\)
For \(2\le m\le 6\), we use \(2^{m}\) mcontrolled Toffoli gates to check the value on \(R_{W_{i}}\) and load \(f^{(m)}_{i}\) which corresponds to the value. Tcount for this is \(2^{m}(8m9)\).^{Footnote 7} Summing this for \(m=2,\dots ,6\) leads to about 4000. Since this is much smaller than Tcount for arccos in one iteration, we neglect this. For \(m\ge 7\), we do multiplication between a mbit variable and a \(n_{\mathrm{dig}}\)bit constant, which is decomposed \(n_{\mathrm{dig}}\) additions of mbit. Then, Tcount is \(14mn_{\mathrm{dig}}\).
Summing up these counts and taking only dominant contributions, one SN gate has Tcount of \((7n_{\mathrm{dig}}^{2} + 3.4\times 10^{4})n_{\mathrm{dig}}\) roughly.


2
\(U_{j}\)
This includes \(2n_{S}\) additions (\(n_{S}\) comparisons) and three multiplications. So one \(U_{j}\) gates has Tcount of \(63n_{\mathrm{dig}}^{2}+28n_{S}n_{\mathrm{dig}}\) roughly.
In total, we can estimate the Tcount of the entire circuit in the AEtype method as
Comparison between two methods
Table 4 compares resources necessary in two methods. The number of qubits is independent of \(n_{\mathrm{t}}\) in the PRNtype method but proportional to \(n_{\mathrm{t}}\) in the AEtype method. On the other hand, Tcount is proportional to \(n_{\mathrm{t}}\) in both methods.
Let us consider the following setting, which is necessary for practical use in derivative pricing:
Table 5 presents resources in this setting. The total Tcount is of the same order of magnitude in both methods but larger for the PRNtype method by a factor of about 2.
We here comment on parts consuming Tcount most heavily in each method. In the PRNtype method, there are two parts dominantly contributing to Tcount. The first part is the update of the asset price in \(V^{(j)}_{k}\). Additional operations for reducing the number of qubits, such as inverse division in selfupdate multiplication and drawing back the asset price to clear \(R_{g}\), increase Tcount compared with the AEtype method. The second part is modular multiplications in the update of the PRN sequence. The Tcount of operations for the PRN becomes large because the PRN generator requires the large bit number, say \(n_{\mathrm{PRN}}=64\), to keep good statistical properties. On the other hand, in the AEtype method, the dominant contribution to Tcount comes from the calculation of arccos in preparing SNRNs. Because an arccos is not only Tcount consuming but also used in each iteration in the SN gate, it piles up Tcount.
Summary
In this paper, we presented the implementation of the time evolution of the asset price in the LV model on quantum computers. Similar to other problems in finance, derivative pricing by Monte Carlo simulation requires a large number of random numbers, which is proportional to the number of time steps for asset price evolution. We considered two methods of implementation: the PRNtype method and the AEtype method. In the former, we sequentially generate PRNs on a register and use them to evolve the asset price. In the latter, SNRNs are created as superpositions on separate registers. For both methods, we presented the concrete quantum circuits in detail (see Fig. 1 and 8). We then gave estimations of the qubit number and Tcount required in each method. In the PRNtype method, the qubit number is kept constant against the number of time steps. On the other hand, in the AEtype method, the qubit number is proportional to the number of time steps. The total Tcounts for both methods are of the same order of magnitude, but the PRNtype method has the larger Tcount by a factor of about 2.
Note that analyses of resources required for implementing the LV model in this paper depend on designs of elementary circuits for arithmetic. For example, in the AEtype method, the dominant contribution to Tcount comes from arccos’s in preparing SNRNs. If more efficient circuits are proposed, the required resources will change from our estimation.
Finally, we would like to note that this study is not enough for the application of a quantum algorithm for Monte Carlo simulation to pricing in the LV model. Although we assumed that the LV function is given, in practice, we have to calibrate the LV so that the model prices of European options fit the market prices. Besides, we have not considered how to evaluate terms in exotic derivatives, for example, early exercise. In future works, we will consider such things and aim to present how to apply quantum computers in the whole process of exotic derivative pricing.
Condition on the parameters in the PRNtype method
We show that it is necessary for the PRNtype method working well that \(\sigma (t,S)\) is continuous on S and \(\Delta t_{j}\) is sufficiently small. These conditions lead to onetoone correspondence between \(S^{(i)}_{t_{j}}\) and \(S^{(i)}_{t_{j+1}}\). We define a function f by
then \(S^{(i)}_{t_{j+1}}=f(S^{(i)}_{t_{j}})\) holds. Except for the grid points \(\{s_{j,0},\dots , s_{j,n_{S}} \}\), \(f(S)\) is differentiable, and its derivative is given by
for \(s_{j,k1}< S< s_{j,k}\) and \(k=0,\dots ,n_{S}+1\). If we take sufficiently small \(\Delta t_{j}\), \(f^{\prime }(S)\) is positive expect the grid points. Besides, if \(\sigma (t_{j},S)\) is continuous on S, \(f(S)\) is continuous too. Combining these facts, we find that \(f(S)\) is strictly increasing, that is, onetoone mapping if the above two conditions hold.
Truncated multiplier and divider
We here describe the modified version of multiplier and divider. We assume that we consider the fixedpoint arithmetic with \(n_{\mathrm{int}}\) bits in the integer part and \(n_{\mathrm{frac}}\) bits in the fractional part, \(n=n_{\mathrm{int}}+n_{\mathrm{frac}}\) bits in total. We hereafter call such numbers \((n_{\mathrm{int}},n_{\mathrm{frac}})\)bit numbers.
Let us consider truncated multiplication. In order to keep this digit setting during multiplication, we adopt the following policy.

We simply truncate the digits lower than the \(n_{\mathrm{frac}}\)th fractional digit in the product. This might cause numerical errors around and the \(n_{\mathrm{frac}}\)th fractional digit and such a tiny error might accumulate, but we simply neglect this concern.

We assume the overflow from the \(n_{\mathrm{int}}\)bit integer part never occurs.
We write a number x in binary representation as \(x_{n_{\mathrm{int}}1}x_{n_{\mathrm{int}}2}\dots x_{0}.x_{1}\dots x_{n_{ \mathrm{frac}}}\), where \(x_{i}\) is the ith integer digit of x and \(x_{j}\) is the jth fractional digit of x. We then approximate the product of x and y as follows:
where
Under our assumption that the overflow from the nbit integer part never occurs, we have
This calculation can be constructed by using nbit controlled adders ntimes as in Ref. [42]. Thus, the qubit number and Tcount of the circuit for truncated multiplication are the same as those in Ref. [42].
We define the truncated division of z by y as the inverse of the truncated multiplication: \(z/y\approx (f^{\mathrm{mul}}_{n_{\mathrm{frac}},n_{ \mathrm{int}},y} )^{1}(z)\). Given two \((n_{\mathrm{int}},n_{\mathrm{frac}})\)bit numbers y and z, we can find an \((n_{\mathrm{int}},n_{\mathrm{frac}})\)bit number x satisfying \(z=f^{\mathrm{mul}}_{n_{\mathrm{frac}},n_{\mathrm{int}},y}(x)\) by the following iterative procedure:

1
Set \(i=n_{\mathrm{int}}1\) and \(x=0\).

2
Update z with \(z2^{i}\tilde{y}_{i}\)

3
Set \(x_{i}=0\) if \(z<0\), else set \(x_{i}=1\).

4
If \(x_{i}=0 \), update z with \(z+2^{i}\tilde{y}_{i}\) (z returns to the value before step 2).

5
Decrement i by 1.

6
Repeat steps 25 until \(i = n_{\mathrm{frac}}1\).

7
Output x.
Note that \(2^{i}\tilde{y}_{i} > \sum_{j=n_{\mathrm{frac}}}^{i1}2^{j} \tilde{y}_{j}\). This ensures that sequential subtractions by \(2^{i}\tilde{y}_{i}\) and checking whether the difference is positive or negative lead to determining each digit of x. In the above procedure, we need to convert z to \((2n_{\mathrm{int}} 1, n_{\mathrm{frac}})\)bit number to calculate \(z\pm 2^{n_{\mathrm{int}}1}\tilde{y}_{n_{\mathrm{int}}1}\). So, we introduce \(n_{\mathrm{int}}\)dummy qubits corresponding to the \(2n_{\mathrm{int}}1\)th to \(n_{\mathrm{int}}\)th integer digits of z. Then, steps 24 are implemented as the circuit in Fig. 12. We also note that the dividend register is reset from \(z\rangle \) to \(0\rangle \) through the above procedure. If we want to reserve \(z\rangle \), we need to copy the dividend state to another ancillary register by CNOT gates, which increases the total number of qubits by n.
Despite the trick to truncate the digits, the structure of the circuit for truncated division is similar to the restoring division circuit in Ref. [46]. Thus, the Tcount of our truncated division circuit is the same as that of the circuit in Ref. [46]. On the other hand, since we have introduced dummy qubits and a register to keep the dividend, the qubit number of divider is 5n.^{Footnote 8}
Notes
In this paper, we assume arbitragefree and complete markets, so the number of stochastic factors equals that of assets.
Hereafter, we use the word ‘qubit’ to mean a logical qubit.
Note that the drift term does not exist because we set the riskfree to be 0.
The state preparation method for the standard normal distribution introduced in Sect. 4.3.2 does not suffer from this problem since it does not use Monte Carlo integration to calculate the cumulative distribution over each interval in discrete approximation, whereas Ref. [25] assumed the use of it. Although Ref. [10] said that our method uses Monte Carlo integration in some parts, it actually does not use any. For the detail, see Sect. 4.3.2.
Strictly speaking, comparisons between \(R_{S}\) or \(R_{S^{\prime }}\) and \(s_{j,k}\)’s require loading \(s_{j,k}\)’s into some register. This does not require another register, since at least one of ancillary registers used in \(x\leftarrow x+(ax+b)y\) and \(z\leftarrow \frac{z+xby}{1+ay}\) is empty at loading.
Although we also need a register to which \(x^{\mathrm{ICDF}}_{m}\)’s are loaded at comparisons between them and \(R_{\mathrm{PRN}}\), we can use \(R_{W}\) or intermediate output registers, which are empty at comparisons.
Actually, added qubits are not 2n but \(n_{\mathrm{int}}+n\), but we consider that 2n qubits are added for simplicity and conservativeness.
Abbreviations
 LV:

Local Volatility
 RN:

Random Number
 PRN:

PseudoRandom Number
 BS:

BlackScholes
 CDF:

Cumulative Distribution Function
 SNRN:

Standard Normal Random Number
 PSNRN:

Pseudo Standard Normal Random Number
 PCG:

Permuted Congruential Generator
 LCG:

Linear Congruential Generator
References
Orus R et al.. Quantum computing for finance: overview and prospects. Rev Phys. 2019;4:100028.
Hull JC. Options, futures, and other derivatives. New York: Prentice Hall; 2012.
Shreve S. Stochastic calculus for finance I: the binomial asset pricing model. Berlin: Springer; 2004.
Shreve S. Stochastic calculus for finance II: continuoustime models. Berlin: Springer; 2004.
Montanaro A. Quantum speedup of Monte Carlo methods. Proc R Soc Ser A. 2015;471:2181.
Miyamoto K, Shiohara K. Reduction of qubits in quantum algorithm for Monte Carlo simulation by pseudorandom number generator. Phys Rev A. 2020;102:022424.
Rebentrost P et al.. Quantum computational finance: Monte Carlo pricing of financial derivatives. Phys Rev A. 2018;98:022321.
Stamatopoulos N et al.. Option pricing using quantum computers. Quantum. 2020;4:291.
RamosCalderer S et al.. Quantum unary approach to option pricing. Phys Rev A. 2021;103:032414.
Chakrabarti S et al.. A threshold for quantum advantage in derivative pricing. Quantum. 2021;5:463.
Black F, Scholes M. The pricing of options and corporate liabilities. J Polit Econ. 1973;81:637.
Merton RC. Theory of rational option pricing. Bell J Econ Manag Sci. 1973;4:141.
Dupire B. Pricing with a smile. Risk. 1994;7:18–20.
Grover L, et al. Creating superpositions that correspond to efficiently integrable probability distributions. quantph/0208112.
Campbell ET et al.. Roads towards faulttolerant universal quantum computation. Nature. 2017;549:172.
Egger DJ, et al. Credit Risk Analysis using Quantum Computers. 1907.03044.
Amy M et al.. A meetinthemiddle algorithm for fast synthesis of depthoptimal quantum circuits. IEEE Trans ComputAided Des Integr Circuits Syst. 2013;32(6):818–30.
Selinger P. Phys Rev A. 2013;87:042302.
Maruyama G. On the transition probability functions of the Markov process. Rend Circ Mat Palermo. 1955;4:48.
Bassard G et al.. Quantum amplitude amplification and estimation. Contemp Math. 2002;305:53.
Suzuki Y et al.. Amplitude estimation without phase estimation. Quantum Inf Process. 2020;19:75.
Nakaji K. Faster Amplitude Estimation. 2003.02417.
GiurgicaTiron T, et al. Low depth algorithms for quantum amplitude estimation. 2012.03348.
Plekhanov K, et al. Variational quantum amplitude estimation. 2109.03687.
Herbert S. The problem with groverrudolph state preparation for quantum MonteCarlo. Phys Rev E. 2021;103:063302.
Vedral V et al.. Quantum networks for elementary arithmetic operations. Phys Rev A. 1996;54:147.
Beckman D et al.. Efficient networks for quantum factoring. Phys Rev A. 1996;54:1034.
Draper TG. Addition on a quantum computer. quantph/0008033.
Cuccaro SA et al.. A new quantum ripplecarry addition circuit. In: The eighth workshop on quantum information processing. 2004.
Takahashi Y et al.. A linearsize quantum circuit for addition with no ancillary qubits. Quantum Inf Comput. 2005;5(6):440–8.
Van Meter R et al.. Fast quantum modular exponentiation. Phys Rev A. 2005;71(5):052320.
Draper TG et al.. A logarithmicdepth quantum carrylookahead adder. Quantum Inf Comput. 2006;6(4):351.
Takahashi Y et al.. Quantum addition circuits and unbounded fanout. Quantum Inf Comput. 2010;10(9–10):0872.
Portugal R et al.. Reversible Karatsubas algorithm. J Univers Comput Sci. 2006;12(5):499.
AlvarezSanchezet JJ et al.. A quantum architecture for multiplying signed integers. J Phys Conf Ser. 2008;128(1):012013.
Takahashi Y et al.. A fast quantum circuit for addition with few qubits. Quantum Inf Comput. 2008;8(6):636.
Thapliyal H. Mapping of subtractor and addersubtractor circuits on reversible quantum gates. Transactions on Computational Science XXVII. 2016;10.
Thapliyal H, Ranganathan N. Design of efficient reversible logic based binary and BCD adder circuits. ACM J Emerg Technol Comput Syst. 2013;9:17.
Lin CC et al.. Qlib: quantum module library. ACM J Emerg Technol Comput Syst. 2014;11(1):7:1–7:20.
Babu HMH. Costefficient design of a quantum multiplieraccumulator unit. Quantum Inf Process. 2016;16(1):30.
Jayashree HV et al.. Ancillainput and garbageoutput optimized design of a reversible quantum integer multiplier. J Supercomput. 2016;72(4):1477.
MuñozCoreas E, Thapliyal H. Quantum circuit design of a Tcount optimized integer multiplier. IEEE Trans Comput. 2019;68:5.
Khosropour A et al.. Quantum division circuit based on restoring division algorithm. In: Information technology: new generations (ITNG), 2011 eighth international conference on. Las Vegas: IEEE; 2011. p. 1037–40.
Jamal L, Babu HMH. Efficient approaches to design a reversible floating point divider. In: 2013 IEEE international symposium on circuits and systems (ISCAS2013). 2013. p. 3004–7.
Dibbo SV et al.. An efficient design technique of a quantum divider circuit. In: 2016 IEEE international symposium on circuits and systems (ISCAS). 2016. p. 2102–5.
Thapliyal H et al.. Quantum circuit designs of integer division optimizing Tcount and Tdepth. In: IEEE transactions on emerging topics in computing. 2019.
Amy M, Maslov D, Mosca M. IEEE Trans CAD. 2014;33(10):1476.
Maslov D. On the advantages of using relative phase Toffolis with an application to multiple control Toffoli optimization. Phys Rev A. 2016;93:022311.
O’Neill ME. PCG: A Family of Simple Fast SpaceEfficient Statistically Good Algorithms for Random Number Generation. Harvey Mudd College Computer Science Department Tachnical Report. 2014. http://www.pcgrandom.org/.
Hörmann W, Leydold J. Continuous random variate generation by fast numerical inversion. ACM Trans Model Comput Simul. 2003;13(4):347.
Haner T, et al. Optimizing Quantum Circuits for Arithmetic. 1805.12445.
MuñozCoreas E, Thapliyal H. Tcount and qubit optimized quantum circuit design of the nonrestoring square root algorithm. ACM J Emerg Technol Comput Syst. 2018;14:3.
Kliuchnikov V et al.. Practical approximation of singlequbit unitaries by singlequbit quantum Clifford and T circuits. IEEE Trans Comput. 2016;65(1):161.
Funding
The research was funded by MizuhoDL Financial Technology Co., Ltd.
Author information
Authors and Affiliations
Contributions
The original idea to this paper came from K.M. All authors contributed to the preparation of the manuscript. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare that they have no competing interests.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Kaneko, K., Miyamoto, K., Takeda, N. et al. Quantum pricing with a smile: implementation of local volatility model on quantum computer. EPJ Quantum Technol. 9, 7 (2022). https://doi.org/10.1140/epjqt/s40507022001252
Received:
Accepted:
Published:
DOI: https://doi.org/10.1140/epjqt/s40507022001252
Keywords
 Finance
 Pricing
 Quantum computing