Arbitrary unitaries in orbital angular momentum of single photons

A simple argument is presented that explicitly shows how to construct an arbitrary quantum gate acting on orbital angular momentum (OAM) of single photons. The scheme can be applied to implement subspace multiplexing, where a single high-dimensional OAM qudit represents effectively a stack of multiple independent lower-dimensional qudits. A special subclass of unitaries composed of single-photon controlled gates is studied in detail and notable examples of the general approach are discussed. The generalization of the simple argument leads to the parallelization scheme, which results in the savings of resources. The presented schemes utilize only conventional optical elements and apply not only to single photons but also to classical light.


I. INTRODUCTION
The orbital angular momentum (OAM) of a photon amounts to quantized twists of the photon's wave function [1,2] and has served in a multitude of experiments as a high-dimensional quantum carrier of informationfor given d one can always consider a d-dimensional subspace of OAM spanned by eigenstates |0 , . . . , |d − 1 , where the information is encoded in a superposition of these eigenstates. The OAM of photons has been experimentally utilized in quantum teleportation [3], highdimensional quantum key distribution [4], generation of high-dimensionally entangled quantum states [5,6], as well as in more fundamental experiments that study the correspondence principle for a very high number of OAM quanta [7].
To make OAM of photons a full-fledged degree of freedom suitable for information transmission and processing, it is necessary to be able to manipulate the information contained in the OAM states as required by a given application. Mathematically speaking, one should be able to apply an arbitrary unitary operation to the quantum state of OAM. In this paper, we present a very simple argument that demonstrates that any unitary can be implemented using only conventional optical elements. We also introduce a scheme that allows to construct a singlephoton controlled-gates, where either the control or the target qudits are played by the path degree of freedom of the photon. The direct generalization of these schemes leads one to the study of the parallelization scheme where a series of simultaneously applied identical local unitaries is replaced with only a single instance of the unitary supplemented with pre-and post-processing stages.
The schemes studied in the text amount to networks of interferometers. In real experimental conditions, the stability of interferometers is in general an important issue that hinders further development of this technology and over the years reliable holographic techniques have been developed [8] and successfully demonstrated [9][10][11] that overcome the stability problem. Being aware of that, the aim of the present paper is not just to present an alternative point of view, based on interferometers, but also to emphasize some of the algebraic properties related to the orbital angular momentum. One such property of the resulting setups is periodicity in OAM, which can in principle be used for multiplexing multiple OAM quantum states lying in different subspaces into a single large superposition.
The OAM eigenstates are represented by wavefunctions whose form in cylindrical coordinates (r, φ, z) takes on the form where k is an integer. This wavefunction undergoes simple transformations when the corresponding photon is subjected to the action of various conventional optical elements. Throughout the paper, we consider only the following toolkit of optical elements: mirrors, beam splitters, Dove prisms, phase shifters, and simple holograms in the form of spiral phase plates. Most notably, no complicated phase profiles resulting from numerical simulations are used that would require the use of costly spatial light modulators. The summary of actions of the individual elements on OAM states can be found in Ref. [12]. The manuscript is structured as follows. At first, we demonstrate in Sec. II that conventional optical elements are sufficient to construct an arbitrary unitary gate in OAM of single photons. To exemplify this approach, we present an implementation of high-dimensional Pauli gates and their integer powers. Such gates are used for example in the construction of Heisenberg-Weyl observables, which in turn find applications in the quantum state tomography [13,14]. After that we turn our attention to the single-photon controlled-gates in Sec. III, where the OAM plays the role of the control qudit and the path degree of freedom of the same photon represents the target qudit. The opposite case with the two roles exchanged then follows from the latter by using a swap operator. We conclude the list of interferometric networks in Sec. IV, where the parallelization scheme is studied. It allows to replace a series of identical local unitaries by a single setup that contains significantly reduced number of optical elements. To give a specific example, we present explicit setups for parallelized Pauli gates. The arXiv:2106.11046v2 [quant-ph] 19 Aug 2022 general schemes can be simplified using the polarization of photons. Also, the schemes exhibit periodicity that is discussed in Sec. V. We summarize our results in Sec. VI.

II. ARBITRARY UNITARIES IN OAM
A. General case One of the core results of the quantum computation theory is that there exist universal sets of operations, out of which any other unitary operation can be constructed. For the case of operations acting on the OAM of a single photon, such universal sets have been presented in Refs. [15,16]. Here we put forward a very simple argument that shows explicitly that such a universal set can be constructed only from conventional optical elements.
Let us denote by U the abstract d-dimensional unitary operation and let U O and U P be its implementations for the OAM and the path degrees of freedom, respectively. The idea underlying our argument is this: to build U O one first transforms the incoming OAM eigenstates into the path encoding and then applies U P , for which general implementation schemes are known [17][18][19] that use only beam splitters and phase shifters. At the end, the propagation modes are transformed back into the OAM eigenstates. The transition between the OAM and path encodings is performed by a d-dimensional OAM sorter S. The sorter turns an OAM eigenstate |m O with m quanta of OAM and propagating along the zeroth path |0 P into the fundamental mode |0 O that propagates along the m-th path |m P , such that The OAM sorter can be implemented in multiple ways [9,10,20,21]. Here we employ the interferometric implementation that consists merely of beam splitters, Dove prisms, and holograms [22][23][24][25] and whose structure is shown in section II B. The whole scheme is then compactly represented by the formula In the scheme, the use is made of O(d 2 ) beam splitters, O(d 2 ) phase shifters, O(d) Dove prisms, and O(d) holograms as can be deduced from the structure of the OAM sorter [25,26] and the Reck et al. scheme [17]. The detailed analysis of the number of optical elements can be found in Appendix A and the effect of losses is briefly studied in Appendix B. Despite the "obviousness" of the universal scheme in Eq. (3), the author of these lines has not found any publication so far that would mention it. The scheme offers some advantages when compared to alternative approaches. It works in a deterministic and in principle lossless way, unlike the approach based on the decomposition of a general unitary into a linear combination of powers of X and Z gates [1]. There is also no need to assume a limit of many elements as in Ref. [15], no complicated theoretical framework is necessary to demonstrate universality as in Ref. [16], and there is also no need to perform numerical Fourier-optics optimization algorithms as in Ref. [8].

B. High-dimensional Pauli gates
The brute-force scheme of Eq. (3) can be simplified considerably for specific unitary operations. One class of such operations are the Pauli operators, prominent examples of local quantum gates. The d-dimensional Pauli X gate and Z gate are defined by [13] X d (|q ) = |(q + 1) mod d , where ω = exp(2πi/d), and where {|q } d−1 q=0 form the computational basis. It turns out that to implement the Z gate as well as its integer powers Z k a single optical element is sufficient-a Dove prism rotated through an angle of kπ/d performs the required transformation. The approach of Eq. (3) is thus not necessary in this case. However, for the X gate the situation is more complicated and the detailed analysis, based on utilizing Eq. (3), is presented below.
The implementation scheme of the d-dimensional X gate acting on OAM was given in Refs. [26,27]. In what follows, we generalize results of Ref. [26] and construct a setup for a k-th power of the X gate, X k gate, in a way that is more efficient than a mere concatenation of k setups corresponding to the X gate. It turns out that the most resource-demanding case is that of k = d/2, for which the scaling of resources is linear O(d). An alternative approach in Ref. [14] for constructing X k gates scales like O(d log 2 (d)).
We make use of OAM exchangers, depicted in Fig. 1(a), which are passive two-input two-output optical devices [25] composed of a Leach interferometer [22] and two holograms. The sorting properties of an OAM exchanger EX k of order k are determined by the value of k and so is the case for the inverse operation EX −1 k shown in Fig. 1(b). The X gate can be built out of OAM exchangers in an arbitrary dimension. Nonetheless, here we focus only on dimensions of the form d = 2 M , for which the X gate can be constructed as a series of OAM exchangers of orders 2 k for k = 0, . . . , M − 1, followed by the reversed series of the same structure. The modification of the original scheme for dimension d = 8 is presented in Fig. 1(c) and the generalizations for higher dimensions follow analogously the depicted pattern. This pattern can be obtained by starting from the naive implementation in Eq. (3), where the OAM sorters are constructed as binary-tree networks of OAM exchangers [22,25]. This case is explicitly shown in Fig. 1(d), where the pathencoded implementation U P of the X gate corresponds to the path permutation that connects the output ports  1. The interferometric implementation of the X gate together with its integer powers. (a) The OAM exchanger EX k of order k is built from two holograms and a Leach interferometer [22] with one Dove prism rotated through π/(2k). Optical elements: holo-hologram, Dove-Dove prism, BS-50:50 beam splitter. (b) The inverse of the OAM exchanger, EX −1 k , has almost the identical structure to that of EX k only the Dove prism is rotated through −π/(2k). For convenience, we use two slightly different symbols to denote the inverse of the OAM exchanger, as shown in the figure. (c) The X gate for d = 8. (d) The X gate from (c) is constructed from two OAM sorters, marked by shaded rectangles in the figure, from which redundant exchangers are removed. These exchangers can be grouped into blocks of increasing size, which are enclosed in dashed-line rectangles. The unused paths as well as redundant exchangers are drawn in faded color. The remaining exchangers can be reordered in order to get rid of the path permutations. (e) The same principles apply when constructing the integer powers X k of the X gate. When the exponent k is a power of two, i.e., k = 2 m , the path permutation has a repetitive structure and the whole setup is effectively split into k identical subsetups. (f) For a general exponent k the path permutation has a more complicated structure. (g) The number of exchangers that have to be retained in the final setup increases with the exponent k until it attains the form k = d/2, in which case no exchangers can be removed.
of the sorter S on the left with the input ports of the inverted sorter S −1 on the right. Due to the structure of the path permutation, many exchangers EX k from S are followed by their inverses EX −1 k from S −1 . All these exchangers can be obviously removed without any effect on the final state. The resulting optical network is identical to that in Fig. 1(c).
The X gate is a specific example of a cyclic permutation of the basis states. We can obtain all other cyclic permutations by taking powers of the X gate. Specifically where k ∈ N. Due to the cyclic property of the X gate, it holds that X d−k = (X k ) −1 . Consequently, it suffices to study only powers k ≤ d/2 as the implementation of X k for k > d/2 is obtained as the implementation for X d−k operated backwards. We proceed analogously to the case of the X gate. We again start from the general scheme in Eq. (3) and remove all the exchangers that do not affect the final state. In Figs. 1(e), (f), and (g) the explicit form of X k gate is shown for d = 8 and k = 2, 3, 4, respectively.
In order to understand how the scheme in Eq. (3) can be simplified for general dimension d = 2 M and general power 1 ≤ k ≤ d/2 one notes that the permutations of propagation paths can be expressed as a series of path crossings of increasing size [28], see the middle part of Fig. 1(d). When k is a power of two, i.e., k = 2 m , the path permutation has a repetitive structure, cf. Fig. 1(e) and (g). In such cases, the setup effectively decomposes into k subsetups of the same structure and smaller size. For example, the setup of X 2 in Fig. 1(e) in dimension d = 8 can be viewed as two smaller setups for X in dimension d = 4. In general, the setup of X k for power k = 2 m in dimension d = 2 M can be seen as k subsetups implementing X gate in dimension d = d/k = 2 M −m . These setups are sandwiched between two OAM sorters with k output paths. This way we obtain the simplified scheme for general dimensions d and powers k that are both powers of two. To characterize the general structure of the setup for X k when the power k is not a power of two, such as the case of X 3 in Fig. 1(f), is more subtle. For details and the number of required optical elements see Appendix A 2.

III. SINGLE-PHOTON CONTROLLED GATES IN OAM AND PATH
The scheme of preceding section makes use of the path degree of freedom to implement an arbitrary unitary in OAM of a single photon. In this section, we invoke path once more and consider a photon whose state is a superposition of d OAM modes |k O propagating along n different paths |p P . For such states one can study a class of unitaries that consists of controlled gates where the two degrees of freedom play the role of control and target qudits. Single-photon controlled gates acting on spatial modes of light were studied in special cases e.g. in Refs. [29,30].

A. OAM as a control
The general action of a controlled operation CU on input states where the OAM and path play the roles of the control and the target qudit, respectively, reads where U is a fixed unitary acting on the path degree of freedom. The quantum circuit that corresponds to this operation is depicted in Fig. 2(a). An experimental setup that implements this controlled operation can be constructed as follows. First we note that any unitary operation U can be diagonalized, such that where M is a unitary matrix and D is a diagonal matrix composed of eigenvalues of U . Due to the unitarity, the eigenvalues are of the form λ j = e −iϕj for some real phases ϕ j . From the eigendecomposition formula it directly follows for an arbitrary integer power k that  The key observation is that M is fixed for all the powers and the diagonal matrix D k has only complex phases e −ikϕj on its diagonal. These phases can be compared with the action of a Dove prism (DP) rotated through angle α, when applied to an eigenstate with k quanta of OAM The extra minus factor in the outgoing OAM eigenstate can be corrected for by an additional mirror. From the above relations one sees that the controlled unitary CU can be implemented as shown in Fig. 2(b). First, an operator M is applied only to the path degree of freedom of the incoming photon; then a series of Dove prisms is utilized, where a Dove prism rotated through ϕ j /2 is placed on the j-th path; and finally an inverse operator M † concludes the operation. That is We thus obtain a passive network of optical elements, where M can be implemented in various ways, such as Reck et al. scheme [17] and its alternatives [18,19]. Standard Dove prisms have an undesirable impact on the polarization of propagating photons [31]. Modifications of the Dove prism geometry can nevertheless mitigate this impact considerably [32]. There is one additional feature of the controlledunitary setup of Fig. 2(b) -suppose that a given unitary U is to act on eigenmodes of the form |m k for 0 ≤ k < d where m is a fixed integer. The implementation is in this case identical to that in Fig. 2(b) except that the Dove prism in the j-path is rotated through ϕ j /(2m). This general property was noted in Ref. [27] for the special case of the X gate. Some savings in resources are possible when a polarization is utilized as an auxiliary degree of freedom. The setup of Eq. (11) has a symmetric structure, where the M operator is applied both before and after the stack of Dove prisms. One can get rid of the second operator to obtain a folded scheme [26,27], where M † is implemented by the backward passage of a photon through the setup for M , see Fig. 2(c). The forward and backward passage is controlled by the polarization of the photon. Provided that the initial polarization is H, the photon traverses both the M module and the Dove prisms as in the original scheme. Then a series of half-wave plates rotates H into V and then all terms in the photon's wavefunction travel backward through M . At the end, the terms are reflected out of the setup by an additional series of polarizing beam splitters positioned in front of M . Using the Reck et al. scheme [17], the unfolded scheme can be implemented with n(n − 1) beam splitters, n(n − 1) phase shifters, and n Dove prisms. The folded scheme requires n(n + 3)/2 beam splitters (both non-polarizing and polarizing), n(n − 1)/2 phase shifters, and n Dove prisms. Note that the universal scheme of Eq. (3) in the preceding section can be turned into a folded version in a very analogous way.

B. Examples
Let us discuss now some special cases of the general scheme (11). The simplest case is when U is itself diagonal. A notable example of such a unitary is the highdimensional controlled-Z gate, where the n-dimensional Pauli Z n gate is characterized by Eq. (5). As follows from Eq. (11), the CZ gate can be implemented as a mere stack of n properly rotated Dove prisms, where the prism in the p-th path is rotated through πp/n [12,25].
Another special example is the OAM equivalent of the polarizing beam splitter. Such a beam splitter is represented by a high-dimensional controlled Pauli X gate, a high-dimensional generalization of the CNOT gate. The eigendecomposition of the high-dimensional X gate (4) reads X = F † · Z · F , where Z is given in Eq. (5) and F is the high-dimensional path-only Fourier transform. In the notation of Eq. (11) we have thus M = F . As mentioned above, the integer powers of X gate correspond to cyclic permutations with different strides. Thanks to this property the setup of the CX gate can be viewed as a sorter of OAM eigenstates. An experimental implementation of such an OAM sorter based on the aforementioned Fourier relation of X and Z gates was proposed in Ref. [33]. The same idea was then rediscovered 11 years later independently by two groups [34,35].
Yet another notable example is the Leach interferometer [22] depicted in Fig. 1(a), which is usually used as a parity sorter [23,36]. In this case, operator M corresponds to a single symmetric beam splitter and the diagonal matrix reads D = diag(1, exp(iα)), where α depends on the intended sorting properties of the Leach interferometer [25].

C. Path as a control
For completeness, let us briefly mention the complementary situation to Eq. (7) where the path is now the control qudit and OAM is the target qudit This case can be formally represented by an abstract quantum circuit akin to that in Fig. 2(a), where the roles of control and target qudits are exchanged. On a more practical level, the transformation of Eq. (12) can be understood as n independent setups, one setup in each path, see Fig. 2(d). The easy approach how to implement such a controlled operation is to use a swap operator, discussed in detail in the following section, that exchanges the role of the path and OAM and sandwich the setup of Fig. 2(b) between two such swaps. The advantage of such a scheme, when compared to the naive scheme of Fig. 2(d), is the reduction in the amount of resources. In the general case, one requires O(d 2 n) elements to implement the naive scheme, whereas roughly O(d 2 + n log(n)) elements are required by the scheme based on swaps and Fig. 2(b). Analogously to the previous section, also in the present scenario we can construct the folded scheme. The polarizing beam splitters and half-wave plates reroute the terms of the photon wave function such that the initial polarization H is midway through the setup changed into V and the setup is propagated backward. The circuit of Fig. 2(d) can also be seen from the multiple-photon point of view where each path is occupied by exactly one photon. Even in such a case the setup can be implemented by sandwiching the scheme of Fig. 2(b) between two swaps, where individual photons share some propagation paths. This point of view leads us to study the parallelization of the scheme in the following section, where the same local unitary is applied to multiple photons.

A. General case
In the domain of classical computation, a truly largescale deployment is allowed by various parallelization techniques and the same can also be expected in the quantum domain. In general, a large complex computational task is split into smaller parts, each of which is computed by a separate computational core. The simplest case is when the task consists of multiple identical subtasks. One arrives at the scenario of Fig. 3(a), where each subtask is represented by a local unitary U O acting on one photon. The resulting collection of unitaries U O can be viewed as a single parallelized operation U (par) O acting on many photons. Such a stack of identical gates is henceforth referred to as the naive approach.
It is noteworthy to point out that the same setup of multiple local unitaries emerges in a qualitatively different scenario when a single photon propagates in a superposition of multiple paths and one wants to apply operation U only to its internal degree of freedom, such as OAM. In the experimental realization, U is implemented as a series of identical setups with one setup U O in each path, see Fig. 3(b). This is reminiscent of the scheme in Fig. 2(d) except that the unitary in each path is the same.
In the naive approach, U (par) O requires a number of elements that scales linearly with the number of systems. The same task of simultaneous application of U on n paths can be nevertheless achieved with just a single device, as shown in Fig. 3(c). The key role in this approach is played by the swap operator, whose action on input states reads where |m O denotes internal mode m and |p P stands for the p-th propagation mode. In the scheme, a swap first exchanges the roles of internal and path modes. The path-encoded implementation U P of the desired unitary U is then applied to the photon(s). This operation transforms the path modes according to U and leaves the internal modes unaffected. Even though this property may not be satisfied in general, in many cases this is indeed the case as U P can be constructed only with beam splitters and phase shifters [17], which leave e.g. OAM, polarization, or frequency of photons unaffected [37]. In the third stage, a swap is applied again in order to give the internal and path modes their original meaning. As a result, the internal modes in each path are transformed according to U . This procedure is summarized by the formula which we henceforth refer to as the parallelized scheme. The parallelized scheme is clearly a generalization of the approach used to construct an arbitrary unitary in OAM in section II. The sorter from Eq. (2) can be understood as a special case of the swap operator in Eq. (13) for p = 0. Even though the relation (14) holds for any internal degree of freedom, it is not obvious how to implement efficiently the SWAP operator in a general case. For the case of OAM and path we can utilize the efficient design of Ref. [25] whose detailed structure is presented in section IV C. The path-encoded unitary U P can be implemented using Reck et al. scheme [17] and the parallelized setup of Eq. (14) is thus made of only conventional optical elements. Moreover, each beam splitter in the Reck et al. scheme has to be supplemented by two extra mirrors, such that the sign of OAM eigenstates is unaffected by the reflection off the beam splitter's interface [25].

B. Scaling of resources
The simultaneous n-fold application of unitary U O can be in the naive approach implemented with n separate setups. To quantify the improvement brought by the use of the parallelized scheme of Eq. (14), we introduce the ratio of the number of beam splitters required by the parallelized and naive schemes .
The smaller this ratio, the better the parallelized scheme is when compared to the naive implementation. One can consider similar ratios also for other optical elements, with similar results to those presented below. Some general statements about the efficiency of the improved scheme can be derived even for an unspecified unitary operation. According to Reck et al. scheme at most N P (d) = d(d − 1)/2 beam splitters are necessary to implement an arbitrary path-encoded unitary U . For large enough dimensions the ratio r Reck scales as which can be interpreted such that the parallelized scheme uses effectively only one setup for U P instead of n identical setups for U O . The parallelized scheme thus scales in this general case linearly better than the naive approach. For more details refer to Appendix A 3. For specific classes of unitaries one obtains different scaling estimates. The extreme case is represented by unitaries that correspond to mere permutations of modes. For those, we get N P (d) = 0 as no beam splitters are necessary to permute paths. The ratio for both n > d and n ≤ d cases then scales as Except for extreme cases the parallelized scheme is again more efficient than the stack of n independent setups. ...
...  The aforementioned complexity estimates are based on the assumption that no simplification of the resulting setup is possible. Nevertheless, the scaling may differ when the implementation of the parallelized scheme can be simplified by removing superfluous elements. We exemplify this reduction in section IV C for integer powers of the high-dimensional X gate, which form a special class of permutations. Another situation when the above general estimates do not hold is when other than the brute-force implementation from Eq. (3) is used to construct U O . For instance, the d-dimensional Fourier transform of the OAM eigenstates of a single photon can be implemented using only N O (d) ∼ √ d log(d) beam splitters [12]. Even though the Fourier transform is a much more complex operation than a mere permutation, the scaling of the corresponding number of beam splitters is basically identical to (17). Additional savings in resources are possible when a polarization is utilized in the scheme of Eq. (14) to arrive at the folded setup in a way completely analogous to that described in section III. The number of beam splitters is then reduced approximately by the number of beam splitters required to construct the removed swap operator.
The parallelized scheme of Eq. (14) reduces the total number of elements, but the number of elements that each photon has to traverse on average increases. Another relevant issue when assessing the performance of the setup are thus losses accompanying the transformation U (par) O . Even though the detailed analysis of the role of losses lies beyond the scope of the present paper, in Appendix B a simplified discussion is presented. It turns out that the overall transmittance per photon of the parallelized scheme is decreased by a factor of T 2 log 2 (d) when compared to the naive scheme, where T quantifies the effective mean transmittance of each optical element in the setup. The losses for both the naive and parallelized schemes are otherwise comparable to schemes that implement purely path-encoded unitaries, such as the scheme of Ref. [18].

C. Parallelized Pauli gates
In this section, we discuss the parallelization of Pauli gates and compare their scaling properties with the naive approach. As mentioned in section II B, the implementation of the d-dimensional Pauli Z gate is especially simple-a single rotated Dove prism will do. The Z gate in OAM is thus an example of a gate where the parallelization does not actually bring any advantage. This is no longer true though for the X k gates.
One can parallelize the X gate by starting from the setup in Eq. (14), where U P is the path-encoded X gate, and then removing all the redundant optical elements. To see which elements are not necessary, let us take a close look at the internal structure of the swap operator demonstrated in Fig. 4(a). The swap consists of two functionally different parts [25]-one E block and a series of H blocks of increasing size. The E block is a network of exchangers EX k of increasing orders of the form k = 2 l and is shown explicitly for each swap in Fig. 4. The structure of H blocks is not of interest in our discussion and can be found in Ref. [25]. The removal of redundant exchangers in the case of the parallelized X gate is depicted in Fig. 4(a) explicitly for the special example of n = d = 8. Analogously to the procedure of section II B, there are many instances where an OAM exchanger EX k is followed by its inverse EX −1 k . These exchangers can be removed without affecting the final state. In a completely analogous way one also proceeds for the par-(a) k = 2 0 = 1 4 4. Explicit forms of the parallelized scheme for X k gates. The parallelized scheme, exemplified for d = n = 8 and 1 ≤ k ≤ d/2, consists of a path permutation sandwiched between two swaps. The swap operator comprises a network of OAM exchangers, whose structure is depicted in Fig. 1, and a series of H blocks (in the present case there are two such blocks per one swap). The presented schemes can be simplified by removing the exchangers that do not affect the final state. These are drawn in faded colors and enclosed in dashed boxes. The input paths of the swap operator should be permuted to comply with formula (13). As this path permutation does not affect our discussion, we omit it in the figure for clarity. allelized integer powers, the X k gates. One again starts from the scheme in Eq. (14). This step for n = d = 8 is shown in Figs. 4(b), (c), and (d) for all X k gates with k ≤ d/2.
The number of beam splitters in the parallelized scheme is shown in Fig. 5 for n = 16 propagation paths and dimensions d ≤ n. For comparison, the naive approach that involves n copies of the non-parallelized X k gate is also shown. To put these numbers into context, we note that in Ref. [38] an experiment has been reported recently where 50 polarizing beam splitters and a bulk interferometer representing 300 beam splitters were used. Even though the naive approach exceeds these numbers already for d = 8, the parallelized scheme allows for the construction of an arbitrary power of the X gate even for d = 16. At most 162 beam splitters are required in such a case. Although this number is still rather formidable from the present technology point of view, the savings provided by the parallelized schemes are clearly visible. In Appendix A 4 a more detailed discussion of the number of elements is presented.

V. PERIODICITY
Each implementation scheme discussed so far comes with one neat feature -they are periodic in OAM. When working with OAM degree of freedom, one has to define the subspace of eigenstates in which the operations are to be carried out. When one sets the dimension to d, one possible choice of the OAM subspace consists of eigenstates {|0 , . . . , |d − 1 }. Let us denote the subspace spanned by these eigenstates with H 0 . Another choice of eigenstates can be {|d , . . . , |2d − 1 } or {|a d , . . . , |(a + 1)d − 1 } for a general a ∈ Z. Let us denote the subspace spanned by the latter eigenstates by H a . Consider a unitary operation U defined on subspace H 0 by formula where U = (U i,j ) and 0 ≤ i, j < d. For dimensions of the form d = 2 M , the naive implementation U O (3) of unitary U acts identically on each subspace H a for any a ∈ Z, not only on the fundamental subspace H 0 , such that This property was noted for the case of the highdimensional X gate in Ref. [26], but any power of the X gate constructed in section II B has this property also [39]. The periodicity in OAM is a result of the modulo property of the OAM sorter and the swap operator [14,25]. For details refer to Appendix C. An interesting issue is that of the periodicity of the controlled-unitary setup of Eq. (11). When the eigenvalues of U are of the form λ k = exp(2πi(a k /b k )) for some a k , b k ∈ Z, the whole scheme is periodic with the period that does not exceed the product b 0 b 1 b 2 . . . b n−1 . Unitaries whose eigenvalues have phases that are irrational multiples of 2π do not display any periodic behaviour. In practice though, any real number can be approximated by a rational number and the periodicity is effectively restored.
The periodicity of the presented setups may be seen as another parallelization feature -the parallelization in OAM. The OAM degree of freedom allows for the generation of very-high-dimensional states. One can thus consider a large OAM Hilbert space H composed of n subspaces H a , each of dimension d, such that H is a direct sum of the form A single photon with (nd)-dimensional state |ψ ∈ H can therefore be seen as representing a sum of n different ddimensional qudits |ψ a ∈ H a . When the manipulation of this state is done via operations discussed in preceding sections, each qudit |ψ a is due to Eq. (19) manipulated independently of all the others. One photon can thus in effect carry n different qudits in parallel. This framework can be seen as subspace multiplexing, where several OAM eigenmodes together carry a given quantum state in each subspace. Subspace multiplexing could be used in the free space communication or in the quantum computation with single photons. Even though in theory there is no limit on the largest possible number of OAM quanta and, therefore, the number of subspaces the operation U O can act on, from the physical perspective there is a limit. The spatial extent of the photon's wavefunction increases with the number of OAM quanta and so for large OAM values the eigenstate becomes macroscopic and unwieldy for manipulation [7,40,41].

VI. CONCLUSION
We study the manipulation of orbital angular momentum of light with the help of interferometric networks.
The networks consist of conventional optical elements and spiral phase plates, which add or subtract an integer number of OAM quanta. Importantly, no use of holograms with complex phase profiles is made, as is the case for instance in the multi-plane light conversion method [8][9][10][11]. The interferometric scheme for implementing an arbitrary unitary operation on single photons is presented, which can be viewed as the OAM counterpart of Reck et al. scheme. To exemplify the argument, we construct explicitly the setups for X k gates for dimensions of the form d = 2 M and derive precise analytical formulas for the number of employed optical elements.
Another interferometric scheme introduced is that for implementation of single-photon controlled gates, where the OAM and path play the role of either the control or target qudits. It turns out that several reported results of other authors are special cases of the scheme.
The last class of interferometric networks under consideration is the one that allows for simultaneous application of the same unitary on multiple OAM states. This parallelized scheme can find applications e.g. in multiplexed communication channels, where the internal modes of photons are used as carriers of information. The savings in resources for a general unitary offered by the parallelized scheme scale approximately linearly with the number n of involved parties, i.e., the naive approach requires roughly n times more elements than the parallelized approach. The parallelized versions of Pauli gates are constructed as concrete examples.
The networks presented in the text display periodicity in OAM, which can be used to implement subspace multiplexing where one device acts on many disjoint OAM subspaces of the same photon in the same way. When multiple independent pieces of information are encoded in the OAM of a single photon, our scheme applies the same unitary to each piece separately. The periodicity follows directly from the sorting properties of Leach interferometers in dimensions that are powers of two. This feature is unique to the interferometric implementation we use in this paper and cannot be imitated by alternative techniques that implement an OAM unitary using holograms with complex phase profiles [9,11,20].
The above results are in principle applicable not only to single photons but also to classical beams of light carrying orbital angular momentum. It would be interesting to see whether more properties of the networks similar to those studied in the text can be found.
We would like to thank Robert Kindler and Johannes Pseiner for their valuable comments on the early version of the manuscript. This work was supported by the Austrian Academy of Sciences and the University of Vienna via the QUESS project (Quantum Experiments on Space Scale). The path-encoded unitary U P as well as the OAM sorter are implemented as networks of many interferometers. Beam splitters thus play an important role in their construction. In the following, we will estimate the complexity of the OAM implementation U O by counting the beam splitters used in its construction. A similar discussion can also be done for other optical elements. Let N O (d) be the number of beam splitters required in the scheme of Eq. (3). If N P (d) is the number of beam splitters that are necessary to implement U P , then where is the number of beam splitters that implement the OAM sorter in dimension d [25].

X k gates
The number of beam splitters required in a scheme for the d-dimensional X gate, where d = 2 M , is equal to [26] N X (d) = 4 log 2 (d). (A3) Following the analysis in the main text, it is not hard to see that the number of beam splitters necessary to implement X k , where d = 2 M and k = 2 m , is It is easy to observe that for k = d/2 = 2 M −1 there are no exchangers to be removed from the naive scheme, see Fig. 1(g), since the path permutation affects all the outermost exchangers in both OAM sorters. Indeed, in such a case N X (d, k = d/2) = 2 N S (d). On the other hand, for k = 1 = 2 0 the number of required beam splitters is minimal and (A5) coincides with N X (d) (A3). The structure of the network for powers k that are not powers of two is more complicated. Even though the same procedure as delineated in the main text can be followed, we refrain from describing it in detail and present the resulting number of beam splitters retained in the simplified setup. It turns out that this number for X k gates in dimension d = 2 M and for general powers  [17]. Violet dots correspond to the ratio rperm. for permutations, which form a special subset of unitaries. The orange surface is given by formula (A10) that expresses the approximate scaling of ratio r Reck . The violet surface is likewise given by formula (17)  where m is an integer such that 2 m ≤ k < 2 m+1 . For k = 2 m we recover formula (A5).
From the structure of the OAM exchanger, Fig. 1(a), and the fact that the resulting setup for any X k gate consists only of the OAM exchangers, it is clear that the exact same formula (A6) applies also to the number of employed Dove prisms and holograms. For the number of mirrors we get twice as large a number and there is no need for phase shifters.

Parallelized scheme
Let us denote by N (par) O the number of beam splitters employed in the parallelized scheme of Eq. (14). Analogously to Eq. (A1) we obtain where N SWAP (n, d) is the number of beam splitters used to implement the swap operator with n input and d output paths. Due to the structure of the swap operator [25] we have to discuss the case with n ≤ d and that with n > d separately. When both n and d are powers of two and n ≤ d, the number of beam splitters that implement the swap operator is given by [12] N SWAP (n, d) = n 2 log 2 (n) + d log 2 (n) − 3n + 2d + 1. (A8) In the opposite case with n ≥ d one obtains When n or d are not powers of two, we construct the swap with 2 r input and 2 s output paths, where r and s are such that 2 r−1 < n ≤ 2 r and 2 s−1 < d ≤ 2 s . Formulas (A8) and (A9) then represent upper bounds on the number of utilized beam splitters.
To compare the performance of the parallelized scheme with the naive approach, let us use the ratio r defined in Eq. (15) and consider a general unitary implemented with Reck et al. scheme [17]. For this scheme the ratio (15) scales roughly as Even though the number of beam splitters in the swaps differs for the case of n ≤ d and that of n > d, the scaling (A10) holds approximately for both of them. A sample of exact values of ratio r Reck as well as the asymptotic behavior are depicted in Fig. 6. Permutations serve as a counterpart of this general scheme as far as the number of beam splitters is considered, see r perm. in Eq. (17). For comparison, a sample of exact values of r perm. as well as the asymptotic behavior (17) are also depicted in Fig. 6. As for the other optical elements involved in the parallelized setup built using Reck et al. scheme, the following estimates apply. The path-only unitary implementation requires O(d 2 ) beam splitters and O(d 2 ) phase shifters. To build the swap operators there are O(n log 2 (n)) beam splitters, O(n) phase shifters, O(n log 2 (n)) holograms, and O(n) Dove prisms necessary, provided that n > d [12]. When n ≤ d, the estimates only depend all on d, not on n.

Parallelized X k gates
The resulting number of beam splitters required to implement the parallelized version of the X gate in dimension d = 2 M for n = 2 K paths is equal to N (par) X (n, d) = n log 2 (n) + 2 n − 2, provided that n ≥ d. This formula does not depend on the dimension d, only on the number of paths. The naive approach consisting in stacking n non-parallelized schemes (A3) would require N X (d) n = 4n log 2 (d) beam splitters. The saving in resources is thus approximately equal to r X (n, d) ≈ log 2 (n) 4 log 2 (d) for large enough dimensions d and number of paths n ≥ d. When the number of paths is approximately equal to the dimension, the ratio above approaches a constant factor of 1/4 and the parallelization of Eq. (14) provides a moderate improvement over the naive approach. When n ≤ d, the formula (A11) is modified, but even then the improvement resulting from the parallelized scheme is rather moderate.
By calculations analogous to those for the nonparallelized powers of X k gates, the number of beam splitters retained in the final implementation of the parallelized schemes for an arbitrary k ≤ d/2 turns out to be N (par) X (n, d, k) = n log 2 (n)+2n−4k+2+2d k 2 m + m − 1 , (A13) where m in an integer such that 2 m ≤ k < 2 m+1 and where we assume n ≥ d. This expression simplifies for k = 1 into the formula (A11) derived for the parallelized version of the X gate. A similar discussion can also be done for other optical elements with similar results and for n < d.
When we compare the scaling for the naive approach utilizing n identical copies of the X k gate and the parallelization of Eq. (14), we obtain a scaling ratio that approaches where again n ≥ d. For high powers k we, therefore, save more resources by making use of the parallelized version.
In this formula, we assumed k to be constant. We can, however, also consider k that scales with the dimension d. For instance, the most resource-demanding scenario is when k = d/2. In such a case one obtains r X (n, d, d/2) 3 log 2 (n) 4d . (A15) Unless the number of paths exceeds exponentially the dimension, the parallelized scheme of Eq. (14) offers in this scenario substantial savings in resources when compared to the naive approach. For n < d one can perform an analogous analysis. (C2) Since we work only with such n and d that are powers of two, their ratio n/d is an integer. The above formula is a generalization of the 'modulo property' for the swap operator. If n ≤ d, the action of the swap operator can be written like (C3)