liu.seSök publikationer i DiVA
Ändra sökning
Avgränsa sökresultatet
1 - 33 av 33
RefereraExporteraLänk till träfflistan
Permanent länk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Träffar per sida
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sortering
  • Standard (Relevans)
  • Författare A-Ö
  • Författare Ö-A
  • Titel A-Ö
  • Titel Ö-A
  • Publikationstyp A-Ö
  • Publikationstyp Ö-A
  • Äldst först
  • Nyast först
  • Skapad (Äldst först)
  • Skapad (Nyast först)
  • Senast uppdaterad (Äldst först)
  • Senast uppdaterad (Nyast först)
  • Disputationsdatum (tidigaste först)
  • Disputationsdatum (senaste först)
  • Standard (Relevans)
  • Författare A-Ö
  • Författare Ö-A
  • Titel A-Ö
  • Titel Ö-A
  • Publikationstyp A-Ö
  • Publikationstyp Ö-A
  • Äldst först
  • Nyast först
  • Skapad (Äldst först)
  • Skapad (Nyast först)
  • Senast uppdaterad (Äldst först)
  • Senast uppdaterad (Nyast först)
  • Disputationsdatum (tidigaste först)
  • Disputationsdatum (senaste först)
Markera
Maxantalet träffar du kan exportera från sökgränssnittet är 250. Vid större uttag använd dig av utsökningar.
  • 1.
    Ahmed, Tanvir
    et al.
    Linköpings universitet, Institutionen för systemteknik. Linköpings universitet, Tekniska fakulteten.
    Garrido, Mario
    Linköpings universitet, Institutionen för systemteknik, Elektroniksystem. Linköpings universitet, Tekniska högskolan.
    Gustafsson, Oscar
    Linköpings universitet, Institutionen för systemteknik, Elektroniksystem. Linköpings universitet, Tekniska högskolan.
    A 512-point 8-parallel pipelined feedforward FFT for WPAN2011Ingår i: 2011 Conference Record of the Forty Fifth Asilomar Conference on Signals, Systems and Computers (ASILOMAR), IEEE , 2011, s. 981-984Konferensbidrag (Refereegranskat)
    Abstract [en]

    This paper presents a 512-point feedforward FFT architecture for wireless personal area network (WPAN). The architecture processes a continuous flow of 8 samples in parallel, leading to a throughput of 2.64 GSamples/s. The FFT is computed in three stages that use radix-8 butterflies. This radix reduces significantly the number of rotators with respect to previous approaches based on radix-2. Besides, the proposed architecture uses the minimum memory that is required for a 512-point 8-parallel FFT. Experimental results show that besides its high throughput, the design is efficient in area and power consumption, improving the results of previous approaches. Specifically, for a wordlength of 16 bits, the proposed design consumes 61.5 mW and its area is 1.43 mm2.

  • 2.
    Ambuluri, Sreehari
    et al.
    Linköpings universitet, Institutionen för systemteknik. Linköpings universitet, Tekniska fakulteten.
    Garrido, Mario
    Linköpings universitet, Institutionen för systemteknik, Elektroniksystem. Linköpings universitet, Tekniska högskolan.
    Caffarena, Gabriel
    Boadilla del Monte, Madrid, Spain.
    Ogniewski, Jens
    Linköpings universitet, Institutionen för systemteknik, Informationskodning. Linköpings universitet, Tekniska fakulteten.
    Ragnemalm, Ingemar
    Linköpings universitet, Institutionen för systemteknik, Informationskodning. Linköpings universitet, Tekniska fakulteten.
    New Radix-2 and Radix-22 Constant Geometry Fast Fourier Transform Algorithms For GPUs2013Konferensbidrag (Refereegranskat)
    Abstract [en]

    This paper presents new radix-2 and radix-22 constant geometry fast Fourier transform (FFT) algorithms for graphics processing units (GPUs). The algorithms combine the use of constant geometry with special scheduling of operations and distribution among the cores. Performance tests on current GPUs show a significant improvements compared to the most recent version of NVIDIA’s well-known CUFFT, achieving speedups of up to 5.6x.

  • 3.
    Boopal, Padma Prasad
    et al.
    Linköpings universitet, Institutionen för systemteknik, Elektroniksystem. Linköpings universitet, Tekniska högskolan.
    Garrido Gálvez, Mario
    Linköpings universitet, Institutionen för systemteknik, Elektroniksystem. Linköpings universitet, Tekniska högskolan.
    Gustafsson, Oscar
    Linköpings universitet, Institutionen för systemteknik, Elektroniksystem. Linköpings universitet, Tekniska högskolan.
    A Reconfigurable FFT Architecture for Variable-Length and Multi-Streaming OFDM Standards2013Ingår i: IEEE International Symposium on Circuits and Systems (ISCAS), 2013, IEEE , 2013, s. 2066-2070Konferensbidrag (Refereegranskat)
    Abstract [en]

    This paper presents a reconfigurable FFT architecture for variable-length and multi-streaming WiMax wireless standard. The architecture processes 1 stream of 2048-point FFT, up to 2 streams of 1024-point FFT or up to 4 streams of 512-point FFT. The architecture consists of a modified radix-2 single delay feedback (SDF) FFT. The sampling frequency of the system is varied in accordance with the FFT length. The latch-free clock gating technique is used to reduce power consumption. The proposed architecture has been synthesized for the Virtex-6 XCVLX760 FPGA. Experimental results show that the architecture achieves the throughput that is required by the WiMax standard and the design has additional features compared to the previous approaches. The design uses 1% of the total available FPGA resources and maximum clock frequency of 313.67 MHz is achieved. Furthermore, this architecture can be expanded to suit other wireless standards.

  • 4.
    Chen, Sau-Gee
    et al.
    National Chiao Tung University, Taiwan.
    Huang, Shen-Jui
    Novatek Corp, Taiwan.
    Garrido Gálvez, Mario
    Linköpings universitet, Institutionen för systemteknik, Datorteknik. Linköpings universitet, Tekniska högskolan.
    Jou, Shyh-Jye
    National Chiao Tung University, Taiwan.
    Continuous-flow Parallel Bit-Reversal Circuit for MDF and MDC FFT Architectures2014Ingår i: IEEE Transactions on Circuits and Systems Part 1: Regular Papers, ISSN 1549-8328, E-ISSN 1558-0806, Vol. 61, nr 10, s. 2869-2877Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    This paper presents a bit reversal circuit for continuous-flow parallel pipelined FFT processors. In addition to two flexible commutators, the circuit consists of two memory groups, where each group has P memory banks. For the consideration of achieving both low delay time and area complexity, a novel write/read scheduling mechanism is devised, so that FFT outputs can be stored in those memory banks in an optimized way. The proposed scheduling mechanism can write the current successively generated FFT output data samples to the locations without any delay right after they are successively released by the previous symbol. Therefore, total memory space of only N data samples is enough for continuous-flow FFT operations. Since read operation is not overlapped with write operation during the entire period, only single-port memory is required, which leads to great area reduction. The proposed bit-reversal circuit architecture can generate natural-order FFT output and support variable power-of-2 FFT lengths.

  • 5.
    Garrido Gálvez, Mario
    Linköpings universitet, Institutionen för systemteknik, Datorteknik. Linköpings universitet, Tekniska fakulteten.
    A New Representation of FFT Algorithms Using Triangular Matrices2016Ingår i: IEEE Transactions on Circuits and Systems Part 1: Regular Papers, ISSN 1549-8328, E-ISSN 1558-0806, Vol. 63, nr 10, s. 1737-1745Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    In this paper we propose a new representation for FFT algorithms called the triangular matrix representation. This representation is more general than the binary tree representation and, therefore, it introduces new FFT algorithms that were not discovered before. Furthermore, the new representation has the advantage that it is simple and easy to understand, as each FFT algorithm only consists of a triangular matrix. Besides, the new representation allows for obtaining the exact twiddle factor values in the FFT flow graph easily. This facilitates the design of FFT hardware architectures. As a result, the triangular matrix representation is an excellent alternative to represent FFT algorithms and it opens new possibilities in the exploration and understanding of the FFT.

  • 6.
    Garrido Gálvez, Mario
    Linköpings universitet, Institutionen för systemteknik, Datorteknik. Linköpings universitet, Tekniska fakulteten.
    The Feedforward Short-Time Fourier Transform2016Ingår i: IEEE Transactions on Circuits and Systems - II - Express Briefs, ISSN 1549-7747, E-ISSN 1558-3791, Vol. 63, nr 9, s. 868-872Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    This brief presents the feedforward short-time Fourier transform (STFT). This new approach is based on reusing the calculations of the STFT at consecutive time instants. This leads to significant savings in hardware components with respect to fast Fourier transform based STFTs. Furthermore, the feedforward STFT does not have the accumulative error of iterative STFT approaches. As a result, the proposed feedforward STFT presents an excellent tradeoff between hardware utilization and performance.

  • 7.
    Garrido Gálvez, Mario
    et al.
    Linköpings universitet, Institutionen för systemteknik, Datorteknik. Linköpings universitet, Tekniska fakulteten.
    Andersson, Rikard
    Linköpings universitet, Institutionen för systemteknik, Fordonssystem. Linköpings universitet, Tekniska högskolan.
    Qureshi, Fahad
    Tampere University of Technology, Finland.
    Gustafsson, Oscar
    Linköpings universitet, Institutionen för systemteknik, Datorteknik. Linköpings universitet, Tekniska fakulteten.
    Multiplierless Unity-Gain SDF FFTs2016Ingår i: IEEE Transactions on Very Large Scale Integration (vlsi) Systems, ISSN 1063-8210, E-ISSN 1557-9999, Vol. 24, nr 9, s. 3003-3007Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    In this brief, we propose a novel approach to implement multiplierless unity-gain single-delay feedback fast Fourier transforms (FFTs). Previous methods achieve unity-gain FFTs by using either complex multipliers or nonunity-gain rotators with additional scaling compensation. Conversely, this brief proposes unity-gain FFTs without compensation circuits, even when using nonunity-gain rotators. This is achieved by a joint design of rotators, so that the entire FFT is scaled by a power of two, which is then shifted to unity. This reduces the amount of hardware resources of the FFT architecture, while having high accuracy in the calculations. The proposed approach can be applied to any FFT size, and various designs for different FFT sizes are presented.

  • 8.
    Garrido Gálvez, Mario
    et al.
    Linköpings universitet, Institutionen för systemteknik, Datorteknik. Linköpings universitet, Tekniska fakulteten.
    Angel Sanchez, Miguel
    University of Politecn Madrid, Spain.
    Luisa Lopez-Vallejo, Maria
    University of Politecn Madrid, Spain.
    Grajal, Jesus
    University of Politecn Madrid, Spain.
    A 4096-Point Radix-4 Memory-Based FFT Using DSP Slices2017Ingår i: IEEE Transactions on Very Large Scale Integration (vlsi) Systems, ISSN 1063-8210, E-ISSN 1557-9999, Vol. 25, nr 1, s. 375-379Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    This brief presents a novel 4096-point radix-4 memory-based fast Fourier transform (FFT). The proposed architecture follows a conflict-free strategy that only requires a total memory of size N and a few additional multiplexers. The control is also simple, as it is generated directly from the bits of a counter. Apart from the low complexity, the FFT has been implemented on a Virtex-5 field programmable gate array (FPGA) using DSP slices. The goal has been to reduce the use of distributed logic, which is scarce in the target FPGA. With this purpose, most of the hardware has been implemented in DSP48E. As a result, the proposed FPGA is efficient in terms of hardware resources, as is shown by the experimental results.

  • 9.
    Garrido Gálvez, Mario
    et al.
    Linköpings universitet, Institutionen för systemteknik, Elektroniksystem. Linköpings universitet, Tekniska högskolan.
    Grajal, J.
    University of Politecn Madrid, Spain .
    Continuous-flow variable-length memoryless linear regression architecture2013Ingår i: Electronics Letters, ISSN 0013-5194, E-ISSN 1350-911X, Vol. 49, nr 24, s. 1567-1568Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    A pipelined circuit to calculate linear regression is presented. The proposed circuit has the advantages that it can process a continuous flow of data, it does not need memory to store the input samples and supports variable length that can be reconfigured in run time. The circuit is efficient in area, as it consists of a small number of adders, multipliers and dividers. These features make it very suitable for real-time applications, as well as for calculating the linear regression of a large number of samples.

  • 10.
    Garrido Gálvez, Mario
    et al.
    Linköpings universitet, Institutionen för systemteknik, Elektroniksystem. Linköpings universitet, Tekniska högskolan.
    Grajal, J
    University of Politecn Madrid, Spain .
    Sanchez, M A.
    University of Politecn Madrid, Spain .
    Gustafsson, Oscar
    Linköpings universitet, Institutionen för systemteknik, Elektroniksystem. Linköpings universitet, Tekniska högskolan.
    Pipelined Radix-2(k) Feedforward FFT Architectures2013Ingår i: IEEE Transactions on Very Large Scale Integration (vlsi) Systems, ISSN 1063-8210, E-ISSN 1557-9999, Vol. 21, nr 1, s. 23-32Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    The appearance of radix-2(2) was a milestone in the design of pipelined FFT hardware architectures. Later, radix-2(2) was extended to radix-2(k). However, radix-2(k) was only proposed for single-path delay feedback (SDF) architectures, but not for feedforward ones, also called multi-path delay commutator (MDC). This paper presents the radix-2(k) feedforward (MDC) FFT architectures. In feedforward architectures radix-2(k) canbe used for any number of parallel samples which is a power of two. Furthermore, both decimation in frequency (DIF) and decimation in time (DIT) decompositions can be used. In addition to this, the designs can achieve very high throughputs, which makes them suitable for the most demanding applications. Indeed, the proposed radix-2(k) feedforward architectures require fewer hardware resources than parallel feedback ones, also called multi-path delay feedback (MDF), when several samples in parallel must be processed. As a result, the proposed radix-2(k) feedforward architectures not only offer an attractive solution for current applications, but also open up a new research line on feedforward structures.

  • 11.
    Garrido Gálvez, Mario
    et al.
    Linköpings universitet, Institutionen för systemteknik, Elektroniksystem. Linköpings universitet, Tekniska högskolan.
    Grajal, Jesus
    University of Politecn Madrid.
    Gustafsson, Oscar
    Linköpings universitet, Institutionen för systemteknik, Elektroniksystem. Linköpings universitet, Tekniska högskolan.
    Optimum Circuits for Bit Reversal2011Ingår i: IEEE Transactions on Circuits and Systems - II - Express Briefs, ISSN 1549-7747, E-ISSN 1558-3791, Vol. 58, nr 10, s. 657-661Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    This brief presents novel circuits for calculating bit reversal on a series of data. The circuits are simple and consist of buffers and multiplexers connected in series. The circuits are optimum in two senses: they use the minimum number of registers that are necessary for calculating the bit reversal and have minimum latency. This makes them very suitable for calculating the bit reversal of the output frequencies in hardware fast Fourier transform (FFT) architectures. This brief also proposes optimum solutions for reordering the output frequencies of the FFT when different common radices are used, including radix-2, radix-2(k), radix-4, and radix-8.

  • 12.
    Garrido Gálvez, Mario
    et al.
    Linköpings universitet, Institutionen för systemteknik, Elektroniksystem. Linköpings universitet, Tekniska högskolan.
    Gustafsson, Oscar
    Linköpings universitet, Institutionen för systemteknik, Elektroniksystem. Linköpings universitet, Tekniska högskolan.
    Grajal, Jesus
    University of Politecn Madrid.
    Accurate Rotations Based on Coefficient Scaling2011Ingår i: IEEE Transactions on Circuits and Systems - II - Express Briefs, ISSN 1549-7747, E-ISSN 1558-3791, Vol. 58, nr 10, s. 662-666Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    This brief presents a novel approach for improving the accuracy of rotations implemented by complex multipliers, based on scaling the complex coefficients that define these rotations. A method for obtaining the optimum coefficients that lead to the lowest error is proposed. This approach can be used to get more accurate rotations without increasing the coefficient word length and to reduce the word length without increasing the rotation error. This brief analyzes two different situations where the optimization method can be applied: rotations that can be optimized independently and sets of rotations that require the same scaling. These cases appear in important signal processing algorithms such as the discrete cosine transform and the fast Fourier transform (FFT). Experimental results show that the use of scaling for the coefficients clearly improves the accuracy of the algorithms. For instance, improvements of about 8 dB in the Frobenius norm of the FFT are achieved with respect to using non-scaled coefficients.

  • 13.
    Garrido Gálvez, Mario
    et al.
    Linköpings universitet, Institutionen för systemteknik, Datorteknik. Linköpings universitet, Tekniska fakulteten.
    Huang, Shen-Jui
    Novatek Corp, Taiwan.
    Chen, Sau-Gee
    Natl Chiao Tung Univ, Taiwan.
    Feedforward FFT Hardware Architectures Based on Rotator Allocation2018Ingår i: IEEE Transactions on Circuits and Systems Part 1: Regular Papers, ISSN 1549-8328, E-ISSN 1558-0806, Vol. 65, nr 2, s. 581-592Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    In this paper, we present new feedforward FFT hardware architectures based on rotator allocation. The rotator allocation approach consists in distributing the rotations of the FFT in such a way that the number of edges in the FFT that need rotators and the complexity of the rotators are reduced. Radix-2 and radix-2(k) feedforward architectures based on rotator allocation are presented in this paper. Experimental results show that the proposed architectures reduce the hardware cost significantly with respect to previous FFT architectures.

  • 14.
    Garrido Gálvez, Mario
    et al.
    Linköpings universitet, Institutionen för systemteknik, Datorteknik. Linköpings universitet, Tekniska fakulteten.
    Huang, Shen-Jui
    Novatek Corp, Taiwan.
    Chen, Sau-Gee
    National Chiao Tung University, Taiwan.
    Gustafsson, Oscar
    Linköpings universitet, Institutionen för systemteknik, Datorteknik. Linköpings universitet, Tekniska fakulteten.
    The Serial Commutator FFT2016Ingår i: IEEE Transactions on Circuits and Systems - II - Express Briefs, ISSN 1549-7747, E-ISSN 1558-3791, Vol. 63, nr 10, s. 974-978Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    This brief presents a new type of fast Fourier transform (FFT) hardware architectures called serial commutator (SC) FFT. The SC FFT is characterized by the use of circuits for bit-dimension permutation of serial data. The proposed architectures are based on the observation that, in the radix-2 FFT algorithm, only half of the samples at each stage must be rotated. This fact, together with a proper data management, makes it possible to allocate rotations only every other clock cycle. This allows for simplifying the rotator, halving the complexity with respect to conventional serial FFT architectures. Likewise, the proposed approach halves the number of adders in the butterflies with respect to previous architectures. As a result, the proposed architectures use the minimum number of adders, rotators, and memory that are necessary for a pipelined FFT of serial data, with 100% utilization ratio.

  • 15.
    Garrido Gálvez, Mario
    et al.
    Linköpings universitet, Institutionen för systemteknik, Datorteknik. Linköpings universitet, Tekniska fakulteten.
    Källström, Petter
    Linköpings universitet, Institutionen för systemteknik, Datorteknik. Linköpings universitet, Tekniska fakulteten.
    Kumm, Martin
    University of Kassel, Germany.
    Gustafsson, Oscar
    Linköpings universitet, Institutionen för systemteknik, Datorteknik. Linköpings universitet, Tekniska fakulteten.
    CORDIC II: A New Improved CORDIC Algorithm2016Ingår i: IEEE Transactions on Circuits and Systems - II - Express Briefs, ISSN 1549-7747, E-ISSN 1558-3791, Vol. 63, nr 2, s. 186-190Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    In this brief, we present the CORDIC II algorithm. Like previous CORDIC algorithms, the CORDIC II calculates rotations by breaking down the rotation angle into a series of microrotations. However, the CORDIC II algorithm uses a novel angle set, different from the angles used in previous CORDIC algorithms. The new angle set provides a faster convergence that reduces the number of adders with respect to previous approaches.

  • 16.
    Garrido Gálvez, Mario
    et al.
    Linköpings universitet, Institutionen för systemteknik, Datorteknik. Linköpings universitet, Tekniska fakulteten.
    Lopez-Vallejo, Maria Luisa
    Tech Univ Madrid, Spain.
    Chen, Sau-Gee
    Natl Chiao Tung Univ, Taiwan.
    Guest Editorial: Special Section on Fast Fourier Transform (FFT) Hardware Implementations2018Ingår i: Journal of Signal Processing Systems, ISSN 1939-8018, E-ISSN 1939-8115, Vol. 90, nr 11, s. 1581-1582Artikel i tidskrift (Övrigt vetenskapligt)
    Abstract [en]

    n/a

  • 17.
    Garrido Gálvez, Mario
    et al.
    Linköpings universitet, Institutionen för systemteknik, Datorteknik. Linköpings universitet, Tekniska högskolan.
    Qureshi, Fahad
    Linköpings universitet, Institutionen för systemteknik, Elektroniksystem. Linköpings universitet, Tekniska högskolan.
    Gustafsson, Oscar
    Linköpings universitet, Institutionen för systemteknik, Datorteknik. Linköpings universitet, Tekniska högskolan.
    Low-Complexity Multiplierless Constant Rotators Based on Combined Coefficient Selection and Shift-and-Add Implementation (CCSSI)2014Ingår i: IEEE Transactions on Circuits and Systems Part 1: Regular Papers, ISSN 1549-8328, E-ISSN 1558-0806, Vol. 61, nr 7, s. 2002-2012Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    This paper presents a new approach to design multiplierless constant rotators. The approach is based on a combined coefficient selection and shift-and-add implementation (CCSSI) for the design of the rotators. First, complete freedom is given to the selection of the coefficients, i.e., no constraints to the coefficients are set in advance and all the alternatives are taken into account. Second, the shift-and-add implementation uses advanced single constant multiplication (SCM) and multiple constant multiplication (MCM) techniques that lead to low-complexity multiplierless implementations. Third, the design of the rotators is done by a joint optimization of the coefficient selection and shift-and-add implementation. As a result, the CCSSI provides an extended design space that offers a larger number of alternatives with respect to previous works. Furthermore, the design space is explored in a simple and efficient way. The proposed approach has wide applications in numerous hardware scenarios. This includes rotations by single or multiple angles, rotators in single or multiple branches, and different scaling of the outputs. Experimental results for various scenarios are provided. In all of them, the proposed approach achieves significant improvements with respect to state of the art.

  • 18.
    Garrido Gálvez, Mario
    et al.
    Linköpings universitet, Institutionen för systemteknik, Datorteknik. Linköpings universitet, Tekniska fakulteten.
    Unnikrishnan, Nanda K.
    Univ Minnesota, MN 55455 USA.
    Parhi, Keshab K.
    Univ Minnesota, MN 55455 USA.
    A Serial Commutator Fast Fourier Transform Architecture for Real-Valued Signals2018Ingår i: IEEE Transactions on Circuits and Systems - II - Express Briefs, ISSN 1549-7747, E-ISSN 1558-3791, Vol. 65, nr 11, s. 1693-1697Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    This brief presents a novel pipelined architecture to compute the fast Fourier transform of real input signals in a serial manner, i.e., one sample is processed per cycle. The proposed architecture, referred to as real-valued serial commutator, achieves full hardware utilization by mapping each stage of the fast Fourier transform (FFT) to a half-butterfly operation that operates on real input signals. Prior serial architectures to compute FFT of real signals only achieved 50% hardware utilization. Novel data-exchange and data-reordering circuits are also presented. The complete serial commutator architecture requires 2 log(2) N - 2 real adders, log(2) N - 2 real multipliers, and N + 9 log(2) N - 19 real delay elements, where N represents the size of the FFT.

  • 19.
    Garrido, Mario
    Linköpings universitet, Institutionen för systemteknik, Datorteknik. Linköpings universitet, Tekniska fakulteten.
    Multiplexer and Memory-Efficient Circuits for Parallel Bit Reversal2019Ingår i: IEEE Transactions on Circuits and Systems - II - Express Briefs, ISSN 1549-7747, E-ISSN 1558-3791, Vol. 66, nr 4, s. 657-661Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    This brief presents novel circuits for calculating the bit reversal on parallel data. The circuits consist of delays/memories and multiplexers, and have the advantage that they requires the minimum number of multiplexers among circuits for parallel bit reversal so far, as well as a small total memory.

  • 20.
    Garrido, Mario
    et al.
    Linköpings universitet, Institutionen för systemteknik, Datorteknik. Linköpings universitet, Tekniska fakulteten.
    Acevedo, Miguel
    Linköpings universitet, Institutionen för systemteknik. Linköpings universitet, Tekniska fakulteten.
    Ehliar, Andreas
    Linköpings universitet, Institutionen för systemteknik, Datorteknik. Linköpings universitet, Tekniska fakulteten.
    Gustafsson, Oscar
    Linköpings universitet, Institutionen för systemteknik, Datorteknik. Linköpings universitet, Tekniska fakulteten.
    Challenging the Limits of FFT Performance on FPGAs2014Konferensbidrag (Refereegranskat)
    Abstract [en]

    This paper analyzes the limits of FFT performance on FPGAs. For this purpose, a FFT generation tool has been developed. This tool is highly parameterizable and allows for generating FFTs with different FFT sizes and amount of parallelization. Experimental results for FFT sizes from 16 to 65536, and 4 to 64 parallel samples have been obtained. They show that even the largest FFT architectures fit well in today's FPGAs, achieving throughput rates from several GSamples/s to tens of GSamples/s.

  • 21.
    Garrido, Mario
    et al.
    Universidad Politecnica de Madrid, Spain.
    Grajal, J.
    Universidad Politecnica de Madrid, Spain.
    Efficient Memoryless Cordic for FFT Computation2007Ingår i: Efficient Memoryless Cordic for FFT Computation, IEEE , 2007, s. II-113-II-116Konferensbidrag (Refereegranskat)
    Abstract [en]

    A new memoryless CORDIC algorithm for the FFT computation is proposed in this paper. This approach calculates the direction of the micro-rotations from the control counter of the FFT, so the area of the rotator hardly depends on the number of rotations, which is particularly suitable for the computation of FFTs of a high number of points. Moreover, the new CORDIC presents other advantages such as the simplification of the basic CORDIC processor used to calculate the micro-rotations, or an easy way to compensate the intrinsic gain of the CORDIC algorithm. Additionally, the VLSI implementation of the algorithm is a pipeline architecture with high performance in terms of speed, throughput and latency.

  • 22.
    Garrido, Mario
    et al.
    Linköpings universitet, Institutionen för systemteknik, Datorteknik. Linköpings universitet, Tekniska fakulteten.
    Grajal, Jesus
    Univ Politecn Madrid, Spain.
    Gustafsson, Oscar
    Linköpings universitet, Institutionen för systemteknik, Datorteknik. Linköpings universitet, Tekniska fakulteten.
    Optimum Circuits for Bit-Dimension Permutations2019Ingår i: IEEE Transactions on Very Large Scale Integration (vlsi) Systems, ISSN 1063-8210, E-ISSN 1557-9999, Vol. 27, nr 5, s. 1148-1160Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    In this paper, we present a systematic approach to design hardware circuits for bit-dimension permutations. The proposed approach is based on decomposing any bit-dimension permutation into elementary bit-exchanges. Such decomposition is proven to achieve the theoretical minimum number of delays required for the permutation. This offers optimum solutions for multiple well-known problems in the literature that make use of bit-dimension permutations. This includes the design of permutation circuits for the fast Fourier transform, bit reversal, matrix transposition, stride permutations, and Viterbi decoders.

  • 23.
    Garrido, Mario
    et al.
    Linköpings universitet, Institutionen för systemteknik, Datorteknik. Linköpings universitet, Tekniska fakulteten.
    Möller, K.
    University of Kassel, Kassel, Germany.
    Kumm, M.
    University of Kassel, Kassel, Germany.
    World’s Fastest FFT Architectures: Breaking the Barrier of 100 GS/s2019Ingår i: IEEE Transactions on Circuits and Systems Part 1: Regular Papers, ISSN 1549-8328, E-ISSN 1558-0806, Vol. 66, nr 4, s. 1507-1516Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    This paper presents the fastest fast Fourier transform (FFT) hardware architectures so far. The architectures are based on a fully parallel implementation of the FFT algorithm. In order to obtain the highest throughput while keeping the resource utilization low, we base our design on making use of advanced shift-and-add techniques to implement the rotators and on selecting the most suitable FFT algorithms for these architectures. Apart from high throughput and resource efficiency, we also guarantee high accuracy in the proposed architectures. For the implementation, we have developed an automatic tool that generates the architectures as a function of the FFT size, input word length and accuracy of the rotations. We provide experimental results covering various FFT sizes, FFT algorithms, and field-programmable gate array boards. These results show that it is possible to break the barrier of 100 GS/s for FFT calculation.

  • 24.
    Garrido, Mario
    et al.
    Department of Signal, Systems and Radiocommuncations, Universidad Politecnica de Madrid, Madrid, Spain.
    Parhi, Keshab. K.
    University of Minnesota, USA.
    Grajal, Jesus
    Department of Signal, Systems and Radiocommuncations, Universidad Politecnica de Madrid, Madrid, Spain.
    A Pipelined FFT Architecture for Real-Valued Signals2009Ingår i: IEEE Transactions on Circuits and Systems Part 1: Regular Papers, ISSN 1549-8328, E-ISSN 1558-0806, Vol. 56, nr 12, s. 2634-2643Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    This paper presents a new pipelined hardware archi-tecture for the computation of the real-valued fast Fourier trans-form (RFFT). The proposed architecture takes advantage of the re-duced number of operations of the RFFT with respect to the com-plex fast Fourier transform (CFFT), and requires less area whileachieving higher throughput and lower latency.The architecture is based on a novel algorithm for the computa-tion of the RFFT, which, contrary to previous approaches, presentsa regular geometry suitable for the implementation of hardwarestructures. Moreover, the algorithm can be used for both the deci-mation in time (DIT) and decimation in frequency (DIF) decompo-sitions of the RFFT and requires the lowest number of operationsreported for radix 2.Finally, as in previous works, when calculating the RFFT theoutput samples are obtained in a scrambled order. The problemof reordering these samples is solved in this paper and a pipelinedcircuit that performs this reordering is proposed.

  • 25.
    Kanders, Hans
    et al.
    Linköpings universitet, Institutionen för systemteknik. Linköpings universitet, Tekniska fakulteten.
    Mellqvist, Tobias
    Linköpings universitet, Institutionen för systemteknik. Linköpings universitet, Tekniska fakulteten.
    Garrido Gálvez, Mario
    Linköpings universitet, Institutionen för systemteknik, Datorteknik. Linköpings universitet, Tekniska fakulteten.
    Palmkvist, Kent
    Linköpings universitet, Institutionen för systemteknik, Datorteknik. Linköpings universitet, Tekniska fakulteten.
    Gustafsson, Oscar
    Linköpings universitet, Institutionen för systemteknik, Datorteknik. Linköpings universitet, Tekniska fakulteten.
    A 1 Million-Point FFT on a Single FPGA2019Ingår i: IEEE Transactions on Circuits and Systems Part 1: Regular Papers, ISSN 1549-8328, E-ISSN 1558-0806, Vol. 66, nr 10, s. 3863-3873Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    In this paper, we present the first implementation of a 1 million-point fast Fourier transform (FFT) completely integrated on a single field-programmable gate array (FPGA), without the need for external memory or multiple interconnected FPGAs. The proposed architecture is a pipelined single-delay feedback (SDF) FFT. The architecture includes a specifically designed 1 million-point rotator with high accuracy and a thorough study of the word length at the different FFT stages in order to increase the signal-to-quantization-noise ratio (SQNR) and keep the area low. This also results in low power consumption.

  • 26.
    Kovalev, Anton
    et al.
    Linköpings universitet, Institutionen för systemteknik, Datorteknik. Linköpings universitet, Tekniska fakulteten.
    Gustafsson, Oscar
    Linköpings universitet, Institutionen för systemteknik, Datorteknik. Linköpings universitet, Tekniska fakulteten.
    Garrido, Mario
    Linköpings universitet, Institutionen för systemteknik, Datorteknik. Linköpings universitet, Tekniska fakulteten.
    Implementation approaches for 512-tap 60 GSa/s chromatic dispersion FIR filters2017Ingår i: Conference Record of The Fifty-First Asilomar Conference on Signals, Systems & Computers / [ed] Michael B. Matthews, Institute of Electrical and Electronics Engineers (IEEE), 2017, s. 1779-1783Konferensbidrag (Refereegranskat)
    Abstract [en]

    In optical communication the non-ideal properties of the fibers lead to pulse widening from chromatic dispersion. One way to compensate for this is through digital signal processing. In this work, two architectures for compensation are compared. Both are designed for 60 GSa/s and 512 filter taps and implemented in the frequency domain using FFTs. It is shown that the high-speed requirements introduce constraints on possible architectural choices. Furthermore, the theoretical multiplication complexity estimates are not good predictors for the energy consumption. The results show that the implementation with 10% more multiplications per sample has half the power consumption and one third of the area consumption. The best architecture for this specification results in a power consumption of 3.12 W in a 65 nm technology, corresponding to an energy per complex filter tap of 0.10 mW/GHz.

  • 27.
    Kumm, Martin
    et al.
    Univ Kassel, Germany.
    Gustafsson, Oscar
    Linköpings universitet, Institutionen för systemteknik, Datorteknik. Linköpings universitet, Tekniska fakulteten.
    Garrido Gálvez, Mario
    Linköpings universitet, Institutionen för systemteknik, Datorteknik. Linköpings universitet, Tekniska fakulteten.
    Zipf, Peter
    Univ Kassel, Germany.
    Optimal Single Constant Multiplication Using Ternary Adders2018Ingår i: IEEE Transactions on Circuits and Systems - II - Express Briefs, ISSN 1549-7747, E-ISSN 1558-3791, Vol. 65, nr 7, s. 928-932Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    The single constant coefficient multiplication is a frequently used operation in many numeric algorithms. Extensive previous work is available on how to reduce constant multiplications to additions, subtractions, and bit shifts. However, on previous work, only common two-input adders were used. As modern field-programmable gate arrays (FPGAs) support efficient ternary adders, i.e., adders with three inputs, this brief investigates constant multiplications that are built from ternary adders in an optimal way. The results show that the multiplication with any constant up to 22 bits can be realized by only three ternary adders. Average adder reductions of more than 33% compared to optimal constant multiplication circuits using two-input adders are achieved for coefficient word sizes of more than five bits. Synthesis experiments show FPGA average slice reductions in the order of 25% and a similar or higher speed than their two-input adder counterparts.

  • 28.
    Källström, Petter
    et al.
    Linköpings universitet, Institutionen för systemteknik, Elektroniksystem. Linköpings universitet, Tekniska högskolan.
    Garrido Gálvez, Mario
    Linköpings universitet, Institutionen för systemteknik, Elektroniksystem. Linköpings universitet, Tekniska högskolan.
    Gustafsson, Oscar
    Linköpings universitet, Institutionen för systemteknik, Elektroniksystem. Linköpings universitet, Tekniska högskolan.
    Low-Complexity Rotators for the FFT Using Base-3 Signed Stages2012Ingår i: APCCAS 2012 : 2012 IEEE Asia Pacific Conference on Circuits and Systems, Piscataway, N.J., USA: IEEE , 2012, s. 519-522Konferensbidrag (Refereegranskat)
    Abstract [en]

    Rotations by angles that are fractions of the unit circle find applications in e.g. fast Fourier transform (FFT) architectures. In this work we propose a new rotator that consists of a series of stages. Each stage calculates a micro-rotation by an angle corresponding to a power-of-three fractional parts. Using a continuous powers-of-three range, it is possible to carry out all rotations required. In addition, the proposed rotators are compared to previous approaches, based of shift-and-add algorithms, showing improvements in accuracy and number of adders.

  • 29.
    Moeller, Konrad
    et al.
    Univ Kassel, Germany.
    Kumm, Martin
    Univ Kassel, Germany.
    Garrido Gálvez, Mario
    Linköpings universitet, Institutionen för systemteknik, Datorteknik. Linköpings universitet, Tekniska fakulteten.
    Zipf, Peter
    Univ Kassel, Germany.
    Optimal Shift Reassignment in Reconfigurable Constant Multiplication Circuits2018Ingår i: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, ISSN 0278-0070, E-ISSN 1937-4151, Vol. 37, nr 3, s. 710-714Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    This paper presents a new method called optimal shift reassignment (OSR), used for reconfigurable multiplication circuits. These circuits consist of adders, subtractors, shifts, and multiplexers (MUXs). They calculate the multiplication of an input number by one out of several constants which can be selected dynamically during run-time. The OSR method is based on the idea that shifts can be placed at different positions along the circuit, while the calculated output constant stays the same. This differs from previous approaches, which were limited by the fact that all constants within the constant multiplier were forced to be odd. The OSR method subsequently releases this restriction. As a result, the number of required MUXs in the circuit can be reduced. This happens when the shift reassignment aligns the shift values of different inputs of an MUX. Experimental results show MUX savings of up to 50% and average savings between 11% and 16% using the OSR method compared to previous approaches.

  • 30.
    Mohammadi Sarband, Narges
    et al.
    Linköpings universitet, Institutionen för systemteknik, Datorteknik. Linköpings universitet, Tekniska fakulteten.
    Gustafsson, Oscar
    Linköpings universitet, Institutionen för systemteknik, Datorteknik. Linköpings universitet, Tekniska fakulteten.
    Garrido, Mario
    Linköpings universitet, Institutionen för systemteknik, Datorteknik. Linköpings universitet, Tekniska fakulteten.
    Obtaining Minimum Depth Sum of Products from Multiple Constant Multiplication2018Ingår i: PROCEEDINGS OF THE 2018 IEEE INTERNATIONAL WORKSHOP ON SIGNAL PROCESSING SYSTEMS (SIPS), IEEE, Institute of Electrical and Electronics Engineers (IEEE), 2018, s. 134-139Konferensbidrag (Refereegranskat)
    Abstract [sv]

    In this work, an approach for transposing solutions to the multiple constant multiplication (MCM) problem to obtain a sum of product (SOP) computation with minimum depth is proposed. The reason for doing this is that solving the SOP problem directly is highly computationally intensive when adder graph algorithms are used. Compared to using subexpression sharing algorithms, which has a lower computational complexity, directly for the SOP problem, it is shown that the proposed approach, as expected, results in lower complexity for the SOP. It is also shown that there is no obvious way to construct the MCM solution in such a way that the SOP solution has the minimum theoretical depth. However, the proposed approach guarantees minimum depth subject to the MCM solution given as input.

  • 31.
    Qureshi, Fahad
    et al.
    Linköpings universitet, Institutionen för systemteknik, Elektroniksystem. Linköpings universitet, Tekniska högskolan.
    Garrido, Mario
    Linköpings universitet, Institutionen för systemteknik, Elektroniksystem. Linköpings universitet, Tekniska högskolan.
    Gustafsson, Oscar
    Linköpings universitet, Institutionen för systemteknik, Elektroniksystem. Linköpings universitet, Tekniska högskolan.
    Unified architecture for 2, 3, 4, 5, and 7-point DFTs based on Winograd Fourier transform algorithm2013Ingår i: Electronics Letters, ISSN 0013-5194, E-ISSN 1350-911X, Vol. 49, nr 5, s. 348-U60Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    A unified hardware architecture that can be reconfigured to calculate 2, 3, 4, 5, or 7-point DFTs is presented. The architecture is based on the Winograd Fourier transform algorithm and the complexity is equal to a 7-point DFT in terms of adders/subtractors and multipliers plus only seven multiplexers introduced to enable reconfigurability. The processing element finds potential use in memory-based FFTs, where non-power-of-two sizes are required such as in DMB-T.

  • 32.
    Sanchez, Miguel A.
    et al.
    Universidad Politecnica de Madrid, Madrid, Spain.
    Garrido, Mario
    Universidad Politecnica de Madrid, Madrid, Spain.
    Lopez-Vallejo, Marisa
    Universidad Politecnica de Madrid, Madrid, Spain.
    Grajal, Jesus
    Universidad Politecnica de Madrid, Madrid, Spain.
    Implementing FFT-based Digital Channelized Receivers on FPGA Platforms2008Ingår i: IEEE Transactions on Aerospace and Electronic Systems, ISSN 0018-9251, E-ISSN 1557-9603, Vol. 44, nr 4, s. 1567-1585Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    This paper presents an in-depth study of the implementationand characterization of fast Fourier transform (FFT) pipelinedarchitectures suitable for broadband digital channelized receivers.When implementing the FFT algorithm on field-programmablegate array (FPGA) platforms, the primary goal is to maximizethroughput and minimize area. Feedback and feedforwardarchitectures have been analyzed regarding key designparameters: radix, bitwidth, number of points and stage scaling.Moreover, a simplification of the FFT algorithm, the monobitFFT, has been implemented in order to achieve faster real timeperformance in broadband digital receivers. The influence ofthe hardware implementation on the performance of digitalchannelized receivers has been analyzed in depth, revealinginteresting implementation trade-offs which should be taken intoaccount when designing this kind of signal processing systems onFPGA platforms.

  • 33.
    Unnikrishnan, Nanda K.
    et al.
    Univ Minnesota, MN 55455 USA.
    Garrido Gálvez, Mario
    Linköpings universitet, Institutionen för systemteknik, Datorteknik. Linköpings universitet, Tekniska fakulteten.
    Parhi, Keshab K.
    Univ Minnesota, MN 55455 USA.
    Effect of Finite Word-Length on SQNR, Area and Power for Real-Valued Serial FFT2019Ingår i: 2019 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), IEEE , 2019Konferensbidrag (Refereegranskat)
    Abstract [en]

    Modern applications for DSP systems are increasingly constrained by tight area and power requirements. Therefore, it is imperative to analyze effective strategies that work within these requirements. This paper studies the impact of finite word-length arithmetic on the signal to quantization noise ratio (SQNR), power and area for a real-valued serial FFT implementation. An experiment is set up using a hardware description language (HDL) to empirically determine the tradeoffs associated with the following parameters: (i) the input word-length, (ii) the word-length of the rotation coefficients, and (iii) length of the FFT on performance (SQNR), power and area. The results of this paper can be used to make design decisions by careful selection of word-length to achieve a reduction in area and power for an acceptable loss in SQNR.

1 - 33 av 33
RefereraExporteraLänk till träfflistan
Permanent länk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf