liu.seSearch for publications in DiVA
Change search
Link to record
Permanent link

Direct link
BETA
Wu, Di
Alternative names
Publications (10 of 21) Show all publications
Asghar, R., Wu, D., Saeed, A., Huang, Y. & Liu, D. (2012). Implementation of a Radix-4, Parallel Turbo Decoder and Enabling the Multi-Standard Support. Journal of Signal Processing Systems, 66(1), 25-41
Open this publication in new window or tab >>Implementation of a Radix-4, Parallel Turbo Decoder and Enabling the Multi-Standard Support
Show others...
2012 (English)In: Journal of Signal Processing Systems, ISSN 1939-8018, E-ISSN 1939-8115, Vol. 66, no 1, p. 25-41Article in journal (Refereed) Published
Abstract [en]

This paper presents a unified, radix-4 implementation of turbo decoder, covering multiple standards such as DVB, WiMAX, 3GPP-LTE and HSPA Evolution. The radix-4, parallel interleaver is the bottleneck while using the same turbo-decoding architecture for multiple standards. This paper covers the issues associated with design of radix-4 parallel interleaver to reach to flexible turbo-decoder architecture. Radix-4, parallel interleaver algorithms and their mapping on to hardware architecture is presented for multi-mode operations. The overheads associated with hardware multiplexing are found to be least significant. Other than flexibility for the turbo decoder implementation, the low silicon cost and low power aspects are also addressed by optimizing the storage scheme for branch metrics and extrinsic information. The proposed unified architecture for radix-4 turbo decoding consumes 0.65 mm(2) area in total in 65 nm CMOS process. With 4 SISO blocks used in parallel and 6 iterations, it can achieve a throughput up to 173.3 Mbps while consuming 570 mW power in total. It provides a good trade-off between silicon cost, power consumption and throughput with silicon efficiency of 0.005 mm(2)/Mbps and energy efficiency of 0.55 nJ/b/iter.

Place, publisher, year, edition, pages
Springer Verlag (Germany), 2012
Keywords
Turbo codes, VLSI implementation, Radix-4, Parallel interleaver, Multimode
National Category
Engineering and Technology
Identifiers
urn:nbn:se:liu:diva-75110 (URN)10.1007/s11265-010-0521-6 (DOI)000299530700004 ()
Note
Funding Agencies|European Commission||Ericson AB||Infineon Austria AG||IMEC||Lund University||KU-Leuven||Available from: 2012-02-21 Created: 2012-02-17 Last updated: 2017-12-07
Wu, D., Eilert, J. & Liu, D. (2011). Implementation of a High-Speed MIMO Soft-Output Symbol Detector for Software Defined Radio. Journal of Signal Processing Systems, 63(1), 27-37
Open this publication in new window or tab >>Implementation of a High-Speed MIMO Soft-Output Symbol Detector for Software Defined Radio
2011 (English)In: Journal of Signal Processing Systems, ISSN 1939-8115, Vol. 63, no 1, p. 27-37Article in journal (Refereed) Published
Abstract [en]

This paper presents a programmable MMSE soft-output MIMO symbol detector that supports 600 Mbps data rate defined in 802.11n. The detector is implemented using a multi-core floating-point processor and configurable soft-bit demapper. Owing to the dynamic range supplied by the floating-point SIMD datapath, special algorithms can be adopted to reduce the computational latency of channel processing with sufficient numerical stability for large channel matrices. When compared to several existing fixed-functional solutions, the detector proposed in this paper is smaller and faster. More important, it is programmable and configurable so that it can support various MIMO transmission schemes defined by different standards.

Place, publisher, year, edition, pages
New York: Springer, 2011
Keywords
SDR, MIMO, OFDM, MMSE, Soft-output, Detection, VLSI
National Category
Engineering and Technology
Identifiers
urn:nbn:se:liu:diva-50666 (URN)10.1007/s11265-009-0369-9 (DOI)000289166100003 ()
Available from: 2009-10-13 Created: 2009-10-13 Last updated: 2011-04-26Bibliographically approved
Asghar, R., Wu, D., Eilert, J. & Liu, D. (2010). Memory Conflict Analysis and Implementation of a Re-configurable Interleaver Architecture Supporting Unified Parallel Turbo Decoding. Journal of Signal Processing Systems for Signal, Image, and Video Technology, 60(1), 15-29
Open this publication in new window or tab >>Memory Conflict Analysis and Implementation of a Re-configurable Interleaver Architecture Supporting Unified Parallel Turbo Decoding
2010 (English)In: Journal of Signal Processing Systems for Signal, Image, and Video Technology, ISSN 1939-8018, Vol. 60, no 1, p. 15-29Article in journal (Refereed) Published
Abstract [en]

This paper presents a novel hardware interleaver architecture for unified parallel turbo decoding. The architecture is fully re-configurable among multiple standards like HSPA Evolution, DVB-SH, 3GPP-LTE and WiMAX. Turbo codes being widely used for error correction in today’s consumer electronics are prone to introduce higher latency due to bigger block sizes and multiple iterations. Many parallel turbo decoding architectures have recently been proposed to enhance the channel throughput but the interleaving algorithms used indifferent standards do not freely allow using them due to higher percentage of memory conflicts. The architecture presented in this paper provides a re-configurable platform for implementing the parallel interleavers for different standards by managing the conflicts involved in each. The memory conflicts are managed by applying different approaches like stream misalignment, memory division and use of small FIFO buffer. The proposed flexible architecture is low cost and consumes 0.085 mm2 area in 65nm CMOS process. It can implement up to 8 parallel interleavers and can operate at a frequency of 200 MHz, thus providing significant support to higher throughput systems based on parallel SISO processors.

National Category
Engineering and Technology
Identifiers
urn:nbn:se:liu:diva-25599 (URN)10.1007/s11265-009-0394-8 (DOI)000276722700002 ()
Note
The original publication is available at www.springerlink.com: Rizwan Asghar, Di Wu, Johan Eilert and Dake Liu, Memory Conflict Analysis and Implementation of a Re-configurable Interleaver Architecture Supporting Unified Parallel Turbo Decoding, 2010, Journal of Signal Processing Systems for Signal, Image, and Video Technology, (60), 1, 15-29. http://dx.doi.org/10.1007/s11265-009-0394-8 Copyright: Springer Science Business Media http://www.springerlink.com/ Available from: 2009-10-13 Created: 2009-10-08 Last updated: 2010-05-10
Wu, D., Eilert, J., Asghar, R., Liu, D., Nilsson, A., Tell, E. & Alfredsson, E. (2010). System architecture for 3GPP-LTE modem using a programmable baseband processor. International Journal of Embedded and Real-Time Communication Systems, 1(3), 44-64
Open this publication in new window or tab >>System architecture for 3GPP-LTE modem using a programmable baseband processor
Show others...
2010 (English)In: International Journal of Embedded and Real-Time Communication Systems, ISSN 1947-3176, E-ISSN 1947-3184, Vol. 1, no 3, p. 44-64Article in journal (Refereed) Published
Abstract [en]

The evolution of third generation mobile communications toward high-speed packet access and long-term evolution is ongoing and will substantially increase the throughput with higher spectral efficiency. This paper presents the system architecture of an LTE modem based on a programmable baseband processor. The architecture includes a baseband processor that handles processing time and frequency synchronization, IFFT/FFT (up to 2048-p), channel estimation and subcarrier de-mapping. The throughput and latency requirements of a Category four User Equipment (CAT4 UE) is met by adding a MIMO symbol detector and a parallel Turbo decoder supporting H-ARQ, which brings both low silicon cost and enough flexibility to support other wireless standards. The complexity demonstrated by the modem shows the practicality and advantage of using programmable baseband processors for a single-chip LTE solution. Copyright © 2010, IGI Global.

Place, publisher, year, edition, pages
IGI Global, 2010
Keywords
3GPP; Long-Term Evolution; Programmable; Radio Baseband; Software Defined Radio
National Category
Engineering and Technology
Identifiers
urn:nbn:se:liu:diva-100694 (URN)10.4018/jertcs.2010070103 (DOI)
Available from: 2013-11-11 Created: 2013-11-11 Last updated: 2017-12-06
Wu, D., Eilert, J., Asghar, R. & Liu, D. (2010). VLSI Implementation of a Fixed-Complexity Soft-Output MIMO Detector for High-Speed Wireless. EURASIP Journal on Wireless Communications and Networking, 2010(893184)
Open this publication in new window or tab >>VLSI Implementation of a Fixed-Complexity Soft-Output MIMO Detector for High-Speed Wireless
2010 (English)In: EURASIP Journal on Wireless Communications and Networking, ISSN 1687-1472, E-ISSN 1687-1499, Vol. 2010, no 893184Article in journal (Refereed) Published
Abstract [en]

This paper presents a low-complexity MIMO symbol detector with close-Maximum a posteriori performance for the emerging multiantenna enhanced high-speed wireless communications. The VLSI implementation is based on a novel MIMO detection algorithm called Modified Fixed-Complexity Soft-Output (MFCSO) detection, which achieves a good trade-off between performance and implementation cost compared to the referenced prior art. By including a microcode-controlled channel preprocessing unit and a pipelined detection unit, it is flexible enough to cover several different standards and transmission schemes. The flexibility allows adaptive detection to minimize power consumption without degradation in throughput. The VLSI implementation of the detector is presented to show that real-time MIMO symbol detection of 20 MHz bandwidth 3GPP LTE and 10 MHz WiMAX downlink physical channel is achievable at reasonable silicon cost.

Place, publisher, year, edition, pages
Hindawi, 2010
National Category
Engineering and Technology
Identifiers
urn:nbn:se:liu:diva-67289 (URN)10.1155/2010/893184 (DOI)
Available from: 2011-04-07 Created: 2011-04-07 Last updated: 2017-12-11
Wu, D., Eilert, J., Asghar, R., Liu, D. & Ge, Q. (2010). VLSI Implementation of A Multi-Standard MIMO Symbol Detector for 3GPP LTE and WiMAX. In: Wireless Telecommunications Symposium (WTS), 2010: . Paper presented at 9th IEEE Wireless Telecommunication Symposium, WTS'10 (pp. 1-4). IEEE
Open this publication in new window or tab >>VLSI Implementation of A Multi-Standard MIMO Symbol Detector for 3GPP LTE and WiMAX
Show others...
2010 (English)In: Wireless Telecommunications Symposium (WTS), 2010, IEEE , 2010, p. 1-4Conference paper, Published paper (Refereed)
Abstract [en]

In this paper, a low-complexity symbol detector is presentedtargeting the emerging 3GPP LTE andWiMAX standards. The detector isthe VLSI implementation of a novel MIMO detection algorithm recentlyproposed. Compared to the design in the reference, the detector performsbetter while consumes less silicon area. Including a microcode controlledchannel preprocessing unit and a pipelined detection unit, it is flexibleenough to cover different standards and transmission schemes whilemaintaining the power and area efficiency. Implemented using 65 nmCMOS process, the detector can support real-time detection of 20 MHzbandwidth 3GPP LTE or 10 MHz WiMAX downlink physical channel.

Place, publisher, year, edition, pages
IEEE, 2010
National Category
Engineering and Technology
Identifiers
urn:nbn:se:liu:diva-56218 (URN)10.1109/WTS.2010.5479665 (DOI)978-1-4244-6558-3 (ISBN)
Conference
9th IEEE Wireless Telecommunication Symposium, WTS'10
Available from: 2010-05-01 Created: 2010-05-01 Last updated: 2014-09-01
Wu, D., Eilert, J. & Liu, D. (2009). Evaluation of MIMO Symbol Detectors for 3GPP LTE Terminals. In: 17th European Signal Processing Conference (EUSIPCO).
Open this publication in new window or tab >>Evaluation of MIMO Symbol Detectors for 3GPP LTE Terminals
2009 (English)In: 17th European Signal Processing Conference (EUSIPCO), 2009Conference paper, Published paper (Refereed)
Abstract [en]

This paper investigates various MIMO detection methods for 3GPP LTE open-loop downlink multi-antenna transmission. Targeting VLSI implementation, these detection methods are evaluated with respect to complexity and detection performance. A realistic 3GPP LTE simulation chain is developed for the evaluation. The result shows that with the aid of Hybrid Automatic Repeat reQuest (H-ARQ), a recently proposed reduced complexity close-ML detector called MFCSO achieves a good tradeoff between achievable throughput and complexity. An adaptive transmission and detection scheme is also proposed based on user scenarios.

National Category
Engineering and Technology
Identifiers
urn:nbn:se:liu:diva-25598 (URN)
Available from: 2009-10-13 Created: 2009-10-08 Last updated: 2009-10-16
Wu, D., Asghar, R., Huang, Y. & Liu, D. (2009). Implementation of a high-speed parallel Turbo decoder for 3GPP LTE terminals. Paper presented at IEEE 8th International Conference on ASIC, ASICON'09.. IEEE
Open this publication in new window or tab >>Implementation of a high-speed parallel Turbo decoder for 3GPP LTE terminals
2009 (English)Conference paper, Published paper (Refereed)
Abstract [en]

This paper presents a parameterized parallel Turbo decoder for 3GPP LTE terminals. To support the high peak data-rate defined in the forthcoming 3GPP LTE standard, turbo decoder with a throughout beyond 150 Mbit/s is needed as a key component of the radio baseband chip. By exploiting the tradeoff of precision, speed and area consumption, a turbo decoder with eight parallel SISO units is implemented to meet the throughput requirement. The turbo decoder is synthesized, placed and routed using 65 nm CMOS technology. It achieves a throughput of 152 Mbit/s and occupies an area of 0.7 mm2 with estimated power consumption being 650 mW.

Place, publisher, year, edition, pages
IEEE, 2009
National Category
Engineering and Technology
Identifiers
urn:nbn:se:liu:diva-56217 (URN)10.1109/ASICON.2009.5351623 (DOI)000275924100117 ()978-1-4244-3868-6 (ISBN)
Conference
IEEE 8th International Conference on ASIC, ASICON'09.
Available from: 2010-05-01 Created: 2010-05-01 Last updated: 2010-08-12
Asghar, R., Wu, D., Eilert, J. & Liu, D. (2009). Memory Conflict Analysis and Interleaver Design for Parallel Turbo Decoding Supporting HSPA Evolution. In: 12th EUROMICRO Conference on Digital System Design: . Paper presented at 12th Euromicro Conference on Digital System Design: Architectures, Methods and Tools, DSD 2009; Patras; Greece (pp. 699-706).
Open this publication in new window or tab >>Memory Conflict Analysis and Interleaver Design for Parallel Turbo Decoding Supporting HSPA Evolution
2009 (English)In: 12th EUROMICRO Conference on Digital System Design, 2009, p. 699-706Conference paper, Published paper (Refereed)
Abstract [en]

HSPA evolution has raised the throughput requirements for WCDMA based systems where turbo code has been adapted to perform the error correction. Many parallel turbo decoding architectures have recently been proposed to enhance the channel throughput but the interleaving algorithm used in WCDMA based systems does not freely allows to use them due to high percentage of memory conflicts. This paper provides a comprehensive analysis for reduction of interleaver memory conflicts while generating more than one address in a single clock cycle. It also provides trade-off analysis in terms of area and power efficiency for multiple architectures for different functions involved in the interleaver design. The final architecture supports processing of two parallel SISO blocks and manages the conflicts by applying different approaches like stream misalignment, memory division and small FIFO buffer. The proposed architecture is low cost and consumes 4.3K gates at a frequency of 150MHz. This work also focuses on reduction of pre-processing overheads by introducing the segment based modulo computation, thus providing further relaxation to SISO decoding process.

National Category
Engineering and Technology
Identifiers
urn:nbn:se:liu:diva-25596 (URN)10.1109/DSD.2009.178 (DOI)000275715100094 ()978-0-7695-3782-5 (ISBN)
Conference
12th Euromicro Conference on Digital System Design: Architectures, Methods and Tools, DSD 2009; Patras; Greece
Available from: 2009-10-08 Created: 2009-10-08 Last updated: 2014-08-28
Wu, D. (2009). Scalable Multi-Standard Radio Baseband for Modern Wireless Communications. (Doctoral dissertation). Linköping: Linköping University Electronic Press
Open this publication in new window or tab >>Scalable Multi-Standard Radio Baseband for Modern Wireless Communications
2009 (English)Doctoral thesis, monograph (Other academic)
Abstract [en]

Today, owing to the rapid advancement of technologies, people can cross the geographic gap and communicate without waiting for a week to receive a mail. Meanwhile, more and more wireless communications standards are emerging, as all claimed to make our life easier. This really brings us into a dilemma: we need new technologies, not because we are fond of technical complication,

on the contrary, because we are constantly pursuing convenience and simplicity. Being tangled by so many standards for connectivity is not fun for anyone (even for people who invented these technologies). The demand is rather simple: why not put everything into one unit which can automatically attach itself to the most suitable radio access available in the circumstances? The whole purpose of this thesis is to find out an economic way of meeting such a demand.

From semiconductor industry’s point-of-view, traditional ASIC design flow is facing the challenges brought by the ever rapidly changing specification and immense tape-out cost at nanoscale. Let alone the ever increased system complexity requires painstaking and costly integration and verification.

This thesis investigates multi-tasking radio which is a concept to allow multiple radio access technologies to be supported by the same hardware platform and switched under different scenarios. By simultaneously looking at different layers of abstraction such as system modeling and simulation, architecture design, and silicon implementation, the design tradeoff for multi-tasking radio baseband is discussed.

In this dissertation, taking the emerging mobile broadband standard 3GPP LTE as the focus and other standards (e.g IEEE 802.11n and DVB) as complements, the system architecture of a multi-tasking radio platform is studied. A general multi-tasking radio baseband chain is partitioned into several functional blocks according to the processing flow and investigated separately. These blocks include synchronization, channel estimation, demodulation and channel coding. Different algorithms are evaluated for each functional block. A new multiple-input multipleoutput symbol detection algorithm “modified fixed-complexity soft-output”, in short MFCSO, is proposed and implemented in silicon. A unified synchronization unit is presented to support several standards. The architecture of channel estimator is also addressed. Finally a highspeed radix-2 Turbo decoder implementation is presented leading towards radix-4 scenario. It is worth mentioning that in this dissertation, the performance evaluation takes the complete system into consideration rather than independently analyzing an individual block. Based on this, algorithm/hardware co-optimization is carried out. Using the “Single Instruction Multiple Tasks” architecture presented earlier, by exploring the commonality of signal processing functions and choosing the proper level of hardware multiplexing, it is concluded in this dissertation that system thinking allows a harmony to be achieved for multi-tasking radio baseband design.

Place, publisher, year, edition, pages
Linköping: Linköping University Electronic Press, 2009. p. 168
Series
Linköping Studies in Science and Technology. Dissertations, ISSN 0345-7524 ; 1279
National Category
Engineering and Technology
Identifiers
urn:nbn:se:liu:diva-51579 (URN)978-91-7393-507-4 (ISBN)
Public defence
2009-11-27, Visionen, Hus B, Campus Valla, Linköpings universitet, Linköping, 10:15 (English)
Opponent
Supervisors
Available from: 2009-11-09 Created: 2009-11-09 Last updated: 2009-11-09Bibliographically approved
Organisations

Search in DiVA

Show all publications