liu.seSearch for publications in DiVA
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Massive Machine-Type Communication Pilot-Hopping Sequence Detection Architectures Based on Non-Negative Least Squares for Grant-Free Random Access
Linköping University, Department of Electrical Engineering, Computer Engineering. Linköping University, Faculty of Science & Engineering. (Computer Engineering)ORCID iD: 0000-0001-7074-3676
Linköping University, Department of Electrical Engineering, Communication Systems. Linköping University, Faculty of Science & Engineering.ORCID iD: 0000-0002-6841-3405
Linköping University, Department of Electrical Engineering, Computer Engineering. Linköping University, Faculty of Science & Engineering.ORCID iD: 0000-0003-4965-1077
Linköping University, Department of Electrical Engineering, Communication Systems. Linköping University, Faculty of Science & Engineering.ORCID iD: 0000-0002-7599-4367
Show others and affiliations
2021 (English)In: IEEE Open Journal of Circuits and Systems, ISSN 2644-1225, Vol. 2, p. 253-264Article in journal (Refereed) Published
Abstract [en]

User activity detection in grant-free random access massive machine type communication (mMTC) using pilot-hopping sequences can be formulated as solving a non-negative least squares (NNLS) problem. In this work, two architectures using different algorithms to solve the NNLS problem is proposed. The algorithms are implemented using a fully parallel approach and fixed-point arithmetic, leading to high detection rates and low power consumption. The first algorithm, fast projected gradients, converges faster to the optimal value. The second algorithm, multiplicative updates, is partially implemented in the logarithmic domain, and provides a smaller chip area and lower power consumption. For a detection rate of about one million detections per second, the chip area for the fast algorithm is about 0.7 mm 2 compared to about 0.5 mm 2 for the multiplicative algorithm when implemented in a 28 nm FD-SOI standard cell process at 1 V power supply voltage. The energy consumption is about 300 nJ/detection for the fast projected gradient algorithm using 256 iterations, leading to a convergence close to the theoretical. With 128 iterations, about 250 nJ/detection is required, with a detection performance on par with 192 iterations of the multiplicative algorithm for which about 100 nJ/detection is required.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2021. Vol. 2, p. 253-264
Keywords [en]
5G mobile communication, base stations, Internet of Things, machine-to-machine communications, MIMO
National Category
Signal Processing
Identifiers
URN: urn:nbn:se:liu:diva-179789DOI: 10.1109/ojcas.2020.3043643OAI: oai:DiVA.org:liu-179789DiVA, id: diva2:1599759
Funder
ELLIIT - The Linköping‐Lund Initiative on IT and Mobile CommunicationsAvailable from: 2021-10-01 Created: 2021-10-01 Last updated: 2024-12-13Bibliographically approved
In thesis
1. Energy-Efficient Implementation of Communication Algorithms through Complexity Reduction
Open this publication in new window or tab >>Energy-Efficient Implementation of Communication Algorithms through Complexity Reduction
2025 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Energy-efficient implementations are essential for the future and modern society, especially in digital signal processing (DSP) and communication systems, where the rapid growth of devices, such as those battery-driven internet of things (IoT) sensors, necessitates low-complexity and low-power solutions. This thesis concentrates on two areas: constant multiplication and active user detection in wire-less networks.

Constant multiplication can be implemented using a shift-and-add (SHA) network. Typically, the number of adders/subtracters is minimized, but the number of cascaded adders/subtracters (depth) also impacts the power consumption. The two classes of algorithms used to solve the problem are adder graph algorithms or sub-expression sharing algorithms. Adder graph algorithms typically yield better results for a single or a low number of inputs, since they do not depend on the representation of the numbers involved. However, they can lead to very high run times and worse results when the number of inputs is high. At the same time, it is known that it is possible to transpose the problem. For example, a sum of products (many inputs) can be transposed to a single input with multiple coefficients, meaning that it is possible to transpose the problem to the more advantageous form, solve it and then transpose the solution back. However, there has been no systematic algorithm available to obtain the transposed result that takes depth into account. In this thesis, a systematic algorithm that obtains the minimum depth of the transposed SHA network subject to the input is introduced.

The practical application of the constant multiplication problem is demonstrated through the implementation of a reconfigurable lowpass equalizer, widely used in communication systems and DSP. Various formulations of the constant multiplication problem, combined with pipelining, are explored to identify the most efficient implementation in a 28 nm FD-SOI standard cell, significantly reducing power consumption and highlighting the real-world impact of our research.

The second research focuses on the challenge of detecting active users in massive machine-type communication (mMTC) scenarios involving large numbers of devices. The problem is addressed using a pilot-hopping sequence method and is formulated as a non-negative least-squares (NNLS) problem. This work implements two NNLS algorithms, fast projected gradient (Fast) and multiplicative updates (Mult), to solve the active user detection problem. These implementations are implemented in a 28 nm FD-SOI process and are optimized for energy efficiency, chip area, and detection speed. The results demonstrate the ability to perform over a million detections per second with significantly lower energy consumption compared to existing methods. However, the implementations lack reconfigurability, and it can be argued whether the high detection rates are relevant for current practical applications.

To enhance practicality and reconfigurability, the Fast algorithm is implemented using a reconfigurable time-multiplexed architecture, reducing resources by reusing them within one iteration. This architecture employs a novel user re-ordering method to enable parallel memory access and continuous operation for successive iterations, thereby increasing the execution speed. The architecture is implemented on numerous FPGA families, demonstrating resource efficiency and reconfigurability by storing the pilot-hopping sequences in memory, while obtaining a more practically usable detection rate of about one to a few thousand detections per second depending on the FPGA family.

Abstract [sv]

Energieffektiva implementationer är avgörande för framtiden och det moderna samhället, särskilt inom digital signalbehandling (DSP) och kommunikationssystem, där den snabba tillväxten av enheter, såsom de inom Internet of Things (IoT), kräver lösningar med låg komplexitet och låg effektförbrukning. Denna avhandling fokuserar på två områden: konstantmultiplikation och detektering av aktiva användare i trådlösa nätverk.

Konstantmultiplikation kan implementeras med ett shift-and-add (SHA) nätverk. Vanligtvis så minimeras antalet adderare/subtraherare, men antalet kaskadkopplade adderare/subtraherare (djupet) påverkar också effektförbrukningen. De två klasserna av algoritmer för att lösa konstaktmultiplikation baseras på adderargrafer eller gemensamma deluttryck. Addergrafer är oftast bättre vid en eller få ingångar, eftersom de inte beror på vilken representation som talen uttrycks med. För många ingångar så kan beräkningstiden öka väldigt snabbt samt att resultaten ofta är sämre eftersom sökrymden ökar fort. Samtidigt så är det välkänt att det får att transponera problemet. Till exempel så kan en summa produkter (många ingångar) transponeras till en ingång som multipliceras med många olika konstanter. Detta leder till att det är möjligt att transponera problemet till den mest fördelaktiga formen, lösa det och sedan transponera lösningen. Det har tidigare inte funnits någon systematisk algoritm för att transponera ett nätverk som tar hänsyn till djupet, men i denna avhandling så presenteras en som garanterar minimalt djup givet en viss adderargraf som argument.

Den praktiska tillämpningen av konstantmultiplikation demonstreras genom implementering av en konfigurerbar lågpassutjämnare, som är allmänt använd i DSP- och kommunikationssystem. Olika formuleringar av konstantmultiplikationsproblemet, kombinerat med pipelining, utforskas för att identifiera den mest effektiva implementationen i en 28 nm FD-SOI-standardcell, vilket avsevärt minskar strömförbrukningen och visar den verkliga påverkan av vår forskning.

Den andra forskningen fokuserar på utmaningen att detektera aktiva användare i massive machinetype communication (mMTC) med ett stort antal enheter. Problemet hanteras med hjälp av en pilothoppingssekvensmetod och formuleras som ett problem med non-negative least-squares (NNLS). Detta arbete implementerar två NNLS-algoritmer, fast projected gradient (Fast) och multiplikativa uppdateringar (Mult), för att lösa problemet med detektering av aktiva användare. Dessa implementationer implementerade i en 28 nm FD-SOI-process och är optimerade för energieffektivitet, chiparea och detekteringshastighet. Resultaten visar att mer än en miljon detektioner per sekund kan utföras med betydligt lägre energiförbrukning än befintliga metoder, till stor del för att en iteration beräknas helt parallelt. Dock saknas möjlighet att ändra användarsekvenser och det kan diskuteras om den höga detektionstakten är praktiskt användbar.

För att öka praktiskheten implementeras Fast-algoritmen med en konfigurerbar tidsmultiplexad arkitektur, vilken sparar resurser genom att återanvända dem i en iteration. Denna arkitektur använder en ny metod att ordna om användarna för att möjliggöra parallell minnesåtkomst och kontinuerliga beräkning-ar för efterföljande iterationer, vilket ökar exekveringshastigheten. Arkitekturen implementeras på olika FPGA-familjer och visar resurseffektivitet och konfigurerbarhet genom att pilothoppningssekvenserna sparas i minne, samtidigt som en mer praktiskt användbar detektionshastighet från ungefär ett tusen till några tusen detektioner per sekund beroende på FPGA-familj.

Place, publisher, year, edition, pages
Linköping: Linköping University Electronic Press, 2025. p. 79
Series
Linköping Studies in Science and Technology. Dissertations, ISSN 0345-7524 ; 2414
National Category
Computer Engineering
Identifiers
urn:nbn:se:liu:diva-210418 (URN)10.3384/9789180758529 (DOI)9789180758512 (ISBN)9789180758529 (ISBN)
Public defence
2025-01-21, Ada Lovelace, B-building, Campus Valla, Linköping, 13:15 (English)
Opponent
Supervisors
Available from: 2024-12-13 Created: 2024-12-13 Last updated: 2024-12-16Bibliographically approved

Open Access in DiVA

fulltext(2212 kB)344 downloads
File information
File name FULLTEXT01.pdfFile size 2212 kBChecksum SHA-512
9ec14f7c1c6bb8c446adac677e83bd35d41e8f046a39e07dc1bb169efa87fc06dd617fd9ce3e3d1ac06ea2173adfed629390337adcd9226c98331a6a379fdb48
Type fulltextMimetype application/pdf

Other links

Publisher's full text

Authority records

Mohammadi Sarband, NargesBecirovic, EmaKrysander, MattiasLarsson, Erik G.Gustafsson, Oscar

Search in DiVA

By author/editor
Mohammadi Sarband, NargesBecirovic, EmaKrysander, MattiasLarsson, Erik G.Gustafsson, Oscar
By organisation
Computer EngineeringFaculty of Science & EngineeringCommunication Systems
Signal Processing

Search outside of DiVA

GoogleGoogle Scholar
Total: 344 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 824 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf