liu.seSearch for publications in DiVA
Change search
Refine search result
1234567 151 - 200 of 362
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Rows per page
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sort
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
Select
The maximal number of hits you can export is 250. When you want to export more records please use the Create feeds function.
  • 151.
    Jiang, Guoyou
    Linköping University, Department of Electrical Engineering, Computer Engineering.
    Design and Implementation of a DMA Controller for Digital Signal Processor2010Independent thesis Advanced level (degree of Master (Two Years)), 30 credits / 45 HE creditsStudent thesis
    Abstract [en]

    The thesis work is conducted in the division of computer engineering at thedepartment of electrical engineering in Linköping University. During the thesiswork, a configurable Direct Memory Access (DMA) controller was designed andimplemented. The DMA controller runs at 200MHz under 65nm digital CMOS technology. The estimated gate count is 26595.

    The DMA controller has two address generators and can provide two clocksources. It can thus handle data read and write simultaneously. There are 16channels built in the DMA controller, the data width can be 16-bit, 32-bit and64-bit. The DMA controller supports 2D data access by configuring its intelligentlinking table. The DMA is designed for advanced DSP applications and it is notdedicated for cache which has a fixed priority.

  • 152.
    Jiang, Yang
    Linköping University, Department of Electrical Engineering, Computer Engineering.
    Implementation and Evaluation of Architectures for Multi-Stream FIR Filtering2017Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    Digital filters play a key role in many DSP applications and FIR filters are usually selected because of their simplicity and stability against IIR filters.In this thesis eight architectures for multi-stream FIR filtering are studied. Primarily, three kinds of architectures are implemented and evaluated: one-toone mapping, time-multiplexed and pipeline interleaving. During implementation, practical considerations are taken into account such as implementation approach and number representation. Of interest is to see the performance comparison of different architectures, including area and power. The trade-off between area and power is an attractive topic for this work. Furthermore, the impact of the filter order and pipeline interleaving are studied.The result shows that the performance of different architectures differ a lot even with the same sample rate for each stream. It also shows that the performance of different architectures are affected by the filter order differently. Pipeline interleaving improves area utilization at the cost of rapid increment of power. Moreover, it has negative impact on the maximum working frequency.All the FIR filter architectures are synthesized in a 65nm technology.

  • 153. Jiao, Haiyan
    et al.
    Nilsson, Anders
    Linköping University, The Institute of Technology. Linköping University, Department of Electrical Engineering, Computer Engineering.
    Tell, Eric
    Linköping University, The Institute of Technology. Linköping University, Department of Electrical Engineering, Computer Engineering.
    Liu, Dake
    Linköping University, The Institute of Technology. Linköping University, Department of Electrical Engineering, Computer Engineering.
    MIPS Cost Estimation for OFDM-VBLAST systems2006In: WCNC, IEEE Wireless Communications and Networking,2006, 2006Conference paper (Refereed)
  • 154.
    Johansson, Håkan
    et al.
    Linköping University, Department of Electrical Engineering, Communication Systems. Linköping University, Faculty of Science & Engineering.
    Gustafsson, Oscar
    Linköping University, Department of Electrical Engineering, Computer Engineering. Linköping University, Faculty of Science & Engineering.
    Filter-Bank Based All-Digital Channelizers and Aggregators for Multi-Standard Video Distribution2015In: IEEE International Conference on Digital Signal Processing (DSP), 2015, IEEE , 2015, p. 1117-1120Conference paper (Refereed)
    Abstract [en]

    This paper introduces all-digital flexible channelizersand aggregators for multi-standard video distribution. The overall problem is to aggregate a number of narrow-band subsignals with different bandwidths (6, 7, or 8 MHz) into one composite wide-band signal. In the proposed scheme, this is carried out through a set of analysis filter banks (FBs), that channelize the subsignals into 1/2-MHz subbands, which subsequently are aggregated through one synthesis FB. In this way, full flexibility with a low computational complexity and maintained quality is enabled. The proposed solution offers orders-of-magnitude complexity reductions as compared with a straightforward alternative. Design examples are included that demonstrate the functionality, flexibility, and efficiency.

  • 155.
    Johansson, Håkan
    et al.
    Linköping University, Department of Electrical Engineering, Communication Systems. Linköping University, Faculty of Science & Engineering.
    Gustafsson, Oscar
    Linköping University, Department of Electrical Engineering, Computer Engineering. Linköping University, Faculty of Science & Engineering.
    On frequency-domain implementation of digital FIR filters2015In: IEEE International Conference on Digital Signal Processing (DSP), 2015, IEEE , 2015, p. 315-318Conference paper (Refereed)
    Abstract [en]

    This paper considers frequency-domain implementation of finite-length impulse response filters. In practical fixed-point arithmetic implementations, the overall system corresponds to a time-varying system which can be represented with either a multirate filter bank, and the corresponding distortion and aliasing functions, or a periodic time-varying impulse-response representation or, equivalently, a set of impulse responses and the corresponding frequency responses. The paper provides systematic derivations and analyses of these representations along with design examples. These representations are useful when analyzing the effect of coefficient quantizations as well as the use of shorter DFT lengths than theoretically required.

  • 156.
    Johansson, Malin
    Linköping University, Department of Electrical Engineering, Computer Engineering.
    Synchronization of Acoustic Sensors in a Wireless Network2019Independent thesis Basic level (university diploma), 10,5 credits / 16 HE creditsStudent thesis
    Abstract [en]

    Geographically distributed networks of acoustic sensors can be used to identify and localize the origin of acoustic phenomena. One area of use is localization of snipers by detecting the bullet's shock wave and the muzzle blast. At FOI Linköping, this system is planned to be adapted from a wire bounded sensor network into a wireless sensor network (WSN). When changing from wire bounded communication to wireless, the issue of synchronization becomes present. Synchronization can be achieved in multiple ways with different benefits depending of the method of choice. This thesis studies the synchronization method of using the highly accurate clock in Global Navigation Satellite System (GNSS) modules. This synchronization method is developed into an independent time stamping device that can be connected to each sensor in the WSN. This ensure that all sensors are synchronized to Coordinated Universal Time (UTC). The thesis starts with a pre-study where different solutions are investigated and evaluated. After the pre-study, a development stage is begun where the best solution is developed into a model to be easily implemented in the future. The result is a model existing of a microcontroller, a timing module and an ADC with built in filter and amplification. 

  • 157.
    Jonsson, Simon
    Linköping University, Department of Electrical Engineering, Computer Engineering.
    Designing a Scheduler for Cloud-Based FPGAs2018Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    The primary focus of this thesis has been to design a network packet scheduler for the 5G (fifth generation) network at Ericsson in Linköping, Sweden. Network packet scheduler manages in which sequences the packages in a network will be transmitted, and will put them in a queue accordingly. Depending on the requirement for the system different packet schedulers will work in different ways. The scheduler that is designed in this thesis has a timing wheel as its core. The packages will be placed in the timing wheel depending on its final transmission time and will be outputted accordingly. The algorithm will be implemented on an FPGA (Field Programmable gate arrays). The FPGA itself is located in a cloud environment. The platform in which the FPGA is located on is called "Amazon EC2 F1", this platform can be rented with a Linux instance which comes with everything that is necessary to develop a synthesized file for the FPGA. Part of the thesis will discuss the design of the algorithm and how it was customized for a hardware implementation and part of the thesis will describe using the instance environment for development.

  • 158.
    Jonsson, Sofia
    Linköping University, Department of Electrical Engineering, Computer Engineering.
    Minimising Memory Access Conflicts for FFT on a DSP2019Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    The FFT support in an Ericsson's proprietary DSP is to be improved in order to achieve high performance without disrupting the current DSP architecture too much. The FFT:s and inverse FFT:s in question should support FFT sizes ranging from 12-2048, where the size is a multiple of prime factors 2, 3 and 5. Especially memory access conflicts could cause low performance in terms of speed compared with existing hardware accelerator. The problem addressed in this thesis is how to minimise these memory access conflicts. The studied FFT is a mixed-radix DIT FFT where the butterfly results are written back to addresses of a certain order. Furthermore, different buffer structures and sizes are studied, as well as different order in which to perform the operations within each FFT butterfly stage, and different orders in which to shuffle the samples in the initial stage.

    The study shows that for both studied buffer structures there are buffer sizes giving good performance for the majority of the FFT sizes, without largely changing the current architecture. By using certain orders for performing the operations and shuffling within the FFT stages for remaining FFT sizes, it is possible to reach good performance also for these cases.

  • 159.
    Jung, Daniel
    et al.
    Linköping University, Department of Electrical Engineering, Vehicular Systems. Linköping University, The Institute of Technology.
    Dong, Yi
    Institute for Software Integrated Systems, Vanderbilt University, Nashville, USA.
    Frisk, Erik
    Linköping University, Department of Electrical Engineering, Vehicular Systems. Linköping University, The Institute of Technology.
    Krysander, Mattias
    Linköping University, Department of Electrical Engineering, Computer Engineering. Linköping University, The Institute of Technology.
    Biswas, Gautam
    Institute for Software Integrated Systems, Vanderbilt University, Nashville, USA.
    Sensor selection for fault diagnosis in uncertain systems2018In: International Journal of Control, ISSN 0020-7179, E-ISSN 1366-5820, p. 1-11Article in journal (Refereed)
    Abstract [en]

    Finding the cheapest, or smallest, set of sensors such that a specified level of diagnosis performance is maintained is important to decrease cost while controlling performance. Algorithms have been developed to find sets of sensors that make faults detectable and isolable under ideal circumstances. However, due to model uncertainties and measurement noise, different sets of sensors result in different achievable diagnosability performance in practice. In this paper, the sensor selection problem is formulated to ensure that the set of sensors fulfils required performance specifications when model uncertainties and measurement noise are taken into consideration. However, the algorithms for finding the guaranteed global optimal solution are intractable without exhaustive search. To overcome this problem, a greedy stochastic search algorithm is proposed to solve the sensor selection problem. A case study demonstrates the effectiveness of the greedy stochastic search in finding sets close to the global optimum in short computational time.

  • 160.
    Jung, Daniel
    et al.
    Linköping University, Department of Electrical Engineering, Vehicular Systems. Linköping University, The Institute of Technology.
    Eriksson, Lars
    Linköping University, Department of Electrical Engineering, Vehicular Systems. Linköping University, The Institute of Technology.
    Frisk, Erik
    Linköping University, Department of Electrical Engineering, Vehicular Systems. Linköping University, The Institute of Technology.
    Krysander, Mattias
    Linköping University, Department of Electrical Engineering, Computer Engineering. Linköping University, The Institute of Technology.
    Development of misfire detection algorithm using quantitative FDI performance analysis2015In: Control Engineering Practice, ISSN 0967-0661, E-ISSN 1873-6939, Vol. 34, p. 49-60Article in journal (Refereed)
    Abstract [en]

    A model-based misfire detection algorithm is proposed. The algorithm is able to detect misfires and identify the failing cylinder during different conditions, such as cylinder-to-cylinder variations, cold starts, and different engine behavior in different operating points. Also, a method is proposed for automatic tuning of the algorithm based on training data. The misfire detection algorithm is evaluated using data from several vehicles on the road and the results show that a low misclassification rate is achieved even during difficult conditions. (C) 2014 Elsevier Ltd. All rights reserved.

  • 161.
    Jung, Daniel
    et al.
    Linköping University, Department of Electrical Engineering, Vehicular Systems. Linköping University, The Institute of Technology.
    Frisk, Erik
    Linköping University, Department of Electrical Engineering, Vehicular Systems. Linköping University, The Institute of Technology.
    Krysander, Mattias
    Linköping University, Department of Electrical Engineering, Computer Engineering. Linköping University, The Institute of Technology.
    A flywheel error compensation algorithm for engine misfire detection2016In: Control Engineering Practice, ISSN 0967-0661, E-ISSN 1873-6939, Vol. 47, p. 37-47Article in journal (Refereed)
    Abstract [en]

    A commonly used signal for engine misfire detection is the crankshaft angular velocity measured at the flywheel. However, flywheel manufacturing errors result in vehicle-to-vehicle variations in the measurements and have a negative impact on the misfire detection performance, where the negative impact is quantified for a number of vehicles. A misfire detection algorithm is proposed with flywheel error adaptation in order to increase robustness and reduce the number of mis-classifications. Since the available computational power is limited in a vehicle, a filter with low computational load, a Constant Gain Extended Kalman Filter, is proposed to estimate the flywheel errors. Evaluations using measurements from vehicles on the road show that the number of mis-classifications is significantly reduced when taking the estimated flywheel errors into consideration.

  • 162.
    Jung, Daniel
    et al.
    Linköping University, Department of Electrical Engineering, Vehicular Systems. Linköping University, The Institute of Technology.
    Frisk, Erik
    Linköping University, Department of Electrical Engineering, Vehicular Systems. Linköping University, The Institute of Technology.
    Krysander, Mattias
    Linköping University, Department of Electrical Engineering, Computer Engineering. Linköping University, The Institute of Technology.
    Quantitative isolability analysis of different fault modes2015In: 9th IFAC Symposium on Fault Detection, Supervision and Safety for Technical Processes SAFEPROCESS 2015 – Paris, 2–4 September 2015: Proceedings / [ed] Didier Maquin, Elsevier, 2015, Vol. 48(21), p. 1275-1282Conference paper (Refereed)
    Abstract [en]

    To be able to evaluate quantitative fault diagnosability performance in model-based diagnosis is useful during the design of a diagnosis system. Different fault realizations are more or less likely to occur and the fault diagnosis problem is complicated by model uncertainties and noise. Thus, it is not obvious how to evaluate performance when all of this information is taken into consideration. Four candidates for quantifying fault diagnosability performance between fault modes are discussed. The proposed measure is called expected distinguishability and is based of the previous distinguishability measure and two methods to compute expected distinguishability are presented.

  • 163.
    Jung, Daniel
    et al.
    Linköping University, Department of Electrical Engineering, Vehicular Systems. Linköping University, Faculty of Science & Engineering.
    Frisk, Erik
    Linköping University, Department of Electrical Engineering, Vehicular Systems. Linköping University, Faculty of Science & Engineering.
    Krysander, Mattias
    Linköping University, Department of Electrical Engineering, Computer Engineering. Linköping University, Faculty of Science & Engineering.
    Residual change detection using low-complexity sequential quantile estimation2017In: 20th IFAC World Congress / [ed] Denis Dochain, Didier Henrion, Dimitri Peaucelle, 2017, Vol. 50, p. 14064-14069, article id 1Conference paper (Refereed)
    Abstract [en]

    Detecting changes in residuals is important for fault detection and is commonly performed by thresholding the residual using, for example, a CUSUM test. However, detecting variations in the residual distribution, not causing a change of bias or increased variance, is difficult using these methods. A plug-and-play residual change detection approach is proposed based on sequential quantile estimation to detect changes in the residual cumulative density function. An advantage of the proposed algorithm is that it is non-parametric and has low computational cost and memory usage which makes it suitable for on-line implementations where computational power is limited.

  • 164.
    Jung, Daniel
    et al.
    Linköping University, Department of Electrical Engineering, Vehicular Systems. Linköping University, Faculty of Science & Engineering.
    Ng, Kok Yew
    School of Engineering, Ulster University, Newtownabbey, UK; Electrical and Computer Systems Engineering, School of Engineering, Monash University Malaysia, Malaysia.
    Frisk, Erik
    Linköping University, Department of Electrical Engineering, Vehicular Systems. Linköping University, Faculty of Science & Engineering.
    Krysander, Mattias
    Linköping University, Department of Electrical Engineering, Computer Engineering. Linköping University, Faculty of Science & Engineering.
    Combining model-based diagnosis and data-driven anomaly classifiers for fault isolation2018In: Control Engineering Practice, ISSN 0967-0661, E-ISSN 1873-6939, Vol. 80, p. 146-156Article in journal (Refereed)
    Abstract [en]

    Machine learning can be used to automatically process sensor data and create data-driven models for prediction and classification. However, in applications such as fault diagnosis, faults are rare events and learning models for fault classification is complicated because of lack of relevant training data. This paper proposes a hybrid diagnosis system design which combines model-based residuals with incremental anomaly classifiers. The proposed method is able to identify unknown faults and also classify multiple-faults using only single-fault training data. The proposed method is verified using a physical model and data collected from an internal combustion engine.

    The full text will be freely available from 2020-09-09 11:37
  • 165.
    Jung, Daniel
    et al.
    Linköping University, Department of Electrical Engineering, Vehicular Systems. Linköping University, Faculty of Science & Engineering.
    Yew Ng, Kok
    Monash University, Malaysia.
    Frisk, Erik
    Linköping University, Department of Electrical Engineering, Vehicular Systems. Linköping University, Faculty of Science & Engineering.
    Krysander, Mattias
    Linköping University, Department of Electrical Engineering, Computer Engineering. Linköping University, Faculty of Science & Engineering.
    A combined diagnosis system design using model-based and data-driven methods2016In: 2016 3RD CONFERENCE ON CONTROL AND FAULT-TOLERANT SYSTEMS (SYSTOL), IEEE , 2016, p. 177-182Conference paper (Refereed)
    Abstract [en]

    A hybrid diagnosis system design is proposed that combines model-based and data-driven diagnosis methods for fault isolation. A set of residuals are used to detect if there is a fault in the system and a consistency-based fault isolation algorithm is used to compute all diagnosis candidates that can explain the triggered residuals. To improve fault isolation, diagnosis candidates are ranked by evaluating the residuals using a set of one-class support vector machines trained using data from different faults. The proposed diagnosis system design is evaluated using simulations of a model describing the air-flow in an internal combustion engine.

  • 166.
    Kaltea, Eddie
    et al.
    Linköping University, Department of Electrical Engineering, Computer Engineering. Linköping University, Department of Electrical Engineering.
    Lundgren, Daniel
    Linköping University, Department of Electrical Engineering, Computer Engineering. Linköping University, Department of Electrical Engineering.
    Utveckling och konstruktion av analysatorverktyg för styrsignaler i HDMI-gränssnittet2009Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [sv]

    Utveckling av produkter som skall stöda HDMI-standarden medför många hinder som behöver överkommas. Ett av problemen är certifiering mot standarden. Det är svårt att testa att standardens alla krav uppfylls på ett utvecklingsföretag då testutrustningen är kostsam och därför ej tillgänglig. Ett enkelt verktyg har därför utvecklats för att underlätta testning av att standarden följs.

    Denna rapport inleds med en problemställning och grundläggande teori om relaterade ämnen. En förstudie följer sedan där olika sätt att lösa problemet presenteras. Sedan följer en övergripande beskrivning om hur verktyget fungerar och hur det tillverkades. I slutet på rapporten finns en efterstudie och resultat som beskriver hur verktygets utveckling har fungerat och hur resultatet från förstudien påverkat utvecklingen.

    Resultatet av examensarbetet är en prototyp som går att använda för att underlätta testning av att HDMI-standarden följs i vissa avseenden.

  • 167.
    Kamula, Juha
    et al.
    Linköping University, Department of Electrical Engineering, Computer Engineering. Linköping University, The Institute of Technology.
    Hansson, Rikard
    Linköping University, Department of Electrical Engineering, Computer Engineering. Linköping University, The Institute of Technology.
    Rugged Portable Communication System2013Independent thesis Advanced level (professional degree), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    Todays modern warfare puts high demands on military equipment. Where soldiers are concerned, types of communication equipment such as radios, displays and headsets play a central role. A modern soldier is often required to maintain communication links with other military units. These units can, for example, consist of platoon commanders, headquarters and other soldiers. If the soldier needs to make a report to several units, the message needs to be sent to several radio networks that are connected to these separate units. This multiplicity in turn requires several items of radio equipment connected to the radio network frequencies. Considering all the communication equipment that is used by a modern soldier, the parallel data flow and all the weight a soldier needs to carry, can get quite extensive. 

    \noindentAt Saab AB it has been proven that a combination of powerful embedded hardware platforms and cross platform software fulfills the communication needs. However, the weight issue still remains as these embedded platforms are quite bulky and hard to carry. In order to increase the portability, a tailored Android application for smaller low-power embedded hardware platform has been developed at Saab AB. Saab AB has also developed a portable analogue interconnection unit for connecting three radios and a headset, the SKE (Sammankopplingsenhet).

    \noindentSaab AB intends to develop a new product for soldiers, the RPCS (Rugged Portable Communication System), with capacities of running the Android application and combining the audio processing functionality of the SKE. This thesis focuses on developing a hardware platform prototype for the RPCS using Beagleboard. The SKE audio processing functionality is developed as a software application running on the Beagleboard.

  • 168.
    Kanders, Hans
    et al.
    Linköping University, Department of Electrical Engineering. Linköping University, Faculty of Science & Engineering.
    Mellqvist, Tobias
    Linköping University, Department of Electrical Engineering. Linköping University, Faculty of Science & Engineering.
    Garrido Gálvez, Mario
    Linköping University, Department of Electrical Engineering, Computer Engineering. Linköping University, Faculty of Science & Engineering.
    Palmkvist, Kent
    Linköping University, Department of Electrical Engineering, Computer Engineering. Linköping University, Faculty of Science & Engineering.
    Gustafsson, Oscar
    Linköping University, Department of Electrical Engineering, Computer Engineering. Linköping University, Faculty of Science & Engineering.
    A 1 Million-Point FFT on a Single FPGA2019In: IEEE Transactions on Circuits and Systems Part 1: Regular Papers, ISSN 1549-8328, E-ISSN 1558-0806, Vol. 66, no 10, p. 3863-3873Article in journal (Refereed)
    Abstract [en]

    In this paper, we present the first implementation of a 1 million-point fast Fourier transform (FFT) completely integrated on a single field-programmable gate array (FPGA), without the need for external memory or multiple interconnected FPGAs. The proposed architecture is a pipelined single-delay feedback (SDF) FFT. The architecture includes a specifically designed 1 million-point rotator with high accuracy and a thorough study of the word length at the different FFT stages in order to increase the signal-to-quantization-noise ratio (SQNR) and keep the area low. This also results in low power consumption.

  • 169.
    Karlsson, Andreas
    et al.
    Linköping University, Department of Electrical Engineering, Computer Engineering. Linköping University, Faculty of Science & Engineering.
    Sohl, Joar
    Linköping University, Department of Electrical Engineering, Computer Engineering. Linköping University, Faculty of Science & Engineering.
    Liu, Dake
    Linköping University, Department of Electrical Engineering, Computer Engineering. Linköping University, Faculty of Science & Engineering.
    Cost-efficient Mapping of 3- and 5-point DFTs to General Baseband Processors2015In: International Conference on Digital Signal Processing (DSP), Singapore, 21-24 July, 2015, Institute of Electrical and Electronics Engineers (IEEE), 2015, p. 780-784Conference paper (Refereed)
    Abstract [en]

    Discrete Fourier transforms of 3 and 5 points are essential building blocks in FFT implementations for standards such as 3GPP-LTE. In addition to being more complex than 2 and 4 point DFTs, these DFTs also cause problems with data access in SDR-DSPs, since the data access width, in general, is a power of 2. This work derives mappings of these DFTs to a 4-way SIMD datapath that has been designed with 2 and 4-point DFT in mind. Our instruction set proposals, based on modified Winograd DFT, achieves single cycle execution of 3-point DFTs and 2.25 cycle average execution of 5-point DFTs in a cost-effective manner by reutilizing the already available arithmetic units. This represents an approximate speed-up of 3 times compared to an SDR-DSP with only MAC-support. In contrast to our more general design, we also demonstrate that a typical single-purpose FFT-specialized 5-way architecture only delivers 9% to 25% extra performance on average, while requiring 85% more arithmetic units and a more expensive memory subsystem.

  • 170.
    Karlsson, Andreas
    et al.
    Linköping University, Department of Electrical Engineering, Computer Engineering. Linköping University, Faculty of Science & Engineering.
    Sohl, Joar
    Linköping University, Department of Electrical Engineering, Computer Engineering. Linköping University, The Institute of Technology.
    Liu, Dake
    Linköping University, Department of Electrical Engineering, Computer Engineering. Linköping University, Faculty of Science & Engineering.
    Energy-efficient sorting with the distributed memory architecture ePUMA2015In: IEEE International Symposium on Parallel and Distributed Processing with Applications (ISPA), Institute of Electrical and Electronics Engineers (IEEE), 2015, p. 116-123Conference paper (Refereed)
    Abstract [en]

    This paper presents the novel heterogeneous DSP architecture ePUMA and demonstrates its features through an implementation of sorting of larger data sets. We derive a sorting algorithm with fixed-size merging tasks suitable for distributed memory architectures, which allows very simple scheduling and predictable data-independent sorting time.The implementation on ePUMA utilizes the architecture's specialized compute cores and control cores, and local memory parallelism, to separate and overlap sorting with data access and control for close to stall-free sorting.Penalty-free unaligned and out-of-order local memory access is used in combination with proposed application-specific sorting instructions to derive highly efficient local sorting and merging kernels used by the system-level algorithm.Our evaluation shows that the proposed implementation can rival the sorting performance of high-performance commercial CPUs and GPUs, with two orders of magnitude higher energy efficiency, which would allow high-performance sorting on low-power devices.

  • 171.
    Karlsson, Andreas
    et al.
    Linköping University, Department of Electrical Engineering, Computer Engineering. Linköping University, Faculty of Science & Engineering.
    Sohl, Joar
    Linköping University, Department of Electrical Engineering, Computer Engineering. Linköping University, Faculty of Science & Engineering.
    Liu, Dake
    Linköping University, Department of Electrical Engineering, Computer Engineering. Linköping University, Faculty of Science & Engineering.
    ePUMA: A Processor Architecture for Future DSP2015In: International Conference on Digital Signal Processing (DSP), Singapore, 21-24 July, 2015, 2015, p. 253-257Conference paper (Refereed)
    Abstract [en]

    Since the breakdown of Dennard scaling the primary design goal for processor designs has shifted from increasing performance to increasing performance per Watt. The ePUMA platform is a flexible and configurable DSP platform that tries to address many of the problems with traditional DSP designs, to increase  performance, but use less power. We trade the flexibility of traditional VLIW DSP designs for a simpler single instruction issue scheme and instead make sure that each instruction can perform more work. Multi-cycle instructions can operate directly on vectors and matrices in memory and the datapaths implement common DSP subgraphs directly in hardware, for high compute throughput. Memory bottlenecks, that are common in other architectures, are handled with flexible LUT-based multi-bank memory addressing and memory parallelism. A major contributor to energy consumption, data movement, is reduced by using heterogeneous interconnect and clustering compute resources around local memories for simple data sharing. To evaluate ePUMA we have implemented the majority of the kernel library from a commercial VLIW DSP manufacturer for comparison. Our results not only show good performance, but also an order of magnitude increase in energy- and area efficiency. In addition, the kernel code size is reduced by 91% on average compared to the VLIW DSP. These benefits makes ePUMA an attractive solution for future DSP.

  • 172.
    Karlsson, Andreas
    et al.
    Linköping University, Department of Electrical Engineering, Computer Engineering. Linköping University, Faculty of Science & Engineering.
    Sohl, Joar
    Linköping University, Department of Electrical Engineering, Computer Engineering. Linköping University, The Institute of Technology.
    Liu, Dake
    Linköping University, Department of Electrical Engineering, Computer Engineering. Linköping University, Faculty of Science & Engineering.
    Software-based QPP Interleaving for Baseband DSPs with LUT-accelerated Addressing2015In: International Conference on Digital Signal Processing (DSP), Singapore, 21-24 July, 2015, Institute of Electrical and Electronics Engineers (IEEE), 2015Conference paper (Refereed)
    Abstract [en]

    This paper demonstrates how QPP interleaving and de-interleaving for Turbo decoding in 3GPP-LTE can be implemented efficiently on baseband processors with lookup-table (LUT) based addressing support of multi-bank memory. We introduce a LUT-compression technique that reduces LUT size to 1% of what would otherwise be needed to store the full data access patterns for all LTE block sizes. By reusing the already existing program memory of a baseband processor to store LUTs and using our proposed general address generator, our 8-way data access path can reach the same throughput as a dedicated 8-way interleaving ASIC implementation. This avoids the addition of a dedicated interleaving address generator to a processor which,  according to ASIC synthesis, would be 75\% larger than our proposed address generator. Since our software implementation only involves the address generator, the processor's datapaths are free to perform the other operations of Turbo decoding in parallel with interleaving. Our software implementation ensure programmability and flexibility and is the fastest software-based implementation of QPP interleaving known to us.

  • 173.
    Karlsson, Andréas
    Linköping University, Department of Electrical Engineering, Computer Engineering.
    Algorithm Adaptation and Optimization of a Novel DSP Vector Co-processor2010Independent thesis Advanced level (professional degree), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    The Division of Computer Engineering at Linköping's university is currently researching the possibility to create a highly parallel DSP platform, that can keep up with the computational needs of upcoming standards for various applications, at low cost and low power consumption. The architecture is called ePUMA and it combines a general RISC DSP master processor with eight SIMD co-processors on a single chip. The master processor will act as the main processor for general tasks and execution control, while the co-processors will accelerate computing intensive and parallel DSP kernels.This thesis investigates the performance potential of the co-processors by implementing matrix algebra kernels for QR decomposition, LU decomposition, matrix determinant and matrix inverse, that run on a single co-processor. The kernels will then be evaluated to find possible problems with the co-processors' microarchitecture and suggest solutions to the problems that might exist. The evaluation shows that the performance potential is very good, but a few problems have been identified, that causes significant overhead in the kernels. Pipeline mismatches, that occurs due to different pipeline lengths for different instructions, causes pipeline hazards and the current solution to this, doesn't allow effective use of the pipeline. In some cases, the single port memories will cause bottlenecks, but the thesis suggests that the situation could be greatly improved by using buffered memory write-back. Also, the lack of register forwarding makes kernels with many data dependencies run unnecessarily slow.

  • 174.
    Karlsson, Andréas
    Linköping University, Department of Electrical Engineering, Computer Engineering. Linköping University, Faculty of Science & Engineering.
    Design of Energy-Efficient High-Performance ASIP-DSP Platforms2016Doctoral thesis, monograph (Other academic)
    Abstract [en]

    In the last ten years, limited clock frequency scaling and increasing power density has shifted IC design focus towards parallelism, heterogeneity and energy efficiency. Improving energy efficiency is by no means simple and it calls for a reevaluation of old design choices in processor architecture, and perhaps more importantly, development of new programming methodologies that exploit the features of modern architectures.

    This thesis discusses the design of energy-efficient digital signal processors with application-specific instructions sets, so-called ASIP-DSPs, and their programming tools. Target applications for such processors include, but are not limited to, communications, multimedia, image processing, intelligent vision and radar. These applications are often implemented by a limited set of kernel algorithms, whose performance and efficiency are critical to the application's success. At the same time, the extreme non-recurring engineering cost of system-on-chip designs means that product life-time must be kept as long as possible. Neither general-purpose processors nor non-programmable ASICs can meet both the flexibility and efficiency requirements, and ASIPs may instead be the best trade-off between all the conflicting goals.

    Traditional superscalar- and VLIW processor design focus has been to improve the throughput of fine-grained instructions, which results in high flexibility, but also high energy consumption. SIMD architectures, on the other hand, are often restricted by inefficient data access. The result is architectures which spend more energy and/or time on supporting operations rather than actual computing.

    This thesis defines the performance limit of an architecture with an N-way parallel datapath as consuming 2N elements of compute data per clock cycle. To approach this performance, this work proposes coarse-grained higher-order functional (HOF) instructions, which encode the most  frequently executed compute-, data access- and control sequences into single many-cycle instructions, to reduce the overheads of instruction delivery, while at the same time maintaining orthogonality. The work further investigates opportunities for operation fusion to improve computing performance, and proposes a flexible memory subsystem for conflict-free parallel memory access with permutation and lookup-table-based addressing, to ensure that high computing throughput can be sustained even in the presence of irregular data access patterns. These concepts are extensively studied by implementing a large kernel algorithm library with typical DSP kernels, to prove their effectiveness and adequacy. Compared to contemporary VLIW DSP solutions, our solution can practically eliminate instruction fetching energy in many scenarios, significantly reduce control path switching, simplify the implementation of kernels and reduce code size, sometimes by as much as 30 times.

    The techniques proposed in this thesis have been implemented in the DSP platform ePUMA (embedded Parallel DSP processor with Unique Memory Access), a configurable control-compute heterogeneous platform with distributed memory, optimized for low-power predictable DSP computing. Hardware evaluation has been done with FPGA prototypes. In addition, several VLSI layouts have been created for energy and area estimations. This includes smaller designs, as well as a large design with 73 cores, capable of 1280 integer GOPS or 256 GFLOPS at 500MHz and which measures 45mm2 in 28nm FD-SOI technology.

    In addition to the hardware design, this thesis also discusses parallel programming flow for distributed memory architectures and ePUMA application implementation. A DSP kernel programming language and its compiler is presented. This effectively demonstrates how kernels written in a high-level language can be translated into HOF instructions for very high processing efficiency.

  • 175.
    Karlsson, Andréas
    et al.
    Linköping University, Department of Electrical Engineering, Computer Engineering. Linköping University, The Institute of Technology.
    Sohl, Joar
    Linköping University, Department of Electrical Engineering, Computer Engineering. Linköping University, The Institute of Technology.
    Wang, Jian
    Linköping University, Department of Electrical Engineering, Computer Engineering. Linköping University, The Institute of Technology.
    Liu, Dake
    Linköping University, Department of Electrical Engineering, Computer Engineering. Linköping University, The Institute of Technology.
    ePUMA: A unique memory access based parallel DSP processor for SDR and CR2013In: Global Conference on Signal and Information Processing (GlobalSIP), 2013 IEEE, IEEE , 2013, p. 1234-1237Conference paper (Refereed)
    Abstract [en]

    This paper presents ePUMA, a master-slave heterogeneous DSP processor for communications and multimedia. We introduce the ePUMA VPE, a vector processing slave-core designed for heavy DSP workloads and demonstrate how its features can used to implement DSP kernels that efficiently overlap computing, data access and control to achieve maximum datapath utilization. The efficiency is evaluated by implementing a basic set of kernels commonly used in SDR. The experiments show that all kernels asymptotically reach above 90% effective datapath utilization. while many approach 100%, thus the design effectively overlaps computing, data access and control. Compared to popular VLIW solutions, the need for a large register file with many ports is eliminated, thus saving power and chip area. When compared to a commercial VLIW solution, our solution also achieves code size reductions of up to 30 times and a significantly simplified kernel implementation.

  • 176.
    Karlström, Per
    et al.
    Linköping University, Department of Electrical Engineering, Computer Engineering. Linköping University, The Institute of Technology.
    Ehliar, Andreas
    Linköping University, Department of Electrical Engineering, Computer Engineering. Linköping University, The Institute of Technology.
    Liu, Dake
    Linköping University, Department of Electrical Engineering, Computer Engineering. Linköping University, The Institute of Technology.
    High performance, low-latency field-programmable gate array-based floating-point adder and multiplier units in a Virtex 42008In: IET Computers and digital techniques, ISSN 1751-8601, Vol. 2, p. 305-313Article in journal (Refereed)
    Abstract [en]

    There is increasing interest about floating-point arithmetics in field programmable gate arrays (FPGAs) because of the increase in their size and performance. FPGAs are generally good at bit manipulations and fixed-point arithmetics, but they have a harder time coping with floating-point arithmetics. An architecture used to construct high-performance floating-point components in a Virtex-4 FPGA is described in detail. Floating-point adder/subtracter and multiplier units have been constructed. The adder/subtracter can operate at a frequency of 377 MHz in a Virtex-4SX35 (speed grade -12).

  • 177.
    Karlström, Per Axel
    Linköping University, Department of Electrical Engineering, Computer Engineering. Linköping University, The Institute of Technology.
    NoGAP: Novel Generator of Accelerators and Processors2010Doctoral thesis, monograph (Other academic)
    Abstract [en]

    ASIPs are needed to handle the future demand of flexible yet highperformance embedded computing. The flexibility of ASIPs makes them preferable over fixed function ASICs. Also, a well designed ASIP, has a power consumption comparable to ASICs.  However the cost associated with ASIP design is a limiting factor for a more wide spread adoption. A number of different tools have been proposed, promising to ease this design process. However all of the current state of the art tools limits the designer due to a template based design process. It blocks design freedoms and limits the I/O bandwidth of the template. We have therefore proposed the Novel Generator of Accelerator and Processors (NoGAP). NoGAP is a design automation tool for ASIP andaccelerator design that puts very few limits on what can be designed, yet NoGAP gives support by automating much of the tedious anderror prone tasks associated with ASIP design.

    This thesis will present NoGAP and much of its key concepts. Such as; the NoGAP-CL) which is a language used to implement processors in NoGAP, This thesis exposes NoGAP's key technologies, which include automatic bus and wire sizing, instruction decoder and pipeline management, how PC-FSMs can be generated, how an assembler can be generated, and how cycle accurate simulators can be generated.

    We have so far proven NoGAP's strengths in three extensive case studies, in one a floating point pipelined data path was designed, in another a simple RISC processor was designed, and finally one advanced RISC style DSP was designed using NoGAP. All these case studies points to the same conclusion, that NoGAP speeds up development time, clarify complex pipeline architectures, retains design flexibility, and most importantly does not incur much performance penalty, compared to hand optimized RTL code.

    We belive that the work presented in this thesis shows that NoGAP, using our proposed novel approach to micro architecture design, can have a significant impact on both academic and industrial hardware design. To our best knowledge NoGAP is the first system that has demonstrated that a template free processor construction framework can be developed and generate high performance hardware solutions.

  • 178.
    Karlström, Per
    et al.
    Linköping University, Department of Electrical Engineering, Computer Engineering. Linköping University, The Institute of Technology.
    Ehliar, Andreas
    Linköping University, Department of Electrical Engineering, Computer Engineering. Linköping University, The Institute of Technology.
    Liu, Dake
    Linköping University, Department of Electrical Engineering, Computer Engineering. Linköping University, The Institute of Technology.
    High Performance, Low Latency FPGA based Floating Point Adder and Multiplier Units in a Virtex 42006In: NORCHIP 2006: The Nordic Microelectronics Event. 2006, 2006, p. 31-34Conference paper (Refereed)
    Abstract [en]

    Since the invention of FPGAs, the increase in their size and performance has allowed designers to use FPGAs for more complex designs. FPGAs are generally good at bit manipulations and fixed point arithmetics but has a harder time coping with floating point arithmetics. In this paper we describe methods used to construct high performance floating point components in a Virtex-4. We have constructed a floating point adder/subtracter and multiplier which we then used to construct a complex radix-2 butterfly. Our adder/subtracter can operate at a frequency of 361 MHz in a Virtex-4SX35 (speed grade -12)

  • 179.
    Karlström, Per
    et al.
    Linköping University, Department of Electrical Engineering, Computer Engineering. Linköping University, The Institute of Technology.
    Liu, Dake
    Linköping University, Department of Electrical Engineering, Computer Engineering. Linköping University, The Institute of Technology.
    NoGAP: A Micro Architecture Construction Framework2009In: Embedded Computer Systems: Architectures, Modeling, and Simulation: 9th International Workshop, SAMOS 2009, Samos, Greece, July 20-23, 2009. Proceedings / [ed] Koen Bertels, Nikitas Dimopoulos, Cristina Silvano, Stephan Wong, Berlin: Springer Berlin/Heidelberg, 2009, 1, p. 171-180Conference paper (Refereed)
    Abstract [en]

    Flexible Application Specific Instruction set Processors (ASIP) are starting to replace monolithic ASICs in a vide variety of fields. However the design of an ASIP is today a substantial design effort. This paper discusses NoGAP (Novel Generator for ASIP) a tool for ASIP designs utilizing hardware multiplexed data paths. One of the main advantages of NoGAP compared to other ADL tools is that it does not impose limits on the architecture and thus design freedom. To reach this flexibility NoGAP makes heavy use of the compositional design principle and is therefore divided into three parts Mage, Mase, and Castle. This paper presents the central concepts of NoGAP to show that it is possible to reach this advertised flexibility and still be able to generate HDL code and tools such as simulators and assemblers.

  • 180.
    Karlström, Per
    et al.
    Linköping University, Department of Electrical Engineering, Computer Engineering. Linköping University, The Institute of Technology.
    Zhou, Wenbiao
    Linköping University, Department of Electrical Engineering, Computer Engineering. Linköping University, The Institute of Technology.
    Liu, Dake
    Linköping University, Department of Electrical Engineering, Computer Engineering. Linköping University, The Institute of Technology.
    Automatic Assembler Generator for NoGAP2010In: Ph.D. Research in Microelectronics and Electronics, 2010Conference paper (Refereed)
  • 181.
    Karlström, Per
    et al.
    Linköping University, Department of Electrical Engineering, Computer Engineering. Linköping University, The Institute of Technology.
    Zhou, Wenbiao
    Linköping University, Department of Electrical Engineering, Computer Engineering. Linköping University, The Institute of Technology.
    Liu, Dake
    Linköping University, Department of Electrical Engineering, Computer Engineering. Linköping University, The Institute of Technology.
    Automatic Port and Bus Sizing in NoGAP2010In: International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation, 2010, p. 258-264Conference paper (Refereed)
    Abstract [en]

    ASIP processors and programmable accelerators are replacing monolithic ASICs in more and more areas. However the design and implementation of a new ASIP processor or programmable accelerator requires a substantial design effort. There are a number of existing tools that promise to ease this design effort, but using these tools usually means that the designer get locked into the tools a priori assumtions and it is therefore hard to develop truly novel ASIPs or accelerators. NoGAP is a tool that delivers design support while not locking the designer into any predefined template architecture. An important aspect of NoGAPs design process is the ability to design the data path of each instruction individually. Therefore the size of input/output ports can sometimes not be known while designing the individual functional units. For this reason we have introduced the concept of dynamic port sizes, which is an extension of the parameter/generic concept in Verilog/VHDL. A problem arises if the data path graph contains loops, either due to intra or inter instruction dependencies. This paper will present the algorithm used to solve this looping problem.

  • 182.
    Karlström, Per
    et al.
    Linköping University, Department of Electrical Engineering, Computer Engineering. Linköping University, The Institute of Technology.
    Zhou, Wenbiao
    Linköping University, Department of Electrical Engineering, Computer Engineering. Linköping University, The Institute of Technology.
    Liu, Dake
    Linköping University, Department of Electrical Engineering, Computer Engineering. Linköping University, The Institute of Technology.
    Operation Classification for Control Path Synthetization with NoGAP2010Conference paper (Refereed)
    Abstract [en]

    Flexible Application Specific Instruction set Processors (ASIP) are starting to replace monolithic ASICs in a wide variety of fields. However the construction of an ASIP is today associated with a substantial design effort. NoGAP (Novel Generator of Micro Architecture and Processor) is a tool for ASIP designs utilizing hardware multiplexed data paths. One of the main advantages of NoGAP compared to other ADL tools is that it does not impose limits on the architecture and thus design freedom. NoGAP does not assume a fixed processor template and is not another data flow synthesizer. To reach this flexibility NoGAP makes heavy use of the compositional design principle and is therefore divided into three parts Mage, Mase, and Castle. This paper discusses the techniques used in NoGAP for control path synthetization. A RISC processor has been constructed with NoGAP in less than a working day and synthesized to an FPGA. With no FPGA specific optimizations this processor met timing closure at 178MHz in a Virtex-4 LX80 speedgrade 12.

  • 183.
    Keller, Markus
    Linköping University, Department of Electrical Engineering, Computer Engineering.
    Implementation of LTE Baseband Algorithms for a Highly Parallel DSP Platform2016Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    The division of computer engineering at Linköping’s university is currentlydeveloping an innovative parallel DSP processor architecture called ePUMA. Onepossible future purpose of the ePUMA that has been thought of is to implement itin base stations for mobile communication. In order to investigate the performanceand potential of the ePUMA as a processing unit in base stations, a model of theLTE physical layer uplink receiving chain has been simulated in Matlab and thenpartially mapped onto the ePUMA processor.The project work included research and understanding of the LTE standard andsimulating the uplink processing chain in Matlab for a transmission bandwidth of5 MHz. Major tasks of the DSP implementation included the development of a300-point FFT algorithm and a channel equalization algorithm for the SIMD unitsof the ePUMA platform. This thesis provides the reader with an introduction tothe LTE standard as well as an introduction to the ePUMA processor. Furthermore,it can serve as a guidance to develop mixed point radix FFTs in general orthe 300 point FFT in specific and can help with a basic understanding of channelequalization. The work of the thesis included the whole developing chain from understandingthe algorithms, simplifying and mapping them onto a DSP platform,and testing and verification of the results.

  • 184.
    Khorasgani, Hamed
    et al.
    Vanderbilt University, TN 37235 USA.
    Jung, Daniel
    Linköping University, Department of Electrical Engineering, Vehicular Systems. Linköping University, Faculty of Science & Engineering.
    Biswas, Gautam
    Vanderbilt University, TN 37235 USA.
    Frisk, Erik
    Linköping University, Department of Electrical Engineering, Vehicular Systems. Linköping University, Faculty of Science & Engineering.
    Krysander, Mattias
    Linköping University, Department of Electrical Engineering, Computer Engineering. Linköping University, Faculty of Science & Engineering.
    Robust Residual Selection for Fault Detection2014In: 2014 IEEE 53RD ANNUAL CONFERENCE ON DECISION AND CONTROL (CDC), IEEE , 2014, p. 5764-5769Conference paper (Refereed)
    Abstract [en]

    A number of residual generation methods have been developed for robust model-based fault detection and isolation (FDI). There have also been a number of offline (i.e., design-time) methods that focus on optimizing FDI performance (e.g., trading off detection performance versus cost). However, design-time algorithms are not tuned to optimize performance for different operating regions of system behavior. To do this, would need to define online measures of sensitivity and robustness, and use them to select the best residual set online as system behavior transitions between operating regions. In this paper we develop a quantitative measure of residual performance, called the detectability ratio that applies to additive and multiplicative uncertainties when determining the best residual set in different operating regions. We discuss this methodology and demonstrate its effectiveness using a case study.

  • 185.
    Kleback, Oskar
    Linköping University, Department of Electrical Engineering, Computer Engineering.
    Study on Low Voltage Power Electronics Used for Actuator Control2017Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    The aim of this thesis is to understand the current implementation, how different hardware and output frequency affects the hydraulic actuators in the current platform and Then an improve the controller should be presented. This needs to be both faster then the current controller and should not use more CPU recurses then necessary. With the understanding of current controller, three new regulators where implemented and tested. One uses a PI regulator and the other two uses an adaptive algorithm to generate the control signal. All where faster than the current one and the PI-implementation uses the lowest amount of CPU recurses, on the other hand this needs to be calibrated for the different hardware and output frequency’s. ThetwoadaptivecontrollersrequiresahigheramountofCPUrecurses, instead it requires less calibration to work.

  • 186.
    Kolumban, Gaspar
    Linköping University, Department of Electrical Engineering, Computer Engineering. Linköping University, The Institute of Technology.
    Low Cost Floating-Point Extensions to a Fixed-Point SIMD Datapath2013Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    The ePUMA architecture is a novel master-multi-SIMD DSP platform aimed at low-power computing, like for embedded or hand-held devices for example. It is both a configurable and scalable platform, designed for multimedia and communications.

    Numbers with both integer and fractional parts are often used in computers because many important algorithms make use of them, like signal and image processing for example. A good way of representing these types of numbers is with a floating-point representation. The ePUMA platform currently supports a fixed-point representation, so the goal of this thesis will be to implement twelve basic floating-point arithmetic operations and two conversion operations onto an already existing datapath, conforming as much as possible to the IEEE 754-2008 standard for floating-point representation. The implementation should be done at a low hardware and power consumption cost. The target frequency will be 500MHz. The implementation will be compared with dedicated DesignWare components and the implementation will also be compared with floating-point done in software in ePUMA.

    This thesis presents a solution that on average increases the VPE datapath hardware cost by 15% and the power consumption increases by 15% on average. Highest clock frequency with the solution is 473MHz. The target clock frequency of 500MHz is thus not achieved but considering the lack of register retiming in the synthesis step, 500MHz can most likely be reached with this design.

  • 187.
    Kovalev, Anton
    et al.
    Linköping University, Department of Electrical Engineering, Computer Engineering. Linköping University, Faculty of Science & Engineering.
    Gustafsson, Oscar
    Linköping University, Department of Electrical Engineering, Computer Engineering. Linköping University, Faculty of Science & Engineering.
    Garrido, Mario
    Linköping University, Department of Electrical Engineering, Computer Engineering. Linköping University, Faculty of Science & Engineering.
    Implementation approaches for 512-tap 60 GSa/s chromatic dispersion FIR filters2017In: Conference Record of The Fifty-First Asilomar Conference on Signals, Systems & Computers / [ed] Michael B. Matthews, Institute of Electrical and Electronics Engineers (IEEE), 2017, p. 1779-1783Conference paper (Refereed)
    Abstract [en]

    In optical communication the non-ideal properties of the fibers lead to pulse widening from chromatic dispersion. One way to compensate for this is through digital signal processing. In this work, two architectures for compensation are compared. Both are designed for 60 GSa/s and 512 filter taps and implemented in the frequency domain using FFTs. It is shown that the high-speed requirements introduce constraints on possible architectural choices. Furthermore, the theoretical multiplication complexity estimates are not good predictors for the energy consumption. The results show that the implementation with 10% more multiplications per sample has half the power consumption and one third of the area consumption. The best architecture for this specification results in a power consumption of 3.12 W in a 65 nm technology, corresponding to an energy per complex filter tap of 0.10 mW/GHz.

  • 188.
    Krysander, Mattias
    et al.
    Linköping University, Department of Electrical Engineering, Computer Engineering. Linköping University, Faculty of Science & Engineering.
    Frisk, Erik
    Linköping University, Department of Electrical Engineering, Vehicular Systems. Linköping University, Faculty of Science & Engineering.
    Lind, Ingela
    Saab Aeronaut, Linkoping, Sweden.
    Nilsson, Ylva
    Saab Aeronaut, Linkoping, Sweden.
    Diagnosis Analysis of Modelica Models2018In: IFAC PAPERSONLINE, ELSEVIER SCIENCE BV , 2018, Vol. 51, no 24, p. 153-159Conference paper (Refereed)
    Abstract [en]

    To leverage on model based engineering for fault diagnosis, it is useful to be able to do direct analysis of general purpose modelling languages for engineering systems. In this work, it is demonstrated how non-trivial Modelica models, for example utilizing the Modelica standard library, can be automatically transformed into a format where existing fault diagnosis analysis techniques are applicable. The procedure is demonstrated on a model of an air cooling system in the Gripen fighter aircraft developed by Saab, Sweden. It is discussed why the Modelica language is well suited for diagnosability analysis, and a number of non-trivial diagnosability analysis shows the efficacy of the approach. The methods extract the model structure, which gives additional insight into the system, e.g., highlighting model connections and possible model decompositions. (C) 2018, IFAC (International Federation of Automatic Control) Hosting by Elsevier Ltd. All rights reserved.

  • 189.
    Krysander, Mattias
    et al.
    Linköping University, Department of Electrical Engineering, Computer Engineering. Linköping University, The Institute of Technology.
    Heintz, Fredrik
    Linköping University, Department of Computer and Information Science, Artificial Intelligence and Intergrated Computer systems. Linköping University, The Institute of Technology.
    Roll, Jacob
    Linköping University, Department of Electrical Engineering, Automatic Control. Linköping University, The Institute of Technology.
    Frisk, Erik
    Linköping University, Department of Electrical Engineering, Vehicular Systems. Linköping University, The Institute of Technology.
    Dynamic Test Selection for Reconfigurable Diagnosis2008In: Proceedings of the 47th IEEE Conference on Decision and Control, IEEE , 2008, p. 1066-1072Conference paper (Refereed)
    Abstract [en]

    Detecting and isolating multiple faults is a computationally intense task which typically consists of computing a set of tests, and then computing the diagnoses based on the test results. This paper proposes a method to reduce the computational burden by only running the tests that are currently needed, and dynamically starting new tests when the need changes. A main contribution is a method to select tests such that the computational burden is reduced while maintaining the isolation performance of the diagnostic system. Key components in the approach are the test selection algorithm, the test initialization procedures, and a knowledge processing framework that supports the functionality needed. The approach is exemplified on a relatively small dynamical system, which still illustrates the complexity and possible computational gain with the proposed approach.

  • 190.
    Kumm, Martin
    et al.
    University of Kassel, Digital Technology Group, Germany.
    Gustafsson, Oscar
    Linköping University, Department of Electrical Engineering, Computer Engineering. Linköping University, Faculty of Science & Engineering.
    de Dinechin, Florent
    Univ Lyon, INSA Lyon, Inria, CITI, France.
    Kappauf, Johannes
    University of Kassel, Digital Technology Group, Germany.
    Zipf, Peter
    University of Kassel, Digital Technology Group, Germany.
    Karatsuba with Rectangular Multipliers for FPGAs2018In: 2018 IEEE 25TH SYMPOSIUM ON COMPUTER ARITHMETIC (ARITH), IEEE, 2018, p. 13-20Conference paper (Refereed)
    Abstract [en]

    This work presents an extension of Karatsuba's method to efficiently use rectangular multipliers as a base for larger multipliers. The rectangular multipliers that motivate this work are the embedded 18x25-bit signed multipliers found in the DSP blocks of recent Xilinx FPGAs: The traditional Karatsuba approach must under-use them as square 18x18 ones. This work shows that rectangular multipliers can be efficiently exploited in a modified Karatsuba method if their input word sizes have a large greatest common divider. In the Xilinx FPGA case, this can be obtained by using the embedded multipliers as 16x24 unsigned and as 17x25 signed ones.The obtained architectures are implemented with due detail to architectural features such as the pre-adders and post-adders available in Xilinx DSP blocks. They are synthesized and compared with traditional Karatsuba, but also with (non-Karatsuba) state-of-the-art tiling techniques that make use of the full rectangular multipliers. The proposed technique improves resource consumption and performance for multipliers of numbers larger than 64 bits.

  • 191.
    Kumm, Martin
    et al.
    Univ Kassel, Germany.
    Gustafsson, Oscar
    Linköping University, Department of Electrical Engineering, Computer Engineering. Linköping University, Faculty of Science & Engineering.
    Garrido Gálvez, Mario
    Linköping University, Department of Electrical Engineering, Computer Engineering. Linköping University, Faculty of Science & Engineering.
    Zipf, Peter
    Univ Kassel, Germany.
    Optimal Single Constant Multiplication Using Ternary Adders2018In: IEEE Transactions on Circuits and Systems - II - Express Briefs, ISSN 1549-7747, E-ISSN 1558-3791, Vol. 65, no 7, p. 928-932Article in journal (Refereed)
    Abstract [en]

    The single constant coefficient multiplication is a frequently used operation in many numeric algorithms. Extensive previous work is available on how to reduce constant multiplications to additions, subtractions, and bit shifts. However, on previous work, only common two-input adders were used. As modern field-programmable gate arrays (FPGAs) support efficient ternary adders, i.e., adders with three inputs, this brief investigates constant multiplications that are built from ternary adders in an optimal way. The results show that the multiplication with any constant up to 22 bits can be realized by only three ternary adders. Average adder reductions of more than 33% compared to optimal constant multiplication circuits using two-input adders are achieved for coefficient word sizes of more than five bits. Synthesis experiments show FPGA average slice reductions in the order of 25% and a similar or higher speed than their two-input adder counterparts.

  • 192.
    Källming, Daniel
    et al.
    Linköping University, Department of Electrical Engineering, Computer Engineering.
    Hultenius, Kristoffer
    Linköping University, Department of Electrical Engineering, Computer Engineering.
    Improving and Extending a High Performance Processor Optimized for FPGAs2010Independent thesis Advanced level (professional degree), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    This thesis is about a number of improvements and additions done to a soft CPU optimized for field programmable gate arrays (FPGAs). The goal has been to implement the changes without substantially lowering the CPU's ability to operate at high clock frequencies. The result of the thesis is a number of high clock frequency modules, which when added completes the CPU hardware functionality in certain areas. The maximum frequency of the CPU is however somewhat lowered after the modules have been added.

  • 193.
    Källström, Petter
    et al.
    Linköping University, Department of Electrical Engineering, Computer Engineering. Linköping University, Faculty of Science & Engineering.
    Gustafsson, Oscar
    Linköping University, Department of Electrical Engineering, Computer Engineering. Linköping University, Faculty of Science & Engineering.
    Fast and Area Efficient Adder for Wide Data in Recent Xilinx FPGAs2016In: 26th International Conference on Field-Programmable Logic and Applications, Lausanne: IEEE , 2016, p. 338-341Conference paper (Refereed)
    Abstract [en]

    Most modern FPGAs have very optimised carry logic for efficient implementations of ripple carry adders (RCA). Some FPGAs also have a six input look up table (LUT) per cell, whereof two inputs are used during normal addition. In this paper we present an architecture that compresses the carry chain length to N/2 in recent Xilinx FPGA, by utilising the LUTs better. This carry compression was implemented by letting some cells calculate the carry chain in two bits per cell, while some others calculate the summary output bits. In total the proposed design uses no more hardware than the normal adder. The result shows that the proposed adder is faster than a normal adder for word length larger than 64 bits in Virtex-6 FPGAs.

  • 194.
    Lepenica, Nermin
    Linköping University, Department of Electrical Engineering, Computer Engineering. Linköping University, The Institute of Technology.
    Assertion Based Verification on Senior DSP2011Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    Digital designs are often very large and complex, this makes locating and fixing a bug very hard and time consuming. Often more than half of the development time is spent on verification. Assertion based verification is a method that uses assertions that can help to improve the verification time. Simulating with assertions provides more information that can be used to locate and correct a bug. In this master thesis assertions are discussed and implemented in Senior DSP processor.

  • 195.
    Lind, Tobias
    Linköping University, Department of Electrical Engineering, Computer Engineering.
    Evaluation of Instruction Prefetch Methods for Coresonic DSP Processor2016Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    With increasing demands on mobile communication transfer rates the circuits in mobile phones must be designed for higher performance while maintaining low power consumption for increased battery life. One possible way to improve an existing architecture is to implement instruction prefetching. By predicting which instructions will be executed ahead of time the instructions can be prefetched from memory to increase performance and some instructions which will be executed again shortly can be stored temporarily to avoid fetching them from the memory multiple times.

    By creating a trace driven simulator the existing hardware can be simulated while running a realistic scenario. Different methods of instruction prefetch can be implemented into this simulator to measure how they perform. It is shown that the execution time can be reduced by up to five percent and the amount of memory accesses can be reduced by up to 25 percent with a simple loop buffer and return stack. The execution time can be reduced even further with the more complex methods such as branch target prediction and branch condition prediction.

  • 196.
    Lindgren, Jonas
    Linköping University, Department of Electrical Engineering, Computer Engineering. Linköping University, Department of Computer and Information Science. Linköping University, The Institute of Technology.
    Analysis of requirements for an automated testing and grading assistance system2014Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    This thesis analyzes the configuration and security requirements of an auto-mated assignment testing system. The requirements for a flexible yet powerfulconfiguration format is discussed in depth, and an appropriate configurationformat is chosen. Additionally, the overall security requirements of this systemis discussed, analyzing the different alternatives available to fulfill the require-ments.

  • 197.
    Liu, Dake
    Linköping University, Department of Electrical Engineering, Computer Engineering. Linköping University, The Institute of Technology.
    Application specific instruction set DSP processors2010In: Handbook of signal processing systems / [ed] Shuvra S. Bhattacharyya, Ed F. Deprettere, Rainer Leupers, Jarmo Takala, New York: Springer, 2010, p. 415-447Chapter in book (Refereed)
  • 198.
    Liu, Dake
    Linköping University, The Institute of Technology. Linköping University, Department of Electrical Engineering, Computer Engineering.
    Embedded DSP Processor Design: Application Specific Instruction Set Processors2008Book (Other (popular science, discussion, etc.))
    Abstract [en]

       

  • 199.
    Liu, Dake
    et al.
    Linköping University, Department of Electrical Engineering, Computer Engineering. Linköping University, The Institute of Technology.
    Karlsson, Andréas
    Linköping University, Department of Electrical Engineering, Computer Engineering. Linköping University, The Institute of Technology.
    Sohl, Joar
    Linköping University, Department of Electrical Engineering, Computer Engineering. Linköping University, The Institute of Technology.
    Wang, Jian
    Linköping University, Department of Electrical Engineering, Computer Engineering. Linköping University, The Institute of Technology.
    Petersson, Magnus
    Linköping University, Department of Electrical Engineering, Computer Engineering. Linköping University, The Institute of Technology.
    Zhou, Wenbiao
    Beijing Institute of Technologies, China.
    ePUMA embedded parallel DSP processor with Unique Memory Access2011In: Information, Communications and Signal Processing (ICICS), 2011, IEEE , 2011, p. 1-5Conference paper (Refereed)
    Abstract [en]

    Computing unto 100GOPS without cooling is essential for high-end embedded systems and much required by markets. A novel master-slave multi-SIMD architecture and its kernel (template) based parallel programming flow is thus introduced as a parallel signal processing platform, ePUMA, embedded Parallel DSP processor with Unique Memory Access. It is an on chip multi-DSP-processor (CMP) targeting to predictable signal processing for communications and multimedia. The essential technologies are to separate the processing of control stream from parallel computing, and to separate parallel data access from parallel arithmetic computing kernels. By separations, the computation and data access can be orthogonal both in hardware and in programs. Orthogonal operations can therefore be executed in parallel and the run time cost of data access can be minimized. Benchmark shows that the computing performance therefore reaches about 80% of the hardware limit. Less than 40% of the hardware limit can be reached by normal processors. The unique SIMD memory subsystem architecture offers programmable conflict free parallel data accesses. Programming flow and tools are also developed to support coding on the unique hardware architecture. A prototype on FPGA shows especially high performance over silicon cost.

  • 200.
    Liu, Dake
    et al.
    Linköping University, Department of Electrical Engineering, Computer Engineering. Linköping University, The Institute of Technology.
    Nilsson, Anders
    Linköping University, Department of Electrical Engineering, Computer Engineering. Linköping University, The Institute of Technology.
    Eilert, Johan
    Linköping University, Department of Electrical Engineering, Computer Engineering. Linköping University, The Institute of Technology.
    Bridging Dream and Reality: Programmable Baseband Processors for Software-Defined Radio2009In: IEEE COMMUNICATIONS MAGAZINE, ISSN 0163-6804, Vol. 47, no 9, p. 134-140Article in journal (Refereed)
    Abstract [en]

    A programmable radio baseband signal processor is one of the essential enablers of software-defined radio. As wireless standards evolve, the processing power needed for baseband processing increases dramatically and the underlying hardware needs to cope with various standards or even simultaneously maintaining several radio links. Meanwhile, the maximum power consumption allowed by mobile terminals is still strictly limited. These challenges require both system and architecture level innovations. This article introduces a design methodology for radio baseband processors discussing the challenges and solutions of radio baseband signal processing. The LeoCore architecture is presented here as an example of a baseband processor design aimed at reducing power and silicon cost while maintaining sufficient flexibility.

1234567 151 - 200 of 362
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf