Docsity
Docsity

Prepara tus exámenes
Prepara tus exámenes

Prepara tus exámenes y mejora tus resultados gracias a la gran cantidad de recursos disponibles en Docsity


Consigue puntos base para descargar
Consigue puntos base para descargar

Gana puntos ayudando a otros estudiantes o consíguelos activando un Plan Premium


Orientación Universidad
Orientación Universidad

FPGAs: High-Performance Semiconductor Devices with Millions of Configurable Logic Blocks, Guías, Proyectos, Investigaciones de Sistemas de Control

VLSI TechnologyFPGA DesignDigital Logic DesignSemiconductor Devices

An introduction to Field-Programmable Gate Arrays (FPGAs), discussing their increasing use in high-performance servers and embedded computers due to their large-scale integration and versatility. the architecture of FPGAs, their advantages over other devices, and a comparison with CPLDs, ASICs, GPUs, and DSP-based solutions.

Qué aprenderás

  • What industries benefit most from the use of FPGAs?
  • What are the advantages of using FPGAs over other devices like ASICs, CPLDs, GPUs, and DSP-based solutions?
  • How does the architecture of an FPGA differ from that of a CPLD or ASIC?

Tipo: Guías, Proyectos, Investigaciones

2019/2020

Subido el 24/01/2020

FranciscoT89
FranciscoT89 🇩🇪

1 documento

1 / 10

Toggle sidebar

Vista previa parcial del texto

¡Descarga FPGAs: High-Performance Semiconductor Devices with Millions of Configurable Logic Blocks y más Guías, Proyectos, Investigaciones en PDF de Sistemas de Control solo en Docsity! 1 FPGA – Overview, Architecture and its Industry Applications Francisco Tahhan (318064) M.Sc. Mechanical Engineering Technische Hochschule Lübeck Lübeck, 23562, Germany francisco.tahhan@th-luebeck.de Prof. Dr.-Ing. Martin Hahn M.Sc. Mechanical Engineering Technische Hochschule Lübeck Lübeck, 23562, Germany martin.hahn@th-luebeck.de Abstract A programmable logic device (PLD) is an integrated circuit (IC) built on digital electronics, which allows the user to configure its functions. The demand for new processes, for which the working platform must be flexible enough to integrate into the existing infrastructure, support huge workloads, solve constantly evolving and increasingly complex mathematical algorithms and offer low latency and highly deterministic systems to support the fast response required by real-time systems (e.g. neural computing), allows highly advanced PLDs to be found for the several computational architecture requirements used in different fields of application. These include microprocessors, ASICs, PROMs, GPUs, CPLDs, DSPs, FPGAs, etc. The purpose of this document is to provide an overview of FPGAs as one of the preferred digital implementation platforms for a wide variety of industrial applications nowadays. Their high speed and flexibility, parallelism, short time-to-market, etc. present FPGAs as highly functional embedded circuits. Thanks to the reuse/integration of functional modules known as hard intellectual property (IP), they are currently even considered as systems on a chip (SoC). The first part of this document provides a short introduction about what is an FPGA. Section II describes its architecture, followed by a brief contrast to CPLDs, ASICs and GPUs on section III. In section IV it is shown its application areas, as well as examples of these applications and platforms used by different providers. Keywords - integrated circuit, programmable logic device, field-programmable logic array, system on a chip, embedded systems, high-performance computing. I. INTRODUCTION Thanks to several technological advances, in the mid-80s, Xilinx Inc. produced the first family of general-purpose user-programmable logic devices based on matrix architecture. This technological contribution was known as Logic Cell Array (LCA). From this, it was almost immediately invented the first commercial matrix of field programmable gate, illustrated in Fig. 1, consequently the development of the first family Xilinx FPGAs. A field programmable gate array, as its name suggests, is an integrated circuit, whose hardware functionality can be programmed after manufacture. Fig. 1 Xilinx XC2064 Configuration Logic Block array and interconnect architecture (Xilinx, 1985) 2 Today, both FPGAs and GPGPUs are considered as computing platforms, able to keep pace with the growing demand for High- performance computing (HPC)1. Through Very-large-scale integration (VLSI) technology, as described by Adrew Moore [1], there are currently FPGA devices on the market with up to millions of Configurable Logic-blocks (CLBs) on a single circuit board, which in turn can include hard Intellectual Property (IP), being even considered as a System on a Chip (SoC). The use of FPGAs as high performance platforms has had an exponential increase in recent years and can be measured by the number of publications per year, as shown in [2], starting in 1989 with two publications and having up to 2906 in 2010. The number of publications from 1989 to 2015 is shown in Fig. 2. Fig. 2 IEEE listed FPGA related publications per year (adapted from [2]) The three primary contributions offered by FPGAs are real-time calculations, parallelism and a high level of user I/O pins. This and many other features, such as the ability to be connected to almost all application-specific circuits through protocol-independent high- speed serial, have had a major impact on several fields of scientific computer technology [2]. Several aspects are considered in the industry when it comes to choosing the right programmable logic device (PLD), such as cost, energy efficiency, processing speed, etc. Some of these aspects make an FPGA the best choice when compared to other devices such as ASICs, CPLDs, GPUs and DSP-based solutions, especially when it comes to heterogeneous2 systems and parallel information processing. 1 High-performance servers and embedded computers. 2 Systems that use more than one kind of processor or cores. Another reason why FPGAs have caught the attention of system developers is because its integrated circuits today can contain millions or even thousands of millions of transistors and undertake complex operations such as high-speed data networks, 3D graphics or streaming. It is also due to the fact that technological advances have led to a 25% improvement in the performance of computational logic per generation of FPGAs3, while the price has been significantly reduced by 90%. [3]. Meaning that instead of a simple, low-functional and expensive wired system, as it was in the mid-80s, it is now possible to acquire an FPGA platform starting at 30US with various properties that allow the end user to perform an immense number of operations for control systems and massive data analysis. II. FPGAs ARCHITECTURE An FPGA is a semiconductor device, which allows its functions to be defined after being manufactured and works based on Boolean algebra, which, through the combination of hundreds of thousands of logic gates, registers and wires, can generate powerful mathematical functions for specific tasks. Basically, an FPGA is composed of a long array of CLBs, digital signal processing blocks (DSPs), RAM blocks, and I/O pin blocks (IOBs) within a flexible matrix of configurable interconnections [3], as depicted in Fig. 3. Fig. 3 High-Level Block Diagram of FPGAs (adapted from [3]) Modern FPGAs have a combination of dedicated hardware for various functions such as configurable static random access memories (SRAM or Flash), high speed I/O pins, speed serial transmitters, external memory controllers, PLLs and MMCMs, etc. Some even include a hard-silicon processor core (such as the ARM Cortex A9 in the case of 3 About 92x in the last decade [3] 5 A GPU is highly segmented, indicating that it has a large number of functional units. These units can be divided mainly into two: those that process vertices, and those that process pixels. Therefore, the vertex and pixel are established as the main units handled by the GPU. In addition, memory plays an important role in storing the intermediate results of the operations and textures used. A large number of GPU threads are necessary to prevent ALUs from becoming inactive. In this way, its threads are programmed so that groups of ALUs are performing the same (exact) instruction in parallel. The use of SIMT allows GPU providers to perform area and energy- efficient implementations in relation to a CPU, because much of the resources of one kernel can be shared with other kernels in the same group. Current GPU models typically have hundreds of unified shader processors that are capable of acting as vertex shaders, and as pixel shaders. Thus, a clock frequency of about 1-1.5 GHz4, translates into much more computing power than a higher frequency CPU (3.8-4 GHz), thanks to its parallel architecture. Fig. 8 GPU Block Diagram (adapted from [12]) In the case of the FPGAs, they have up to millions of CLBs embedded in a sea of programmable interconnections. These CLBs are mainly composed of Look-Up tables (LUTs), multiplexers and flip-flops and can implement complex logical functions [6]. Because these CLBs are incredibly complex compared to the macro-cells of CPLDs, it can be said that in terms of flexibility and variety, a CPLD is not competition for an FPGA. Another great advantage over a CPLD is the possibility of modifying the circuit, even while it is in operation. This is known as partial reconfiguration and is very useful when 4 Today’s sta dard o the ost powerful GPUs needed to keep the device running. Similarly, it can be updated with various designs per requirement, a feature widely used in Accelerated Computing. On the other hand, the use of an external ROM to load the configuration information and configure the fabric before it can start working, presents a time delay between the power on and the start of the FPGA. This delay time is undoubtedly a disadvantage not only in front of CPLDs, but also in front of the rest of the devices here discussed. Similarly, an FPGA can be much more capable than a CPLD, but this means that it can also be much more expensive. Finally, they have a relatively high idle consumption compared to a CPLD, as noted at the beginning of this section. When it comes to differentiating between an FPGA and an ASIC, two main advantages of FPGA can be mentioned. The first is time to market. The design of an ASIC starts from a completely blank frame, i.e. components must be purchased, come from manufacturing or must be individually developed for use within any ASIC design. The design of an FPGA starts with a long arrangement of blocks and other elements, offering developers a much easier task. This means that the design cycle of an FPGA can be done in an afternoon up to a couple of weeks, while for an ASIC, it can only take several weeks or even months before it can be integrated into a printed circuit board and start with validation tests. The second advantage is at economic terms, as long as it is not about high production volume, as detailed in Fig. 9, according to Rajeev Jayaraman, former Director, FPGA Implementation Tools at Xilinx Inc. An FPGA starts with an acquisition cost of up to $30, while the development of an ASIC can easily start in the range of millions of dollars. Fig. 9 FPGA vs ASIC cost analysis 6 In turn, FPGAs can be reconfigured with a different design or simply reconfigure a part of the chip even while it is running, which means that they are very suitable for applications where the current design may need to be updated to use a better algorithm or design. In such a case, programmability is the deciding factor rather than cost. Also, the CLBs can be programmed only with the operations needed for the application, sometimes resulting in increased calculation efficiency. Many FPGAs also contain dedicated hard silicon blocks for various functions. These dedicated hardware blocks are critical for competing with ASICs. On the other hand, because ASICs have permanent circuits, they are more energy efficient than FPGAs. The power consumption of ASICs can be controlled and optimized very carefully. In addition, they have a limited operating frequency compared to the ASIC of a similar process node. Routing and configurable logic consume time margin in FPGAs. ASIC technology offers higher speed and lower power consumption solutions beyond what an FPGA can provide. The speed differences between the two design methods can easily be 10x or more. In addition, an FPGA design can be reverse-engineered from its bitstream, while reverse-engineering an ASIC is much more difficult. As for the relationship between FPGAs and GPGPUs, both are at a similar level in terms of performance and energy consumption. Although a comparison between the two devices depends mostly on the specific use of hardware, some general differences can be determined [13]. An analysis is then carried out for High Performance Embedded Computing (HPEC), such as military systems, aerospace market and Deep Neural Networks (DNN). Most workloads that contain large amounts of parallel computing tasks also contain some sequential or moderately parallel elements, meaning that a GPU-CPU hybrid system is required to meet system performance requirements [14]. This implies that despite the massive parallel construction of GPUs that allows them to run a software algorithm much faster than a conventional processor, the need for a high-end CPU has an impact on the efficiency and performance of the platform. The necessity of communication between the CPU and the GPU adds a potential bottleneck to the system [12]. This means that future workloads cannot be mapped into the GPU and the hardware remains inactive or inefficient. Whereas FPGAs fit perfectly with OpenCL’s by virtue of its inherently parallelism. Thus, they offer a pipeline parallelism wherein tasks can be created in a push-pull setup with each task using different data from the previous task with or without host interaction. In addition, the off-chip memory of a GPU provides substantially less bandwidth relative to local/on-chip memory. If the workload depends on off-chip memory a decrease in GPU computing and efficiency is noticeable [12]. To address this problem, new Direct Memory Access techniques allow high-volume data to be transmitted to the GPU without consuming GPU clock cycles. The high bandwidth provided by the GPU's new memory interfaces allows it to outperform the FPGA in terms of absolute performance when implementing complex models. Notwithstanding the above, Xilinx Inc., shows that when it comes to on-chip memory, FPGAs result in significantly higher computing capacity. To meet this demand, GPU vendors must modify existing architectures to stay current. Therefore, GPU users must stop their project until the new architecture is available [12], [15]. Moreover, the FPGA outperforms the GPU three times in absolute performance and even by an order of magnitude when power consumption is taken into account [13]. In the presence of data dependencies, a custom FPGA data path outperforms the GPU [13]. The data flow in GPUs is defined by the software and driven by the GPU's complex memory hierarchy. However, it is clear that in the case of receiving a large number of threads in the GPU, only parts of a workload can be mapped efficiently in its architecture, and if not enough parallelism is found within the threads, this results in lower performance efficiency [12]. FPGAs can offer more flexible architectures, which are a mix of hardware programmable resources, DPS blocks, and BRAM. This flexibility allows the user to easily reconfigure the data path, even during runtime, using partial reconfiguration. 7 Last but not least, GPUs offer an advantage over FPGAs when it comes to ease of use. Compute Unified Device Architecture (CUDA) is very easy to use for software developers, who do not need in-depth knowledge of the underlying hardware. However, for the use of an FPGA, the developer must have knowledge of both FPGAs and automatic learning algorithms. Xilinx has developed tools such as SDx IDE5. These tools have made the FPGA design flow process much easier for software engineers, as they can easily convert their C/C++ code to HDL. IV. APPLICATIONS Here is an overview of typical fields of application where FPGAs are used [4], [2], [7]. For High Performance Servers we find the following industries with some of their applications:  Government labs: nuclear waste simulation, warfare modeling, spacecraft modeling, etc.  Defense: data mining and analysis for threat monitoring, video, pattern matching, etc.  Financial Services: options valuation and risk analysis of assets.  Geosciences and engineering: Seismic modeling and analysis, and reservoir simulation.  Life sciences: Gene encoding and matching, and drug modeling. For High-Performance Embedded Computers it is as follows:  Defense: beam forming in radar.  Airborne Electronics: Image compression and analysis in payload.  Control Engineering: fuzzy logic, motor control, power electronics, circuit testing and prototyping, etc.  Medical imaging: image rendering in CT and MRI scanners.  Robotics: deep neural networks, object tracking, depth and movement extraction, etc. 5 Integrated Development Environment  Financial services: low latency and high throughput data processing in trading solutions.  Communications: encryption in network routers.  Mathematics: modelling. The FPGA platform technology allows its integration with applications for the above mentioned industries, and offers higher performance. As has already been stated repeatedly in this document, the features from which these systems benefit are: massive parallelism, real-time capabilities, local memory, re-configurability, on-time (dynamic) reconfigurable architecture, flexibility, dedicated DSP blocks, hard Intellectual Property, overall acceleration of the system and the vast amount of I/O pins. To give the reader a general idea of the use of an FPGA, a simple but still perfect example of application has been chosen. T. Sutikno, et al. show a single switching state machine scheme using Altera's APEX20KE to create an electronic commutation to provide the rotating field of a trapezoidal permanent magnet AC motor in both forward/reverse rotations and regenerative forward/reverse braking [16]. For this, a pair of switches must be sequentially turned on in the correct order to energize a pair of windings. The FPGA based scheme controls the BLDC motor correctly, switching automatically from one state to the other. For further information on FPGA applications in industry, the reader can refer to the detailed bibliography at the end of the document. Moreover, accordingly to MarketsandMarkets Research Private Ltd., “The FPGA market to grow from USD 5.8 billion in 2017 to USD 9.5 billion by 2023, at a compound annual growth rate (CAGR) of 8.5% during the forecast period” [17]. Although there are many companies in the FPGA industry such as DSpace (Germany), S2C Inc. (USA), Microchip (USA), Taiwan Semiconductor Manufacturing Company (Taiwan), EmuPro (India), National Instruments (US), etc., the 2 main companies with most of the Market Share are Intel and Xilinx Inc. These 2 had 37% and 50% of the revenue in 2015 respectively after Intel bought Altera.
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved