fpga
در نشریات گروه فنی و مهندسی-
سامانه ناوبری اینرسی (INS) ابزاری توسعه یافته برای تخمین موقعیت جغرافیایی، سرعت و اطلاعات حالت وسیله نقلیه است. با این حال، دقت INS با گذشت زمان به دلیل نویزهای جمع شونده ناشی از ژیروسکوپ و شتابسنج، کاهش می یابد. بنابراین، استفاده از یک الگوریتم کمکی حذف نویز برای بهبود دقت INS امری حیاتی است. الگوریتم طرح لیفتینگ (موجک نسل دوم) یکی از روش های حذف نویز است که می تواند در این سامانه ها به کار رفته و به دلیل پایین بودن بار محاسباتی قابل پیاده سازی است. هدف از این مقاله، پیاده سازی سخت افزاری الگوریتم طرح لیفتینگ با استفاده از سخت افزار FPGA برای حسگرهای ارزان قیمت INS، به منظور بهبود عملکرد این نوع حسگرها است. استفاده از فناوری FPGA موجب سرعت بالای توابع منطقی در پردازش، کاهش سخت افزار مورد نیاز، کاهش حجم مدار، کاهش چشم گیر توان مصرفی و اتلاف توان می شود. در این مقاله، با توجه به موازی سازی های انجام شده در طراحی زیرماژول های این سامانه، حداکثر تاخیر انتشار برابر 35 چرخه زمانی بوده که با توجه به نرخ بهروزرسانی حسگرها و همچنین سرعت بالای محاسباتی سخت افزار FPGA، پیاده سازی طرح لیفتینگ به عنوان یک الگوریتم حذف نویز قابل اجرا و قابل توجیه است. برای این کار، 6 دسته ورودی، شامل 3 دسته داده های شتاب سنج و 3 دسته داده های ژیروسکوپ که هر کدام برای سه محور هستند، به صورت سریال اعمال شده است. در نهایت، داده های خروجی ذخیره شده تا با توجه به داده های ورودی، میزان اثر بخشی آن در کاهش نویز سیگنال ورودی با استفاده از واریانس آلن نمایان شود. در همین راستا، برای دو سطح تجزیه، بهبود حدودا 17 درصدی برای شتاب سنج و 16 درصدی برای ژیروسکوپ محقق شده است.کلید واژگان: FPGA، شتاب&Rlm، سنج، ژیروسکوپ، تبدیل موجک گسسته، طرح لیفتینگ، واریانس آلنInertial Navigation System (INS) is a developed instrument to estimate the vehicle's position, velocity, and attitude. However, the INS accuracy decreases over time due to additive noise arising from the gyroscope and accelerometer. Hence, denoising algorithms are essential to remove the noise from the sensor and the data quality. A lifting scheme (second generation of wavelets) is a paramount approach to denoise the signals, and their low computation ensures that they are implemented in hardware. This paper aims to implement the lifting scheme algorithm by deploying FPGA for commercial and low-cost INSs. FPGAs can pave the way to fulfill high-rate computation, reduce resource use, and have low power usage, which are compelling reasons researchers use FPGAs. In this paper, according to implemented parallelization technics in submodule designing, the maximum delay path is a 35-time cycle which is applicable and justifiable regarding FPGA's high rate and the sensors' output rate. In this way, six data sequences, three series for an accelerometer and three series for a gyroscope in XYZ axes, are used and finally assessed by Allan variance to validate the performance of the lifting scheme algorithm and compare it with the simulation results. Therefore, the results show that the noise variance in the accelerometer and gyroscope decreased by 17% and 16%, respectively, for the two decomposition levels.Keywords: FPGA, Accelerometer, Gyroscope, Discrete Wavelet Transform, Lifting Scheme, Allan Variance
-
The voting process is one of the most significant areas that benefits from technological growth, and the development of digital technology has changed many other industries. Electronic voting machines (EVMs) are an essential part of traditional voting systems because of their effectiveness, accuracy, and ease of use. The design, architecture, and advantages of the FPGA (Field-Programmable Gate Array) implementation of an electronic voting machine over conventional voting systems have been explored in this work. EVMs based on FPGA have benefits in terms of reliability, flexibility, security, and speed. The elements of the FPGA implementation, the design process, and the difficulties in maintaining data integrity and thwarting tampering are also covered in the article. The goal is to demonstrate how FPGA technology can be applied to build a voting system that is safe, reliable, and effective, therefore enhancing the election process. FPGA-based EVM describes more complex functionalities and enhances performance and electronics voting application
Keywords: EVM, Voting, FPGA, Counting, Efficient Design -
پیاده سازی سخت افزاری بلوک اکتساب بهبودیافته در گیرنده های GPS در محیط های سیگنال ضعیف مبتنی بر FPGA
به دلیل افزایش کاربردهای موقعیت یابی مبتنی بر ماهواره و اهمیت آن ها در زندگی روزمره، نیاز به گیرنده های با حساسیت زیاد برای اکتساب سیگنال های ضعیف در محیط های دارای محدودیت همچون تونل ها، محیط های دارای ساختمان های بلند و خیابان های بین آن ها و موارد مشابه، ضروری به نظر می رسد. در یک گیرنده نرم افزاری، اولین و مهم ترین مرحله، اکتساب سیگنال های GPS است. هدف از این مرحله، تعیین ماهواره های قابل رویت و یافتن مقادیر تقریبی فرکانس حامل و تاخیر کد سیگنال های ارسالی از ماهواره ها است. گیرنده های GPS در شرایط سیگنال ضعیف اغلب با مشکلات داپلر کد و تغییر علامت بیت داده ناوبری مواجه می شوند که SNR خروجی مرحله ی اکتساب را کاهش می دهند. یکی از روش های نوین که می تواند بر مشکلات داپلر کد و تغییر علامت بیت داده ناوبری غلبه کند، روش اکتساب جبران ساز نیم بیت بهبود یافته (ISBC) است که این مقاله به پیاده سازی این روش در بستر FPGA پرداخته است. همچنین، با توجه به کثرت مراحل پردازش سیگنال در گیرنده های GPS، این مقاله با ساده سازی مراحل طراحی و پیاده سازی ماژولار روش ISBC در سطح RTL، نسبت به روش های پیشین به بهبود در پیاده سازی دست یافته است و با حداقل مقدار SNR برابر 43.3- دی بی توانسته است با موفقیت، اکتساب حداقل چهار ماهواره را انجام دهد.
کلید واژگان: GPS، FPGA، اکتساب، سیگنال ضعیف، فرکانس داپلر، داپلر کدDue to the increase in satellite-based positioning applications and their importance in daily life, the need for high-sensitivity receivers to acquire weak signals in limited environments such as tunnels, environments with tall buildings and streets between them, and similar things seems necessary. In a software receiver, the first and most important step is the acquisition of GPS signals. The purpose of this step is to determine the visible satellites and find the approximate values of the carrier frequency and code delay of the signals sent from the satellites. GPS receivers in weak signal conditions often face the problems of Doppler code and navigation data bit sign transition, which reduce the output SNR of the acquisition stage. One of the new methods that can overcome the problems of Doppler code and navigation data bit sign transition is the Improved Semi-bit Compensation (ISBC) acquisition method. This paper has implemented this method in the FPGA platform. In addition, due to the multiplicity of signal processing steps in GPS receivers, this paper has achieved an improvement in implementation by simplifying its design and implementation steps, compared to the previous methods, and with the minimum SNR value of -43.3dB, it has been able to successfully pay for the acquisition of at least four satellites.
Keywords: GPS, FPGA, Acquisition, Weak Signal, Doppler Frequency, Doppler Code -
در این مقاله، یک مبدل دیجیتال به آنالوگ دلتا-سیگما درجه دوم (DSM-DAC) 16بیتی چند حالته با ساختار بهم ریخته زمانی (TI) در فرکانس مرکزی 4 گیگاهرتز و با پهنای باند 20 مگاهرتز به زبان توصیف سخت افزاری (VHDL) مبتنی بر FPGA پیاده سازی شده است. معماری پیشنهادی تنها از یک فرکانس کلاک برای تولید سیگنال های فرکانس رادیویی (RF) استفاده می کند. مدولاتور دلتا-سیگما (DSM) درجه دوم با توانایی تنظیم مجدد دارای سه حالت پایین گذر (LP)، میانگذر (BP) در فرکانس 4/Fs و بالاگذر (HP) برای سنتز سیگنال است. برای افزایش فرکانس نمونه برداری (Fs)، ساختار 4 کاناله TI پیشنهاد شده است که هر کدام از کانالها در فرکانس 4/Fs کار می کنند. از آنجایی که ضرایب ساده برای همه حالتها وجود دارد، عملیات ضرب را می توان با استفاده از یک بلوک شیفت دهنده انجام داد. یک چالش مهم در طراحی این نوع ساختارها، خطای چرخه وظیفه (DCE) است. برای غلبه بر اثر خطای DCE، با تنظیم مدار فیلتر و یکطرفه کردن باند فرکانسی عبور سیگنال بدون اضافه کردن سخت افزار اضافی و پیچیدگی مداری، راه حل جدیدی پیشنهاد شده است. در این روش با حذف اثر تصویر سیگنال مقادیر SNDR و SFDR حتی برای حالت BP به طور قابل توجهی افزایش می یابد. چالش دیگر خطای عدم تطابق سلول های DAC است. این خطا به دو روش میانگین گیری وزنی داده ها (DWA) و مرتب سازی تطبیق عناصر پویا (SDEM) جبران سازی شده است. نتایج شبیه سازی در ISE نشان می دهد که مقدار SNDR برای حالت های LP، BP و HP به ترتیب برابر با 10/106، 65/105 و 95/104 dB است.
کلید واژگان: بهم ریختگی زمانی، خطای چرخه وظیفه، ساختارپس خور- خطا، عدم تطابق سلول ها، مدولاتور دلتا-سیگما، FPGAIn this research, a 16-bit multi-mode second-order Delta-Sigma Modulator-Digital-to-Analog Converter (DSM-DAC) with a time-interleaved (TI) structure operating at a center frequency of 4 GHz and a bandwidth of 20 MHz has been implemented using VHDL on an FPGA platform. The proposed architecture utilizes a single clock frequency for generating RF signals. The second-order DSM is reconfigurable, offering three filter modes: LP, BP at Fs/4, and HP for signal synthesis. Since the coefficients remain simple for all modes, multiplication operations can be achieved using a shifter block. To investigate the effect of duty-cycle-error (DCE) and its compensation, various error values are applied to the modulator and compensation is performed. A novel solution is proposed to overcome the DCE by adjusting the filter and unilaterally narrowing the signal passband without adding extra hardware complexity. This approach significantly enhances the SNDR and SFDR of the DSM output, even for the BP mode. Another challenge is the mismatch error in DAC cells. This error is simulated and compensated using two methods DWA and SDEM. Simulation results in ISE demonstrate that the SNDR values for LP, BP, and HP modes are 106.10, 105.65, and 104.95 dB, respectively.
Keywords: Delta-Sigma Modulator, Duty-Cycle-Error, Error-Feedback, FPGA, Mismatch, Time-Interleaved -
Journal of Future Generation of Communication and Internet of Things, Volume:3 Issue: 2, Apr 2024, PP 1 -7
The electrical modeling of the human body in the form of electrical elements such as resistors and capacitors has simplified the analysis of the body for researchers and doctors . There are various models for the body . One of the most famous of these models is the Cole model, which is used for the inside and outside of body cells . This model is a combination of several resistors and capacitors that are made and modeled in different ways . In some researches, it is made as a real resistor and capacitor , and in others, it is simulated in electrical circuit analysis software . In this research, the body model has been implemented in FPGA, which is used to analyze body tissues, including bioimpedance measurement , and its inputs and outputs have been recorded . Finally, the program is implemented in Zinq hardware and its inputs and outputs are displayed by a digital oscilloscope . In FPGA compared to previous works, from the perspective of hardware volume and accuracy has been improved .
Keywords: Cool Body Model, FPGA, Bioimpedance -
Artificial intelligence-based optimization algorithm was used to compute the switching angle values. In order to run the inverter with the lowest possible Total Harmonic Distortion (THD) value, it is suggested in this study to use an algorithm such as the Practical Swarm Algorithm (PSA). The multilevel inverter and optimization algorithm were created and simulated in this study using a MATLAB software. A frequency spectrum analysis was also conducted and found to be consistent with the theoretical analysis of the system. To provide practical results, the FPGA generates PWM signals that are appropriate for the inverter switches. On the Spartan-3E Starter set, the suggested control schemes were developed and put it into practice. Xilinx-ISE 12.1i design software and VHDL hardware description language were used to create the FPGA software. The suggested approaches have a number of benefits over conventional digital PWM techniques, including straightforward hardware implementation, minimum scaling of digital circuits, easy digital design, reconfigurable, and flexibility in adaptability. The outcomes of the experiment and the simulation agreed rather well.
Keywords: FPGA, Selective Harmonic Elimination (SHE), Harmonics, Particle Swarm Optimization (PSO) -
The major challenge in marine environment imaging lies in addressing the haziness induced by natural phenomena, such as absorption and scattering in underwater scenes. This haze significantly impacts the visual quality of underwater images, necessitating improvement. This paper presents a novel approach aimed at enhancing the efficiency of Gaussian filters for reducing Gaussian noise in underwater images. The method introduces a pipeline structure in the Gaussian filter implementation and evaluates the influence of employing approximate adders on overall performance. Simulation results reveal a notable speed enhancement exceeding 150%, coupled with a substantial reduction in power consumption exceeding 34%. However, these advantages are tempered by an increase in spatial requirements. The study recognizes the inherent tradeoff between output quality and power, highlighting the applicability of the proposed design in error-resilient applications, particularly in image and video processing domains. In essence, the presented approach offers a compelling solution where the benefits of accelerated speed and reduced power consumption outweigh spatial constraints, contributing to the advancement of underwater image enhancement techniques.
Keywords: Underwater image, Gaussian filter, Low power, High speed, FPGA -
در این مقاله پیاده سازی ساختارهایی پر سرعت برای محاسبه ضرب نقطه ای برای خم های بیضوی باینری ادواردز و هشیان کلی شده بر اساس الگوریتم نردبان منتگومری ارایه شده است. در ساختار پیشنهادی برای کاهش تعداد سیکل ساعت، ضرب کننده های میدانی برای انجام محاسبات جمع دو نقطه و دو برابر کردن یک نقطه به صورت موازی استفاده شده اند. ضرب کننده ی میدانی استفاده شده با پایه نرمال گوسی می باشد، که به صورت خط لوله ای و دارای ساختار رقمی-سریال در پایه نرمال گوسی است. این ضرب کننده دارای ساختاری منظم با مسیر بحرانی کم و سخت افزار مصرفی مناسب می باشد. در ساختار ارایه شده عمل ضرب نقطه ای برای خم های بیضوی باینری ادواردز در دو حالت کلی و خاص آن به ترتیب از چهار و سه ضرب کننده ی میدانی استفاده شده است. همچنین از سه ضرب کننده ی میدانی برای خم باینری هشیان کلی شده استفاده شده است. ضرب کننده ها در طول محاسبات برای کاهش تعداد سیکل ساعت، زمان بندی و به اشتراک گذاشته شده اند. نتایج پیاده سازی معماری های پیشنهادی بر روی Virtex-5 XC5VLX110 FPGA نشان می دهد که زمان اجرای ضرب نقطه برای خم های بیضوی باینری ادواردز و هشیان کلی شده بر روی میدان های متناهی GF(2163) و GF(2233) به ترتیب µs 8.62 و µs 11.03 است. نتایج نشان می دهد که ساختارهای پیشنهادی، در مقایسه با ساختارهای قبلی، از نظر پارامترهای مانند تاخیر و بازدهی بهبود یافته اند.
کلید واژگان: سیستم رمزنگاری خم بیضوی، ضرب نقطه ای، ضرب کننده در پایه نرمال گوسی، رقمی-سریال، خم های بیضوی باینری ادواردز، خم های باینری هشیان کلی شدهJournal of Iranian Association of Electrical and Electronics Engineers, Volume:21 Issue: 1, 2024, PP 105 -120The field of embedded systems for cryptographic applications is constantly growing and new methods and applications are emerging. In this paper, high-speed hardware architectures of point multiplication based on the Montgomery ladder algorithm for binary Edwards and generalized Hessian curves in Gaussian normal basis are presented. Computations of the point addition and point doubling in the proposed architecture are concurrently performed by pipelined digit-serial finite field multipliers. The multipliers in the parallel form are scheduled for the lower number of clock cycles compared to other works. The structure of the proposed digit-serial Gaussian normal basis multiplier is constructed based on regular and low-cost modules of exponentiation by powers of two and multiplication by normal elements. Therefore, the structures are area efficient and have low critical path delay. Implementation results of the proposed architectures on Virtex-5 XC5VLX110 FPGA show that execution time of the point multiplication for binary Edwards and generalized Hessian curves over GF(2163) and GF(2233) are 8.62 µs and 11.03 µs, respectively. The results show improvements in terms of execution time and efficiency compared to other's related works. For example, for binary Edwards curves over GF(2163) (on Virtex-4 XC4VLX110 FPGA) the proposed design can reduce hardware resource utilization, execution time, and efficiency by up to 17%, 30%, and 42%, respectively, compared with other the best previous architecture.
Keywords: Elliptic Curve Cryptosystems, Point multiplication, Finite Fields, Gaussian normal basis, Binary Edwards curves, generalized Hessian curves, FPGA -
A dual stage system architecture for face detection based on skin tone detection and Viola and Jones face detection structure is presented in this paper. The proposed architecture able to track down human faces in the image with high accuracy within time constrain. A non-linear transformation technique is introduced in the first stage to reduce the false alarms in second stage. Moreover, in the second stage pipe line technique is used to improve overall throughput of the system. The proposed system design is based on Xil inx’s Virtex FPGA chip and Texas Instruments DSP processor. The dual port BRAM memory in FPGA chip and EMIF (External Memory Interface) of DSP processor are used as interface between FPGA and DSP processor. The proposed system exploits advantages of both the computational elements (FPGA and DSP) and the system level pipelining to achieve real time perform ance. The present system implementation focuses on high accurate and high speed face detec tion and this system evaluated using standard BAO image database, which include images with different poses, orientations, occlusions and illumination. The proposed system attained 16.53 FPS frame rate for the input image spatial resolution of 640X480, which is 23.4 times faster detection of faces compared to MATLAB implementation and 12.14 times faster than DSP implementation and 2.1 times faster than FPGA implementation.
Keywords: Face detection, Heterogeneous System, FPGA, DSP -
Scientia Iranica, Volume:29 Issue: 5, Sep-Oct 2022, PP 2437 -2449The emission of radio waves from Extensive Air Showers (EAS), initiated by ultrahigh-energy cosmic rays, has been attributed to geomagnetic emission and charge excess processes. At frequencies from 10 to 100 MHz this process leads to coherent radiation. Nowadays, the radio detection technique is used in many experiments consisting in studying EAS. One of them is the Auger Engineering Radio Array (AERA), located at the Pierre Auger Observatory. The frequency band observed by the AERA radio stations is 30-80 MHz. This investigatedfrequency range is often highly contaminated by human-made and narrow-band radio frequency interferences (RFI). The suppression of this contamination is crucial to lower the rate of spurious triggers.An adaptive filter based on the Least Mean Squares (LMS) algorithm can be an alternative to the currently used IIR-notch non-adaptive filter. The paper presents 32/64-stage filters based on a non-canonical FIR filter implemented into cost-effective CycloneIV and CycloneV Altera FPGAs with a sufficient safety margin of the registered performance for a global clockabove 200 MHz to satisfy the Nyquist criterion.Keywords: Cosmic rays, Pierre Auger Observatory, Auger Engineering Radio Array, FPGA, filter, LMS, RFI
-
Masking techniques are used to protect the hardware implementation of cryptographic algorithms against side-channel attacks. Reconfigurable hardware, such as FPGA, is an ideal target for the secure implementation of cryptographic algorithms. Due to the restricted resources available to the reconfigurable hardware, efficient secure implementation is crucial in an FPGA. In this paper, a two-share threshold technique for the implementation of AES is proposed. In continuation of the work presented by Shahmirzadi et al. at CHES 2021, we employ built-in Block RAMs (BRAMs) to store component functions. Storing several component functions in a single BRAM may jeopardize the security of the implementation. In this paper, we describe a sophisticated method for storing two separate component functions on a single BRAM to reduce area complexity while retaining security. Out design is well suited for FPGAs, which support both encryption and decryption. Our synthesis results demonstrate that the number of BRAMs used is reduced by 50% without affecting the time or area complexities.Keywords: Side-channel attacks, FPGA, Threshold Implementation, AES
-
Journal of Applied Research in Electrical Engineering, Volume:1 Issue: 2, Summer and Autumn 2022, PP 203 -210The mapping of DNA subsequences to a known reference genome, referred to as “short-read mapping”, is essential for next-generation sequencing. Hundreds of millions of short reads need to be aligned to a tremendously long reference sequence, making short-read mapping very time consuming. Day by day progress in Next-Generation Sequencing (NGS) is enabling the generation of DNA sequence data at ever faster rates and at low cost, which means a dramatic increase in the amounts of data being sequenced; nowadays, sequencing nearly 20 billion reads (short DNA fragments) costs about 1000 dollars per human genome and sequencers can generate 6 Terabases of data in less than two days. This article considered the seed extension kernel of the Burrows-Wheeler Alignment (BWA) genomic mapping algorithm for accelerating with FPGA devices. We have proposed an FPGA-based accelerated implementation for the seed extension kernel. The Smith-Waterman algorithm is used during the seed extension to find the optimum alignment between two sequences. The state-of-the-art architectures use 1D-systolic arrays to fill a similarity matrix, based on the best score out of all match combinations, mismatches and gaps are computed. The cells on the same anti-diagonal are calculated in parallel in these architectures. We propose a novel 2-dimensional architecture. Our new modified algorithm is based on two editing and calculating phases. In each step of calculation, some errors may occur in which all the cells on the same row and the same column are computed in parallel and, thereby, significantly speed up the process. Needless to say, these probable errors will be omitted before the next step of calculation begin. Our simulation results show that the proposed architecture can work with up to 312 MHz frequency in Synopsys Design-Compiler for 180-nm CMOS technology and be up to 570x and 1.4x faster than the software execution and the 1D-systolic arrays, respectively.Keywords: Bioinformatics, FPGA, Smith-Waterman
-
Recently, the growth of convolutional neural networks in various scientific fields can be seen dramatically. The use of various software and hardware techniques in advancing this process provides the platform for increasing research and finding different solutions to increase the efficiency and optimization of this method. One of the important techniques in the field of neural networks is the Generative Adversarial Networks (GAN) and its implementation on FPGA accelerators. In this paper, we will provide an overview of the growth process of convolutional neural networks with using GAN technique and its implementation on FPGA accelerator over two years. In this series we intend to follow some of the most primitive projects starting almost 2017 and by the end of 2018, where significant progress can be made. In this work, we review five papers, the first of which is presented in 2017 and the other four in 2018. The method of comparing these articles is characterized by four distinct perspectives: Optimal utilization of accelerator resources, application of specific techniques, analysis of generated data, and finally the speed of execution in FPGA-based systems is a requirement. Finally, we discuss the advantages and disadvantages of these designs to optimize and improve their performance.
Keywords: Generative Adversarial Network (GAN), Convolution Neural Networks, FPGA, Accelerator Resources -
Journal of Optimization in Industrial Engineering, Volume:15 Issue: 32, Winter and Spring 2022, PP 207 -216RIPEMD-160 hash functions are widely used in many applications of cryptography such as digital signature, Hash Message Authentication Code (HMAC) and other data security application. There are three proposed RIPEMD-160 design namely RIPEMD-160 iterative design, RIPEMD-160 unfolding with factor two and RIPEMD-160 unfolding design with factor four. These techniques were applied to RIPEMD-160 designs to examine the inner structure of RIPEMD-160 in terms of area, maximum frequency and throughput of the design. In this project, RIPEMD-160 hash function using unfolding transformation technique with factor four provided high throughput implementation. The throughput of the RIPEMD-160 unfolding design increase significantly. The objective of this project is to enhance the performance of RIPEMD-160 in terms of throughput. By using unfolding transformation factor four technique, the throughput of RIPEMD-160 can be improved which is about 1753.50 Mbps. The percentage of performance to area ratio of RIPEMD-160 unfolding with factor four designs increase 1.51% if compared with RIPEMD-160 design. The results show performance of proposed designs give the highest value compare with other designs. The simulation results were obtained from ModelSim Altera-Quartus II to verify the correctness of the RIPEMD-160 designs in terms of functional and timing simulations.Keywords: FPGA, Hash Function, RIPEMD-160, throughput, Unfolding
-
One of the important methods of signal analysis is Fourier series and Fourier transform. use the Fourier series to analyze alternating and periodic signals, and to process non-periodic, use Fourier transform. In many applications, they sample the analog signal from the converter and process the required numerical data. Discrete Fourier transform is used to analyze discrete signals and extract its frequency harmonics. Mostly, this algorithm is implemented on software packages using software such as MATLAB, but hardware implementation has the undeniable benefits such as a much higher speed that makes it suitable for real-time processing. FPGA chips are well-suited platforms for implementing signal processing algorithms such as fast Fourier Transform, due to their advantages such as higher performance and flexibility and parallel processing compared to other hardware packages such as microcontrollers or DSP. In this paper, we implement optimized Fast Fourier Transform algorithm by implementing Verilog hardware on FPGA chip.
Keywords: Verilog hardware, Optimization, Fast Fourier Transform, FPGA, Twiddle Factor, FFT, Radix-2, Good-Thomas, Cooley-Tukey, Rader -
Side-channel analysis methods can reveal the secret information of digital electronic systems by analyzing the dependency between the power consumption of implemented cryptographic algorithms and the secret data. Recent studies show that it is possible to gather information about power consumption from FPGAs without any physical access. High flexibilities of modern FPGAs cause that they are used for cloud accelerator in Platform as a Service (PaaS) system; however, new serious vulnerabilities emerged for these platforms. Although there are some reports about how switching activities from one region of FPGA affect other regions, details of this technique are not analyzed. In this paper, we analyzed the strength of this kind of attack and examined the impact of geometrical and electrical parameters of the victim/attacker modules on the efficiency of this attack. We utilized a Zynq-based Xilinx platform as the device under attack. Experimental results and analyses show that the distance between the victim module and the sensor modules is not the only effective parameter on the quality of attack; the influence of the relational location of victim/attacker modules could be more considerable on the quality of attack.
Keywords: CPA, FPGA, Side-Channel, Power Sensor, TDL, TDC -
FPGA’s block memory may be programmed as a single or dual-port RAM/ROM module that leads to an area-efficient implementation of memory-based systems. In this contest, various works of carrying out an optimized implementation of simple to complex DSP systems on embedded building blocks may be seen. The multiplier is a core element of the DSP systems, and in implementing a memory-based multiplier, it is observed that one of the operands is kept constant, hence leading the design to a constant-coefficient multiplication. This paper shows Virtex-7 FPGA’s dual-port ROM-based implementation of an 8x8 variable-coefficient multiplier that may be used in several simple to complex DSP applications. The novelty of the proposed design is to configure the block ROM in dual-port mode and, hence, get four partial products in two clock cycles and introduce two unconventional adder approaches for partial product addition. This approach leads to fully resource utilization and the provision of a variable-coefficient multiplier. The work also shows the comparison of proposed architecture with already existing memory-based implementations and concludes the work as a novel step towards the efficient memory-based implementation of multiplier core.
Keywords: Block Memory, Digital Signal Processing, FPGA, Multiplier -
Journal of Electrical and Computer Engineering Innovations, Volume:9 Issue: 1, Winter-Spring 2021, PP 93 -102Background and Objectives
Programmable logic devices, such as Field Programmable Gate Arrays, are well-suited for implementing biologically-inspired visual processing algorithms and among those algorithms is HMAX model. This model mimics the feedforward path of object recognition in the visual cortex.
MethodsHMAX includes several layers and its most computation intensive stage could be the S1 layer which applies 64 2D Gabor filters with various scales and orientations on the input image. A Gabor filter is the product of a Gaussian window and a sinusoid function. Using the separability property in the Gabor filter in the 0° and 90° directions and assuming the isotropic filter in the 45° and 135° directions, a 2D Gabor filter converts to two more efficient 1D filters.
ResultsThe current paper presents a novel hardware architecture for the S1 layer of the HMAX model, in which a 1D Gabor filter is utilized twice to create a 2D filter. Using the even or odd symmetry properties in the Gabor filter coefficients reduce the required number of multipliers by about 50%. The normalization value in every input image location is also calculated simultaneously. The implementation of this architecture on the Xilinx Virtex-6 family shows a 2.83ms delay for a 128×128 pixel input image that is a 1.86X-speedup relative to the last best implementation.
ConclusionIn this study, a hardware architecture is proposed to realize the S1 layer of the HMAX model. Using the property of separability and symmetry in filter coefficients saves significant resources, especially in DSP48 blocks. The author(s). This is an open access article distributed under the terms of the Creative Commons Attribution (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, as long as the original authors and source are cited. No permission is required from the authors or the publishers.
Keywords: Gabor Filter, FPGA, Separable Filter, Convolution, HMAX Model -
Hardware Trojans have emerged as a major concern for integrated circuits in recent years. As a result, detecting Trojans has become an important issue in critical applications, such as finance and health. The Trojan detection methods are mainly categorized into functional and side channel based ones. To increase the capability of both mentioned detection methods, one can increase the transition activity of the circuit. This paper proposes a trusted platform for detecting Trojans in FPGA bitstreams. The proposed methodology takes advantage of increased Trojan activation, caused by transition aware partitioning of the circuit. Meanwhile, it benefits partial reconfiguration feature of FPGAs to reduce area overhead. Experimental studies on the mapped version of s38417 ISCAS89 benchmark show that for the transition probability thresholds of 10^{-4} and 2*10^{-5}, our method increases the ratio of the number of transitions (TCTCR) in the Trojan circuit by about 290.93% and 131.48%, respectively, compared to the unpartitioned circuit. Similar experiments on s15850 for the transition probability thresholds of 10^{-4} and 2*10^{-5} show an increase of 290.26% and 203.11% in TCTCR, respectively. Furthermore, this method improves the functional Trojan detection capability due to a significant increase in the ratio of observing wrong results in primary outputs.Keywords: Hardware Trojan, Trusted Design Platform, Partial Reconfiguration, FPGA
-
Resampling is a critical step in Particle Filter (PF) because of particle degeneracy and impoverishment problems. Independent Metropolis Hasting (IMH) resampling algorithm is a robust and high-speed method that can be used as the resampling step in PF. In this paper, a new algorithm based on IMH resampling is first proposed. The proposed algorithm classifies the particles before entering to the resampling module. The classification causes those essential particles are only routed to the IMH resampler. Then we propose a distributed architecture to reduce the execution time and high-speed processing for resampling. Simulation results for tracking a signal indicate that the PF with the proposed resampling architecture has acceptable tracking performance in comparison to other resampling methods. The PF architecture with the novel Improved IMH (IIMH) resampling algorithm has 33% more speed than the best-reported method in PF. Also, the proposed distributed PF architecture achieve 79% more speed compared with the best-reported method in PF. FPGA-based implementation results indicate that the utilization of the proposed IIMH resampling algorithm in PF and also distributed architecture lead to hardware resource and area usage reduction.Keywords: Particle Filter, Independent Metropolis Hasting Resampling, FPGA, Signal Tracking
- نتایج بر اساس تاریخ انتشار مرتب شدهاند.
- کلیدواژه مورد نظر شما تنها در فیلد کلیدواژگان مقالات جستجو شدهاست. به منظور حذف نتایج غیر مرتبط، جستجو تنها در مقالات مجلاتی انجام شده که با مجله ماخذ هم موضوع هستند.
- در صورتی که میخواهید جستجو را در همه موضوعات و با شرایط دیگر تکرار کنید به صفحه جستجوی پیشرفته مجلات مراجعه کنید.