EEVblog #58 - Warm and Fuzzy FPGA Troubleshooting

preview_player
Показать описание
Look, up in the sky, is it a hardware fault?, is it a software fault?, no, it's a bloody FPGA
Рекомендации по теме
Комментарии
Автор

"My favorite programing language is solder!" Love that quote!

power-max
Автор

"Being a hardware guy, I immediately blamed the software! Because software is a pain in the ass!"

Here is your like, EEV :)

rajvinjamuri
Автор

Boy does this hit the mark.... I have been using a Xilinx Virtex Pro II on a circuit at work and last year had the worst problems. We out sourced work to another company but the returned work to an extent of 90% didn't work.  I didn't have any boundary scan software and very little to go on except to ensure the components were to a visual degree and somewhat to a measurement degree were ok. when this didn't produce any fruits then I had to manually check the BGA connections. I had access to about 90% of the pins from the underside, but the only thing I could check for was the protection diode of an Input. If I put the positive lead on ground and probed the vias I could reference the reading with a good board.  Ok but there are hundreds of pins to check. This took awhile but to a large extent was successful. Some of the comments you made were really quite reassuring and helpful.  Thanks for the video Dave.

michaelhawthorne
Автор

if i had to guess, i would say that you were trying to communicate via two or more independant SPI lines to two or more peripherals. that would cause memory buffer problems with the SPI driver in the FPGA which would be receiving/having to send bytes from multiple peripherals.

Perhaps the expectation of the driver was a single SPI line connected to a number of peripherals where you talk to only 1 peripheral at a time.

Great blog by the way, I'm subscribing.

orangedac
Автор

HAHAHA, I love it!!

"My favorite programming language is solder."

TurboHawkV
Автор

sir you're vid are very helpful both for beginners and experienced person. more power to EEVblog

adrianara
Автор

The different sybols in these two lines are 0852 and 0245. These can be 52-th week of 2008 and 45-th week of 2002. And these are the only characters that differ. So the chips marking may be identical with just different daycodes.

alextrofimov
Автор

@KingKongSamurai Point of view thing. If you're a C programmer, HDL is hardware, if you're a board designer, HDL is software. I'd easily call it software, because if you think back to the origin terms, hardware is part of the computing equipment you can't change once it's out, software is something that can be changed. I don't see why it matters whether it's written in command-oriented language or one describing hardware-like structures.

SianaGearz
Автор

Great story and very funny re software v hardware. Agatha Christie would be impressed. Nice one. And programmers have an attitude of superiority, sitting on the comfy chairs with their laptops while engineers are getting dirty, asking for cups of coffee as their buggy routines ensure a late night for everyone. Always getting the good parking spot, never having a crease on their shirt but a perfect one on their trousers. Avoiding blame when it all caves in. Always asking for a cable to be made up because they cant be bothered to check their little black bag etc  etc... All software guys should be treated as lackeys, continuously, 24 hours a day, 365 days a year for next 40 years to even things up...A bit.

martinda
Автор

I feel your pain, I've just finished a month of debugging a system implemented on an old Spartan 2E and that was one of the single most infuriating things I've ever encountered. When even the removal of commented code, comments and CR-LF's prior to synthesis caused a soft serial interface to fail completely or the place and routing stage of Xilinx to flip from 25% resource use on chip to 151% and fail implementation.

BlackWolf
Автор

Sounds like the design is missing a timing constraint on one of the SPI lines. Subtle hardware revisions of the FPGA will exceed the Altera *formal* specifications for setup time by different margins. I would wager the mis-constrained lines have propagation delays falling within the difference. Adding a constraint would lower the global clock rate to meet the setup time or force P&R to pick a shorter path. That removing a multiplexer fixed it re-enforces this belief.

shuckc
Автор

As a student more or less thrown in a classroom with a Spartan-3 and a page that says "build something" I can distantly relate.
You have a simple if..else.. statement blow up in your face (In a process!)
It usually takes me about 30 minutes to reach some 'magical' effects.
We are implementing a simple 8bit MCU now. FML :D

ZealothPL
Автор

i think the numbers is year and week the batch was done
one on the left week 52 of 2008 and the right week 45 of 2002

williefleete
Автор

great insight and detail... do you have a photo/link of this device once completed and packaged? Always wondered what the end product of the devices you work on look like.

averagemale
Автор

My fav EEblog vid! I can relate to this 😂😂

armstrongsubero
Автор

Really cool, this remind me, of a hardware/software issue that in the end we found that by changing the ddr memory (low power ddr2) manufacture the whole thing worked, in the end we also found that our processor (frescale imx51) had some errata on the memory controller part. (and by some reason the other memory could handle this difference)

But it took almost one year to find out :(

nerdinvestdor
Автор

Wow, that's really strange. I agree, the best programs are electronic circuits because they are the most bug-free. But WOW, turns out that tight tolerances don't always mean exact results every single time, especially with firmware.

maclover
Автор

I wonder if all the I/O were metastable double-FF'd and no combinatorial inputs?

True, SW must be robust, especially where transients are involved. Low power design with power regions makes this especially interesting.

kdrhp
Автор

@FrancekPirosrancek it's my home lab.

EEVblog
Автор

I had the same issue with STM32 IIC bus, I was like debugging the HW IIC for like two months before I discovered it was a silicon bug

Mr_ASIC