Preferences

FPGAs are an amazing product that almost shouldn't exist if you think about the business and marketing concerns. They are a product that is too expensive at scale. If an application takes off, it is eventually cheaper and more performant to switch to ASICs, which is obvious when you see the 4-digit prices of the most sophisticated FPGAs.

Given how ruinously expensive silicon products are to bring to market, it's amazing that there are multiple companies competing (albeit in distinct segments).

FPGAs also seem like a largely untapped domain in general purpose computing, a bit like GPUs used to be. The ability to reprogram an FPGA to implement a new digital circuit in milliseconds would be a game changer for many workloads, except that current CPUs and GPUs are already very capable.


inamberclad
The problem is that the tools are still weak. The languages are difficult to use, nobody has made something more widely adopted than Verilog or VHDL. In addition, the IDEs are proprietary and the tools are fragile and not reproduceable. Synthesis results can vary from run to run on the exact same code with the same parameters, with real world impacts on performance. This all conspires to make FPGA development only suitable for bespoke products with narrow use cases.

I would love to see the open source world come to the rescue here. There are some very nice open source tools for Lattice FPGAs and Lattice's lawyers have essentially agreed to let the open source tools continue unimpeded (they're undoubtedly driving sales), but the chips themselves can't compete with the likes of Xilinx.

JoachimS
Systemverilog (SV) is the dominating language for both ASIC and FPGA development. SV is evolving, and the tools are updated quite fast. SV allows you to do abstractions through interfaces, enums, types etc. The verification part of contains a lot of modern-ish language constructions, support for formal verification. The important thing is really to understand that what is being described is hardware. Your design is supposed to be possible to implement on a die, with physical wires, gates, register, I/Os etc. There will be clocks, wire delays. It actually one of the problems one encounter with more SWE people tries to implement FPGAs and ASICs. The language, tools may help you, but you also need to understand that it is not programing, but design you are doing.

https://en.wikipedia.org/wiki/SystemVerilog

hardolaf
SV requires a linter for literally every single line change that you do because the language is rotten to the core by being based on Verilog. Heck, it has an entire chapter of it's LRM dedicated to the non-deterministic behavior inherent to its description of the hardware. VHDL has no such section because it is deterministic.

Both languages suck for different reasons but no one has figured out how to make a better language and output a netlist from it (yes, there is an open interchange standard that almost every proprietary tool supports).

sweetjuly
I don't disagree that the tools are rough, but to the person you're replying to's point, would perfect tools and languages actually solve the underlying problem?

As much as I love FPGAs, GPUs really ate their lunch in the acceleration sphere (trying to leverage the FPGA's parallelism to overcome a >20x clock speed disadvantage is REALLY hard, especially if power is a concern) and so it seems the only niche left for them is circuit emulation. Of course, circuit emulation is a sizable market (low volume designs which don't make sense as ASICs, verification, research, etc.) and so it's not exactly a death sentence.

hardolaf
The FPGA market has been growing in size despite GPGPU taking off. And clock speed difference is closer to 4-5x not 20x. Despite that and the lower area efficiency of FPGAs, there have been price and power competitive FPGA accelerators cards released over the last 5 years. Sure, you're not going to get an A100's performance, but you can get deterministic latency below 5us for something that the A100 would take a minimum of 50us to process. GPGPU isn't ideal for its current use case either so FPGA based designs have a lot of room to work in to get better, application specific accelerators.
JoachimS
The non-deterministic part of the toolchain is not a universal truth. Most, all tools allow you to set, control the seeds and you can get deterministic results. Tillitis use this fact to allow you to verify that the FPGA bitstream used is the exact one you get from the source. Just clone the design repo, install the tkey-builder docker image for the release and run 'make run-make' and of course all tools in the tkey-builder are open source with known versions to that you can verify the integrity of the tools.

And all this is due to the actually very good open source toolchain, including synthesis (Yosys) P&R (NextPNR, Trellis etc), Verilator, Icarus, Surfer and many more. Lattice being more friendly than other vendors has seen an uptake in sales because of this. They make money on the devices, not their tools.

And even if you move to ASICs, open source tools are being used more and more, esp at simulation, front end design. As an ASIC and FPGA designer for 25 odd years I spend most of my time in open source tools.

https://github.com/tillitis/tillitis-key1 https://github.com/tillitis/tillitis-key1/pkgs/container/tke...

bjourne
I never understood why FPGA vendors think the tools should do this and not the designer. Most do a terrible job at it too. E.g., Quartus doing place and route in a single thread and then bailing out after X hours/days with a cryptic error message... As a designer I would be much happier to tell it exactly where to put my adder, where to place the sram, and where to run the wires connecting the ports. You'd build your design by making larger and larger components.
variadix
As I understand it, the physical FPGA layout and timing information used for placement and routing is proprietary, and the vendors don’t want to share it. They’ll let you specify constraints for connections, but it has to go through their opaque solver. And to be fair, they do have to try to solve an NP-complete problem, so the slowness isn’t unjustified compared to all the other slow buggy software people have to deal with nowadays.
JoachimS
The competitiveness between Lattice and Xilinx is also not a univeral truth. It totally depends on the applications. Small to medium designs Lattice have very competitive offerings. Hard ARM cores, not as much. Very large designs not at all. But if you need internal config memory (for some devices), small footprint etc Lattice is really a good choice. And then support in open source tools to boot.
zxexz
I’m sure the Lattice open source situation is driving sales in a more than substantial way. I’ve definitely spent >2k in the past five years on lattice boards and chips (not counting used or otherwise aftermarket). Most people I’ve met with any disposable income and an interest in hardware or DSP have done similar. I know 2k USD is nothing compared to even a single high end chip from another vendor, but it sticks. I’ve known more amazing devs than I can count on two hands that ended up actually using Lattice FPGAs in their job (several times changing industries to do so!). I honestly feel that if Lattice embraced open source further, they could be a leading player in the space. Hell, I’m sure they could do that and still find a way to make money on software. Something, something, enterprise.
fake-name
> Synthesis results can vary from run to run on the exact same code with the same parameters, with real world impacts on performance.

This is because some of the challenges for the synthesis/routing process are effectively NP-hard. Instead, the compiler uses heuristics and a random process to try to find a valid solution that meets the timing constraints, rather then the best possible solution.

I believe you can control the synthesis seed to make things repeatable, I believe that the stochastic nature of the process means that any change to the input can substantially change the output.

JoachimS
Yes you can control the seeds and get determinstic bitstreams. Depending on device, tools you can also assist the tools by providing floorplanning constraints. And one can of course try out seeds to get designs that meet results you need. Tillitis use this to find seeds that generate implementations that meet the timing requirements. Its in ther custom tool flow.
kev009
For a while in the 2000s Cisco was one of the biggest users of FPGAs. If you consider how complicated digital designs have been for many decades, and the costs of associated failures, FPGAs can certainly be cost neutral at scale, especially accounting for risk and reputational damage, into production lines.

Also there is a large gamut and pretty much always has been for decades of programmable logic.. some useful parts are not much more than a mid range microcontroller. The top end is for DoD, system emulation, novel frontier/capture regimes (like "AI", autonomous vehicles).. few people ever work on those compared to the cheaper parts.

duskwuff
FPGAs are still quite common in niche hardware like oscilloscopes or cell towers, where the manufacturer needs some sophisticated hardware capabilities but isn't manufacturing enough units to make the NRE for an ASIC worthwhile.
stephen_g
Also time to market - I have a friend who worked for Alcatel Lucent and they would use FPGAs while Nokia would use ASICs, they saw it as a big advantage since if there was a problem in part of the ASIC, or if you needed new features that were outside the original scope, the time and cost to respin was massive over fixing problems or implementing new standards in the FPGA bitstream!

Eventually Nokia ended up buying Alcatel Lucent and not too long after he left, not sure what their current strategy is.

cubefox
Why not use CPUs instead of FPGAs?
stephen_g
They're not just moving packets around, they're producing streams of data that pass through dozens of stages and end up being spit out into a DAC to produce radio signals (and vise-versa, signals into an ADC and then through many demodulation stages).

All the modulation, demodulation, framing, scrambling, forward error correction coding/decoding, etc. has to all happen continuously at the same time.

There are some open source software defined radios that can do that for one or two stations on a CPU at low data rates, but it's basically impossible with current CPUs for anything like the number of stations (phones) that are handled in one FPGA with decent data rates, latency etc.

You'd probably need a server rack's worth of servers and hundreds of times the power consumption to do what's happening in the one chip.

vardump
FPGAs are way superior running a lot of parallel pipelined logic. They can have SERDES that can communicate at gigabits per second. You can build logic that reacts in nanoseconds to external I/O with zero jitter.
razakel
Because whilst a CPU can emulate what you want, it can't do it efficiently.
tverbeure
Why do people use GPUs instead of CPUs?

For some workloads, an FPGA is orders of magnitude faster than a CPU.

petra
//If an application takes off, it is eventually cheaper and more performant to switch to ASICs,

That's part of the FPGA business model - they have an automated way to take an FPGA design and turn it into a validated semi-custom ASIC, at low NRE, at silicon nodes(10nm?) you wouldn't have access to otherwise.

And all of that at a much lower risk. This is a strong rational but also emotional appeal. And people are highly influenced by that.

duskwuff
Is this still an active thing? My understanding is that both Xilinx and Altera/Intel have effectively discontinued their ASIC programs (Xilinx EasyPath, Altera HardCopy); they aren't available for modern part families.

For what it's worth, Xilinx EasyPath was never actually ASIC. The parts delivered were still FPGAs; they were just FPGAs with a reduced testing program focusing on functionality used by the customer's design.

CamperBob2
I'd be amazed if that were still possible, in fact. Real-world FPGA designs lean heavily on the vendor's proprietary IP, which won't port straight across to ASICs any more than the LUT-based FPGA fabric will.

Anyone who claims to turn a modern FPGA design into an ASIC "automatically" is selling snake oil.

duskwuff
Oh, these programs were always in-house. The offering was essentially "if you pay an up-front fee and give us your FPGA design, we'll sell you some chips that run that design for cheaper than the FPGAs". If there was ever any custom silicon involved - which there may have been for Altera, but probably not for Xilinx - the design files for it were never made available to the customer.
15155
> Real-world FPGA designs lean heavily on the vendor's proprietary IP

No, not always - I use no vendor IP whatsoever for extremely large designs.

For ASICs is basically required to use fab IP (for physical production/electrical/verification reasons,) but that's absolutely not the case for FPGAs.

esseph
It is very possible and many vendors are still doing this. One of them was fairly recently acquired by Cisco.
CamperBob2
Automatic, without substantial NRE as well as active cooperation from brand X or brand A? BS.
15155
> The ability to reprogram an FPGA to implement a new digital circuit in milliseconds would be a game changer for many workloads

Someone has to design each of those reconfigurable digital circuits and take them through an implementation flow.

Only certain problems map well to easy FPGA implementation: anything involving memory access is quite tedious.

JoachimS
The ability to optimize the memory access and memory configuration is sometimes a game changer. And modern FPGA tools have functionality to make mem access quite easy. Not as easy as a MCU/CPU, but basically the same as for an ASIC.

I would also question the premise that mem access is less tedious, easy for MCUs/CPU. Esp if you need determinstic performance and response times. Most CPUs have memory hierarchies.

The more practial attempts at dynamic, partial reconfiguration involves swapping out accelerators for specific functions. Encoders, fecoders for different wireless standards, Different curves in crypto for example. And yes somebody has to implement those.

15155
> modern FPGA tools have functionality

HLS is not good, so I don't know what you are referring to as "modern." I am primarily experienced with large UltraScale+ and Versal chips - nothing has changed in 15 years here.

> basically the same as for an ASIC

What does this even mean, specifically? Use RTL examples. ASIC memory access isn't "easy," either (though it is basically the "same.")

> partial reconfiguration involves swapping out accelerators for specific functions

Tell me you've never used PR without telling me. Current vendor implementations of this are terrible (with Xilinx leading the pack.)

bigfatkitten
> which is obvious when you see the 4-digit prices of the most sophisticated FPGAs.

6-digit at the high end.

https://www.digikey.com/en/products/detail/amd/XCVU29P-3FSGA...

15155
Nobody has ever paid this price.

These chips are <$3000 new.

stephen_g
I mean, it's not like people producing products with those parts actually pay that for production though, except for some really tiny volume ones (such as some defence projects).

Companies make products based around FPGAs and can sell the whole thing for less than you could buy just the single FPGA part for on a place like Digi-key. It's just part of the FPGA companies' business models. In volume the price will be far smaller.

bigfatkitten
At that end of the market they cost an astronomical amount of money, no matter what.

The $140,000 device doesn’t become a $400 device in any volume; it might become a $90,000 device.

YakBizzarro
these prices are like airplanes: no one with volume pays list prices, it's something else. moreover, this FPGA is very peculiar. it's used to simulate ASIC during validation, so it's not really the typical FPGA that gets used in a project
stephen_g
No, I expect you could get it under $20K with not that much volume and potentially in the single digit thousands in high volume. The FPGA vendors' business models are weird, the price breaks are unlike what we see with most other parts.
15155
> The $140,000 device doesn’t become a $400 device in any volume; it might become a $90,000 device.

VU13Ps are quoted $300/ea at tray quantities from Xilinx, yet are $89k on DigiKey with no price breaks.

bluGill
There are a large number of products that will never sell enough to be worth going to an ASIC.
Aromasin
Part of why Lattice Semi has been so successful in recent years is they've broken the paradigm slightly in that their FPGAs are much most cost-effective, while still coming with all of the things we expect of FPGAs. Lots of high-speed IO, half decent software, and a pretty broad IP portfolio. Something like the Certus-NX comes in at ~$5 at the 17k LUT count, and Avant only ~$550 at the 600k LUT mark. That's almost a 1/4 or more of what the equivalent Xilinx or Altera device goes for. There's very little licensing cost too, which make them really appealing. I see them going into so many designs now because they can scale. You'd have to be making 100Ks+ of boards to justify the ASIC expense when there's a commodity product like it.
tverbeure
At volume, $550 for 600k LUTs is outrageously high and much more than what you’d pay Xilinx or Altera.
Aromasin
Absolutely, but I quoted shelf price. Volume pricing can be anywhere between 30-80% less than that. Agilex 5 equivalent is ~$900 and much the same for the Ultrascale+ equivalent, so Lattice is the most competitive on cost.
15155
And then you have to use Lattice's toolchain, which you couldn't pay me to use.
kvemkon
> The ability to reprogram an FPGA to implement a new digital circuit in milliseconds would be a game changer for many workloads,..

Only 47 milliseconds from power-on to operational.

Lattice Avant™-G FPGA: Boot Up Time Demo (12.12.2023)

https://www.youtube.com/watch?v=s4NUVYyLUxc

Aromasin
CertusPro-NX has I/O Configuration in about 4 ms and full fabric config within 30 ms (for ~100 K logic cell device). Certus, full‑device configuration within ~ 8 ms.

Lattice make some really cool devices. Not the fastest fmax speeds, but hell if the time to config and tiny power draw don't half make up for it.

pjc50
> Only 47 milliseconds from power-on to operational.

Absolute eternity by modern computer standards. GPU will be a trillion operations ahead of you before you even start. Or for another view, that's a whole seven frames at 144Hz.

People say FPGAs will be great for many workloads, but then don't give examples. In my experience the only real ones are those requiring low-latency hardware comms. ADC->FPGA->DAC is a powerful combo. Everything else gets run over by either CPU doing integer work or GPU doing FP.

KeplerBoy
That's completely besides the point. How long does an embedded linux box need to get it's GPU up and ready for number crunching? But yes, FPGAs are best-suited for deterministic low latency stuff.

With the jetsons (agx orin) I have on my desk it would take a bit of tinkering to even get it under a minute.

JoachimS
You also need to bring time to market, product lifetime, the need for upgrades, fixes and flexibility, risks and R&D cost including skillset and NRE when comparing FPGAs and ASICs. Most, basically all ASICs start out as FPGAs, either in labs or in real products.

Another aspect where FPGAs are interesting alternatives are security. Open up a fairly competent HSM and you will find FPGAs. FPGAs, esp ones that can be locked to a bitstream - for example anti-fuse or Flash based FPGAs from Microchip are used in high security systems. The machines can be built in a less secure setting, and the injection, provisioning of a machine can be done in a high security setting.

Dynamically reconfigurable systems was a very interesting idea. With support for partial reconfiguration, which allowed you to change accelarator cores connected to a CPU platform seemed to bring a lot of promise. Xilinx was an early provider with the C6x family IRRC through company they bought. AMD also provided devices with support for partial reconfiguration. There were also some research devices and startups for this in the early 2000s. I planned to do a PhD around this topic. But tool, language support and the added cost in the devices seemed to have killed this. At least for now.

Today, in for example mobile phone systems, FPGAs provide the compute power CPUs can't do with the added ability do add new features as the standards evolve, regional market requirements affect the HW. But this is more like FW upgrades.

esseph
I know many network vendors that have been selling FPGA driven products for decades, and I have contributed to some of the product development.

ASICs require a certain scale and a very high up-front cost.

SlowTao
There has been an idea for a very long time of building an OS and software that targets FPGA systems so it can dynamically change its function for each task. The idea being that it would potentially be faster than a general purpose processor.

Still practically theory as I have never seen anything come of it. It is going up against ASIC design which is a great middle ground for those thing even if it means you are not free to do it yourself.

artiscode
The military loves FPGAs. They can do what ASICs can, but without involving extra people.
15155
Except analog (save for very recently with devices such as Xilinx RFSoC).
mrheosuper
> They are a product that is too expensive at scale. If an application takes off, it is eventually cheaper and more performant to switch to ASICs Isn't that same thing: Too expensive to scale, so you switch to ASIC ?
maxdamantus
> Too expensive to scale, so you switch to ASIC ?

I think it's not so much about too expensive, but once you've got the resources it will always be better to switch to an ASIC.

Not a hardware engineer, but it seems obvious to me that any circuitry implemented using an FPGA will be physically bigger with more "wiring" (more resistance, more energy, more heat) than the equivalent ASIC, and accordingly the tolerances will need to be larger so clock speeds will be lower.

Basically, at scale an ASIC will always win out over an FPGA, unless your application is basically "give the user an FPGA" (but this is begging the question—unless your users are hardware engineers this can't be a goal).

esseph
ASICs require scale that doesn't always make sense. Many things using FPGAs aren't necessarily mass-market / consumer devices.
geerlingguy
One area I see almost exclusively FPGA designs is for high power broadcast equipment like transmitters and exciters. The polar opposite of mass-market, and the price is high enough the FPGA is just one of many expensive components.
esseph
You'll also find them in tons and tons of ISP equipment. Radios, optical gear, QoE equipment, etc.
mrheosuper
Better in what, performance ?, maybe, profit ?, heavily depends on your market.
maxdamantus
Yes, performance (per watt, or per mass of silicon).

Profit is dependent on scale. FPGAs are useful if the scale is so small that an ASIC production line is more expensive than buying a couple of FPGAs.

If the scale is large enough that ASIC production is cheaper, you reap the performance improvements.

Think of it this way: FPGAs are programmed using ASIC circuitry. If you programmed an FPGA using an FPGA (using ASIC circuitry), do you think you'll achieve the same performance as the underlying FPGA? Of course not (assuming you're not cheating with some "identity" compilation). Same thing applies with any other ASIC.

Each layer of FPGA abstraction incurs a cost: more silicon/circuitry/resistance/heat/energy and lower clock speeds.

JoachimS
Yes, profit depends on scale. But far from everything sells in millons of units, and scale is not everything. Mobile base stations sells i thousands and sometimes benefit from ASICs. But the ability to adapt the base station due to regional requirements and support several generations of systems with one design makes FPGAs very attractive. So in this case, the scale make FPGAs a better fit.
maxdamantus
With a 90% to 95% reduction in performance [0], I'd be interested to know when these "generational" upgrades are worth the hit, since it seems like you're already going back a few generations.

I'll admit I'm not familiar with the processing requirements of basestations, but the prospect of mass-produced FPGA baseband hardware still seems dubious to me, and I can't find conclusive evidence it being used, only suggestions that it might be useful (going back at least 20 years). Feel free to share more info.

[0] ASIC vs FPGA comparison of RISC-V processor, showing an 18x slowdown (or 94.[4]% reduction), apparently consistent with the "general design performance gap": https://iugrc.journals.ekb.eg/article_302717_7bac60ca6ef9fb9...

fecal_henge
You see 6-digits on some prices.

https://www.digikey.co.uk/short/5pz3nnfj

imtringued
FPGA vs ASIC is a boring and tired comparison. Yeah obviously. For a fixed configuration an ASIC is basically just an FPGA, but without any of the parts that make an FPGA programmable.

If you don't need programmability, then all that flexibility represents pure waste. But then again, we can make the same argument with ASIC vs CPUs and GPUs. The ASIC always wins, because CPUs and GPUs come with unnecessary flexibility.

The real problem with FPGAs isn't even that they get beaten by ASICs, because you can always come up with a low volume market for them, especially as modern process nodes get more and more expensive to the point where bleeding edge FPGAs are becoming more and more viable. You can now have FPGAs on 7nm with better performance than ASICs with older but more affordable process nodes that fit in your budget.

The real problem is that the vast majority of FPGA manufacturers don't even play the same game as GPUs and CPUs. You can have fast single and double precision floats on a CPU and really really fast single precision floats on GPUs, but on FPGAs? Those are reserved for the elite Versal series (or Intel's equivalent). Every other FPGA manufacturer? Fixed point arithmetic plus bfloat16 if you are lucky.

Now let me tell you. For AI this doesn't really matter. The FPGAs that do AI, focus primarily on supporting a truckload of simultaneous of camera inputs. There is no real competition here. No CPU or GPU will let you connect as many cameras as an FPGA, unless its an SoC specifically built for VR headsets.

Meanwhile for everything else, not having single precision floats is a curse. Porting an algorithm from floating point to fixed point arithmetic is non-trivial and requires extensive engineering effort. You not only need to know how to work with hardware, but also need to understand the algorithm in its entirety and all the numerical consequences that entails. You go from dropping someone's algorithm into your code and having it work from the get go, to needing to understand every single line and having it break anyway.

These problems aren't impossible to fix, but they are guaranteed to go away the very instant you get your hands on floating point arithmetic. This leads to a paradox. FPGAs are extremely flexible, but simultaneously extremely constricting. The appeal is lost.

15155
Floating point arithmetic isn't a "basic element of logic" and likely will never become one in FPGA world: floating point multipliers take up a lot of area and require specific binary implementation details.
Simboo (dead)
checker659
I think FPGAs (or CGRAs really) will make a comeback once LLMs can directly generate FPGA bitstreams.
throwawayabcdef
No need. I gave ChatGPT this prompt: "Write a data mover in Xilinx HLS with Vitis flow that takes in a stream of bytes, swaps pairs of bytes, then streams the bytes out"

And it did a good job. The code it made probably works fine and will run on most Xilinx FPGAs.

pjc50
> The code it made probably works fine

Solve your silicon verification workflow with this one weird trick: "looks good to me"!

throwawayabcdef
Its how I saved cost and schedule on this project.
ben_w
I don't even work in hardware, and yet even I have still heard of the Pentium FDIV bug, which happened despite people looking a lot more closely than "probably works fine".
15155
What does "directly generate FPGA bitstreams" mean?

Placement and routing is an NP-Complete problem.

duskwuff
And I certainly can't imagine how a language model would be of any use here, in a problem which doesn't involve language.
15155
They are "okay" at generating RTL, but are likely never going to be able to generate actual bitstreams without some classical implementation flow in there.
buildbot
I think in theory, given terabytes of bitstreams, you might be able to get an LLM to output valid designs. Excepting hardened IP blocks, a bitstream is literally a sequence of sram configuration bits to set the routing tables and LUTs. Given the right type of positional encoding I think you could maybe get simple designs working at a small scale.
checker659
AI could use EDA tools
imtringued
AMDs FPGAs already come with AI engines.

This item has no comments currently.