Are there any significant speed up in time on cracking an RSA key (either brute-force or factoring with general number field sieve) using a GPA or FPGA compared to a CPU? If there is a speed advantage, are there any figures on how much it is?
-
1The question is odd, as in easy why ask it. I think that's why people are confused. A single 4u server with a FPGA back plane can replace an entire datacenter of CPU's. Somewhere around 10,000 - 1. Start looking at openCL and the password haze project. That combined with clustered resources is just plain win. – Christopher Apr 12 '16 at 11:34
-
1So... you're asking to provide figures about speed-gain, but are not willing to provide any info on either hardware or software? In general: tasks that can be executed in parallel are usually faster on a GPU than a CPU, due to the higher number of cores. So yes, there's a speed advantage - most likely. For the rest of the question: good luck in getting any answer with the info you provided. – Apr 12 '16 at 13:13
3 Answers
FPGAs are not like CPUs or GPUs, and cannot be compared like that. Your question lacks sufficient details to provide a meaningful answer. FPGAs come in wildly different sizes and offers paralellism only limited by the logic resources of the FPGA. A giant €100000 FPGA will have way way more logic resources than a €1 FPGA. There are no €100000 GPUs or CPUs.
You must include other metrics to make a meaningful performance comparison. Like investment cost, power usage, implementation effort, and so on.
If you want a hint of high end FPGA capability: Some of largest FPGAs available can do up to 5-10 trillion multiplications per second.
- 301
- 2
- 6
I believe there are too many variables here to give a precise answer. It depends on the below, as far as I know:
- the cracking software and/or the algorithm used
- the number of cores in the GPU
- the architecture of the GPU
- the architecture of the CPU
- no of bits in the RSA key
-
2Point 1 and 5 is irrelevant. And I'm not asking for a precise answer, just a rough figure so that there's something to start with, which means point 2 to 4 is also irrelevant. I wish I can downvote this answer. – user78228 Jun 09 '15 at 10:35
-
-
1I'm sure I'm clear with the question about what I'm looking for. And regardless of whether the figure is a benchmark or something else, your answer is still irrelevant becuase any comparisons are _supposed_ to be using the same algorithm with the same number of bits, otherwise it's like comparing apple to oranges and the whole thing becomes pointless. And if you aren't accounting for the points 2 to 4 when interpretating the figures, it's also pointless. They are the requisites for any figures, so when I said that I already mean that they are accounted for. – user78228 Jun 09 '15 at 11:24
-
1
Idk why people were so harsh, when obviously "a" would still mean comparable market positions (whether you meant "the best one money can buy", the most sold in the consumer market, or just whatever 1000$ could bring you home).
GPUs or even FPGAs the problem is still the same at the end of the day (until some new progress in integer factorization theory is made at least): you want to use sieves, not "as simple as it is linear" brutish algorithms. And for an big enough key that would really make you feel constrained by a normal processor, the downside to that is that they take an increasingly large quantity of memory. Which in turn comes in "ample scalable sizes" only for CPUs.
When factoring 1024-bit will enter the realm of sub-million dollars hardware, perhaps you may even see dedicated circuitries (ie. the premium you pay to design a whole new thing is worth the gains you get, and you won't throw away the device after just a few runs because you are only gonna be able to crack like 3 keys each two hardware refresh cycles). But without big juicy fruits in the near future, you can see why that's not really a thing.
Anyhow, for the moment, I think the best speedup is coupling GPUs with CPUs.
- 127
- 7
-
-
Because there is not really a single answer? My first link mentions a single GPU being between 4 and 8 times faster "for a number of tested NFS matrices compared to an optimized multi-core implementation". And in the first page of my last one, they seem relatively happy with professional cards sporting 32 or 48 GB of vram. But open to the last comments, and you see that using a meagre GTX 1660 actually *slows* you down. As I said: there's only so much that you can throw at them before getting memory-limited. The only thing I wished I could have added is TWIRL, but I don't feel educated enough. – mirh Jan 25 '23 at 23:27