<?xml version='1.0' encoding='UTF-8'?>
<rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/" version="2.0">
  <channel>
    <title>The overblown frequency vs cost efficiency trade-off - comments</title>
    <link>https://yosefk.com/cgi-bin/comments.cgi?post=blog/the-overblown-frequency-vs-cost-efficiency-trade-off#comments</link>
    <description>Comments on "The overblown frequency vs cost efficiency trade-off" by Yossi Kreinin</description>
    <docs>http://www.rssboard.org/rss-specification</docs>
    <generator>Yossi Kreinin's ugly publishing software</generator>
    <image>
      <url>https://yosefk.com/blog/self.jpg</url>
      <title>The overblown frequency vs cost efficiency trade-off - comments</title>
      <link>https://yosefk.com/cgi-bin/comments.cgi?post=blog/the-overblown-frequency-vs-cost-efficiency-trade-off#comments</link>
      <width>144</width>
      <height>144</height>
    </image>
    <language>en</language>
    <lastBuildDate>Thu, 25 Jun 2026 08:00:08 +0000</lastBuildDate>
    <item>
      <title>Alex Orange</title>
      <link>https://yosefk.com/cgi-bin/comments.cgi?post=blog/the-overblown-frequency-vs-cost-efficiency-trade-off#comment-3152</link>
      <description><![CDATA[<html><head>
</head><body><p>P.S. By IOPs I meant integer operations/second. Just realized IOPs is
I/O not integer ops/second.</p>
]]></description>
      <pubDate>Fri, 27 Oct 2017 19:50:00 +0000</pubDate>
      <dc:creator>Alex Orange</dc:creator>
    </item>
    <item>
      <title>Alex Orange</title>
      <link>https://yosefk.com/cgi-bin/comments.cgi?post=blog/the-overblown-frequency-vs-cost-efficiency-trade-off#comment-3151</link>
      <description><![CDATA[<html><head>
</head><body><p>You seem to be confusing the best rate to run a given circuit at with
the most efficient circuit. If you want to get from point A to point B
with a car and your choices are a Honda Civic or a McLaren F1, the Civic
is certainly going to get you there with less gas, but it can't go as
fast as the F1. The Civic will have an optimal speed, and like your
argument relative to circuits up to a certain point higher speed will
give you higher efficiency. The F1's maximum efficiency speed will
likely be higher than the Civic's maximum efficiency speed but it's
efficiency will almost certainly be less due to it having a much larger
engine then the Civic.</p>
<p>Similarly with circuits, a simple ripple carry adder is going to be
excruciatingly slow, but also likely the lowest energy per add. A
Kogge-Stone adder is going to be several times faster but will take up
something like 5-6x the area and 5-6x the energy per operation. This is
all talking about the architecture of the circuit (where to use an
AND/NOR/NOT/XOR/etc gate). If you change the circuit type to something
like dynamic gates you can speed up some more, but again at the cost of
more energy. Almost universally, anything that you do in a given process
to speed up an operation will burn more energy unless the original
circuit was horribly designed (which they aren't).</p>
<p>By horribly designed I mean absolute mistakes like not using minimum
length gates or building very area-inefficient gates. The differences
between what's going on inside a CPU and a GPU other than process are
going to be architecture and circuit type, not layout. Likely both are
going to use custom layouts. The reason GPUs are "slower" is because
their computations are MUCH more parallel then a CPU's. Therefore they
measure their performance in GFLOPs total whereas a CPU measures its
performance in GFLOPs or more often IOPs serial. CPU arithmetic circuits
are therefore larger even taking speed into account whereas GPUs are
tuned to fit as many operations/second into a given piece of area.</p>
<p>So, in conclusion, your statement of "I've often read arguments that
computing circuitry running at a high frequency is inefficient,
power-wise or silicon area-wise or both." would be better phrased as
"...computing circuitry ***capable of*** running at a high frequency..."
In which case the statement that such circuits are power and area
inefficient is absolutely true.</p>
<p></p>]]></description>
      <pubDate>Fri, 27 Oct 2017 19:47:00 +0000</pubDate>
      <dc:creator>Alex Orange</dc:creator>
    </item>
    <item>
      <title>Yossi Kreinin</title>
      <link>https://yosefk.com/cgi-bin/comments.cgi?post=blog/the-overblown-frequency-vs-cost-efficiency-trade-off#comment-2950</link>
      <description><![CDATA[<html><head>
</head><body><p>Both of these are true to some extent, though the SoCs of the last
decade have far fewer communication overheads than say the CPU/GPU
desktop setup which is always mentioned in these cases, and powering
up/down probably doesn't take much more than ~1ms (but then of course
some state might be destroyed by it that needs reinitialization, and
there might be other costs.)</p>
<p></p>]]></description>
      <pubDate>Sat, 06 Feb 2016 09:25:00 +0000</pubDate>
      <dc:creator>Yossi Kreinin</dc:creator>
    </item>
    <item>
      <title>Johan Ouwerkerk</title>
      <link>https://yosefk.com/cgi-bin/comments.cgi?post=blog/the-overblown-frequency-vs-cost-efficiency-trade-off#comment-2949</link>
      <description><![CDATA[<html><head>
</head><body><p>There's also the fact that a lot of this hardware tend to start out
as a simple 'slave' device to a master CPU. So the bottleneck is going
to be I/O between the two "domains" and a 'naive' version of your faster
accelerator mostly burns these extra cycles waiting for IO to
complete.</p>
<p>Also, there's the fact that powering things down to lower clock
speed/sleep mode and back up is not a free lunch either. So your higher
clock speeds must be so much higher that this overhead in current draw
and time is compensated for by the correspondingly greater time spent in
low(er) power mode(s).</p>
<p></p>]]></description>
      <pubDate>Sat, 06 Feb 2016 05:09:00 +0000</pubDate>
      <dc:creator>Johan Ouwerkerk</dc:creator>
    </item>
    <item>
      <title>Yossi Kreinin</title>
      <link>https://yosefk.com/cgi-bin/comments.cgi?post=blog/the-overblown-frequency-vs-cost-efficiency-trade-off#comment-2944</link>
      <description><![CDATA[<html><head>
</head><body><p>Interesting! I updated the article. (I hope I got it right; I find
it's really easy to be stupid about the simple things – forget a 2x here
or a 10x there...)</p>
<p></p>]]></description>
      <pubDate>Mon, 01 Feb 2016 09:35:00 +0000</pubDate>
      <dc:creator>Yossi Kreinin</dc:creator>
    </item>
    <item>
      <title>Norman Yarvin</title>
      <link>https://yosefk.com/cgi-bin/comments.cgi?post=blog/the-overblown-frequency-vs-cost-efficiency-trade-off#comment-2943</link>
      <description><![CDATA[<html><head>
</head><body><p>Yes, "super-linear" was what I meant — or, well, I took it for
granted that the question was switching losses per amount of work done,
in which case it's a simple increase. As for hard numbers, I didn't have
any in my head, but a search finds this report on some explorations that
Intel did where they were able to make a Pentium that could run at as
little as 2 milliwatts (though only at 3 MHz; the optimum was at more
like 17 milliwatts and 100 MHz):</p>
<p><a href="http://www.realworldtech.com/near-threshold-voltage/" rel="nofollow">http://www.realworldtech.com/near-threshold-voltage/</a></p>
<p></p>]]></description>
      <pubDate>Sun, 31 Jan 2016 20:47:00 +0000</pubDate>
      <dc:creator>Norman Yarvin</dc:creator>
    </item>
    <item>
      <title>Yossi Kreinin</title>
      <link>https://yosefk.com/cgi-bin/comments.cgi?post=blog/the-overblown-frequency-vs-cost-efficiency-trade-off#comment-2942</link>
      <description><![CDATA[<html><head>
</head><body><p>Yeah – maybe I should have said plainly that accelerators
<em>accelerate</em>, even if it's 50x instead of 100x; that's kinda what
I meant by my vague "other architectural improvements." That's why it
makes sense to leave that last, hard 2x for the next time.</p>
<p></p>]]></description>
      <pubDate>Sun, 31 Jan 2016 11:40:00 +0000</pubDate>
      <dc:creator>Yossi Kreinin</dc:creator>
    </item>
    <item>
      <title>Dan Luu</title>
      <link>https://yosefk.com/cgi-bin/comments.cgi?post=blog/the-overblown-frequency-vs-cost-efficiency-trade-off#comment-2941</link>
      <description><![CDATA[<html><head>
</head><body><p>&gt; So AFAIK this is why so many embedded accelerators had crummy
frequencies when they started out (and they also had apologists
explaining why it was a good thing). And that's why some of the
accelerators caught up – basically it was never a technical limitation
but an economic problem of where to spend effort, and changing
circumstances caused effort to be invested into improving frequency.</p>
<p>This also matches my experience with non-embedded accelerators. If
you're looking at (just for example) a 100x speedup, it's not so bad to
target a less aggressive clock rate and take a 50x speedup with v1,
which sharply reduces risk and eases schedule pressure. If that works
out, then pull out all the stops for v2 or even v3.</p>
<p></p>]]></description>
      <pubDate>Sun, 31 Jan 2016 09:03:00 +0000</pubDate>
      <dc:creator>Dan Luu</dc:creator>
    </item>
    <item>
      <title>Yossi Kreinin</title>
      <link>https://yosefk.com/cgi-bin/comments.cgi?post=blog/the-overblown-frequency-vs-cost-efficiency-trade-off#comment-2940</link>
      <description><![CDATA[<html><head>
</head><body><p>One more thing is, if you're feeding off a battery and/or have
trouble dissipating heat, it's beneficial to lower your frequency as
much as you can lower it without the throughput falling below the
threshold of acceptability – even if you <em>can't</em> also lower the
voltage. That way, you get linear gains in switching power instead of
super-linear, but in absolute terms, battery life is up and heat is
down. This wouldn't be so if processors were powered down every time
they finish the current bulk of work, but they aren't – in practice,
waiting for the user involves a lot of non-productive switching activity
and you save energy by doing this stuff slower.</p>
<p>The upshot is that we should see some processors in the field
lowering their frequency to a much lower level than they would if all
they pursued was a lower voltage.</p>
<p></p>]]></description>
      <pubDate>Sun, 31 Jan 2016 08:44:00 +0000</pubDate>
      <dc:creator>Yossi Kreinin</dc:creator>
    </item>
    <item>
      <title>Yossi Kreinin</title>
      <link>https://yosefk.com/cgi-bin/comments.cgi?post=blog/the-overblown-frequency-vs-cost-efficiency-trade-off#comment-2939</link>
      <description><![CDATA[<html><head>
</head><body><p>I guess you mean that's where some of the <em>super-linear</em> (so
cost-inefficient) increased switching losses come from (if you're
increasing frequency and keeping the voltage, switching costs per unit
of time <em>also</em> increase, but they increase proportionately to the
amount of work done per unit of time so it's neutral
efficiency-wise.)</p>
<p>And still – (1) at what frequency does it typically become necessary
to increase the voltage, and (2) how much less cost-efficient is the
circuit because of being able to reach a higher frequency at a higher
voltage? AFAIK the answer to (1) is "pretty high" and the answer to (2)
is "not much." Even when the answer to (1) is "pretty low", it means
that you could beneficially make your circuit work at a higher frequency
for those times where it's needed without losing much cost efficiency,
and you chose not to do it because there wasn't much to gain by speeding
up those rare/non-existent bursts of extraordinarily intensive, urgent
work. So my main point would remain, namely, if your peak supported
frequency is pretty low, it's not because supporting a higher peak
frequency would result in a worse design, but because it was
uneconomical given your schedule, development budget and use case. If
design effort was free and everything else were kept constant, you'd
probably do it.</p>
<p>But it is interesting how with all that said, essentially to the
extent that you can lower power dissipation by lowering the frequency
and voltage, you're trading silicon area for power and these are two
pretty different variables (they're costs paid at different times and
circumstances.) So I wonder how pronounced this effect is if you plot it
– how low can you go frequency-wise and still gain something (I never
experimented very much with it for various reasons – I probably would if
I were in the cellphone processor market, for instance.)</p>
<p></p>]]></description>
      <pubDate>Sun, 31 Jan 2016 08:15:00 +0000</pubDate>
      <dc:creator>Yossi Kreinin</dc:creator>
    </item>
    <item>
      <title>Norman Yarvin</title>
      <link>https://yosefk.com/cgi-bin/comments.cgi?post=blog/the-overblown-frequency-vs-cost-efficiency-trade-off#comment-2938</link>
      <description><![CDATA[<p>To get the circuit to work at a higher frequency, you often have to
increase the voltage. That's where the increased switching losses come
from; for those, power goes as voltage squared. Increasing the voltage
also increases leakage losses, but I'm not sure how those scale.</p>
<p>Many CPUs these days do actually change their voltage as they change
their frequency, and for exactly this reason. Transmeta, I believe,
pioneered this; although they're defunct, others have picked it up.</p>
<p></p>]]></description>
      <pubDate>Sun, 31 Jan 2016 04:19:00 +0000</pubDate>
      <dc:creator>Norman Yarvin</dc:creator>
    </item>
  </channel>
</rss>
