<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	>

<channel>
	<title>Proper Fixation</title>
	<atom:link href="http://www.yosefk.com/blog/feed" rel="self" type="application/rss+xml" />
	<link>http://www.yosefk.com/blog</link>
	<description>a substitute for anaesthesia</description>
	<pubDate>Fri, 11 May 2012 16:53:11 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.5.1</generator>
	<language>en</language>
			<item>
		<title>Hardware macroarchitecture vs mircoarchitecture</title>
		<link>http://www.yosefk.com/blog/hardware-macroarchitecture-vs-mircoarchitecture.html</link>
		<comments>http://www.yosefk.com/blog/hardware-macroarchitecture-vs-mircoarchitecture.html#comments</comments>
		<pubDate>Fri, 11 May 2012 16:47:23 +0000</pubDate>
		<dc:creator>Yossi Kreinin</dc:creator>
		
		<category><![CDATA[hardware]]></category>

		<guid isPermaLink="false">http://www.yosefk.com/blog/?p=175</guid>
		<description><![CDATA[I'll use examples from diverse types of hardware - CPUs, DSPs, GPUs, FPGAs, CAPPs, and even DRAM controllers. We'll discuss some example problems and how they can be solved at the macro or micro level.]]></description>
			<content:encoded><![CDATA[<p>The <a href="http://semipublic.comp-arch.net/wiki/Computer_Architecture">comp-arch.net wiki</a> defines &#8220;computer architecture&#8221; as the union of two things:</p>
<ul>
<li><strong>Macroarchitecture </strong>- the &#8220;visible&#8221; parts, the contract between hardware and software. For example, branch instructions.</li>
<li><strong><strong>Microarchitecture</strong> </strong>- the &#8220;invisible&#8221; parts, the implementation strategy which affects performance but not semantics. For example, branch prediction.</li>
</ul>
<p>I think this distinction is very interesting for three reasons:</p>
<ol>
<li>It pops up everywhere. That is, many hardware problems can be addressed at the macro level or the micro level, explicitly or implicitly.</li>
<li>The choice of macro vs micro is rarely trivial - for most problems, there are real-world examples of both kinds of solutions.</li>
<li>The choice has common consequences across problems. The benefits and drawbacks of macro and micro are frequently similar.</li>
</ol>
<p>I&#8217;ll use examples from diverse types of hardware - CPUs, DSPs, GPUs, FPGAs, CAPPs, and even DRAM controllers. We&#8217;ll discuss some example problems and how they can be solved at the macro or micro level. I&#8217;ll leave the discussion of the resulting trade-offs to separate write-ups. Here, we&#8217;ll go through examples to see how practical macro and micro solutions to different problems look like.</p>
<p>Our examples are:</p>
<ul>
<li><strong>Data parallelism</strong>: SIMD vs SIMT</li>
<li><strong><strong>Multiple issue</strong></strong>: VLIW vs superscalar</li>
<li><strong><strong>Running ahead</strong></strong>: exposed latencies vs OOO</li>
<li><strong><strong>Local storage</strong></strong>: bare RAM vs cache</li>
<li><strong><strong>Streaming access</strong></strong>: DMA vs hardware prefetchers</li>
<li><strong><strong>Data processing</strong></strong>: logic synthesis vs instruction sets</li>
<li><strong><strong>Local communication</strong></strong>: routing vs register addresses</li>
<li><strong>Avoiding starvation:</strong> pressure signals vs request aging</li>
</ul>
<p><strong><strong>Data parallelism: SIMD vs SIMT</strong></strong></p>
<p>Suppose you want to have a data-parallel machine: software issues one instruction that processes multiple data items.</p>
<p>The common macro approach is wide registers and SIMD opcodes. To use the feature, software must explicitly break up its data into 16-byte chunks, and use special opcodes like &#8220;add_16_bytes&#8221; to process the chunks.</p>
<p>One mirco approach is what NVIDIA marketing calls SIMT. The instruction set remains scalar. However, hw runs multiple scalar threads at once, with simultaneously running threads all executing the same instruction. That way, 16 pairs of values are added in a single cycle using scalar instructions.</p>
<p>(If you&#8217;re interested in SIMT, a detailed comparison with SIMD as well as SMT - the more general simultaneous multithreading model - is <a href="http://www.yosefk.com/blog/simd-simt-smt-parallelism-in-nvidia-gpus.html">here</a>.)</p>
<p><strong><strong>Multiple issue: VLIW vs superscalar/OOO</strong></strong></p>
<p>Suppose you want to have a multiple issue machine. You want to simultaneously issue multiple instructions from a single thread.</p>
<p>The macro approach is VLIW, which stands for &#8220;very long instruction word&#8221;. The idea is, those multiple instructions you issue become &#8220;one (very long) instruction&#8221;, because software explicitly asks to run them together: &#8221;ADD <span style="text-decoration: underline;">R0</span>, R1, R2 <strong><strong>and </strong></strong>MUL R3, <span style="text-decoration: underline;">R0</span>, R5&#8243;. Note that ADD and MUL &#8220;see&#8221; the same value of R0: MUL gets R0&#8217;s value before it&#8217;s modified by ADD.</p>
<p>VLIW also lets software choose to say, &#8220;ADD <span style="text-decoration: underline;">R0</span>, R1, R2; <strong><strong>afterwards</strong></strong>, MUL R3, <span style="text-decoration: underline;">R0</span>, R5&#8243; - that&#8217;s two separate instructions yielding vanilla serial execution. This is not only slower (2 cycles instead of 1), but has a different meaning. This way, MUL <em>does</em> see ADD&#8217;s change to R0. Either way, you get what you explicitly asked for.</p>
<p>(If you&#8217;re interested in VLIW, an explanation of how programs map to this rather strangely-looking architecture is <a href="http://www.yosefk.com/blog/humans-and-compilers-need-each-other-the-vliw-simd-case.html">here</a>.)</p>
<p>The micro approach, called superscalar execution, is having the hardware analyze the instructions and run them in parallel - when that doesn&#8217;t change the hw/sw contract (the serial semantics). For example, ADD R0, R1, R2 can run in parallel with MUL R3, R1, R2 - but not with MUL R3, R0, R5 where MUL&#8217;s input, R0, depends on ADD. Software remains unaware of instruction-level parallelism - or at least it <em><em>can</em> </em>remain unaware and still run correctly.</p>
<p><span style="font-weight: bold;">Running ahead: exposed latencies vs OOO</span></p>
<p>We&#8217;ve just discussed issuing multiple instructions simultaneously. A related topic is issuing instructions before a previous instruction completes. Here, the macro approach is to, well, simply go ahead and issue instructions. It&#8217;s the duty of software to make sure those instructions don&#8217;t depend on results that are not yet available.</p>
<p>For example, a LOAD instruction can have a 4-cycle latency. Then if you load to R0 from R1 and at the next cycle, add R0 and R2, you will have used the <em>old </em>value of R0. If you want the new value, you must explicitly wait for 4 cycles, hopefully issuing some other useful instructions in the meanwhile.</p>
<p>The micro approach to handling latency is called OOO (out-of-order execution). Suppose you load to R0, then add R0 and R2, and then multiply R3 and R4. An OOO processor will notice that the addition&#8217;s input is not yet available, proceed to the multiplication because its inputs are ready, and execute the addition once R0 is loaded (in our example, after 4 cycles). The hw/sw contract is unaffected by the fact that hardware issues instructions before a previous instruction completes.</p>
<p><span style="font-weight: bold; ">Local storage: local memories vs caches</span></p>
<p>Suppose you want to have some RAM local to your processor, so that much of the memory operations work with this fast RAM and not the external RAM, which is increasingly becoming a bottleneck.</p>
<p>The macro approach is, you just add local RAM. There can be special load/store opcodes to access this RAM, or a special address range mapped to it. Either way, when software wants to use local RAM, it must explicitly ask for it - as in, char* p = (char*)0&#215;54000, which says, &#8220;I&#8217;ll use a pointer pointing to this magic address, 0&#215;54000, which is the base address of my local RAM&#8221;.</p>
<p>This is done on many embedded DSPs and even CPUs - for example, ARM calls this &#8220;tightly-coupled memory&#8221; and MIPS calls this &#8220;scratch pad memory&#8221;.</p>
<p>The micro approach is caches. Software doesn&#8217;t ask to use a cache - it loads from an external address as if the cache didn&#8217;t exist. It&#8217;s up to hardware to:</p>
<ul>
<li>Check if the data is already in the cache</li>
<li>If it isn&#8217;t, load it to the cache</li>
<li>Decide which cached data will be overwritten with the new data</li>
<li>If the overwritten cached data was modified, write it back to external memory before &#8220;forgetting&#8221; it</li>
<li>&#8230;</li>
</ul>
<p>The hardware changes greatly, the hw/sw contract does not.</p>
<p><span style="font-weight: bold; ">Data streaming: DMA vs hardware prefetchers</span></p>
<p>Suppose you want to support efficient &#8220;streaming transfers&#8221;. DRAM is actually a fairly poor <em>random </em>access memory - there&#8217;s a big latency you pay per address. However, it has excellent throughput if you load a large contiguous chunk of data. To utilize this, a processor must issue loads without waiting for results of previous loads. Load, wait, load, wait&#8230; is slow; load, load, load, load&#8230; is fast.</p>
<p>The macro approach is, sw tells hw that it wants to load an array. For example, a DMA - direct memory access - engine can have control registers telling it the base address and the size of an array to load. Software explicitly programs these registers and says, &#8220;load&#8221;.</p>
<p>DMA starts loading and eventually says, &#8220;done&#8221; - for example, by setting a bit. In the meanwhile, sw does some unrelated stuff until it needs the loaded data. At this point, sw waits until the &#8220;done&#8221; bit is set, and then uses the data.</p>
<p>The micro approach is, software simply loads the array &#8220;as usual&#8221;. Naturally, it loads from the base address, p, then from p+1, then p+2, p+3, etc. At some point, a hardware prefetcher quietly inspecting all the loads realizes that a contiguous array is being loaded. It then speculatively fetches ahead - loads large chunks beyond p+3 (hopefully not <em>too </em>large - we don&#8217;t want to load too much unneeded data past the end of our array).</p>
<p>When software is about to ask for, say, p+7, its request is suddenly satisfied very quickly because the data is already in the cache. This keeps working nicely with p+8 and so on.</p>
<p><strong>Data processing: logic synthesis vs instruction sets</strong></p>
<p>Let&#8217;s get back to basics. Suppose we want to add a bunch of numbers. How does software tell hardware to add numbers?</p>
<p>The micro approach is so much more common that it&#8217;s the only one that springs to mind. Why, of course hardware has an ADD command, and it&#8217;s implemented in hardware by some sort of circuit. There are degrees here (should there be a DIV opcode or should sw do division?) But the upshot is, there are opcodes.</p>
<p>However, there are architectures where software explicitly constructs data processing operations out of bit-level boolean primitives. This is famously done on FPGAs and is called &#8220;logic synthesis&#8221; - effectively software gets to build its own circuits. (This programming model is so uncommon that it isn&#8217;t even called &#8220;software&#8221;, but it certainly is.)</p>
<p>Less famously, this is also what effectively happens on associative memory processors (CAPPs/APAs) - addition is implemented as a series of bit-level masked compare &amp; write operations. (The CAPP way results in awfully long latencies, which you&#8217;re supposed to make up with throughput by processing thousands of elements in parallel. If you&#8217;re interested in CAPPs, an overview is available <a href="http://www.yosefk.com/blog/an-unusual-hardware-architecture-apa-associative-processing-array.html">here</a>.)</p>
<p>Of course, you can simulate multiplication using bitwise operations on conventional opcode-based processors. But that would leave much of the hardware unused. On FPGAs and CAPPs, on the other hand, &#8220;building things out of bits&#8221; is how you&#8217;re supposed to utilize hardware resources. You get a big heap of exposed computational primitives, and you map your design to them.</p>
<p><strong><strong>Local communication</strong>: routing vs register addresses</strong></p>
<p>Another problem as basic as data processing operations is local communication: how does an operation pass its results to the next? We multiply and then add - how does the addition get the output of multiplication?</p>
<p>Again, the micro approach is by far the better known one. The idea is, you have registers, which are numbered somehow. We ask MUL to output to the register R5 (encoded as, say, 5). Then we ask ADD to use R5 as an input.</p>
<p>This actually doesn&#8217;t sound as &#8220;micro&#8221; - what&#8217;s implicit about it? We asked for R5 very explicitly. However, there are two sorts of &#8220;implicit&#8221; things going on here:</p>
<ul>
<li>The numbers don&#8217;t necessarily refer to physical registers - they don&#8217;t on machines with register renaming.</li>
<li>More fundamentally, even when numbers do refer to physical registers, the <em>routing</em> is implicit.</li>
</ul>
<p>How does the output of MUL travel to R5 and then to the input port of the adder? There are wires connecting these things, and multiplexers selecting between the various options. On most machines, there are also data forwarding mechanisms sending the output of MUL directly to the adder, in parallel to writing it into R5, so that ADD doesn&#8217;t have to wait until R5 is written and then read back. But even on machines with explicit forwarding (and there aren&#8217;t many), software doesn&#8217;t see the wires and muxes - these are opaque hardware resources.</p>
<p>The macro approach to routing is what FPGAs do. The various computational units are connected to nearby configurable switches. By properly configuring those switches, you can send the output of one unit to another using a path going through several switches.</p>
<p>Of course this uses up the wires connecting the switches, and longer paths result in larger latencies. So it&#8217;s not easy to efficiently connect all the units that you want using the switches and the wires that you have. In FPGAs, mapping operations to computational units and connecting between them is called &#8220;placement and routing&#8221;. The &#8220;place &amp; route&#8221; tools can run for a couple of hours given a large design.</p>
<p>This example as well as the previous illustrate micro vs macro at the extreme - a hardware resource that looks &#8220;all-important&#8221; in one architecture is invisible in another to the point where we forget it exists. The point is that they&#8217;re equally important on both - the only question is who manages the resource, hardware or software.</p>
<p><strong>Avoiding starvation: </strong><strong>pressure signals vs request aging</strong></p>
<p>One thing DRAM controllers do is accept requests from several different processors, put them in a queue, and reorder them. Reordering helps to better utilize DRAM, which, as previously mentioned, isn&#8217;t that good at random access and prefers streaming access to consequent locations.</p>
<p>So if two processors, A and B, run in parallel, and each accesses a different region, it&#8217;s frequently better to group requests together - A, A, A, A, B, B, B, B - then to process them in the order in which they arrive - say, A, B, A, A, B, A, B, B.</p>
<p>In fact, as long as A keeps issuing requests, it&#8217;s frequently better to keep processing them until they&#8217;re over, and keep B waiting. Better, that is, for throughput, as well as for A&#8217;s latency - but worse for B&#8217;s latency. If we don&#8217;t know when to stop, serving A and starving B could make the system unusable.</p>
<p>When to stop? One macro solution is, the DRAM controller has incoming pressure signals, and both A and B can complain when starved by raising the pressure. Actually, this is &#8220;macro&#8221; only as far as the DRAM controller is concerned - it gives outside components explicit control over its behavior. The extent of software control over the generation of the pressure signal depends on the processors A and B.</p>
<p>One micro solution is to use request aging. Older requests are automatically considered more urgent. This method is implemented in many DRAM controllers - for instance, Denali&#8217;s. The macro approach is implemented in the Arteris DRAM scheduler.</p>
<p>The micro approach is safer - the controller itself is careful to prevent starvation, whereas in the micro option, a non-cooperative processor can starve others. It also uses a simpler bus protocol, making compatibility easier for processors. However, it results in a lesser throughput - for instance, if B is a peripheral device with a large FIFO for incoming data, and can afford to wait for long periods of time until the FIFO overflows.</p>
<p>Whatever the benefits and drawbacks - and here, we aren&#8217;t going to discuss benefits and drawbacks in any depth - this last example is supposed to illustrate that macro vs micro is relevant outside of &#8220;core&#8221;/&#8221;processor&#8221; design but extends to &#8220;non-computational&#8221; hardware as well.</p>
<p><span style="font-weight: bold; ">Blurred boundaries</span></p>
<p>Micro vs macro is more of a continuum than a strictly binary distinction. That is, we can&#8217;t always label a hardware feature as &#8220;visible&#8221; or &#8220;invisible&#8221; to programmers - rather, we can talk about the extent of its visibility.</p>
<p>There are basically two cases of &#8220;boundary blurring&#8221;:</p>
<ul>
<li>Hardware features &#8221;quite visible&#8221; even though they don&#8217;t affect program semantics. These are &#8220;technically micro&#8221; but &#8220;macro in spirit&#8221;.</li>
<li>Hardware features &#8220;quite invisible&#8221; even though they <em>do</em> affect program semantics. These are &#8220;technically macro&#8221; but &#8220;micro in spirit&#8221;.</li>
</ul>
<p>Let&#8217;s briefly look at examples of both kinds of &#8220;blurring&#8221;.</p>
<p><strong><em>Technically micro but macro in spirit</em></strong></p>
<p>A good example is memory banking. The point of banking is increasing the number of addresses that can be accessed per cycle. A single 32K memory bank lets you access a single address per cycle. 2 16K banks let you access 2 address, 4 8K banks let you access 4 addresses, and so on.</p>
<p>So basically &#8220;more is better&#8221;. What limits the number of banks is the overhead you pay per bank, the overhead of logic figuring out the bank an address belongs to, and the fact that there&#8217;s no point in accessing more data than you can process.</p>
<p>Now if we look at banking as implemented in NVIDIA GPU memory, TI DSP caches and superscalar CPU caches, then at first glance, they&#8217;re all &#8220;micro solutions&#8221;. These machines seem to mostly differ in their mapping of address to bank - for instance, NVIDIA GPUs switch banks every 4 bytes, while TI DSPs switch banks every few kilobytes.</p>
<p>But on all these machines, software can remain unaware of banking and run correctly. If two addresses are accessed at the same cycle that map to the same bank, then the access will take two cycles instead of one - but no fault is reported and results aren&#8217;t affected. Semantically, banking is invisible.</p>
<p>However, I&#8217;d call GPUs&#8217; and DSPs&#8217; banking &#8220;macroish&#8221;, and superscalar CPUs&#8217; banking &#8220;microish&#8221;. Why?</p>
<p>GPUs and DSPs &#8220;advertise&#8221; banking, and commit to a consistent address mapping scheme and consistent performance implications across different devices. Vendors encourage you to know about banking so that you allocate data in ways minimizing contentions.</p>
<p>CPUs don&#8217;t advertise banking very much, and different CPUs running the same instruction set have different banking schemes which result in different performance. Moreover, those CPU variants differ in their ability to access multiple addresses in parallel in the first place: a low-end CPU might access at most one address but a high-end CPU might access two.</p>
<p>GPUs and DSPs, on the other hand, have explicit multiple load-store units (a macro feature). So software knows <em>when</em> it attempts to accesses many addresses in parallel - one reason to &#8220;advertise&#8221; <em>which</em> addresses can actually be accessed in parallel.</p>
<p>This shows why hardware features that don&#8217;t affect program semantics aren&#8217;t &#8220;completely invisible to programmers&#8221; - rather, there are &#8220;degrees of visibility&#8221;. A feature only affecting performance is &#8220;quite visible&#8221; if vendors and users consider it an important part of the hw/sw contract.</p>
<p><em><strong>Technically macro but micro in spirit</strong></em></p>
<p>SIMD and VLIW are both visible in assembly programs/binary instruction streams. However, SIMD is &#8220;much more macro in spirit&#8221; than VLIW. That&#8217;s because for many programmers, the hw/sw contract isn&#8217;t the semantics of assembly, which they never touch, but the semantics of their source language.</p>
<p>At the source code level, the effect of SIMD tends to be very visible. Automatic vectorization rarely works, so you end up using intrinsic functions and short vector data types. The effect of VLIW on source code can be close to zero. Compilers are great at automatic scheduling, and better than humans, so there&#8217;s no reason to litter the code with any sort of annotations to help them. Hence, SIMD is &#8220;more macro&#8221; - more visible.</p>
<p>Moreover, there&#8217;s &#8220;macroish VLIW&#8221; and &#8220;microish VLIW&#8221; - just like there&#8217;s &#8220;macroish banking&#8221; and &#8220;microish banking&#8221; - and, again, the difference isn&#8217;t in the hardware feature itself, but in the way it&#8217;s treated by vendors and users.</p>
<p>An extreme example of &#8220;microish VLIW&#8221; is Transmeta - the native binary instruction encoding was VLIW, but the only software that was supposed to be aware of that were the vendor-supplied binary translators from x86 or other bytecode formats. VLIW was visible at the hardware level but still completely hidden from programmers by software tools.</p>
<p>An opposite, &#8220;macro-centric&#8221; example is TI&#8217;s C6000 family. There&#8217;s not one, but two &#8220;human-writable assembly languages&#8221;. There&#8217;s parallel assembly, where you get to manually schedule instructions. There&#8217;s also linear assembly, which schedules instructions for you, but you still get to explicitly say which execution unit each instruction will use (well, almost; let&#8217;s ignore the A-side/B-side issues here.)</p>
<p>Why provide such a &#8220;linear assembly&#8221; language? Josh Fisher, the inventor of VLIW, didn&#8217;t approve of the concept in his book &#8220;Embedded Computing: a VLIW Approach&#8221;.</p>
<p>That&#8217;s because originally, one of the supposed benefits of VLIW was precisely being &#8220;micro in spirit&#8221; - the ability to hide VLIW behind an optimizing compiler meant that you could speed up existing code just by recompiling it. Not as easy as simply running old binaries on a stronger new out-of-order processor, but easy enough in many cases - and much easier to support at the hardware end.</p>
<p>Linear assembly basically throws these benefits out the window. You spell things in terms of C6000&#8217;s execution units and opcodes, so the code can&#8217;t be cross-platform. Worse, TI can&#8217;t decide to add or remove execution units or registers from some of the C6000 variants and let the compiler reschedule instructions to fit the new variant. Linear assembly refers to units and registers explicitly enough to not support this flexibility - for instance, there&#8217;s no silent spill code generation. Remove some of the resources, and much of the code will stop compiling.</p>
<p>Then why is linear assembly shipped by TI, and often recommended as the best source language for optimized code? The reason is that the code is more &#8220;readable&#8221; - if one of the things the reader is after is performance implications. The same silent spill code generation that makes C more portable makes it &#8220;less readable&#8221;, performance-wise - you never can tell whether your data fits into registers or not, similarly it&#8217;s hard to know how many operations of every execution unit are used.</p>
<p>The beauty of linear assembly is that it hides the stuff humans particularly hate to do and compilers excel at - such as register allocation and instruction scheduling - but it <em>doesn&#8217;t</em> hide things making it easy to estimate performance - such as instruction selection and the distinction between stack and register variables. (In my opinion, the only problem with linear assembly is that it still hides a bit too much - and that people can choose to <em>not</em> use it. They often do - and preserve stunning unawareness of how the C6000 works for years and years.)</p>
<p>Personally, I believe that, contrary to original expectations, VLIW works better in &#8220;macro-centric&#8221; platforms than &#8220;micro-centric&#8221; - a viewpoint consistent with the relative success of Transmeta&#8217;s chips and VLIW DSPs. Whether this view is right or wrong, the point here is that hardware features &#8220;visible&#8221; to software can be more or less visible to programmers - depending on how visible the software stack in its entirety makes them.</p>
<p><strong>Implications</strong></p>
<p>We&#8217;ve seen that &#8220;macro vs micro&#8221; is a trade-off appearing in a great many contexts in hardware design, and that typically, both types of solutions can be found in practical architectures - so it&#8217;s not clear which is &#8220;better&#8221;.</p>
<p>If there&#8217;s no clear winner, what are the benefits and drawbacks of these two options? I believe that these benefits and drawbacks are similar across the many contexts where the trade-off occurs. Some of the implications were briefly mentioned in the discussion on VLIW&#8217;s &#8220;extent of visibility&#8221; - roughly:</p>
<ul>
<li>Micro is more compatible</li>
<li>Macro is more efficient</li>
<li>Macro is easier to implement</li>
</ul>
<p>There are other common implications - for example, macro is harder to context-switch (I like this one because, while it&#8217;s not very surprising once you think about it, it doesn&#8217;t immediately spring to mind).</p>
<p>I plan to discuss the implications in detail sometime soon. I intend to focus, not as much on how things could be in theory, but on how they actually tend to come out and why.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.yosefk.com/blog/hardware-macroarchitecture-vs-mircoarchitecture.html/feed</wfw:commentRss>
		</item>
		<item>
		<title>Email is evil</title>
		<link>http://www.yosefk.com/blog/email-is-evil.html</link>
		<comments>http://www.yosefk.com/blog/email-is-evil.html#comments</comments>
		<pubDate>Tue, 24 Apr 2012 05:15:30 +0000</pubDate>
		<dc:creator>Yossi Kreinin</dc:creator>
		
		<category><![CDATA[wetware]]></category>

		<guid isPermaLink="false">http://www.yosefk.com/blog/?p=173</guid>
		<description><![CDATA[Personally, I love email:

It&#8217;s still the best way to talk online, overall - the most open format, the best client programs.
Online beats offline since everything is archived and searchable.
Written beats spoken since you have time to think stuff through, and you can attach images, spreadsheets, code, etc.

However, I noticed that email discussions bring the worst [...]]]></description>
			<content:encoded><![CDATA[<p>Personally, I love email:</p>
<ul>
<li>It&#8217;s still the best way to talk online, overall - the most open format, the best client programs.</li>
<li>Online beats offline since everything is archived and searchable.</li>
<li>Written beats spoken since you have time to think stuff through, and you can attach images, spreadsheets, code, etc.</li>
</ul>
<p>However, I noticed that email discussions bring the worst out of people, whereas walking over to them and talking brings the best out of them. I guess it&#8217;s because emails feel impersonal, leading to &#8220;email rage&#8221; much like feeling isolated inside a car leads to &#8220;<a href="http://en.wikipedia.org/wiki/Road_rage">road rage</a>&#8220;.</p>
<p>On top of that, for many people email is their todo list, there still really being no better alternative for keeping a todo list. What this means though is that sending an email with a suggestion implying work on their part without prior face-to-face discussion looks like a written order to do something. I believe this impression can&#8217;t be avoided even with the most polite, &#8220;pretty please&#8221;-infested wording. It still feels like &#8220;you didn&#8217;t even bother to talk to me and you expect me to do things!&#8221;</p>
<p>So I decided, roughly, to never open any discussion over email. It&#8217;s fine for followups and bug reports, and it&#8217;s fine if it&#8217;s known to work for the people involved. But my default assumption is that email is an evil thing capable of creating tensions and conflicts out of nowhere. Much better to call the person, check that they&#8217;re available to talk and go talk to them. Then, maybe, send them the summary over email to get all that archiving and searching goodness without the evil price.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.yosefk.com/blog/email-is-evil.html/feed</wfw:commentRss>
		</item>
		<item>
		<title>Which of those would you like me to write?</title>
		<link>http://www.yosefk.com/blog/which-of-those-would-you-like-me-to-write.html</link>
		<comments>http://www.yosefk.com/blog/which-of-those-would-you-like-me-to-write.html#comments</comments>
		<pubDate>Fri, 23 Mar 2012 17:34:13 +0000</pubDate>
		<dc:creator>Yossi Kreinin</dc:creator>
		
		<category><![CDATA[OT]]></category>

		<guid isPermaLink="false">http://www.yosefk.com/blog/?p=167</guid>
		<description><![CDATA[I'll give you this list of drafts and you'll tell me which of these you want me to write.]]></description>
			<content:encoded><![CDATA[<p>&#8220;You have 59 posts and 103 drafts&#8221;, says WordPress, and I guess it&#8217;s right. SELECT COUNT() FROM table&#8221; doesn&#8217;t lie, and then I&#8217;ve always had about 2 unpublished drafts per published post.</p>
<p>Let&#8217;s try this: I&#8217;ll give you this list of drafts and you&#8217;ll tell me which of these you want me to write. I have a bunch, they&#8217;re all fresh enough for me to still want to write them - I still remember what I meant to say - but help me prioritize.</p>
<p>So, here goes:</p>
<ul>
<li><strong>FPGAs vs cellular automata</strong>: cellular automata are one example of trying to model computation using local rules, &#8220;because the laws of nature are local&#8221; or what-not. But real life is not like a game of life, and long wires are everywhere - from long axons to the trans-Atlantic cables we humans laid out right after learning how to do so. I&#8217;d write about how FPGAs, which are popular with people who like &#8220;locality&#8221; and &#8220;big flexible computing grids&#8221; and other such ideas, support complicated routing and not just simple connections to neighbors. (Which is why FPGAs are a real platform and not a pipe dream.)</li>
<li><strong>Mircoarchitecture vs macroarchitecture.</strong> I love this one, because it sounds like a somewhat original or at least not a very widely appreciated idea. Basically in hardware, there are two different ways to attack problems. Examples of micro vs macro are: out-of-order versus VLIW; SIMT vs SIMD; hardware prefetchers vs DMA engines; caches vs explicitly managed local memories; and software-driven processors vs FPGAs. Basically it&#8217;s an implicit vs explicit thing - you can make hardware that cleverly solves the problem in a way which is opaque to software, or you can expose &#8220;bare&#8221; resources that software can manage in order to solve the problem itself. There are a whole lot of implications which I think aren&#8217;t obvious, at least as a whole and perhaps separately - for example, one approach is friendlier to context switching than the other. This could be a good read for someone on a small team designing hardware - I could convince him to do macro and accept the drawbacks; micro is for Intel.</li>
<li><strong>Efficiency: shortcuts and highways.</strong> To get from A to B, you travel less - that&#8217;s a shortcut - or travel <em>more simply</em> - that&#8217;s a highway. (Highways are necessarily &#8220;simpler&#8221; than other  paths - wider, straighter, less junctions - and they&#8217;re expensive to  build, so not every path can be a highway.) It appears to be a good  metaphor for optimization and accelerator hardware, in many ways. For  example, <em>getting to</em> the highway can be a bottleneck - similarly,  transferring and organizing data for accelerators/optimized functions.  There are other similarities. This should, in particular, be a nice read  for someone good at algorithmic optimization (shortcuts) who is curious  about &#8220;brute force&#8221; optimization (highways).</li>
<li><strong>&#8220;Don&#8217;t just blame the author of this manual&#8221;</strong> - this is a quote from an actual manual that I like; the context is that the manual bluntly tells why a feature may likely do something different from what you think it does, and what you should do about it. Basically this spec is outrageous if you think of it as a <em>contract - </em>not much is promised to you - but very good as <em>communication</em> - you understand exactly what the designers were doing and what to expect as a result. The distinction between &#8220;specs as contracts&#8221; and &#8220;specs as communication&#8221; is an interesting one in general.</li>
<li><strong>Leibnitz management:</strong> I mention &#8220;the best of all possible  worlds&#8221; together with the efficient market hypothesis, which is what  some ways of putting the &#8220;efficiency&#8221; argument on its head remind me of.  For instance, the market is efficient and therefore, if we spend $1M on  software or equipment, they must be worth the money (the option of us  being suckers and the vendor being sustained by suckers is ignored).  Similarly, if you propose an improvement, you&#8217;re told that it&#8217;s not  going to work since if it did, others would have done it already because  of said &#8220;efficiency&#8221;. And other similar notions, all  popular with  management.</li>
<li><strong>&#8220;No rules&#8221; is a rule, and not a simple one</strong>: I guess I don&#8217;t  like rules, so I tend to think we should have less of them. It recently  occurred to me that what we&#8217;d really want is not <em>less </em>rules but <em>simpler</em> rules and that it&#8217;s <em>not</em> the same thing. For example, if you have no rules about noise, then one  has a right to make noise (good) but no right for silence (bad), which  is a trade-off that some like and others don&#8217;t - but on top of that, it  creates ugly <em>complications </em>so isn&#8217;t a <em>simplification </em>compared  to having a rule against noise (for example, I can make noise near your  property to devalue it). Similarly, giving someone &#8220;a right to  configure&#8221; takes someone else&#8217;s &#8220;right to certainty&#8221; - be it config  files or different device screen sizes or whatever - also a trade-off.  Someone&#8217;s right to check in code without tests takes someone else&#8217;s  right to count on checked-out code to work. Someone&#8217;s right to pass  parameters of any type takes someone else&#8217;s right to count on things  being of a certain type. So, not only choosing the rules, but choosing  the <em>amount </em>of rules is a hairy trade-off. Which goes against of my intuition that &#8220;less rules is good&#8221; , all else being equal.</li>
<li><strong>Does correctness result from design or testing?</strong> - two claims. 1: correctness always results from testing, never from design; if it wasn&#8217;t tested, it doesn&#8217;t work, and moreover, contains some sort of &#8220;conceptual&#8221; flaw and not just a typo. 2: however, the very property of <em>testability</em> always results from sound design. If it&#8217;s sufficiently badly designed, no amount of testing can salvage it, because good testing is whitebox testing or at least &#8220;gray box&#8221; testing and bad design is impenetrable to the mind, so it&#8217;s a black box.</li>
<li><strong>Testing is parallelizable, analysis isn&#8217;t</strong> - a complementary idea (perhaps I&#8217;d merge them). Suppose you have $10M to throw at &#8220;quality&#8221; - finding bugs. You could massively test your program, or you could thoroughly analyze it - formal proofs or human proofreading or something. Testing you can parallelize: you can buy hardware, hire QA people and so on. The insight is that you can&#8217;t really parallelize analysis: to even tell that two parts can be analyzed separately, a lot of analysis is required, and then frequently things truly can&#8217;t be analyzed separately, because if you understand the fact that listing a huge directory is horribly slow on your filesystem, this understanding is worthless <em>in isolation</em> from the fact that some piece of your software creates huge directories. So analysis can&#8217;t happen fast even if you&#8217;re willing to pay money - you have to pay in <em>time</em>. Two things follow: when something is shipped fast and works, we can conclude that it&#8217;s made to work through testing and not analysis; and, pushing for testing is characteristic of the private/commercial sector where money is cheaper, whereas pushing for analysis is characteristic of the public/academic sector where time is cheaper.</li>
<li><strong>Buying code takes more knowledge then building it </strong>- because when you build it yourself, you can afford to acquire much of the knowledge as you go, but when you&#8217;re making a purchasing decision and you lack some of the required knowledge, then you&#8217;ll buy the wrong thing and will <em>then</em> acquire the knowledge through having things not work - much too late. I&#8217;d give A LOT of real-life examples from my own real life; it&#8217;s quite the problem, I believe. (There&#8217;s frequently no way around buying code, especially if you&#8217;re making hardware, but I think it&#8217;s still an interesting angle on &#8220;buy vs build&#8221;.)</li>
<li><strong>Make your code serial and your data scalar:</strong> I claim that things  that don&#8217;t work that way are not debuggable and most people have a hard  time with them. For example, type systems (C++, Haskell, Scala, even  CLOS), vector languages (APL, Matlab before they had a fast for loop),  Prolog (even though I think solvers are a great idea), Makefiles (even  though I think DSLs are a great idea), lazy evaluation (Haskell).</li>
</ul>
<p>There&#8217;s more but let&#8217;s see what about these.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.yosefk.com/blog/which-of-those-would-you-like-me-to-write.html/feed</wfw:commentRss>
		</item>
		<item>
		<title>Passing shell script arguments to a subprocess</title>
		<link>http://www.yosefk.com/blog/passing-shell-script-arguments-to-a-subprocess.html</link>
		<comments>http://www.yosefk.com/blog/passing-shell-script-arguments-to-a-subprocess.html#comments</comments>
		<pubDate>Thu, 08 Mar 2012 16:02:33 +0000</pubDate>
		<dc:creator>Yossi Kreinin</dc:creator>
		
		<category><![CDATA[software]]></category>

		<guid isPermaLink="false">http://www.yosefk.com/blog/?p=162</guid>
		<description><![CDATA[Why am I writing shell scripts in the first place, you ask? There are reasons for this too, and reasons for the reasons; there always are.]]></description>
			<content:encoded><![CDATA[<p>So I want to create a shell script that ultimately exec&#8217;s a command, after doing something like, say, setting an environment variable first:</p>
<pre>#!/bin/sh
export MY_VAR=MY_VAL
exec my_command $*</pre>
<p>(The point of using `exec my_command` rather than plain `my_command` is to not leave a /bin/sh process waiting for my_command that shows up in pstree and makes my_command not see its &#8220;real&#8221; parent process and so on.)</p>
<p>Simple enough, except it doesn&#8217;t work. If you run the script with `script &#8220;a b&#8221; c`, my_command&#8217;s arguments will be a b c (that&#8217;s three arguments instead of the original two).</p>
<p>(<strong>Update</strong> - as pointed out in the comments, &#8220;$@&#8221; instead of $* works fine, and is perfectly sufficient for my example where all the script does is setting an env var. &#8220;$@&#8221; isn&#8217;t enough if you need to fiddle with the arguments - if you need to do that, read on, otherwise just use &#8220;$@&#8221;.)</p>
<p>A common workaround <a href="http://stackoverflow.com/questions/36109/quoting-command-line-arguments-in-shell-scripts">seems to be</a>, you iterate over the arguments and you quote them and then eval:</p>
<pre>#!/bin/sh
export MY_VAR=MY_VAL
args=
for arg in "$@";
do
  args="$args '$arg'"
done
eval exec my_command $args</pre>
<p>Not so simple, but works better: &#8220;a b&#8221; c will indeed be passed to my_command as &#8220;a b&#8221; c.</p>
<p>However, it doesn&#8217;t work if the arguments contain single quotes. If you pass &#8220;<strong>&#8216;a&#8217;</strong>&#8221; (that&#8217;s double quote, single quote, the character a, single quote, double quote), my_command will get plain a. If you pass &#8220;<strong>&#8216;a b&#8217;</strong>&#8221; (double, single, a, space, b, single, double), my_command will get two arguments, a b, instead of one, &#8216;a b&#8217;.</p>
<p>What to do? One potential workaround is escaping quotes: replacing &#8216; with \&#8217;, etc. Perhaps someone with sufficiently thorough understanding of the Unix shell could pull it off; personally, I wouldn&#8217;t trust myself to take care of all the special cases, or even to fully enumerate them.</p>
<p>So instead, what works for me (or so I hope) is, instead of creating a string of <em>values</em> by concatenating arguments, I make a string of <em>references </em>to the arguments using the variables $1, $2, etc. How many of those are needed - is the last one $3 or $7? Ah, we can use $# to figure that out:</p>
<pre>#!/bin/sh
export MY_VAR=MY_VAL
nargs=$#
args=
while [ $nargs -gt 0 ]
do
  args=&#8221;\&#8221;\$$nargs\&#8221; $args&#8221;
  nargs=`expr $nargs - 1`
done
eval exec my_command $args</pre>
<p>This handsome code generates, given three arguments, the string &#8220;$1&#8243; &#8220;$2&#8243; &#8220;$3&#8243;, and then evals it to get the three argument values, which apparently can&#8217;t cause quoting complications that are dependent on the actual argument values. With five arguments, the string &#8220;$1&#8243; &#8220;$2&#8243; &#8220;$3&#8243; &#8220;$4&#8243; &#8220;$5&#8243; is generated, and so on. (You&#8217;d use more code if you wanted to mangle some of the arguments, which is the sole reason to do this sort of things as opposed to using &#8220;$@&#8221;.)</p>
<p>If you&#8217;re good at Unix and you know a less &#8220;\&#8221;\$$ugly\&#8221; $way&#8221; to do this, and/or a more correct one, do tell.</p>
<p>(Why am I writing shell scripts in the first place, you ask? <a href="http://www.en.utexas.edu/amlit/amlitprivate/scans/chandlerart.html">There are reasons for this too, and reasons for the reasons; there always are</a>.)</p>
<p><strong>Update 2:</strong> according to a comment, $1-$9 work in sh, but $10 and up do not; they do work in bash, which is what you actually get when you ask for /bin/sh on some systems but not others.</p>
<p>I really ought to try harder to stay away from shell scripting.  I mean, I know I shouldn&#8217;t, but I keep coming back. I&#8217;m like those tribesmen around the world who can&#8217;t resist the urge to drink alcohol <em>and</em> having genes making alcohol particularly addictive and bad for their health. I clearly don&#8217;t have the genetic makeup that would make *sh reasonably harmless for me.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.yosefk.com/blog/passing-shell-script-arguments-to-a-subprocess.html/feed</wfw:commentRss>
		</item>
		<item>
		<title>Why programming isn&#8217;t for everyone</title>
		<link>http://www.yosefk.com/blog/why-programming-isnt-for-everyone.html</link>
		<comments>http://www.yosefk.com/blog/why-programming-isnt-for-everyone.html#comments</comments>
		<pubDate>Mon, 23 Jan 2012 08:10:17 +0000</pubDate>
		<dc:creator>Yossi Kreinin</dc:creator>
		
		<category><![CDATA[software]]></category>

		<category><![CDATA[wetware]]></category>

		<guid isPermaLink="false">http://www.yosefk.com/blog/?p=156</guid>
		<description><![CDATA[Today I learned about HyperCard, a system where you could implement a basic calculator in a few easy steps, one of them involving the following impressively English-like snippet:
on mouseUp
  get name of me
  put the value of the last word of it after card field "lcd"
end mouseUp
The article depicts HyperCard as a system [...]]]></description>
			<content:encoded><![CDATA[<p>Today I learned about <a href="http://www.loper-os.org/?p=568">HyperCard</a>, a system where you could implement a basic calculator in a few easy steps, one of them involving the following impressively English-like snippet:</p>
<pre>on mouseUp
  get name of me
  put the value of the last word of it after card field "lcd"
end mouseUp</pre>
<p>The article depicts HyperCard as a system making programming accessible to people who aren&#8217;t professional developers. It is claimed that Apple likely killed off the product because it&#8217;s inconsistent with its business model (roughly, devices bought to consume rather than to create).</p>
<p>I sympathize with the sentiment - I very much like stuff you can tinker with, and dislike business models discouraging tinkering. However, I don&#8217;t think businesses have the power to prevent anything that works well for many people from happening. A conspiracy of typewriter manufacturers could never stop the PC.</p>
<p>This seems especially true with software, where huge systems can be built by volunteers in their spare time. If an idea works, if a software system wants to be built around it, it will be built.</p>
<p>Of course it may be the case that the time hasn&#8217;t come for a programming system for non-developers. It&#8217;s just my opinion that it never will come, not really. Why?</p>
<p>Not because you need much education to program. Very useful stuff can be built without knowing why optimal sorting is O(n*log(n)), or even what big O means.</p>
<p>Not because programming languages must have, or typically have arcane syntax. As a kid, I found Pascal&#8217;s somewhat English-like &#8220;begin&#8221; and &#8220;end&#8221; off-putting, and was greatly relieved to discover Algolish braces. How close to natural language syntax can get, and whether it is at all beneficial to go there is IMO an irrelevant question. The fact is that programming languages can be very readable to people.</p>
<p>The main reason is that development leads to maintenance, and maintenance leads to suffering.</p>
<p>For example, if your program stores persistent data, and you want to change it, your changes to the program must be done such as to preserve the meaning of existing data. This part of development causes major pain everywhere, from video recording to financial databases to compiler construction. No amount of knowledge and no amount of support from the tools make this fun.</p>
<p>There are many other things. Everything in your program&#8217;s environment is unstable and you must constantly update the program to keep up. Your program gets cluttered with options and you forget what does what. There are cases you didn&#8217;t test - spaces in the names, empty data fields, reverse order of operations.</p>
<p>As a result, maintenance means dealing with misbehaving programs that eat data, send trash around, or simply make you wait for an hour and then watch them produce garbage.</p>
<p>This never ends and quickly stops being fun. When something useful can not be done quickly and isn&#8217;t the average person&#8217;s idea of fun, it becomes the business of professionals - or hardcore hobbyists indistinguishable from professionals. As a counter-example, many people like cooking in their spare time without necessarily getting close to the level of a chef or spending that much time cooking. <a href="http://en.wikipedia.org/wiki/Con_Kolivas">Con Kolivas</a>, on the other hand, could technically be called a &#8220;hobbyist&#8221;, but he could be called a &#8220;professional&#8221; as well.</p>
<p>Maybe I&#8217;m wrong, maybe there are plenty of places where a sprinkle of logic - in textual form or graphical form or whatever form - can be figured out quickly, left alone and be useful ever after. It&#8217;s just that it&#8217;s usually the opposite with me. Every time I have a nice little idea it takes me 10x the time it &#8220;should&#8221; take to implement, and most things keep biting me once in a while for a long time.</p>
<p>Programming isn&#8217;t for everyone because it is not fun to maintain what was fun to program.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.yosefk.com/blog/why-programming-isnt-for-everyone.html/feed</wfw:commentRss>
		</item>
		<item>
		<title>Compensation, rationality and the project/person fit</title>
		<link>http://www.yosefk.com/blog/compensation-rationality-and-the-projectperson-fit.html</link>
		<comments>http://www.yosefk.com/blog/compensation-rationality-and-the-projectperson-fit.html#comments</comments>
		<pubDate>Tue, 03 Jan 2012 08:47:00 +0000</pubDate>
		<dc:creator>Yossi Kreinin</dc:creator>
		
		<category><![CDATA[wetware]]></category>

		<guid isPermaLink="false">http://www.yosefk.com/blog/?p=153</guid>
		<description><![CDATA[To negotiate a compensation, you need to compare to something. There are two principally different things people compare compensation to:

Available alternatives. Employee: &#8220;I could get twice as much at Microsoft.&#8221; Employer: &#8220;We can hire Bob for a half of your salary.&#8221;
Peers&#8217; compensation. Employee: &#8220;Jeff gets twice as much and he&#8217;s not better than me.&#8221; Employer: [...]]]></description>
			<content:encoded><![CDATA[<p>To negotiate a compensation, you need to compare to something. There are two principally different things people compare compensation to:</p>
<ol>
<li><strong style="font-weight: bold;">Available alternatives.</strong> Employee: &#8220;I could get twice as much at Microsoft.&#8221; Employer: &#8220;We can hire Bob for a half of your salary.&#8221;</li>
<li><strong style="font-weight: bold;">Peers&#8217; compensation.</strong> Employee: &#8220;Jeff gets twice as much and he&#8217;s not better than me.&#8221; Employer: &#8220;John gets half your salary and you&#8217;re not better than him.&#8221;</li>
</ol>
<p>I believe the second approach - comparing &#8220;ability&#8221; and having a common level of compensation for people &#8220;at the same level of ability&#8221; - is the worse approach. Its main drawbacks are:</p>
<ul>
<li>People look into each other&#8217;s pockets too much that way</li>
<li>It is, in a basic economical sense, an &#8220;irrational&#8221; approach</li>
<li>It ignores the project/person fit</li>
</ul>
<p>I&#8217;ll discuss all of these drawbacks, mostly focusing on ignoring the project/person fit - in my opinion, the worst part.</p>
<p><strong style="font-weight: bold;">Looking into each other&#8217;s pockets</strong></p>
<p>Even if management doesn&#8217;t disclose the way people are labeled and what compensation corresponds to each label, people have an incentive to find out all about this. This means that everyone will know how much everyone else gets, and how one must be labeled to earn a given amount.</p>
<p>People looking into each other&#8217;s pockets is bad for everyone:</p>
<ul>
<li>Invariably people will find others&#8217; compensation unjust, which doesn&#8217;t improve team spirit.</li>
<li>You get <a href="http://en.wikipedia.org/wiki/Peter_Principle">Piterian</a> situations where, say, a strong developer&#8217;s only way to get a raise is to become a manager, at which he might very well suck, etc.</li>
<li>Sometimes the employer does want to set exceptional conditions for someone - pay someone significantly more or less than someone else with the same title. However, if everyone tends to find out about everyone else&#8217;s compensation, it becomes hard to make these exceptions as it is guaranteed to upset people.</li>
</ul>
<p>Others&#8217; compensation is one of those things that are better left unknown. It&#8217;s a pity if you tempt people to find it out.</p>
<p><strong style="font-weight: bold;">Comparing to imaginary alternatives is &#8220;irrational&#8221;</strong></p>
<p>If I&#8217;m working on X, Jeff works on Y and John works on Z, it makes no sense to compare my compensation to theirs. Whoever is unhappy with the current arrangements and threatens to terminate them - that is, whether I quit or get fired - neither Jeff nor John will replace me, nor will I replace them.</p>
<p>Jeff and John usually have to keep working on Y and Z, so they can&#8217;t work on X if I quit. Nor will I work on Y and Z - even if I quit, not the company, but just my team, and join their team in the same company. They&#8217;re already there working on Y and Z - so I won&#8217;t work on Y or Z, but on W.</p>
<p>Therefore, the employer should compare my compensation to what he&#8217;d pay someone else to do X, including the cost of training him. I should compare to what I&#8217;d be payed to do W, including the cost of having to learn to do it.</p>
<p>Why should we compare to these things and not others? Because these are our actual alternatives. Jeff&#8217;s and John&#8217;s compensation has nothing to do with our actual alternatives.</p>
<p>To which someone can legitimately still reply: <em>why?</em> Someone can say, I still want to compare to Jeff&#8217;s and John&#8217;s compensation. So what if you&#8217;re saying that it&#8217;s &#8220;economically irrational&#8221; to consider things unrelated to the real alternatives in a price negotiation? It&#8217;s <em>my </em>price negotiation, I can compare to whatever I want!</p>
<p>That someone would be right, in a way. It&#8217;s not like there&#8217;s a monopoly on the definition of &#8220;economic rationality&#8221; - one could certainly find an economist claiming that looking at your peers <em style="font-style: italic;">is</em> the rational thing to do, or at least the natural thing to do.</p>
<p>Say, Robert Frank - &#8220;<a href="http://www.amazon.com/Choosing-Right-Pond-Behavior-Status/dp/0195049454">Choosing the Right Pond</a>&#8220;. You know, evolutionary considerations - you&#8217;re trying to impress a potential mate with your salary, the mate compares within the &#8220;pond&#8221;, an unusually high salary is an externality, etc.</p>
<p>Basically it&#8217;s partners that you compete for, and it&#8217;s your peers who you compete against, so it&#8217;s their compensation that you should care about. (Does this sound just like your workplace? I hope not&#8230;)</p>
<p>As an aside, I don&#8217;t understand evolutionary definitions of &#8220;rationality&#8221;, not really. I mean, if the ultimate goal is to pass your genes, shouldn&#8217;t you become a serial rapist targeting nuns or someone else who isn&#8217;t likely to use abortion? If you aren&#8217;t doing this, and you advocate the evolutionary view of rationality, aren&#8217;t you proving your own irrationality by your own actions? And if you are irrational, then why should irrational people like you be trusted to define rationality in the first place?</p>
<p>But the fact that I don&#8217;t like the &#8220;evolutionary&#8221; view of rationality and prefer, in this context, the &#8220;classical economics&#8221; definition is just my opinion. An employer can have his own - just like a friend who kept trying to sell his car, for a long time, until he found someone willing to pay the high price.</p>
<p>Another friend said, when they discussed markets, &#8220;what you did is irrational - markets don&#8217;t behave that way - in a market, you lower the price if you don&#8217;t have a buyer&#8221;. To which the seller responded - &#8220;first, I did sell high eventually; second - you can&#8217;t tell me how markets behave - I <em style="font-style: italic;">am</em> the market!&#8221;</p>
<p>So yeah, if you&#8217;re an employer or an employee and you want to compare compensations regardless of what alternatives are actually available - you can of course do this. You <em style="font-style: italic;">are </em>the market - economists, bloggers or anyone else can try to describe your behavior and predict its outcomes, but they aren&#8217;t entitled to label it &#8220;rational&#8221; or &#8220;irrational&#8221;, not really.</p>
<p>All that <em>can </em>be said is that considering imaginary alternatives instead of the real ones can very well <em>make</em> you face the real ones.</p>
<p>That is, suppose you say to an employee, &#8220;John gets half your salary and you&#8217;re not better than him.&#8221; Suppose the employee replies, &#8220;I could get twice as much at Microsoft.&#8221; His alternative is real - he quits. Your alternative is <em style="font-style: italic;">not</em> real - John is not available to replace the guy who quit. Now you&#8217;re facing your real alternatives - which can be much worse than raising the guy&#8217;s salary would have been.</p>
<p>Isn&#8217;t it a better idea to consider your real alternatives during the negotiations?</p>
<p>To which one could reply - how bad those alternatives can be, really? I mean, we hired John, right? And he&#8217;s just as good. So we can always hire this sort of person for this sort of price, right? Yeah, there are the training costs, but that&#8217;s all there is to it, not?</p>
<p>I believe that there&#8217;s more to it than training costs. The big thing is the project/person fit.<br />
<strong style="font-weight: bold;"></strong></p>
<p><strong style="font-weight: bold;">The project/person fit</strong></p>
<p>It&#8217;s magical. If a person <em style="font-style: italic;">wants </em>to do something, I&#8217;m <em>so </em>much in favor of letting them, whatever other things they&#8217;d have to stop doing. I mean, there are things which nobody will ever do except the one person - or maybe one of two or three people - to whom it&#8217;s <em style="font-style: italic;">important</em>.</p>
<p>Or someone could do it, but not nearly as well. And not because he&#8217;s &#8220;worse&#8221; - he may be &#8220;better&#8221; on all the common benchmarks (IQ, grades, reputation, whatever). He&#8217;s not &#8220;worse&#8221; in any quantifiable way, but it just doesn&#8217;t click - the project is not a good fit for him.</p>
<p>It&#8217;s a depressing thought for a manager - a part of a manager&#8217;s helplessness. A manager can&#8217;t do <em style="font-style: italic;">anything</em> himself - the most helpless creature around. He&#8217;s always responsible for what other people do. He can pick the people, talk to people, negotiate with people, reshuffle people. But that is all he can do - and not a single bit of real work that must be done to make his project succeed.</p>
<p>This means an extreme dependence on other people, which is stressful. The project/person fit makes this much worse. You&#8217;re basically constrained to not move people away from projects when there&#8217;s this magical click. They&#8217;re irreplaceable, so you depend on them tremendously - not very comforting. So it&#8217;s natural to argue that this magic business doesn&#8217;t really exist - everyone is replaceable.</p>
<p>Now, I&#8217;m not saying that people actually &#8220;can&#8217;t be replaced&#8221; - far from it. That thought would make me lose sleep as a team leader - and it would offend me as a programmer.</p>
<p>I mean, if our processors are &#8220;universal computing machines&#8221;, then surely programmers ought to be universal as well, right? I much prefer to think of myself a &#8220;replaceable cog&#8221; - but a <em style="font-style: italic;">universal</em> cog - than an irreplaceable part of the peculiar machinery of my current workplace, obviously useless outside it because of my extreme specialization.</p>
<p>So actually I&#8217;m at <em style="font-style: italic;">the other</em> extreme on this one, most likely - I don&#8217;t think very much of &#8220;relevant experience&#8221;, and I&#8217;ll be the first to say that a person new to something will cope with it very well, don&#8217;t worry. Everyone is replaceable, because everyone can deal with everything.</p>
<p>For instance, in our recent round of work on hardware verification, we had a tough deadline, so there was a single hardware module that 5 programmers worked on. Of them, 3 had no experience in hardware verification at all, so they had to learn about hardware simulators and waveform viewers and stuff.</p>
<p>Normally, just one person would do that work, but it&#8217;d take longer and we couldn&#8217;t afford the latency. We also had to swap people in and out to do other things, and they had to continue where the previous person left. And it worked, basically. So I think I&#8217;m very much at the other extreme - programmers are universal, and they&#8217;ll deal.</p>
<p>What do I mean by this &#8220;project/person fit&#8221; then?</p>
<p>What I mean is that there&#8217;s still a 10x productivity difference between a person struggling with this important stuff that you dumped on them but they kinda don&#8217;t understand or care about very much, and a person who <em style="font-style: italic;">wants</em> the thing done.</p>
<p>Actually it&#8217;s more than 10x - you can&#8217;t quantify it, it&#8217;s qualitatively different. A bird doesn&#8217;t just move faster than a snail. You can&#8217;t express the difference between crawling and flying in a single number, even if your HR policy mandates this sort of quantification.</p>
<p><em><strong>People have their own priorities</strong></em></p>
<p>A manager classifies things as important and unimportant, and he might be tempted to think that somebody gives a damn about his view of these matters.</p>
<p>But they don&#8217;t give a damn. <em style="font-style: italic;">They </em>classify things as &#8220;stuff the manager wants&#8221; and &#8220;stuff that they want&#8221;. Stuff that&#8217;s only important to them because you said so crawls. Stuff that they feel is important and interesting flies.</p>
<p>Managers might think that work gets done because they want it done. It&#8217;s true - but <em>the best </em>work gets done because <em>people who do it</em> want it done.</p>
<p>And people are amazing in the diversity of their tastes. Taste depends on many things - personal talents and interests, personal history that makes some problems closer to your heart than others, and so on - but no matter what the reasons are, the result is that tastes are wildly different.</p>
<p>Consider the following things, all among the stuff our team works on:</p>
<ul>
<li>A distributed build &amp; run server.</li>
<li>A debugger agent - porting gdb to custom hardware and OS.</li>
<li>A graphical editor for tagging objects in video clips.</li>
<li>A static memory manager built around C language extensions and a constraint solver.</li>
</ul>
<p>I think all of them are important, and all of them are interesting. As a programmer, I&#8217;d work on any of them. I mean, does any of this sound like boring grunt work? Certainly they&#8217;re all nicer than verifying a hardware module that you didn&#8217;t specify under time pressure, at least if you ask me.</p>
<p>However, in my team, there&#8217;s just one, two, sometimes ~1.5 people who actually <em style="font-style: italic;">want </em>to work on each of these things. Moreover, most of them have an aversion to most other things on the list.</p>
<p>Now, if it was strictly necessary, any of them would work on any of these projects. And they&#8217;d do a good job even if they got the one they hated the most. But it&#8217;d be uninspired, and nobody could blame them.</p>
<p>How easy is it to find someone who&#8217;d love to do a project? I&#8217;ll tell you - real damn hard. I mean, I&#8217;m a language geek; in my opinion, everyone wants to work on programming language extensions. And you know what? They don&#8217;t. Not really. Most don&#8217;t want at all. Then many like the idea, in principle. But not that kind of language, or not that kind of extensions. There&#8217;s no spark in their eyes - until the right person shows up.</p>
<p>Similarly with the other things. You&#8217;d think that a person who likes the debugger agent would also like the distributed build server, not? I&#8217;d expect that, definitely - but she doesn&#8217;t. And you can&#8217;t make someone like something. Usually you can&#8217;t even pay them to like it. They just won&#8217;t.</p>
<p>Some projects are optional. With these, I will wait for <em style="font-style: italic;">years</em> until the right person shows up. I feel guilty - people are asking for it, it&#8217;d be great if we had it, it would become an enabler for things now<br />
unthinkable. But who am I kidding? Nobody wants to do this now, not really. Better wait until he shows up.</p>
<p>When he shows up, what do I say? I say, <em style="font-style: italic;">keep him</em>. Really. Don&#8217;t let the thing turn into a wasteland just because programmers (actually) are universal, replaceable cogs!</p>
<p>Some projects are not optional - you must do them no matter what. When there&#8217;s no right person to do such a thing - watch years of work, tears and sweat produce a mountain of code dripping with hate and depression. I&#8217;m serious - sometimes I can actually look at code and see how nobody ever loved it.</p>
<p>I&#8217;ve seen brilliant people produce disgusting code nobody wants to touch. Certainly I couldn&#8217;t help it myself - my sense of duty did not help. I did it on time, it worked, and it was a toxic waste.</p>
<p>It doesn&#8217;t help that the manager thinks it&#8217;s important. It doesn&#8217;t help that I agree with him. If I don&#8217;t like it, I won&#8217;t do it very well.</p>
<p>Sometimes - many times - the right person arrives years after the wrong people - the wrong people <em style="font-style: italic;">for this project</em> - have been spitting blood trying to make it work. It takes a few months and the scenery is transformed. Mountains of hate are <em style="font-style: italic;">gone</em>. You have a working system. People who lost hope for this particular area to ever become habitable, to stop smelling of fail, suddenly smile.</p>
<p>Would you let that person go, just because John is &#8220;just as good&#8221; and you pay him less? There is <em style="font-style: italic;">no way</em> that John is going to take over this thing. Even if he&#8217;s available. He isn&#8217;t interested. He couldn&#8217;t care less. He could take over just like anyone else, but it&#8217;d be toxic waste all over again. <em style="font-style: italic;">Come on!</em></p>
<p>Sometimes a programmer will be moved away from a project - or not be allowed to do it - because of his already high compensation. &#8220;We can find someone cheaper to do this&#8221;. Yes - but not someone who <em style="font-style: italic;">wants</em> to do this! This just brings tears to my eyes.</p>
<p><em><strong>But if he loves the project, he won&#8217;t quit, right?</strong></em></p>
<p>Good thinking. People who can be replaced with someone like John should therefore be compared to John. People who <em>can&#8217;t </em>be replaced with someone like John can still be compared to him - they&#8217;re the ones who love their work, so they likely won&#8217;t quit, and then we can sensibly compare everyone to everyone in a reasonable manner.</p>
<p>They&#8217;ll quit.</p>
<p>They&#8217;ll quit even if it&#8217;s &#8220;irrational&#8221; for them. People can quit a project they love over compensation, and then spend years until they find something nice to work on. Often they feel it wasn&#8217;t worth it, or at least are unhappy with their working situation.</p>
<p>But it doesn&#8217;t help that you were right and that they should have stayed, settling for the fair compensation level of John and working on their favorite stuff. It doesn&#8217;t help because the loss is yours as well.</p>
<p>Why do people behave in this &#8220;irrational&#8221; way, apart from having too high expectations about their alternatives? The economist David Friedman gives <a href="http://www.daviddfriedman.com/Academic/econ_and_evol_psych/economics_and_evol_psych.html">an evolutionary explanation</a>, if you like that sort of thing:</p>
<blockquote><p>&#8230;human beings regard the usual terms of exchange as right and any deviation from those terms that makes them worse off as a presumptively wicked act by the other party. This feature resulted in human beings that possessed it getting better terms in bilateral monopoly bargains in the environment in which we evolved&#8230;</p></blockquote>
<p>&#8220;Bilateral monopoly&#8221; is basically the situation you and your employee find yourselves in once a project &#8220;clicks&#8221; with him. It&#8217;s hard for you to replace him - and it&#8217;s hard for him to replace you. This may tempt you to lower the price you&#8217;re willing to pay. The response Mother Nature had equipped us with for these cases is that the employee thinks you&#8217;re wicked, and he quits.</p>
<p>This reaction is &#8220;irrational&#8221; - in the sense that he&#8217;s now worse off. But it&#8217;s very much &#8220;rational&#8221;, in the sense that <em>the threat </em>of &#8220;irrational&#8221; quitting should improve his terms - <em>if you know that the threat is real,</em> despite the fact that actually quitting would make him worse off.</p>
<p>Well, in my experience, the threat is very real alright. Worth taking into account.</p>
<p><strong>Why management likes to set standard compensation levels</strong></p>
<p>I suspect the benefit is that it makes decision-making easier on the scale of a large company. It works reasonably well and is very easy to implement. It&#8217;s a bit like using a simple heuristic in code because it&#8217;s just 5 lines of code and it sort of works.</p>
<p>&#8220;<a href="http://en.wikipedia.org/wiki/Bounded_rationality">Bounded rationality</a>&#8220;, if you like (&#8230;isn&#8217;t &#8220;bounded rationality&#8221; what used to be called &#8220;stupidity&#8221;? Aren&#8217;t &#8220;the cognitive limitations of the mind&#8221; mentioned in the article also called &#8220;stupidity&#8221;? I&#8217;m not mocking stupidity - I&#8217;m certainly equipped with a high degree of stupidity myself, and you can trace its influence on my decision-making. I&#8217;m just wondering why invent new terms when we already have perfectly good ones.)</p>
<p>Anyway, if you know why standard compensation levels are a good idea - a rational argument for them in an <em>unbounded </em>way - let me know in the comments. Puzzles me plenty.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.yosefk.com/blog/compensation-rationality-and-the-projectperson-fit.html/feed</wfw:commentRss>
		</item>
		<item>
		<title>Cycles, memory, fuel and parking</title>
		<link>http://www.yosefk.com/blog/cycles-memory-fuel-and-parking.html</link>
		<comments>http://www.yosefk.com/blog/cycles-memory-fuel-and-parking.html#comments</comments>
		<pubDate>Sat, 31 Dec 2011 20:52:23 +0000</pubDate>
		<dc:creator>Yossi Kreinin</dc:creator>
		
		<category><![CDATA[software]]></category>

		<category><![CDATA[wetware]]></category>

		<guid isPermaLink="false">http://www.yosefk.com/blog/?p=152</guid>
		<description><![CDATA[In high-performance, resource-constrained projects, you&#8217;re not likely to suddenly run out of cycles - but you&#8217;re quite likely to suddenly run out of memory. I think it&#8217;s a bit similar to how it&#8217;s easy to buy fuel for your car - but sometimes you just can&#8217;t find a parking spot.
I think the difference comes from [...]]]></description>
			<content:encoded><![CDATA[<p>In high-performance, resource-constrained projects, you&#8217;re not likely to suddenly run out of cycles - but you&#8217;re quite likely to suddenly run out of memory. I think it&#8217;s a bit similar to how it&#8217;s easy to buy fuel for your car - but sometimes you just can&#8217;t find a parking spot.</p>
<p>I think the difference comes from pricing. Processor cycles are priced similarly to fuel, whereas memory bytes are priced similarly to parking spots. I think I know the problem but not the solution - and will be glad to hear suggestions for a solution.</p>
<p><strong>Cycles: gradual price adjustment</strong></p>
<p>If you work on a processing-intensive project - rendering, simulation, machine learning - then, roughly, every time someone adds a feature, the program becomes a bit slower. Every slowdown makes the program a bit &#8220;worse&#8221; - marginally less useable and more annoying.</p>
<p>What this means is that every slowdown is frowned upon, and the slower the program becomes, the more a new slowdown is frowned upon. From &#8220;it got slower&#8221; to &#8220;it&#8217;s annoyingly slow&#8221; to &#8220;we don&#8217;t want to ship it this way&#8221; to &#8220;listen, we just can&#8217;t ship it this way&#8221; - effectively, a developer slowing things down pays an increasingly high price. Not money, but a real price nonetheless - organizational pressure to optimize is proportionate to the amount of cycles spent.</p>
<p>Therefore, you can&#8217;t &#8220;suddenly&#8221; run out of cycles - long before you <em>really </em>can&#8217;t ship the program, there will be a growing pressure to optimize.</p>
<p>This is a bit similar to fuel prices - we can&#8217;t &#8220;suddenly&#8221; run out of fuel. Rather, fuel prices will rise long before there&#8217;ll actually be no fossil fuels left to dig out of the ground. (I&#8217;m not saying prices will rise &#8220;early enough to readjust&#8221;, whatever &#8220;enough&#8221; means and whatever the options to &#8220;readjust&#8221; are -  just that prices will rise much earlier in absolute terms, at least 5-10 years earlier).</p>
<p>This also means that there can be no fuel shortages. When prices rise, less is purchased, but there&#8217;s always (expensive) fuel waiting for those willing to pay the price. Similarly, when cycles become scarce, everyone spends more effort optimizing (pays a higher price), and some features become too costly to add (less is purchased) - but when you really need cycles, you can get them.</p>
<p><strong>Memory: price jumps from zero to infinity</strong></p>
<p>When there&#8217;s enough memory, the cost of an allocated byte is zero. Nobody notices the memory footprint - roughly, RAM truly is RAM, the cost of memory access is the same no matter where objects are located and how much memory they occupy together. So who cares?</p>
<p>However, there comes a moment where the process won&#8217;t fit into RAM anymore. If there&#8217;s no swap space (embedded devices), the cost of allocated byte immediately jumps to infinity - the program won&#8217;t run. Even if swapping is supported, once your working set doesn&#8217;t fit into memory, things get very slow. So going past that limit is very costly - whereas getting near it costs nothing.</p>
<p>Since nobody cares about memory before you hit some arbitrary limit, this moment can be very sudden: without warning, suddenly you can&#8217;t allocate anything.</p>
<p>This is a bit similar to a parking lot, where the first vehicle is as cheap to park as the next and the last - and then you can&#8217;t park at all. Actually, it&#8217;s even worse - memory is more similar to an unmarked parking lot, where people park any way they like, leaving much unused space. Then when a new car arrives, it can&#8217;t be parked unless every other car is moved - but the drivers are not there.</p>
<p>(Actually, an unmarked parking lot is analogous to fragmented memory, and it&#8217;s solved by heap compaction by introducing a runtime latency. But the biggest problem with real memory is that people allocate many big chunks where few small ones could be used, and probably would be used if memory cost was something above zero. Can you think of a real-world analogy for that?..)</p>
<p><strong>Why not price memory the way we price cycles?</strong></p>
<p>I&#8217;d very much like to find a way to price memory - both instructions and data - the way we naturally price cycles. It&#8217;d be nice to have organizational pressure mount proportionately to the amount of memory spent.</p>
<p>But I just don&#8217;t quite see how to do it, except in environments where it happens naturally. For instance, on a server farm, larger memory footprint can mean that you need more servers - pressure naturally mounts to reduce the footprint. Not so on a dedicated PC or an embedded device.</p>
<p>Why isn&#8217;t parking like fuel, for that matter? Why are there so many places where you&#8217;d expect to find huge underground parking lots - everybody wants to park there - but instead find parking shortages? Why doesn&#8217;t the price of parking spots rise as spots become taken, at least where I live?</p>
<p>Well, basically, fuel is not parking - you can transport fuel but not parking spots, for example, so it&#8217;s a different kind of competition - and then we treat them differently for whatever social reason. I&#8217;m not going to dwell on fuel vs parking - it&#8217;s my analogy, not my subject. But, just as an example, it&#8217;s <a href="http://en.wikipedia.org/wiki/1979_energy_crisis">perfectly possible</a> to establish fuel price controls and get fuel shortages, and then fuel becomes like parking, in a bad way. Likewise, you could implement dynamic pricing of parking spots - more easily with today&#8217;s technology than, say, 50 years ago.</p>
<p>Back to cycles vs memory - you could, in theory, &#8220;start worrying&#8221; long before you&#8217;re out of memory, seeing that memory consumption increases. It&#8217;s just not how worrying works, though. If you have 1G of memory, everybody knows that you can ship the program when it consumes 950M as easily as when it consumes 250M. Developers just shrug and move along. With speed, you genuinely start worrying when it starts dropping, because both you and the users notice the drop - even if the program is &#8220;still usable&#8221;.</p>
<p>It&#8217;s pretty hard to create &#8220;artificial worries&#8221;. Maybe it&#8217;s a cultural thing - maybe some organizations more easily believe in goals set by management than others. If a manager says, &#8220;reduce memory consumption&#8221;, do you say &#8220;Yes, sir!&#8221; - or do you say, &#8220;Are you serious? We have 100M available - but features X, Y and Z are not implemented and users want them!&#8221;</p>
<p>Do you seriously fight to achieve nominal goals, or do you only care about the ultimate goals of the project? Does management reward you for achieving nominal goals, or does it ultimately only care about real goals?</p>
<p>If the organization believes in nominal goals, then it can actually start optimizing memory consumption long before it runs out of memory - but sincerely believing in nominal goals is dangerous. There&#8217;s something very healthy in a culture skeptical about anything that sounds good but clearly isn&#8217;t the most important and urgent thing to do. Without that skepticism, it&#8217;s easy to get off track.</p>
<p>How would you go about creating a &#8220;memory-consumption-aware culture&#8221;? I can think of nothing except paying per byte saved - but, while it sounds like a good direction with parking spots, with developers it could become quite a perverse incentive&#8230;</p>
]]></content:encoded>
			<wfw:commentRss>http://www.yosefk.com/blog/cycles-memory-fuel-and-parking.html/feed</wfw:commentRss>
		</item>
		<item>
		<title>Could SOPA give us back a decentralized Internet?</title>
		<link>http://www.yosefk.com/blog/could-sopa-give-us-back-a-decentralized-internet.html</link>
		<comments>http://www.yosefk.com/blog/could-sopa-give-us-back-a-decentralized-internet.html#comments</comments>
		<pubDate>Mon, 19 Dec 2011 08:04:03 +0000</pubDate>
		<dc:creator>Yossi Kreinin</dc:creator>
		
		<category><![CDATA[OT]]></category>

		<guid isPermaLink="false">http://www.yosefk.com/blog/?p=149</guid>
		<description><![CDATA[I don&#8217;t think SOPA will fly, ultimately. It benefits content companies at the expense of technology companies which by now seem to have deeper pockets. Technology companies will find a way to undo SOPA if it passes.
But suppose it passes and is consistently enforced. This threatens sites &#8220;enabling or facilitating copyright infringement&#8221; - what are [...]]]></description>
			<content:encoded><![CDATA[<p>I don&#8217;t think SOPA will fly, ultimately. It benefits content companies at the expense of technology companies which by now seem to have deeper pockets. Technology companies will find a way to undo SOPA if it passes.</p>
<p>But suppose it passes and is consistently enforced. This threatens sites &#8220;enabling or facilitating copyright infringement&#8221; - what are those?</p>
<p>Standalone personal sites probably aren&#8217;t threatened. You know what you publish, and if you publish copyrighted content, you can easily remove it. Gmail probably isn&#8217;t threatened because data isn&#8217;t publicly available. SOPA does <a href="http://blog.wikimedia.org/2011/12/13/how-sopa-will-hurt-the-free-web-and-wikipedia/">threaten Wikipedia</a>, because you&#8217;re supposed to not link to &#8220;infringing sites&#8221; (which could be anything) - but it probably doesn&#8217;t threaten them through the content actually on the site, since they&#8217;re very careful not to use copyrighted content.</p>
<p>Which sites are threatened the most? Facebook, YouTube, blogging and social networking sites. Plenty of copyrighted content gets uploaded to these. If SOPA is trimmed to exclude links to &#8220;infringing sites&#8221;, then it is mostly &#8220;social&#8221; sites which are targeted.</p>
<p>Are these sites a good development in the Internet world? It&#8217;s definitely not how the Internet was supposed to look like. Instead of many individual sites, we now have a few huge sites keeping most of the published data, together with much personal information, with very little obligations to users. &#8220;<a href="http://gawker.com/5636765/facebook-ceo-admits-to-calling-users-dumb-fucks">They trust me - dumb fucks</a>&#8220;, as the Facebook CEO put it.</p>
<p>Wouldn&#8217;t it be great if instead of big social sites, we had big hosting companies and many independent individual sites? Wouldn&#8217;t it be great if the many independent sites were all using public protocols to exchange data - using the Internet network and not the Facebook network? Wouldn&#8217;t it be great if no &#8220;social engineer&#8221; could oversee our communication?</p>
<p>Couldn&#8217;t SOPA do just that - make it unaffordable to manage a proprietary network like Facebook on top of the Internet, giving us back a decentralized Internet? Facebook convinced hundreds of millions of users that it&#8217;s fun to be on the Internet, read stuff, write stuff. Couldn&#8217;t SOPA then force people out of Facebook and bring them to the actual Internet?</p>
<p>Hosting companies that make publishing easy - on your site, under your domain, with data under your full control and responsibility - could use the opportunity. It&#8217;s well past time that running an actual site is feasible on this fabulous Internet network. With all these proprietary networks on top, what normal person runs a site today, or even knows what it means? Wouldn&#8217;t it be great if they finally started?</p>
<p>And yeah, I realize it&#8217;s not going to be like that. Facebook will manage to shoot this legislation down. If it doesn&#8217;t, then it&#8217;ll manage to work around its enforcement. And if it doesn&#8217;t, any site with a link to any other site is probably threatened - definitely Wikipedia, Reddit, HN&#8230;</p>
<p>So yeah, it&#8217;s going to be much worse. But I can dream, can&#8217;t I?</p>
<p>(And couldn&#8217;t you think of a way to distribute the hosting of user-generated contents - like news links or Wikipedia articles - and give a unified view at the client side? Then one couldn&#8217;t target &#8220;the Wikipedia site&#8221; - there wouldn&#8217;t be any - but only a specific portion. Wouldn&#8217;t it be better for users, in some ways? )</p>
]]></content:encoded>
			<wfw:commentRss>http://www.yosefk.com/blog/could-sopa-give-us-back-a-decentralized-internet.html/feed</wfw:commentRss>
		</item>
		<item>
		<title>Coding standards: is consistency prettier than freedom?</title>
		<link>http://www.yosefk.com/blog/coding-standards-is-consistency-prettier-than-freedom.html</link>
		<comments>http://www.yosefk.com/blog/coding-standards-is-consistency-prettier-than-freedom.html#comments</comments>
		<pubDate>Mon, 12 Dec 2011 23:09:29 +0000</pubDate>
		<dc:creator>Yossi Kreinin</dc:creator>
		
		<category><![CDATA[software]]></category>

		<guid isPermaLink="false">http://www.yosefk.com/blog/?p=147</guid>
		<description><![CDATA[Different projects have different coding standards, and some have none at all. How does it affect the quality of code and developers&#8217; well-being? What results can we reasonably expect from a style guide?
Let&#8217;s have a look at the effect of style guides in the real world. Here&#8217;s how Jerusalem looks like:

These similarly looking buildings are near [...]]]></description>
			<content:encoded><![CDATA[<p>Different projects have different coding standards, and some have none at all. How does it affect the quality of code and developers&#8217; well-being? What results can we reasonably expect from a style guide?</p>
<p>Let&#8217;s have a look at the effect of style guides in the real world. Here&#8217;s how Jerusalem looks like:</p>
<p><img src="http://yosefk.com/img/n/jerusalem-center.png" alt="" /></p>
<p>These similarly looking buildings are near the city center. Here&#8217;s a shot of the suburbs:</p>
<p><img src="http://yosefk.com/img/n/jerusalem-suburbs.png" alt="" /></p>
<p>Same stuff, pretty much. White buildings, red roof tiles - or plain flat roofs.</p>
<p>And now for something completely different:</p>
<p><img src="http://yosefk.com/img/n/florentin.png" alt="" /></p>
<p>This is Tel Aviv. Buildings don&#8217;t look similar to each other here. Nor do different parts of the city:</p>
<p><img src="http://yosefk.com/img/n/telaviv-view.png" alt="" /></p>
<p>As you can see, in Tel Aviv, there&#8217;s no style guide - everyone builds stuff to suit their own taste.</p>
<p>In Jerusalem, on the other hand, buildings have to be covered with Jerusalem stone, giving them their trademark off-white color. Jerusalem owes its visual consistency to <a href="http://en.wikipedia.org/wiki/Jerusalem_stone#Use_in_building">a century-old style guide</a> enforced by municipal laws.</p>
<p>Here are a few observations - relevant to most style guides, I think:</p>
<ul>
<li><strong>Consistent style is either enforced or lacking.</strong> Whatever virtues freedom may have, consistency of style is not one of them.</li>
<li><strong>Consistent style is functionally inconsequential.</strong> Buildings in Jerusalem are about as safe and comfortable as buildings in Tel Aviv.</li>
<li><strong>Psychologically, style does matter.</strong> Many people hate visiting Tel Aviv because it&#8217;s ugly.</li>
<li><strong>Whether consistent style is more beautiful is debatable. </strong>Many other people hate visiting <em>Jerusalem</em> because it&#8217;s ugly.</li>
<li><strong>People will defeat stylistic consistency despite the style guide.</strong> Here&#8217;s an example from one of Jerusalem&#8217;s suburbs, Ramot Polin - &#8220;<a href="http://en.wikipedia.org/wiki/Ramot_Polin#Architecture">a housing project for honeybees</a>&#8220;:</li>
</ul>
<p><img src="http://yosefk.com/img/n/ramot-polin.png" alt="" /></p>
<p>This is consistent with <em>the style guide</em>, but not very consistent with <em>the actual look of other buildings</em> - nor does it look very comfortable. Leading to my last observation:</p>
<ul>
<li><strong>A common style can be codified and enforced, but common sense can&#8217;t be.</strong> Municipal law mentioned &#8220;off-white&#8221;, but who would have thought to mention &#8220;rectangular&#8221;?</li>
</ul>
<p>A sensible style guide is your one and only way to achieve <strong>consistent style</strong> - and not much else.</p>
<p>What if a style guide is <em>not</em> sensible? Here&#8217;s a building from Tirana, Albania:</p>
<p><img src="http://yosefk.com/img/n/tirana-blue.png" alt="" /></p>
<p>Here&#8217;s another one:</p>
<p><img src="http://yosefk.com/img/n/tirana-colors.png" alt="" /></p>
<p>Yep, that&#8217;s the style guide over there - bright colors over ugly buildings. And there&#8217;s nowhere to hide from the consistent style.</p>
<p>Maybe you actually love this style, and hate Jerusalem&#8217;s uniform off-white. My point is that either way, the consistent style of one of those cities leaves no place for you to like.</p>
<p>Tel Aviv, on the other hand, has a place to like for both Tirana lovers and Jerusalem lovers. Off-white houses with red rooftops? Neve Tzedek has what you want:</p>
<p><img src="http://yosefk.com/img/n/neve-tzedek.png" alt="" /></p>
<p>Buildings painted in primary colors? Here&#8217;s a hotel for you:</p>
<p><img src="http://yosefk.com/img/n/dan-hotel.png" alt="" /></p>
<p>Personally, I still prefer Jerusalem though. Consistent style is better - if you like that particular style.</p>
<p><strong>Requiring limestone vs banning asbestos</strong></p>
<p>Can coding standards be described as style guides, or are they more than that?</p>
<p>The <a href="http://google-styleguide.googlecode.com/svn/trunk/cppguide.xml">Google C++ Style Guide</a> suggests that it is in fact more than a style guide:</p>
<blockquote><p>The term Style is a bit of a misnomer, since these conventions cover far more than just source file formatting.</p></blockquote>
<p>The document goes on to say that, apart from &#8220;enforcing consistency&#8221;, it also &#8220;constrains, or even bans, use of certain features&#8221; - &#8220;to avoid the various common errors and problems that these features can cause&#8221;.</p>
<p>&#8220;Enforcing consistency&#8221; does sound similar to requiring limestone - there&#8217;s no direct functional impact. But &#8220;banning features to avoid problems&#8221; sounds more like banning asbestos - very much because of its functional impact, which <a href="http://en.wikipedia.org/wiki/Asbestos#Health_problems">can include cancer</a>.</p>
<p>However, language features are different from building materials. Asbestos was discovered, not designed, and they couldn&#8217;t know it&#8217;d cause cancer. C++ RTTI was designed and approved as a standard by strong programmers, who had in mind some cases where they thought it&#8217;d be useful.</p>
<p>RTTI is banned by the Google Style Guide, not the way asbestos is banned by regulations, but the way some sculptors prohibit their students to use fingers when they shape the fine details of clay. Learn to use proper sculpting tools - then <em>do</em> use fingers if necessary:</p>
<blockquote><p><span>A query of type during run-time typically means a design problem. </span>&#8230;you may use RTTI. But think twice about it. :-) Then think twice again.</p></blockquote>
<p>Think four times and you&#8217;ll be allowed to use RTTI. Think 1024 times and you&#8217;re still not allowed to use asbestos in a housing project. That&#8217;s because construction standards include functional considerations, but coding standards ultimately discuss style and style alone.</p>
<p>That&#8217;s why the strictest coding standards allow exceptions. And that&#8217;s why every banned feature is sometimes better than the proposed alternative.</p>
<p><strong>Readability through inconsistency</strong></p>
<p>Style guides enforce consistency. In the real world, we&#8217;ve seen that consistent style matters psychologically. In programming, people also advocate consistency for psychological reasons:</p>
<blockquote><p><span>It is very important that any programmer be able to look at another&#8217;s code and quickly understand it. </span><span>Creating common, required idioms and patterns makes code much easier to understand.</span></p></blockquote>
<p>Psychological reasons are important - but there are symmetric psychological arguments <em>for inconsistency</em>.</p>
<p>For example, required idioms can in fact make code easier to understand - or harder. Let&#8217;s look at idioms actually required in some style guides:</p>
<ul>
<li>The use of C++ &#8220;algorithms&#8221; such as std::for_each and std::transform instead of decade-old &#8220;patterns&#8221; called loops. I expect the idea to become widespread again, together with C++11 lambdas. Here&#8217;s <a href="http://www.theregister.co.uk/2006/08/08/cplusplus_loops/print.html">TheRegister</a>&#8217;s take on the impact on readability.</li>
<li>Yoda conditions: if(5 == num). <a href="http://united-coders.com/christian-harms/what-are-yoda-conditions">This page</a> - first Google hit for &#8220;Yoda conditions&#8221; at the time of writing - lists only benefits and no drawbacks, and proposes to add this to your style guide. Will code become more readable though? They&#8217;re<em> Yoda conditions!</em> &#8220;If num is five&#8221; is how you always say it in English (and Hebrew, and Russian). If five is num, read as natural your code will not.</li>
</ul>
<p>Of course my opinion on the readability of these patterns is debatable - which is precisely my point. Once a style guide is chosen, some people will experience the joy of fluent reading every time they hit if(5==num). Others will experience the pain of a mental roadblock - also every time.</p>
<p>A style guide will have something to dislike for everyone. When tastes are sufficiently different, <strong>the average amount of cringes per person stays the same under consistent style</strong> - and <strong>the variance rises </strong> (someone will hate a particularly popular mandatory pattern).</p>
<p><strong></strong>It&#8217;s like keeping wealth constant and increasing inequality - something not even a political party would advocate. How is this psychologically a win?</p>
<p>But let&#8217;s go ahead and assume that the &#8220;required idioms&#8221; suit everyone&#8217;s taste, and, by themselves, actually make code easier to understand. Still:</p>
<ul>
<li><strong>External libraries will not follow your style guide</strong>. They follow style guides of their own. And this inconsistency can <em>improve</em> readability. Code using the library stands out, and the library&#8217;s style can match the accepted domain-specific conventions better than your local style. In computer vision, X is the real world coordinate, x is the pixel coordinate - contrary to many software style guides.</li>
<li><strong>You can&#8217;t count on stylistic conventions</strong>, because there are exceptions. Google&#8217;s code orders parameters such that <a href="http://google-styleguide.googlecode.com/svn/trunk/cppguide.xml#Function_Parameter_Ordering">inputs come first</a>, but memcpy &amp; snprintf don&#8217;t. You either have to look out for exceptions or risk misunderstandings by blindly assuming consistency.</li>
<li><strong>Different people think differently, even if their code looks the same.</strong> I find it easier to understand programmers&#8217; intents through their unique style. When they&#8217;re all forced to write superficially similarly, I can&#8217;t tell who wrote what, and what the subtext of the code is.</li>
</ul>
<p>I&#8217;ll illustrate the last point with a couple of examples. I knew O.M. before I ever saw him and before I even knew his name. To me, he was the programmer with the two spaces before the trailing const:</p>
<pre><strong>inline</strong> <strong>int </strong>x()  <strong>const</strong>;</pre>
<p>I knew him through his code: mathematically elegant, obsessive about fine details of type-based binding and modeling. I could guess what he left out with an intent to maybe add it later. I understood him.</p>
<p>Likewise, I can always spot G.D.&#8217;s code by the right-leaning asterisk:</p>
<pre><strong>int </strong>*p,*q=arr+i;</pre>
<p>G.D. certainly couldn&#8217;t care less about types - similarly to most people with this asterisk alignment. I know his code: terse, efficient, to the point. I know what to expect.</p>
<p>Who wrote this code?</p>
<pre>camelCaseName = longerCamelCaseName-camelCaseName;</pre>
<p>I dunno, the collective unconscious wrote it. Anyone could write it - or several people patching after each other. I fail to identify with the author and guess his intent - it could be anyone. The code has no smell or taste to me.</p>
<p>This can sound a bit crazy - &#8220;does he actually advocate that everyone develops <em>as uncommon style as possible</em> as a way to mark his trail&#8221;? Of course I don&#8217;t mean that.</p>
<p>I think what I&#8217;m saying can make more sense if you think of style and taste in code as analogous to the taste of food. Of course it&#8217;s ridiculous to expect every restaurant to make food with a unique taste. Many people like pizza, many people know how to make pizza - so expect much similarly-tasting pizza around.</p>
<p>But we wouldn&#8217;t like to always eat food cooked to the same spec by restaurants in a single franchise. If someone knows to make food with a unique taste, we welcome it.</p>
<p>And unique taste of your food doesn&#8217;t indicate that it&#8217;s bad for your health. Moreover, food with a familiar taste can be very unhealthy - much of it is.  Code in a familiar style has a comforting look - sometimes misleadingly.</p>
<p>I don&#8217;t think requiring the same taste everywhere is how you improve the health of a code base.</p>
<p><strong>Popular demand</strong></p>
<p>I&#8217;m naturally inclined to argue against coding standards, because it feels like bureaucracy unthinkingly imposed on the work process from above. However, what if it&#8217;s not imposed from above - what if the programmers themselves want it?</p>
<p>Many certainly do. Many good ones do.</p>
<p>Incidentally, while I was writing this, I stumbled upon <a href="http://www.codelord.net/2011/12/10/your-brain-cares-about-code-style/">an article</a> arguing for consistency - for the very reasons I use to argue against it:</p>
<blockquote><p><span>If code isn’t written in a consistent style in your team, whenever you come across code with the spacing a bit wrong, the first thing your head’s going to process is &#8220;</span><strong>I didn’t write this.</strong><span>&#8220;</span></p></blockquote>
<p><span> &#8230;Exactly why I like personal style! <em>I</em> <em>really didn&#8217;t write it</em> - it&#8217;s <em>his</em> code, I want to understand <em>him, </em>and his style helps greatly.</span></p>
<blockquote><p><span>This is a natural feeling, and as we all know coders have a hard [time] to restrain impulse to rewrite any piece of code they didn’t write.</span></p></blockquote>
<p>I understand that impulse very well. Personally, I hate new food more than anyone I know - I eat the same thing every day, for months, for years. I do prefer Jerusalem to Tel Aviv. And I like consistent style - especially my own.</p>
<p>But why should others be forced-fed my favorite cabbage salad? Is consistency that much prettier than freedom?</p>
<p><strong>Team spirit</strong></p>
<p>I don&#8217;t argue with people who favor a consistent style - and I can&#8217;t. The article above is nostalgic about a team that followed a style guide &#8220;<span>religiously&#8221;. How can nostalgia be refuted? Clearly, consistent style can create a unique team spirit that to some is valuable and memorable.</span></p>
<p>All I can and do argue is that <em>a lack </em>of a style guide can <em>also </em>create a good atmosphere that is preferred by some other people.</p>
<p>In fact I believe it is exactly team spirit that a style guide - or lack thereof - actually affects. All other effects come indirectly through the impact on atmosphere. Here&#8217;s my attempt at taking the social effect into account:</p>
<ul>
<li><strong>I won&#8217;t break established conventions</strong>. Following conventions is a great way to say, &#8220;I respect the local tradition and the wisdom it embodies.&#8221; And I&#8217;ll thoroughly enjoy the limestone covering our code - I <em>like</em> consistency. Yay, a beautiful code base! And if the convention is <em>really</em> bad - I hopefully won&#8217;t need to join the team in the first place.</li>
<li><strong>However, I won&#8217;t establish and enforce conventions</strong> when I&#8217;m the one starting with a clean slate. Having no conventions is a great way to say, &#8220;join my project to express yourself without artificial constraints&#8221;.</li>
</ul>
<p><strong><strong>From team spirit to grassroots bureaucracy</strong></strong></p>
<p>Did you know Ken Thompson is not allowed to check in code at Google? He said so in his <a href="http://www.codersatwork.com/">Coders at Work</a> interview:</p>
<blockquote><p><strong>Seibel:</strong> I know Google has a policy where every new employee has to get checked out on languages before they&#8217;re allowed to check code in. Which means you had to get checked out on C.</p>
<p><strong>Thompson:</strong> Yeah, I haven&#8217;t been.</p>
<p><strong>Seibel:</strong> You haven&#8217;t been! You&#8217;re not allowed to check in code?</p>
<p><strong>Thompson:</strong> I&#8217;m not allowed to check in code, no.</p></blockquote>
<p>The programmer who co-created Unix is not allowed to check in code. If this isn&#8217;t a bureaucracy, what is? But it&#8217;s inevitable - with rules, you always paint yourself into a corner that way. Allow some to break the conventions, and you will have offended everyone else. Use the same rules for everyone - and some won&#8217;t contribute.</p>
<p>Of course, Thompson does contribute to Google. Well, Google has plenty of ways to motivate him to do that. And, apparently, they have good programmers willing to write production code based on his uncommitted prototype code.</p>
<p>But we&#8217;re not all like Google that way. We can&#8217;t all afford the inevitable laughable outcomes of bureaucracies. If you come across an original programmer with an off-beat style, do you want him to join your project or to move on?</p>
<p>Grassroots bureaucracy is still bureaucracy. I wouldn&#8217;t object to an established bureaucracy that people claim to value. But I wouldn&#8217;t establish a new one, either.</p>
<p><strong><strong>Conclusion</strong></strong></p>
<p>A style guide can make code look prettier to some - and uglier to others - but not tangibly better, except if programmers <em><em>enjoy</em> </em>it so much as to produce better code.</p>
<p>This is equally true for the lack of a style guide. The results of freedom look prettier to some and uglier to others.</p>
<p>Personally, I believe that one rule too few is better than one rule too many, so I don&#8217;t bother to enforce a common style.</p>
<p><strong>P.S.: when I become a neat-freak</strong></p>
<p>Generally, I tend to enforce <a href="http://www.yosefk.com/blog/the-iron-fist-coding-standard.html">little to no conventions</a>. Here are the exceptional situations where I actually enter the ridiculous position of telling people how to write their code.</p>
<p><strong><em>Interfaces </em><em>should</em><em> be consistent</em></strong></p>
<p>I don&#8217;t care if an interface looksLikeThis or like_this. But if it uses <em>both</em>, then I&#8217;ll ask the author to change it to one of the styles - the one which came first. For users to feel confident, an interface should look well thought-out - which implies an illusion of a single author, which implies a consistent style.</p>
<p>By &#8220;interface&#8221;, I only mean the outermost stuff called by module users. Internal functions, classes, etc. can look like Tel-Aviv as far as I&#8217;m concerned. For instance, in a simple server, the &#8220;interface&#8221; to me is just the protocol, and nothing in the code itself.</p>
<p><strong><em>Warnings should be errors</em></strong></p>
<p>I hate the concept of compiler warnings - I link it to the concept of guilt. &#8220;Fine, be that way, do this evil implicit conversion thing - but if something happens, I will have told you so.&#8221; Why should we put up with such manipulative behavior? Pick a position - refuse to compile, or compile silently.</p>
<p>However, in practice, compatibility issues make what has to be errors into warnings. If it compiled 30 years ago, it has to keep compiling, even if nobody wants it to compile in new code. Even if compilers could always prove the code wrong, but initially didn&#8217;t bother, and it happens to work in old programs.</p>
<p>So, to the dismay of freedom-lovers, I turn warnings into errors where I can (as in -Werror) - even if I can&#8217;t cherry-pick the &#8220;right&#8221; warnings. Mainly since when a file generates 10 (false) warnings, the eleventh (truly useful) warning goes unnoticed.</p>
<p>Another reason is that warnings cause guilt - they are evil, and must be destroyed. If I didn&#8217;t destroy them by turning them into errors, I&#8217;d have to destroy them by disabling them. Then I&#8217;d never get that eleventh useful warning.</p>
<p>But except for these two pet peeves - interfaces and warnings - I do think grown-up programmers should be left alone.</p>
<p><strong>P.P.S.: Greetings from the <a href="http://www.smbc-comics.com/index.php?db=comics&amp;id=1846">Overextended Metaphor Parrot</a></strong></p>
<p>Originally, I had a few references to Bauhaus architecture in the text - how Tel Aviv has buildings in the Bauhaus style and Jerusalem doesn&#8217;t because of its limestone requirement, and how Ken Thompson&#8217;s inability to commit code at Google is analogous to that. However, as a commenter pointed out, there <em>are</em> buildings in the Bauhaus style in Jerusalem - in my own neighborhood, actually, so I obviously walked past them plenty of times.</p>
<p>I guess this goes to show that good architects have no trouble complying with a style guide - and that I shouldn&#8217;t overextend metaphors in areas where I&#8217;m not minimally competent.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.yosefk.com/blog/coding-standards-is-consistency-prettier-than-freedom.html/feed</wfw:commentRss>
		</item>
		<item>
		<title>Graham &#038; Coase: when big companies are a good idea</title>
		<link>http://www.yosefk.com/blog/graham-coase-when-big-companies-are-a-good-idea.html</link>
		<comments>http://www.yosefk.com/blog/graham-coase-when-big-companies-are-a-good-idea.html#comments</comments>
		<pubDate>Sat, 26 Nov 2011 19:11:26 +0000</pubDate>
		<dc:creator>Yossi Kreinin</dc:creator>
		
		<category><![CDATA[wetware]]></category>

		<guid isPermaLink="false">http://www.yosefk.com/blog/?p=142</guid>
		<description><![CDATA[Many people can only tightly cooperate under rules implying trust.]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.paulgraham.com/">Paul Graham</a> was once asked the following <a href="http://www.paulgraham.com/raq.html">RAQ</a> (rarely asked question):</p>
<blockquote><p><strong>How can I avoid turning into a pointy-haired boss?</strong></p></blockquote>
<p>His answer:</p>
<blockquote><p><span>The pointy-haired boss is a manager who doesn&#8217;t program. So the surest way to avoid becoming him is to stay a programmer. What tempts programmers to become managers are companies </span><span>where the only way to advance is to go into management. </span><span>So avoid such companies and work for (or start) startups.</span></p>
<p><span>Why be a manager when you could be a founder or early employee at a startup?</span></p></blockquote>
<p><em>Why?! </em>Oh wow. I could fill a book explaining why. But many of my reasons are my own, and aren&#8217;t relevant to you unless you&#8217;re much like me. So I&#8217;ll focus on the general answer to the question implied by Paul Graham: <strong>Why do large firms exist?</strong></p>
<p>The question was addressed by the economist <a href="http://en.wikipedia.org/wiki/Ronald_Coase">Ronald Coase</a> in his article &#8220;<a href="http://en.wikipedia.org/wiki/The_Nature_of_the_Firm">The Nature of the Firm</a>&#8220;. This article, together with his work on externalities (the <a href="http://en.wikipedia.org/wiki/Coase_Theorem">Coase Theorem</a>), <a href="http://en.wikipedia.org/wiki/Coase_Theorem">earned</a> him a Nobel Prize in economics. This is one evidence that the question is interesting and far from trivial.</p>
<p>Suppose there are no good answers to Paul Graham&#8217;s rhetorical question. That is, it&#8217;s always <em>objectively</em> better to start or join a small firm than to be a manager in a large one. You&#8217;ll always get more work done, or will be more satisfied, or both. Well, if so, competition should eventually drive large firms out of business. So why are they still around?</p>
<p>For starters, clearly there are problems best solved by small groups of people armed with off-the-shelf tools. For instance, two iconic <a href="http://ycombinator.com/">YC</a> startups funded by Paul Graham, <a href="http://www.reddit.com/">Reddit</a> and <a href="https://www.dropbox.com/">Dropbox</a>, each solve a problem with the help of a few programmers and a bunch of commodity servers running a commodity software stack. A larger company could hardly improve on what they do.</p>
<p><strong>Note that</strong><strong> off-the-shelf products are key to being small</strong> (or at least starting small). Reddit or Dropbox could never <em>build </em>those servers from scratch. A small group of people can not erect a $5G chip fabrication facility. Building and operating a fab - or a search engine - requires lots of custom development, so you need a lot of people.</p>
<p>Or do you?</p>
<p>Of course the <em>total</em> number of people involved has to be very large. But it doesn&#8217;t follow that they should be organized as big companies. Instead, the work could be done by many small organizations, each contracting out most of the work to others.</p>
<p>You&#8217;re big because you <em>hire</em>. Why hire if you can <em>buy</em>, contract out - and stay small?</p>
<p>Indeed, this seems to make perfect sense. To quote <a href="http://en.wikipedia.org/wiki/The_Nature_of_the_Firm">Wikipedia&#8217;s summary</a> of The Nature of the Firm (1937):</p>
<blockquote><p><span>The traditional economic theory of the time suggested that, because the market is &#8220;efficient&#8221; (that is, those who are best at providing each good or service most cheaply are already doing so), it should always be cheaper to contract out than to hire.</span></p></blockquote>
<p>Then why do most people prefer employment to self-employment, as evidenced by their actions (and an economist never trusts anything but actions as a tool to <a title="Search for revealed preference" href="http://www.daviddfriedman.com/Academic/Price_Theory/PThy_Chapter_2/PThy_CHAP_2.html">reveal someone&#8217;s preferences</a>)? Why do I <em>hate</em> the idea of running a small firm?</p>
<p>Either the &#8220;traditional economic theory&#8221; is right - one should run a small firm, and I&#8217;m a freak of nature destined to extinction due to economic evolutionary pressure, together with much of the population - or the theory is lacking, and there should be a concept formalizing my aversion to self-employment.</p>
<p>And in fact, at this point, Coase introduces the term - <strong>transaction costs</strong>:</p>
<blockquote><p><span>Coase noted, however, that there are a number of </span><a href="http://en.wikipedia.org/wiki/Transaction_cost">transaction costs</a><span> to using the market; the cost of obtaining a good or service via the market is actually more than just the price of the good.</span></p></blockquote>
<p>Oh, yeah - MUCH more if you ask me.</p>
<blockquote><p><span>Other costs, including search and information costs, bargaining costs, keeping </span>trade secrets<span>, and policing and enforcement costs, can all potentially add to the cost of procuring something via the market.</span></p></blockquote>
<p>YES! Here&#8217;s a Nobel Prize-winning economist from the notoriously &#8220;pro-free market&#8221; <a href="http://en.wikipedia.org/wiki/Chicago_school_of_economics">Chicago school</a> that UNDERSTANDS ME. He knows why I hate markets. (&#8221;Pro-market&#8221; doesn&#8217;t mean you love markets, just that you think governments are even worse.)</p>
<blockquote><p><span>This suggests that firms will arise when they can arrange to produce what they need internally and somehow avoid these costs.</span></p></blockquote>
<p>Avoiding these costs can enable work that just can&#8217;t happen outside the context of a big company.</p>
<p>For instance, I work on chips for embedded computer vision, at a company that&#8217;s now fairly large. This is an example where <strong>a lot of people need to cooperate in a custom development effort</strong> (as opposed to fewer people using off-the-shelf products).</p>
<p>In theory, I could start a computer vision hardware startup instead of it being an internal project. In practice, it wouldn&#8217;t work, because:</p>
<ul>
<li><strong>I wouldn&#8217;t know what to build</strong>. Hardware accelerates algorithms - what algorithms? I only know because I&#8217;m in the same company with developers of very effective unpublished algorithms. Without that knowledge, what could I build - an <a href="http://en.wikipedia.org/wiki/OpenCV">OpenCV</a> accelerator? Good luck selling that.</li>
<li><strong>I couldn&#8217;t build it nearly as efficiently.</strong> A great source of efficiency is fitting hardware to the specific workload. But if we were not a part of the company but a vendor, the company would make sure there are competing vendors to keep prices low. This means that <em>we</em>, no longer having a guaranteed customer, would have to support <em>as many different workloads as possible</em>, to increase the pool of potential customers. As a rule, more generic hardware is less efficient.</li>
<li><strong>I couldn&#8217;t explain how to program it.</strong> Once you gave away your programming model to the customer - as you have to if you want them to, well, program you processors - only very strong patents can prevent them from cloning your hardware (possibly with the help of your competitor). A big company that, among other things, designs its own hardware doesn&#8217;t have to explain it to the outside world. And even if its hardware ends up cloned - it&#8217;s just one part of the secret knowledge behind the product. But if you&#8217;re a small company only making hardware and it&#8217;s cloned, you&#8217;re busted. You shouldn&#8217;t even start before making sure your ideas are &#8220;sufficiently patentable&#8221; - which you <em>don&#8217;t know</em> before you developed those ideas.</li>
</ul>
<p>Of course, the number one real reason <em>I</em> couldn&#8217;t run a hardware startup is that I&#8217;m no businessman. But the problems above are also very real, and frequently insurmountable for people who <em>can </em>do business. Not all custom development is impossible to successfully outsource, but much is. The problems result from economic fundamentals.</p>
<p>In econ-speak, such problems are collectively known as &#8220;search and information costs, bargaining costs, keeping trade secrets, and policing and enforcement costs&#8221;. Indeed, all these problems were featured in my example. In plain English, a simple way to sum up all those problems is <strong>trust</strong> - or more precisely, the lack thereof:</p>
<ul>
<li>A company can&#8217;t <strong>trust </strong>a vendor, so a vendor can&#8217;t know its algorithms.</li>
<li>A company can&#8217;t <strong>trust </strong>a vendor to keep qaulity high and prices low if it guarantees to remain its customer&#8230;</li>
<li>&#8230;So a vendor can&#8217;t <strong>trust </strong>a company to remain its customer, so it can&#8217;t invest too much in a solution just to that company&#8217;s specific needs.</li>
<li>A vendor can&#8217;t <strong>trust</strong> a company to keep buying from it if enough knowledge is given away so that the product can be cloned instead - so some products are not worth building.</li>
</ul>
<p>When you work for a big company, you deal with <em>coworkers</em>, and you&#8217;re all playing for the same team. The smaller the company, the more you deal with customers and vendors, which means playing <em>against</em> them. There&#8217;s no such word as &#8220;co-customer&#8221; or &#8220;co-vendor&#8221; for a good reason.</p>
<p>At least that&#8217;s how things are framed by the rules. The rules say that all employees are agents acting towards a common goal, &#8220;to promote the company&#8217;s interests&#8221; - whereas different companies have different bottom lines and different interests.</p>
<p>Of course, reality is never like the rules - in reality, everyone in the company plays by their own rules, attempting to promote the interest of any of the following - or a combination:</p>
<ul>
<li>Shareholders</li>
<li>Customers</li>
<li>Employees</li>
<li>His team</li>
<li>His manager</li>
<li>His friends</li>
<li>Himself</li>
</ul>
<p>So in reality, of course there&#8217;s a lot of chaos in a big company. And it doesn&#8217;t help that the bigger it is, the harder it is to make sense of what&#8217;s going on:</p>
<blockquote><p><span>&#8230;There is a natural limit to what can be produced internally, however. Coase notices &#8220;decreasing returns to the entrepreneur function&#8221;, including increasing overhead costs and increasing propensity for an overwhelmed manager to make mistakes in resource allocation. This is a countervailing cost to the use of the firm.</span></p></blockquote>
<p>&#8230;Which explains why we aren&#8217;t all employed by a single all-encompassing huge company.</p>
<p>But at least the rules of a large company <em>frame</em> things right - as <em>cooperation </em>more than <em>competition</em>. (Competition generally isn&#8217;t an end - it&#8217;s a means to ultimately force people to cooperate, and, as Coase points out, it only gets you this far.)</p>
<p>Of course, corporate rules also create competition - employees compete for raises, etc. But in practice, overall most would agree that it&#8217;s much safer to trust co-workers than customers or vendors.</p>
<p>Why be a manager when you could be a founder or early employee at a startup? Here&#8217;s the part of my answer that is based on economic fundamentals.</p>
<p>I specialize in areas requiring custom development by many people. Many people can only tightly cooperate under rules implying trust. Therefore they must not be customers and vendors, but coworkers, which leads to large firms. Such is The Nature of the Firm.</p>
<p>Of course there are problems that can be solved by a small group of people with mutual trust, without tightly-coupled, joint development with others - for example, the problems solved by Reddit and Dropbox. One reason I personally never looked that way is my aversion to business. Such is my own nature.</p>
<p>It just so happens that the nature of the firm suits my nature nicely - because there are situations where big companies are a good idea. When you can&#8217;t buy and have to build, trust is fundamental to getting the job done.</p>
<p><strong>UPDATE </strong>(December 9, 2011): just found <a href="http://www.johndcook.com/blog/2010/06/30/where-the-unix-philosophy-breaks-down/">an interesting analogy</a> between company size and program size. Doing many things in one big program can be easier than using many small programs because of &#8220;transaction costs&#8221; - the cost of exchanging data between the programs.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.yosefk.com/blog/graham-coase-when-big-companies-are-a-good-idea.html/feed</wfw:commentRss>
		</item>
	</channel>
</rss>

