Archive for September, 2008

Homogeneous? Heterogeneous? Generic? Specialized? What the heck ?!

Tuesday, September 30th, 2008

One of the key debates around multi/many-core processors is about homogenity versus heterogenity. Listening to a number of panel debates, it seems like the balance is tilting towards heterogeneous multi-cores. Too bad, because I believe we are missing an important point here – economics.

But first, let’s get some concepts straight. From my perspective, I regard a chip homogeneous if all it’s programmable cores have the same ISA. All those cores may run at the same frequency – or not – but in my terminology such a chip goes for homogeneous. If there are two different ISAs present inside the chip – I would call that chip heterogeneous. I readily admit this to be an arbitrary definition, but it stems from previous experience in the area and so far it has served me well.

A related issue is that of genericity versus specialization. A chip may be homogeneous, while still being specialized (e.g., all cores are DSP cores). Generic is hard to define though in any other terms than what it´s not – perhaps the only sensible ‘definition’ may be that it’s capable of running programs written in HW-independent languages with ‘acceptable’ performance (please, please, don’t ask me to define ‘acceptable’…)

So back to the basic thesis of this post – economics. Assuming that designing a chip costs roughly the same irrespective of its structure (at least the differences are not of orders of magnitude), what define the viability of a HW design is it’s potential target market. The wider market it is capable of addressing while providing good enough performance, the more viable it is. Hence, I would claim that the future will be dominated by homogeneous, partly specialized or generic processors.

Why?

Here are a few additional arguments.

Portability of the code. If you design for a homogeneous, generic architecture, you can more easily port your code to new HW architectures

Scalability of the code. Parallel applications scale best if they are designed for homogeneous HW architectures. Heterogenity implies specialization and division along functional lines, which will result in scalability barriers, sooner or later.

The shift to ‘SW-defined HW’. There are strong signals that more and more functions will in the future be implemented in SW instead of HW (I will return to this subject in a future post)

Finally, a disclaimer: I’m not claiming the death of specialized processors – there will always be specialized segments where cost will not be an issue; the point is that more and more of the computing-based products will shift to more general purpose, more SW-oriented approaches, where scalability and portability will be the key design considerations.

“One step forward, then it’s back to go…”

Tuesday, September 30th, 2008

… énekli a Dire Straits (a ‘The Bug’ címü nótában) és alkalmazza Üni a gyakorlatban. Az történt ugyanis, hogy a kisasszony megpróbál feltérdelni kutyapozícióba, de ez egyelöre azzal ér véget, hogy hátrafele araszol egyet. Hihetetlenül neki tud mérgelödni és hangosan jelzi is a frusztrációt. A tegnap a hölgyek kaptak egy csokor virágot tölem, el is indult lelkesen a zörgö csoda felé, de szó szerint minden 2-3 kúszó mozdulat után hátra araszolt egy picit 😉

Hiába, mindig is mondtam, hogy mély élettapasztalat van a Dire Straits dalaiban… 🙂

A mökki mint energiaital

Monday, September 29th, 2008

Bizony, mökkizni voltunk a múlt hétvégén (a mökki szabad fordításban hétvégi háznak felel meg). Nem tudom, hogy a jó levegö tette-e avagy a társaság, de tény, hogy Üni úgy felpörgött mint aki Redbull-t ivott, tegnap még idehaza se lehetett vele bírni. Kuncogott, nevetett, kúszott-mászott, rikoltozott, enni se volt hajlandó széles jókedvében. Meg is lett az eredmény, kilenc elött kidölt mint az útszéli fa és 11 órán át húzta a lóbört, minden eddigi rekordot a történelembe számüzve. Ma már talán csendesebb volt, de az utóhatásokat még éreztük a reggeli utáni szokásos hancúrnál. Tény, hogy volt mit pótolnia, majd’ két órát aludt ma is.

Ha belegondolok, hogy ma lesz még egy kis  utó-mökkizésés is (az ott meg nem fözött paprikás-pityóka elpusztítása) …. hát biztos jó befektetés lesz számunkra egy korty kávé, hogy birjuk az iramot…

Majd jelentem az eredményt 😉

Errr… lang

Thursday, September 25th, 2008

Erlang is probably the best known product of Ericsson in the realm of software development. It was developed during the nineties for internal use and released to the open source communities after Y2K. It’s name is a word-play: it stands for ERicsson LANGuage, while also paying tribute to the Danish matematician Agner Krarup Erlang, with contribution to the theory of telephone traffic and queueing theory (which plays an important role in the language)

I think there was – and still is – a lot of confusion around Erlang. So what Erlang really is can be sumarized in two bullets:

  • It’s a functional language
  • It provides language level support for massive, thread-based concurrency and fault management

The real problem is that somehow these two things get mixed up and people believe that these features can only exist in connection to each other. Well, that’s not true. As I told my distinguised colleague – and the father of Erlang – Joe Armstrong, they got one thing wrong when they designed the language: the mix-up of the functional nature with concurrency.

I will not debate here the merits or drawbacks of functional programming – there have been a lot of pro and contra arguments that only prove one thing: this is a religious matter. So let’s leave that aside and instead focus on the concurrency aspect.

The support for concurrent programming in Erlang builds on the following constructs:

  • Light-weight, easy (i.e., fast) to create, isolated threads – no shared memory, no implicit dependency between threads
  • Message based, fire and forget communication between threads
  • Fault management hierarchy – processes (lightweight threads) can monitor each other and detect when a certain process has died (through a system generated message)

These simple yet powerful constructs are the ones that make Erlang so compelling for parallel computing – and in no way its functional nature. Exactly the same constructs can be implemented in C++ (or even better Java) – granted, with process mechanisms embedded into the OS kernel in order to secure independence and memory isolation (something that was also implemented within Ericsson). Everything else – functional programming, code hot-swap etc – are of lesser or no importance when it comes to parallel programming.

So, all in all – let’s focus on the real value of Erlang when it comes to parallel programming – and do not mix apples with pears.

Szia, Szia!

Wednesday, September 24th, 2008

Egy ideje próbáljuk rávenni Ünigét, hogy ‘szia’-t intsen – nos, örömmel tudatom, hogy sikerült 🙂 Két napja intett elöször vissza, akkor még véletlennek gondoltuk. Ma reggel viszont már felkeléskor vidáman integetett, látszott rajta, hogy erösen büszke a teljesítményére.

Az okosak azt állítják, hogy az ilyenszerü utánzás az önkép kialakításának egyik elösegítöje (az elsö lépést mi a nyelvöltögetéssel tettük meg, mutogatta is, ha kellett, ha nem, bárkinek, bárhol 🙂 ). Lehet, hogy így van, de az tény, hogy ugyancsak a napokban bizonyítatott be – tudományosan, kisérlet segítségével 😉 , ilyen kockafej szülökkel miként is másképp … – hogy Üni már felismeri a tükörképben a valós világot.
Miben állt a kisérlet?
Szépen leültettük a tükör elé és a háta mögött, csendben, az egyik oldalon, mutogattam Panda Annát (“aki” valóban (játék-)panda és valóban Annának hívják). Kis figyelés után Üni – ahelyett, hogy a tükör felé indult volna – szépen hátra fordult – a jó irányba – és nyakon csípte Ö-Pandaságát.

Hm, nyolc hónap kellett ahhoz, hogy idáig eljussunk. Folytatás következik …

On shared memory and alcohol

Monday, September 22nd, 2008

Last week I had the honour to chair the panel debate on the computing environment on many-core platforms (many-core being ‘more than 64 cores’), held as part of the Swedish Multi-core Days, sponsored by my employer. The list of panelists was impressive: Anant Agarwal, Kunle Olukotun, David Padua, Erik Hagersten, Per Stenstrom.

At one point Anant made a nice analogy between alcohol and shared memory – we know both are damaging, but still, both are tempting and people use them. Kunle immediately reacted – ‘well, the problem is the hangover, that is, when you realise the mess you got into’. At the end I suggested the solution to hangover – let’s stay drunk and continue using shared memory.

Joking aside – though it was well appreciated by the 200+ audience – I think we touched upon the Achilles heel of parallel programming on chip multi-processors (at least one of them) – shared state. If you really want your software to scale with the number of cores (or HW threads) it may use, you simply cannot afford any central point in it. Put it simply, anything shared among threads will define your next bottleneck – that’s why memory bandwidth is bottleneck, that’s why shared state will be a bottleneck. With all due respect, transactional memory is a neat short term fix to get from 1-2 cores to 4-8, perhaps 16; beyond that, there’s no way forward.
So what do we have left? Well, message passing, at least at the lowest level. It’s the only mechanism that is fully distributed and scales. What you need to get it working is a fast interconnect on the chip. As I said, this is what shall be there at the lowest level – it has been shown that other constructs can be built efficiently on top of message passing semantics, including shared memory (if there’s a real need…)
By the way, with message passing I did not mean Erlang 🙂 – but that’s the subject of an other post.

Hello world ;-)

Monday, September 15th, 2008

So, here we go – the first post on my blog.

I’ll try to keep you posted on what’s going on with us as well as on the general environment of programming.