|
Vista is no more |
| January 10th, 2009 under Computers, Digital Rights, OSS, rengolin, Software, Unix/Linux. [ Comments: 2 ]
|
|
It still hasn’t gone to meet it’s maker, but it was also not as bad as it could’ve been.
After Windows Vista was launched with more PR and DRM than any other, Microsoft hoped to continue its domination of the market. Maybe afraid of the steep Linux increase in desktops (Ubuntu has a great role in that) and other market pressures, they’ve rushed out Vista with so many bugs and security flaws, so slow and with such a big memory and CPU footprint that not many companies really wanted to change their whole infrastructure to see it drawn a little later.
China government ditched it for XP because it was not stable enough to run the Olympics, only to find out that the alternative didn’t help at all.
All that crap helped a lot Linux (especially Ubuntu) jump on the desktop world. Big companies shipping Linux on lots of desktops and laptops, all netbooks with Linux as primary option, lay people now using Linux as they would use any other desktop OS. So, is it just because Vista is so bad? No. Not at all. Linux got really user friendly over the last five to ten years and it’s now as easy as any other.
Vista is so bad that Microsoft had to keep supporting Windows XP, they’re rushing again with Windows 7 and probably (hopefully) they’ll make the same mistakes again. It’s got so bad that the Free Software Foundation’s BadVista campaign is officially is closing down for good. For good as in: Victory!
Yes, victory because in one year they could show the world how bad Vista really is and how good the other opportunities are. Of course, they were talking about Linux and all the free software around, including the new gNewSense platform they’re building, but the victory is greater than that. The biggest message is that Windows is not the only solution to desktops, and most of the time, it’s the worst.
In conjunction with the DefectiveByDesign guys, they also showed how Vista (together with Sony, Apple, Warner et al) can completely destroy your freedom, privacy and entertainment. They were so successful in their quest that they’re closing doors to spend time (and donors’ money) in more important (and pressing) issues.
Now, they’re closing down but that doesn’t mean that the problem is over. The idea is to stabilise the market. Converting all Windows and Mac users to Linux wouldn’t be right, after all, each person is different. But the big challenge is to have users that need (or want) a Mac, to use a Mac. Who needs Windows and can afford to pay all extra software to protect your computer (but not your privacy), can use it. For developers the real environment is Unix, they should be able to get a good desktop and good development tools as well. It’s, at least, fair.
But for the majority of users, what they really want is a computer to browse the web, print some documents, send emails and for that, any of the three is good enough. All three are easy to install (or come pre-installed), all three have all the software you need and most operations and configurations are easy or automatic. It’s becoming more a choice of style and design than anything else.
Now that Apple got rid of all DRM crap, Spore was a fiasco so EA is selling games without DRM, the word is getting out. It’s a matter of time it’ll be a minor problem, too. Would DefectiveByDesign retire too? I truly hope so.
As an exercise to the reader, go to Google home page and search for the terms: “windows vista“. You’ll see the BadVista website in the first page. If you search for “DRM” you’ll also see the DefectiveByDesign web page as well. This is big, it means that lots and lots of websites are pointing to those websites when they’re talking about those subjects!
If you care enough and you have a Google user and is using the personalised Google search, you could search for those terms and press the up arrow symbol on those sites to make them go even higher in the rank. Can we make both be the first? I did my part already.
|
|
Happy birthday to GNU |
| September 3rd, 2008 under OSS, rengolin, Unix/Linux. [ Comments: 2 ]
|
|
25 years and growing strong, happy birthday!
|
|
Unix plumbing |
| August 29th, 2008 under Devel, rengolin, Unix/Linux. [ Comments: 1 ]
|
|
Unix has some fantastic plumbing tools. It’s not easy to understand the power of pipes if you don’t use it every day and normally Windows users think it’s no big deal at all. Let me give you some examples and see what you think…
Tools
With a small set of tools we can do very complex plumbing on Unix. The basic tools are:
- Pipes (represented by the pipe symbol ‘|’) are interprocess communication devices. They’re similar to connectors in real life. They attach the output of a process to the input of another.
- FIFOs are fake files that pretty much to the same thing but have a representation on the file system. In real life they would be the pipes (as they’re somewhat more visible).
- Background execution (represented by the and symbol ‘&’) enables you to run several programs at the same time from the same command line. This is important when you need to run all programs at each corner of the piping system.
Simple example
Now you can understand what the grep below is doing:
cat file.txt | grep "foobar" > foobar.txt
It’s filtering every line that contains “foobar” and saving in a file called foobar.txt.

Multiple pipelining
With the tee you can run two or more pipes at the same time. Imagine you want to create three files: one containing all foo occurrences, another with all bar occurrences and a third with only foo and bar at the same time. You can do this:
mkfifo pipe1; mkfifo pipe2
cat file | tee pipe1 | grep "foo" | tee pipe2 > foo.txt &
cat pipe1 | grep "bar" > bar.txt &
cat pipe2 | grep "bar" > foobar.txt
The Tees are redirecting the intermediate states to the FIFOs which are holding those states until another process read them. All of them run at the same time because they’re running in background. Check here the plumbing example.

Full system mirror
Today you have many tools to replicate entire machines and rapidly build a cluster with an identical configuration than a certain machine at a certain point but none of them are as simple as:
tar cfpsP - /usr /etc /lib /var (...) | ssh dest -C tar xvf -
With the dash, tar redirects the output to the second command in line, the ssh, which then connects to the destination machine and un-tar the information from the input.
The pipe is very simple and at the same time very powerful. The information is being carried from one machine to the other, encrypted by ssh and you didn’t have to set-up anything special. It works with most Unix and even between different types of unices.
There is a wiki page explaining the hack better here.
Simplicity, performance and compliance
Pipes, FIFOs and Tees are universal tools, available on all Unices and supported by all Shells. Because everything is handled in memory, it’s much faster than creating temporary files, and even if programs are not prepared to read from the standard input (and using pipes) you can create FIFOs and have the same effect, cheating the program. It’s also much simpler to use pipes and FIFOs than creating temporary files with non-colliding names and remove them later when needed.
It can be compared with static vs. dynamic allocation in programming languages like C or C++. With static allocation you can create new variables, use them locally and they’ll be automatically thrown away when you don’t need them any more, but it can be quite tricky to deal with huge or changing data. On the other hand, dynamic allocation handles it quite easily but the variables must be created, manipulated correctly and cleaned after use, otherwise you have a memory leak.
Using files on Unix requires the same amount of care not to fill up the quota or have too many files in a single directory but you can easily copy them around and they can be modified and re-modified, over and over. It really depends on what you need, but for most uses a simple pipe/FIFO/Tee would be much more than enough. People just don’t use them…
|
|
Object Orientation in C: Structure polymorphism |
| August 28th, 2008 under Algorithms, Devel, rengolin, Unix/Linux. [ Comments: 1 ]
|
|
Listening to a podcast about the internals of GCC I’ve learnt that, in order to support object oriented languages in a common AST (abstract syntax tree), the GCC does polymorphism in a quite exquisite way.
There is a page that describes how to do function polymorphism in C but not structure polymorphism as it happens on GCC, by means of a union, so I decided that was a good post to write…
Unions
Like structs, you can create a list of things together in an union but, unlike structs, the things share the same space. So, if you create a struct of an int and a double, the size of the structure is the sum of both sizes. In an union, its size is the size of the biggest element and all elements are in the same area of memory, accessed from the first byte of the union.
Its usage is somewhat limited and can be quite dangerous, so you won’t find many C programs and rarely find any C++ programs using it. One of the uses is to (unsafely) convert numbers (double, long, int) to their byte representation by accessing as an array of chars. But the use we’ll see now is how to entangle several structures together to achieve real polymorphism.
Polymorphism
In object oriented polymorphism, you can have a list of different objects sharing the same common interface being accessed by their interface’s structure. But in C you don’t have classes and you can’t build structure inheritance, so to achieve the same effect you need to put them all in the same box, but at the same time defining a generic interface to access their members.
So, if you define your structures like:
struct Interface {
int foo;
void (*bar)();
};
struct OneClass {
int foo;
void (*bar)();
};
struct TwoClass {
int foo;
void (*bar)();
};
and implement the methods (here represented by function pointers) like:
void one_class_bar () {
printf("OneClass.Bar()\n");
}
void two_class_bar () {
printf("TwoClass.Bar()\n");
}
and associate the functions created to the objects (you could use a Factory for that), you have three different classes, still not connected. The next step is to connect them via the union:
typedef union {
struct Interface i;
struct OneClass o;
struct TwoClass t;
} Object;
and you have just created the generic object that could hold both OneClass and TwoClass and be accessed via the Interface. Later on, when reading from a list of Objects, you can access through the interface (if you build your classes with parameters in the same order) and it’ll call the correct method (or use the correct variable):
Object list[2];
/* Setting */
list[0] = (Object) one;
list[1] = (Object) two;
/* Using */
list[0].i.bar(list[0]);
list[1].i.bar(list[1]);
Note that when iterating the list, it access the Object via the Interface (list[0].i) and not via OneClass or TwoClass. Although the result would be the same (as they share the same portion in memory, thus would execute the same method), it’s conceptually correct and compatible with object oriented polymorphisms.
The code above produces the following output:
$ ./a.out
OneClass.Bar()
TwoClass.Bar()
You can get the whole example here. I haven’t checked the GCC code but I believe that they’ve done it in a much better and more stable way, of course, but the idea is probably be the same.
Disclaimer: This is just a proof-of-concept. It’s not nice, they (GCC programmers) were not proud (at least in the podcast) of using it and I’d not recommend anyone to use that in production.
|
|
gzip madness |
| April 9th, 2008 under Computers, Devel, rengolin, Unix/Linux. [ Comments: 3 ]
|
|
Another normal day here at EBI when I change a variable called GZIP from local to global (via export on Bash) and I got a very nice surprise: all my gzipped files have gzip itself as a header!!!
Let me explain… I have a makefile that, among other things, gzip some files. So, I’ve created a variable called GZIP that is the same as “gzip –best –stdout” and on my rules I do:
%.foo : %.bar
$(GZIP) < $< > $@
So far so good, always worked. But I had a few makefiles redefining the same command, so I though: why not make an external include file with all shared variables? I could use the @include for makefiles but I also needed some of those variables for shell scripts as well, so I decided to use “export VARIABLE” for all make variables (otherwise they aren’t caught) and called it a day. That’s when everything started failing…
gzip environment
After a while digging the problem (I was blaming the poor LSF on that) I found that when I hadn’t the GZIP variable defined all went well, but by the moment I defined GZIP=”/bin/gzip –best –stdout” even a plain call to gzip was corrupted (ie. had the binary gzip as a header).
A quick look on gzip’s manual gave me the answer… GZIP is the environment variable that gzip stores all default options. So, if you say that GZIP=”–best –stdout”, every time you call gzip it’ll use those parameters by default.
So, by putting “gzip” on the parameter list, I was always running the following command:
$ /bin/gzip /bin/gzip --best --stdout < a.foo > a.bar
and putting a compressed copy of gzip binary together with a.foo into a.bar.
What a mess can a simple environment variable do…
|
|
Who’s the amateur now? |
| January 15th, 2008 under Computers, rengolin, Software, Unix/Linux. [ Comments: 3 ]
|
|
Long way ago, when I started using Linux, lots of people laughed at me: “What an absurd! You have to compile your own kernel, what do they want with that? They’ll get nowhere!”. Well, things have changed a bit in the last decade and Linux grew up as a very mature, modern and user-friendly operating system as we (not them) all expected.
OS companies didn’t believe at start but with time Linux became a nuisance, than a problem and now it’s real competition. Not only Linux (or rather GNU/Linux) but all free software and all the free licenses like GPL, FreeBSD, CC, etc. Linux is real business, it’s more stable, faster, better designed and change so much faster than any other OS in existence both for security patches and new features. Lots of companies today contribute to free software without charge or restrictions, just because we gave them so much without charge or restrictions (and it turns out as profit too!).
But last year something I wasn’t expecting happened… The biggest OS company for the last 15 years did a move so stupid that I couldn’t believe. Windows Vista was not an operating system, it was a joke, a *very bad joke* indeed. It reminded me the first upgrades of the first Linux distros back in 94, it was a nightmare.
Well, seems like the free software community learnt a lot about deployment, user interfaces, quality assurance, software development strategies. On the other hand, Microsoft seems a bit amateurish when trying to fix the previous mistakes. Every round it gets worse, I wonder where the good programmers they use to have are now…
Well, better for us, Ubuntu seems to be the new OS of choice for many previous Windows users and with recent Microsoft moves it may become more and more often… Luckily they’ll force everyone out of XP (the last minimally decent thing they did) as they did to Win2000 (the only reasonably decent thing they did) and people will migrate to Ubuntu instead of Vista… Let’s see the outcome by next year…
|
|
LSF, Make and NFS 2 |
| November 27th, 2007 under Computers, Distributed, rengolin, Unix/Linux. [ Comments: none ]
|
|
Recently I’ve posted this entry about how NFS cache was playing tricks on me and how sleep 1 kinda solved the issue.
The problem got worse, of course. I’ve raised to 5 seconds and in some cases it was still not enough, than I’ve learnt from the admins that the NFS cache timeout was 60 seconds!! I couldn’t sleep 60 on all of them, so I had to come with a script:
timeout=60
while [ ! -s $file ] && (( $slept < $timeout )); do sleep 5; slept=$(($slept+5)); done
In a way it's not ugly as it may seem... First, the alternative is to change the configuration (either disable cache or reduce timeout) in the whole filesystem and that would affect others. Second because now I just wait for the (almost) correct amount of time and only when I need (the first -s will get the file if there is no problem).
At least, sleep 60 on everything would be much worse!
|
|
Multics back from the dead |
| November 16th, 2007 under Devel, OSS, rengolin, Unix/Linux. [ Comments: none ]
|
|
Multics arose from the dead in the source code shape! MIT has just released its source and now you can see with your own eyes how it was back in ’64!
It’s not easy to retrieve the whole code (no tarballs) but it’s a good exercise to read its parts if you can understand the structure, of course. If you couldn’t, don’t worry, start here.
|
|
LSF, Make and NFS |
| October 17th, 2007 under Algorithms, Distributed, rengolin, Unix/Linux. [ Comments: 2 ]
|
|
I use LSF at work, a very good job scheduler. To parallelize my jobs I use Makefiles (with -j option) and inside every rule I run the command with the job scheduler. Some commands call other Makefiles, cascading even more the spawn of jobs. Sometimes I achieve 200+ jobs in parallel.
Our shared disk BlueArc is also very good, with access times quite often faster than my local disk but yet, for almost two years I’ve seen some odd behaviour when putting all of them together.
I’ve reported random failures on processes that worked until then and, without any modifications, worked ever after. But not a long time ago I figured out what the problem was… NFS refresh speed vs. LSF spawn speed using Makefiles.
When your Makefile looks like this:
bar.gz:
$(my_program) foo > bar
gzip bar
There isn’t any problem because as soon as bar is created gzip can run and create the gz file. Plain Makefile behaviour, nothing to worry about. But then, when I changed to:
bar.gz:
$(lsf_submit) $(my_program) foo > bar
$(lsf_submit) gzip bar
Things started to go crazy. Once every a few months in one of my hundreds of Makefiles it just finished saying:
bar: No such file or directory
make: *** [bar.gz] Error 1
And what’s even weirder, the file WAS there!
During the period when these magical problems were happening, which I was lucky to streamline the Makefiles every day so I could just restart the whole thing and it went well as planned, I had another problem, quite common when using NFS: NFS stale handle.
I have my CVS tree under the NFS filesystem and when testing some perl scripts between AMD Linux and Alpha OSF machines I used to get this errors (the NFS cache was being updated) and had to wait a bit or just try again on most of the cases.
It was then that I have figured out what the big random problem was: NFS stale handle! Because the Makefile was running on different computers, the NFS cache took a few milliseconds to update and the LSF spawner, berzerk for performance, started the new job way before NFS could reorganize itself. This is why the file was there after all, because it was on its way and the Makefile crashed before it arrived.
The solution? Quite stupid:
bar.gz:
$(lsf_submit) "$(my_program) foo > bar" && sleep 1
$(lsf_submit) gzip bar
I’ve put it on all rules that have more than one command being spawned by LSF and never had this problem again.
The smart reader will probably tell me that it’s not just ugly, it doesn’t cover all cases at all, and you’re right, it doesn’t. NFS stale handle can take more than one second to update, single-command rules can break on the next hop, etc but because there is some processing between them (rule calculations are quite costy, run make -d and you’ll know what I’m talking about) the probability is too low for our computers today… maybe in ten years I’ll have to put sleep 1 on all rules…
|
|
Yet another supercomputer |
| October 2nd, 2007 under Algorithms, Computers, Distributed, rengolin, Unix/Linux. [ Comments: none ]
|
|
SciCortex is to launch their cluster-in-a-(lunch)-box with promo video and everything. Seems pretty nice but some things worries me a bit …
Of course a highly interconnected backpane and some smart shortest-path routing algorithms (probably not as good as Feynman’s) is much faster (and reliable?) than gigabit ethernet (myrinet also?). Of course, all-in-one chip technology is much faster and safer and more economic than any HP or IBM 1U node money can buy.
There are also some eye-candy like a pretty nice external case, dynamic resource partitioning (like VMS), native parallel filesystem, MPI optimized interconnection and so on… but do you remember Cray-1? It had wonderful vector machines but in the end it was so complex and monolithic that everyone got stuck with it and never used it anymore.
Assembling a 1024-node Linux cluster with PC nodes, Gigabit, PVFS, MPI etc is hard? Of course it is, but the day Intel stops selling PCs you can use AMD (and vice-versa) and you won’t have to stop using the old machines until you have a whole bunch of new ones up and running transparently integrated with your old cluster. If you do it right you can have a single cluster beowulf cluster running alphas, Intel, AMD, Suns etc, just bother with the paths and the rest is done.
I’m not saying it’s easier, nor cheaper (costs with air conditioning, cabling and power can be huge) but being locked to a vendor is not my favourite state of mind… Maybe if they had smaller machines (say 128 nodes) that could be assembled in a cluster and still allow external hardware to be connected having intelligent algorithms to understand the cost of migrating process to external nodes (based on network bandwidth and latency) would be better. Maybe it could even make their entry easier to existent clusters…
|
| « Previous entries Next entries » |
|
|