SIGSTP: 2012

Friday, 23 November 2012

Friday Time Waster: Watch the world go by in ASCII with Reuters' API

I've recently released WebService::ReutersConnect. It's a Perl modules that interfaces with the ReutersConnect's API in OO style. To demonstrate it and hopefully entertain you on this Friday, here's how to use it to watch the world go by in glorious ASCII and from the comfort of your command line. To put it shorter: The perfect Friday Time Waster.

The ingredients

To cook this recipe, you will need:

- WebService::ReutersConnect

- Text::AAlib

While you can install all of this with the CPAN command, I suggest you install most of their dependencies through your OS packaging system first.

If you're debian based, here are the debs to install: libmoose-perl libdatetime-perl libdatetime-format-iso8601-perl libtest-fatal-perl libxml-libxml-perl libwww-perl liblog-log4perl-perl liburi-perl liblibaa1-dev

Then just install WebService::ReutersConnect and Text::AAlib via your favourite channel.

The Recipe

The Recipe is pretty straight forward, it goes like this:

Preamble. We just load the required packages and also make sure that Log4perl is not going to complain about lack of initialisation:

#! /usr/bin/perl -w

use strict;

use warnings;

use WebService::ReutersConnect qw/:demo/;

use Log::Log4perl qw/:easy/;

use Text::AAlib;

use Imager;

Log::Log4perl->easy_init($WARN);

Building the $reuters object and querying the freshest image. Here we use the demo credentials and search for the latest picture:

my $reuters = WebService::ReutersConnect

->new({ username => REUTERS_DEMOUSER,

password => REUTERS_DEMOPASSWORD

});

my ( $item ) = $reuters

->fetch_search({ limit => 1,

media_types => [ 'P' ],

sort => 'date' });

Building the ASCII of the picture. We just download the preview URL and render it in ASCII using Text::AAlib:

## Load preview image

my $res = $reuters->user_agent->get($item->preview_url());

unless( $res->is_success() ){

die $res->status_line();

}

## Build and scale image

my $bin_image = $res->content();

my $img = Imager->new( data => $bin_image , type => 'jpeg' ) || die Imager->errstr();

$img = $img->convert( preset => 'grey' );

$img = $img->scaleX(scalefactor => 2); ## Tweak to your taste

$img = $img->scale(scalefactor => 0.5); ## Tweak to your taste

## Build ASCII Version

my ($width, $height) = ($img->getwidth, $img->getheight);

my $aa = Text::AAlib->new( width => $width, height => $height );

Rendering the whole thing on the console. The rendering parameters are OK for me, but you might have to play with them depending on your taste/colour scheme:

## Print ASCII Image

print $aa->render( dither => 0 ,

gamma => 1.83,

bright => 50,

contrast => 60,

color => 0

);

## And some info

print "\n ".$item->date_created().' : '.$item->headline()." \n";

print "\n ".$item->preview_url()."\n";

And that's it! Now you can put all of that together, or if you're lazy, you can just copy/paste the whole thing from this pastebin.

Feel free to experiment with the options and the code and tell us about your improvement in the comments!

Happy coding!

Jerome.

Thursday, 8 November 2012

PDFs coming out upside down

Recently at work we received a defect report ticket addressed the tech team saying:

<Support 1>:

When documents are added to the <Client> system, pdf attachments are coming out upside down.

Referred the matter to <Account Manager>, who confirmed this is a technical issue.
To replicate

Log into <Client> system
Go into document 1234567
Click on pdf

Screen shots attached

For obvious reasons, I cannot show you the screen shot, but it effectively shows an upside down document.

One of the most enlightened members of the tech team followed up:

We would need to see the original document that the client uploaded to confirm, however, as this uploaded PDF is the result of manual scanning in of a paper document, it looks very much to us as though the paper document has been placed in the scanner the wrong way round, and hence has been scanned upside down.

Then Support comes back.

<Support 2>

Client has advised:

Hello, scanning the document the other way around has solved the problem - thanks for resolving this.

D'oh. Lesson learned: Don't trust <Account Manager>'s technical issue opinions.

Tuesday, 25 September 2012

Understanding Unicode and UTF8 in Perl

I know there's a lot of things out there on the subject, but when it comes to Unicode, a quick review of core concepts and bug avoidance guidelines is never a waste of time.

I'm not a Unicode Guru, but working with third parties, I often find that a lot of people consistently fail to get the basics right about Unicode and encoding. There must be something esoteric about it. So here's yet another set of slides about Unicode/UTF8 in Perl.

It's not meant to be a comprehensive presentation of all Unicode things in Perl. It's meant to insist on a couple of guidelines and give some pointers to get a good start writing a unicode compliant application and avoiding common issues.

Comments are open for questions. Happy coding!

Friday, 21 September 2012

Yes, One, I don't know - three tales about understanding people

As application developers, it's crucial that we understand what other people are saying, and we're often in a position where misunderstanding is an easy trap to fall into, specially when we speak with people with no computer science background. Yes that happens. Scary hu? The key point when this happens is to forget about everything you know about basic understanding, even in your native language. So here are three tales about how even the most simple language elements can lead to misunderstanding, and what to do about it.

Yes

I started working for (very little) money as a bio-informatics consultant. My main duty was to write and use software to produce - in a few hours of number crunching - enough hypothetical results that would take molecular biologists decades and millions of investors hard earned Euros to validate in the wet lab. At this time, my brain was fresh from university and I was formatted to think about the world in terms of 1's and 0's. And there were only 10 categories under which my mind was ultimately classifying things. True and false. So the word "Yes" for me had only one meaning. It meant something was true in the past, now and in the future, independently of circumstances. So as you can guess, there were not many questions you could answer with a plain "Yes" to me. For instance to the question "Do you like hummus", if you answered me "Yes", then I would have served you hummus for the rest of your life. Well only until you've finally decided by yourself to tell me that "actually yes I like hummus most of the time, but not always". Which is actually closer to the reality. If it just about culinary tastes, it's OK, but when it's about work then the misunderstanding is annoying at best. And I've made this mistake quite a few times: Taking a biologist "yes" for a computer science "yes" and mis-design something as a consequence. Now I try to remember, I don't take "yes" as an answer. When I hear "yes", I try to question it further, think twice and make sure I deal with the case where this "yes" is not true. Because in the real world, it will happen.

One

I bet that like me you've noticed the following fact about software development:

"Given enough time and business analysis, any unique relationship will eventually have to turn multiple."

It works for almost everything. Your users can set their preferences? One day you'll want to manage multiple set of preferences. You display the hottest deal of the day on your homepage? Quite soon you'll want multiple. A document has got a language property? What about mixed language documents? multiple. A user has got a postal address? No, they've got multiple addresses - work/home/mum/dad. OK so now they have multiple addresses, but they choose a favourite one. Humm, what if they want a different favourite depending on the context. Think "Deliver to work, but send the bill to home". Multiple again. Programming language has got single inheritance? The next one will have multiple. And you can go on and on about it. There's no end to multiplying. I wonder sometimes why unique relationships are included in UML. I think it's to trick developers into believing people saying "One". But don't be fooled. There's no such thing as "One". If you don't believe me, read this multiple times.

I don't know

This one is not strictly speaking about computer science, but about science in general. I think that people with a scientific background of any kind are used to deal instantly with equivalence and equations. For instance, given the speed and the mass of a moving object, we instantly know its kinetic energy because we know that E = 1/2mv^2. It also works transitively. I don't want to go into too complex examples, but let's say you have a,b,c, d=f(b,c) , e=f(d,a), and let's say I give you a,b,c and ask you if you know 'e'. Someone with science background will tell you "yep, because you know a and d, and you know d because you know b and c". I did the test with my wife - who's got a biology background - yesterday, so my statistics are pretty solid. To me this is a perfectly natural way of thinking. Until I took a class in finance.

In an attempt to avoid spending my elderly years begging in the streets of London, I took this class in finance on coursera to understand how the wheels turn. And the lecturer looks to me like a brilliant man, full of common sense with very good pedagogy. I highly recommend this class. Anyway. One thing that stroke me is that sometimes I was kind of lost into the teacher's thinking. He was presenting a example, where a,b and c were known and was asking the question: "Do you know e". And the right answer was not "Yes of course, because e=f(f(b,c),a)". The correct answer to him was "I don't know". And I understood why I sometimes didn't understand him. Because for him , "I don't know" doesn't mean the same as my "I don't know". To me "I don't know" means "It cannot be reached within a reasonable computing time". To a mathematician or a physicist, or a biologist, it will probably mean more or less the same. But to him it means something different. It means "I don't know it yet because I haven't calculated it yet". Once I understood that, the rest of the class was suddenly crystal clear, because I had understood that the meaning of this simple sentence "I don't know" meant something different between him and me.

The fact that we cannot assume that everybody shares the meaning of such simple concepts like "Yes", "One" , "I don't know" and probably lot of other ones is quite fascinating and tells us that effective communication between human beings of different backgrounds is probably one of the most difficult things to achieve at work and in life in general.

I don't know if there's a solution to this, but until people more clever than me find one, I'll try to stick to this rule: ~~Don't speak to non IT people~~ Try to understand what meaning others attach to basic phrases.

Saturday, 15 September 2012

JS/HTML/CSS: A braindump from a backend guy

It's a wonderful sunny Saturday outside yet I'm here bloging about web development. I should see a doctor, but in the meantime please suffer my brain dump about things browser related.

I'm not really a front end developer, in the sense that I have very poor graphic design skills, that my knowledge of Javascript and CSS is limited. However over the time I had my fair bit of front end development, more recently here, not counting the things I do for fun, so yeah, I don't have expert level knowledge about all of this, but I feel I have enough experience to broadly think about some general guidelines about what should be fed to the browsers. So the goal of this post is not to dictate what I think is right, but rather to put some ideas on the table and calling for critics and alternative and/or improvement suggestion. So if you're a JS Ninja, or a CSS old-master, please be indulgent, and share your art in the comment.

Ok let's get started with the general page structure. I like a page source to be clean and organised (not like my bedroom), but also the goal here is to allow the browser to render the page as soon as possible. In other words, to reduce the time to first pixel:

<html>

<head>

<title>My Wonderful Page</title>

</head>

<body>

<div>

All sorts of HTML stuff.

</div>

head.js('https://ajax.googleapis.com/.../jquery.min.js',

'https://ajax.googleapis.com/../jquery-ui.min.js',

'http://mysite.com/js/site.js'

);

head.ready(function(){

installFancyJS($(document));

});

</script>

</body>

</html>

Let's go though this step by step.

Use a JS Loader. So the idea here is to do as little as possible in the header, and deport the heavy lifting to the end of the body. In your header, you should have only one and only one minimal script, and this script should allow you to load things later at your page bottom. Here I use head.js. It's got other fancy base features but load.js is also a good alternative if you like to make it even smaller. Let your page be pear shaped.

One CSS File. Not two, not three. One, and preferably on the same domain as your main site (so relative URLs are welcome here). Why is that? Here we reduce the number of extra requests the browser has to do to a minimum of erm ... one, and by having the CSS hosted on the same hostname, you avoid doing extra DNS queries, saving you precious milliseconds. Some people even argue that you should have your CSS in the page itself. I'm not sure about this one. It's true that it saves you the extra query entirely, but at the same time, your page is now heavier and therefore longer to download. There's probably a reasonable limit under which this actually saves time, but I'm not sure about its value. Of course in the development environment, you probably want to have your css neatly organized, so you don't have a giant ball of spaghetti monster to deal with. I've tried different techniques to achieve that, but didn't come with something I really like yet. Suggestions?

All sorts of HTML Stuff. Warning. Controversial bit. What? HTML stuff, controversial? Well yes, as it seems to me that with the advent of the uber-ubiquitous Ajax coolness, a large amount of people chose to ignore a fundamental fact of life (at least of life as a developer): Browsers were born to render HTML and dynamic web servers can generate HTML. For instance, I know a site where every time they need to display a table of something (let's say historical stock prices), they first display an empty table, and then, when all the page JavaScript is loaded, they make a second query to get the data to populate the table. It drives me mad. Not only I don't see the point* , but more importantly, you can imagine how it kills the 'time to useful pixel' of their pages. It's not that I'm against Ajax. not at all. I love it, I was doing it wayyyy before jQuery became popular. In fact you'll find here some posts I wrote about how to easily 'ajax' anything. But I kind of believe that if you want to do JS object models and get JSON from the server instead of HTML, you'll be much better off developing your backend as an API and generate your GUI using something like flex, or GWTK (please add yours to the list :)). I certainly reckon the incredible value and power of OO JS/Json for complex and specific information manipulation and display (down with flash and applets!!), but for general pages development, I prefer to stick with good old HTML, specially if the application is 'browsing' oriented.

One JS Function that does it all. I call it (observe my creative naming skills) 'installFancyJS', and it takes one argument: the jQuery** context it's going to used in. So what does this function do? Well, it installs callbacks and all sort of responsive code in the page just after it's loaded. More precisely in the given context. To give you a very minimal example:

function installFancyJS(context){

installTopClick(context);

....

}

function installTopClick(context){

$('div#topbar' , context).on('click', function(ev){

$('html,body').animate({scrollTop: 0 },

{ duration: 'slow', easing: 'swing'}

);

});

}

The reason why I propagate the context everywhere is that when I get a new bit of HTML using Ajax, I want to be able to install the full set of Javascript features on the newly created slice of document. In here's an minimal example of Ajax callback that uses the function:

function(data){

var newbit = $(data);

container.replaceWith(newbit);

// install fancy js on the container's new content.

installFancyJS(container);

}

That's is for now. I hope you enjoyed reading this and found some interesting/ridiculous/horrible things to talk about.

Jerome.

* But I see the reason: they got tied to the horrible datatables.net.

**yes I use jQuery, please rant and suggest in the comments if you know a better one.

Tuesday, 21 August 2012

The goodness of testing

Wall e-coyote testing a new rocket design

There’s a common view among programmers that testing (that is writing tests) is a painful thing you have to do to please your manager, or something you can avoid doing if your manager is not bothered. Let’s try to debunk that and show that testing is actually fun.

It guarantees code correctness. That’s obvious. Less obviously it forces you to think about what your code correctness actually is. By writing test, you will more easily think about those pesky edge cases that other users of your code will inevitably notice. Further techniques like property testing or mutation testing can also help automating thorough correctness.

It gives you a safety net when you refactor. Imagine you’ve finally decided to repay some technical debt and re-write your poorly written methods (don’t tell me you haven’t got any). If you’ve got good test coverage, no worries. You can refactor and rewrite your module, increase performance, cleanup poorly written code without worrying about breaking anything. At least anything that's tested.

It forces you to write better APIs. By putting you in the role of a user of your code/module/API, it forces you to think about the nicest way to use your module. This is creative, fun and interesting: How, as a client of your library, would you like to be able to use it. How would you like to interact with your piece of software? What do you expect from it? What should it expects from you? You decide, and because you're lazy and imaginative, you'll come to think about something neat. If anybody else uses your code in the future, you will attract a lot of developer love. Because people will go: "hum, that’s neat, and it works first time." That's lot of doughnuts for you.

You can see your test as a kind of contract about what is supposed to work now and forever more (biblical voice needed). It’s a contract between You, your future You, anyone who touches your code in the future, and also with the outside world. Why the outside world? I sometime think that everything that’s not tested doesn’t exists. It’s probably a philosophical view, but as a programmer, I like to think in binary terms. Something is there or something is not there. At least until quantum computers are on our desktops. So what about a feature that is not tested? It can break without anyone noticing, especially if it's a minor thing, or it can break and only the users notice (very bad). So is it there or not?

Well, if it can go away silently, I think it’s safer to assume it’s not there. Ask your sales team if they feel comfortable about selling stuff that can vanish at any time. Anything outside testing shouldn’t be assumed to be existing. That leads to what I deeply think about testing.

Testing does not help to make your product more reliable. Testing is your product.

Comments welcome!

Tuesday, 3 April 2012

Do you speak Franglais?

If you want to sound more clever than you really are in a French company, it's a good idea to forget about speaking French correctly and start speaking 'Franglais' instead. Franglais, spoken with the appropriate French accent, is a very effective way to impress your colleagues and your manager (in Franglais: "ennepluswone"), giving them the feeling that you hold and MBA from Harvard although the last time you were involved with the anglo-saxon's business culture, it was at 'Wall Street institute, 26 Rue de Lyon Paris'.

So here's a quick list to get you started in Franglais:

Cible -> Target
Concours -> Contest
Premieres étapes -> Early Stage
Mis en favoris -> Bookmarké (Bookmarked in English, but the french would prononce 'Bookmark-aide')
Délocalisé -> Offshore
Processus -> Process (you save two letters!)
Commuté -> Switché (Switched in English)
Appel groupé -> Confcall
Tarifs -> Pricing
Défi -> Challenge
Concentré -> Focusé
Plein Ecran -> Full Screen
Atelier -> Workshop
Externalisé -> Outsourcé
Mondial -> Worldwide
Sauvegardé -> Backupé
Feuille de route -> Roadmap
Fonctionnalité / Attribut -> Feature
Retransmis -> Forwardé
Date butoir -> Deadline
Incité -> Incentivé
Recommandations -> Guidelines
Discussion -> Chat
Collègue -> Co-worker
Personalisé -> Customé (Customis - aide)
Invité -> Guest
Avertissement -> Warning
Contenu -> Content
Ordonnanceur -> Scheduler

Ps: Thanks to Michel Poulain for his 'Bingo du Franglais'