How to write a new authoring tool in Agile

April 22, 2009

or how to avoid to have wordpad for the price of word, or how to avoid to have word when you want a tool to edit math over the web :)

If I have to do such stuff (wonder how such idea come to me ? :-) ), I’ll in the following order :

  1. Look for standard dealing with the subject and I’ll find MathML . (Don’t want to make the wheel again to store my Data)
  2. Understand than an authoring storage format differ of the display format (DOC/PDF, …) indeed, all authoring tool need to store meta data (author, revision, pagination, …, revision, …)
  3. Try to find the need of my user (print the page, copy/paste in/from word/office/firefox/IE/…)
  4. Try to cleary define what i want :)

So, my first backlog (list of tasks, will look like ) :

  1. Setup My continuous integration env
  2. provide a simple GUI with ‘/’/division nice display
  3. provide print function
  4. provide copy/paste function from word

I think quite quickly, I’ll have to add matrix/vector/square/root fonction display,

but what about spell checker ? pagination ? indeed, for my user, what’s the most important ? currently no clue of it, and I’ll say, don’t think to much at the question before the first proto. Human are so strange, than the need can come from nowhere …..

May be, import text from forum, internet, or paste to forum will be the key requirement.

After, from a technical point of view, as I know i’ll need to include most of a standard authoring tools fonctionnality, will I make the wheel again, or will I extend a standard tools ? (word/office),  extend a wiki like tool ? extend a forum tool ? use the remote publication mode of word/office/wiki ?  Will I force my user to be online to edit content, of will I allow offline authoring ? ……

Lot of question/answers I’ll have after the first prototype….

So, as a first conclusion, doing a quick/rapid prototype seems to be a key success factor in this area.

Agility will help me to focus on the most valuable features to my end user, and to deliver on a regular basis working prototypes.


Code quality

April 10, 2009

I’ve not post for some time, i was quite bust to set up some CI / metrics tools on a new project.

This raise me some questions / comments.

1) Complexity of the tools

When you read (even in my posts) it’s simple …  I’ve forget a bit the ‘open community’ and in its bad aspects.

Let me illustrate a bit. I am working in a team using mainly Java / Php.

So, using standard tools :

  • Source revision control : svn
  • IDE : eclipse
  • Build : Maven
  • CI : hudson

So, up to now, everything looks fine. I used to use some static code anlysis, in order to know the code quality :

Jdepend, PMD, findbugs, Javancss, cobertura, sonar, Coverity, …

So, nothing really exotic/fancy and most of them are ‘classical’ static code analysis tools.

So,  doing  my ‘Yoda’ show, i’ve encourage the team to use them as soon as possible in their development process so in the IDE.

I’ve plug them in their pom file (maven configuration file), update my hudson installation to have the trend, ….

AND … you know what …. 6 tools,  16 versions needed, because of incompatibility in maven, hudson,  pmd, ….

1 tool , 4 plugin needed …. most of the time just to parse an XML file, or to draw a simple curve with linear data …..

What’s a mess ! where is simplicity ? easy to use ? why coupling a CI tools with build specific plugin ??? why mixing build tools with Source control tools ???

Why not keeping the principle of less coupling to build/CI/static code analysis tools ????

Doing dcode is important, but application design should not be forget.

2) the second point, is some comments : it’s expensive to fix issues …

How can someone justify that fixing an issues at developing time … 5 .. 10 minutes … is more expensive that discovering the issue in production (something like 1$ a minute ? ) , requiring some dev again, some QA, some deployment …..

Of course, the bug may never be seen in production …. for how long ?  Don’t forget the Murphy law : if you can have a problem, you’ll have it !


YDN

December 5, 2008

Hier c’était la première du YDN a Grenoble (Une petite carte ? :-D ).

Pour ceux qui ne savent pas, Yahoo ouvre gratuitement (si si), ses API aux développeurs (Flickr, Search, …), laissant ainsi la possibilité a chacun de les utiliser pour faire sa petite appli ou inclure qqc de marant sur son site Oueb.

3 Evangelistes étaient la pour présenter les nouvelles API, ou les nouveaux gadgets.

La soirée était principalement centré sur les applis mobiles, mais nous avons eu droit à une présentation rapide de l’étendue des possibilités !

C’est bluffant !!!!

Et je pense que les invités ont apprécié …


Mysql over time

December 3, 2008

After my last post on mysql quality, I was wondering (and lot of comments I’ve got confirm the fact it was a good question .. :-) ) if it was due to the release 5.1, or if it was historical bugs.
So, I’ve run the same tools, with the same compilation options and compared the version of mysql 4.1,5.0 and 5.1. Each time I took the last source version available.

The raw results are :

Mysql 4.0 : 4.8k errors
among them :
FORWARD NULL :59
RESOURCE_LEAK:32

Mysql 5.0 : 2.4k errors
among them :
FORWARD NULL :70
RESOURCE_LEAK:39

Mysql 5.1 : 2.5K errors
among them :
FORWARD_NULL: 77
RESOURCE_LEAK: 45

After a quick look, we can see that the type of errors change over time, I think it’s due to maturity of developers (the do less ’simple’ mistake) which increase.

There was a big clean done between 4.x and 5.0.x versions. But, since the beginning of version 5, lot of code added, introducing new bugs (nearly 5% more), without any stabilization actions.


How can Static Code Analysis tools may help you ?

December 1, 2008

I love Static Code Analysis tools (SCA) because they are easy to use, easy to run, and most of the time very valuable.

You have of course, a learning step to know each families of tools (syntax checker, tools ensuring rules compliance, tools finding bugs,..) and to know which one to use and when to use it. But, as soon as you have this knowledge, you are very efficient and you can use them for example, during code review (even on large project) to have a good idea of the issues (it’s not easy to discover manually) or the origin of a problem. For example, you may have scalability issues on a project, but the root cause may be fully different from one project to an other one,…, the tools will help you to spot the origin.

So, I’m currently playing with several ‘bug finder’ tools, some commercial one, and some open source ones, on languages like Java/C/C++/C#,…The results ‘after some analysis’ give a good overview of a Project Quality.

So, like I am currently trying to explain to a friend of mine how to do Continuous Integration on top of Mysql (patches, plugin, specific hook, …) , I’m trying to plug some of these tools inside is Mysql CI line.

And … surprise …. more than 2000 potential bugs in Mysql Source code.

Lot of errors are due to memory handling, synchronization lock,  ’some’ function return null, and the result of the call is used without any tests, some errors with static/non static field, ….., …..

That’s impressive …. there was a lot of time I’ve not seen so many errors per line of code. Also, the errors are very heterogeneous in the code … (side effect of open source ? )

In the following table I give some metrics :
Module Name                        Number of errors

client code 134
cmd-line-utils 80
core 1034
example 6
libmysql 211
mysys 61
server-tools 34
storage/archive 38
storage/blackhole 1
storage/csv 9
storage/federated 5
storage/heap 8
storage/innobase 262
storage/myisam 147
storage/ndb 785
system 0

For the experience I have,  I can say than 85% of the time there is a real bugs when the tool launched raise a warning.
Sometime, it took time to discover but, it was right :-)

Some examples extracted from Myisam storage engine :

* storage/myisam/mi_check.c

Return code not check : everywhere the return code is checked, and an error is raised … my checker assume the return code is critical. So why at this line … no check ???

=> ligne 1185 :i_pack_get_block_info(info, &info->bit_buff, &block_info, &info->rec_buff, file, filepos)

* storage/myisam/mi_key.c

NPE :
–> ligne 252 : char_length= (!is_ft && cs && cs->mbmaxlen > 1) ? length/cs->mbmaxlen : length;
//so assuming cs is null
–> ligne 268
FIX_LENGTH(cs, pos, length, char_length); //which dereference cs without any checks ….

* storage/myisam/mi_rkey.c

Lock error :
—-> ligne 78 : rw_rdlock(&share->key_root_lock[inx]); // take a lock
if (!(nextflag & (SEARCH_FIND | SEARCH_NO_FIND | SEARCH_LAST))) use_key_length=USE_WHOLE_KEY;
……
if (rtree_find_first(info,inx,key_buff,use_key_length,nextflag) line 146
DBUG_ENTER(”table2myisam”);
if (!(my_multi_malloc(MYF(MY_WME, …..) ///// the allocation is not stored … and never free
DBUG_RETURN(HA_ERR_OUT_OF_MEM);


Why Continuous Integration is sometime complex ?

November 14, 2008

One question I’ve often got, is : Does CI is applicable to my small (or big) project ?

The answer is Yes ! The complexity of the setup, and the benefit of the CI are dependent of course,  but not to the size of you project, it’s dependent of the nature of what you integrate : library / application , and the deployment constraint you have on what you integrate : one OS, several OS, several version, ….

In this post, I’ll try to give some thought on the topic.

First of all, let start simple : The goal of the CI is to build high quality, reliable packages from a set of sources files, possibly edit by several person. This is shown in Figure 1.

Figure 1 : CI Goal

gif_2

The full process of CI can be summarized as (illustrated in Figure 2):

  • Edition of the source file (in an IDE or not, but … I’ll show in a future post than an IDE usage can enhance/speedup development process)
  • Validation local of the quality of the package : compilation / test / analysis with Static Code Analysis tools, packaging …
  • Share the file through a Revision Control tool (CVS, SVN, P4,…)
  • Remote Validation of the package (nightly build, black box build) : remote compilation / test / analysis / packaging …
  • History of change and notification to the team if there is issues (CI … can be integrated with previous point)
  • Distribution of the package to be use by the team / external team / …

Figure 2 : CI Process

gif_1

Now, Imagine we are working on two blocks or libraries, two strategies can be put in place. The first one consider the two libraries independent (event if B use A) from a life cycle point of view (B use a stable version of A for example), the second one introduce a life cycle coupling among the two libraries (this is illustrated in Figure 3) .  The choice of solution 1 or 2 depends mainly of the usage of both lib, and their release cycle.

Figure 3 : Integration of several blocks or libraries

gif_5

Let describe the situation where we consider A and B independent. In this case we simply duplicate the previous framework without any difficulties.

Figure 4 : A and B independent

gif_3

The second case is more powerful, that means we can do more fancy things with CI. Of course, we need to validate A and B independently (B can use previous A build version), AND, (the power of CI come from this AND :) ) you can validate integration of A and B.  You can imagine complex and fancy mechanism to validate this integration, but a simple approach is to rebuild B with the new A, and to run B tests again.  So, in a first step you validate your new version of B and your new version of A, and after you validate their integration.  But, warning, by doing that you create a release constraint among A and B. Indeed, if you deliver B, with an old version of A, your integration is useless. You need to validate your library with libraries you deliver with.

Figure 5 : Dependency among A and B

gif_4

The big question, now is does it scale to a large project or not ? and a similar question, is do we apply the same strategy to the whole project.

Their is no magic solution, that’s depends and it will mainly depends of your block organization.  The complexity of having several block does not took to much time as you compile each block with previous version of dependents block (and you can do it in parallel), and you need to build the whole thing again (or just redo the link step for C/C++ lib for example).  You need also to validate this big block. Unit test is not enough, and you should run also integration test, or smock tests (subset of integration test, that can be run quickly to guarantee rapide CI feedback) (note : you can run the whole Integration Test suite during the night, or Saturday/Sunday).  This is shown in Figure 6.

But in order to reduce release cycle constraint I will encourage to have this scenario for all applications, and to having linking applications together.

For example, if you have X libraries, a 5 applications  (Front End, WebService1, WebService2, Database, and BackOffice) most of the time you’ll have :

  • Core components for you project (perl module, RPM, …. )
  • Front End
  • WebService Core , by isolating this block, you make your core block living without any constraint on WS1 and WS2, and you made de release of WS1 independent form the one of WS2.
  • WebService1
  • WebService2
  • Database
  • BackOffice

Figure 6 : Large Project

gif_6

So, if the CI scale without any problems (you just need to have enough servers to compile/test/validate packages in parallel to be fast enough :-D )  (For example Hudson provide a remote compilation module), where the difficulties come from ?

They come mainly from two origins :

  • your deployment constraints
  • your application

Let explain, if you need to deploy your application on 4 operating system (in this case all Linux based, in both 32 / 64 bits) you need to compile (or at lease emulate/simulate) this compilation for the 4 OS, and you need to Test your application on the 4 OS to avoid OS specific issues.

(You’ll have the same issues, if you need to deployed on several Apache server version, or Mysql, Tomcat, …)

Figure 7 :

gif_7

The second source of complexity, comes from your application. For example, if you provide an Internet application , you’ll need to validate (test) you application with several client configuration (browser / OS / Version ….).  That’s feasible, several framework allow to do that (selenium) for example, they are scalable (selenium Grid), and some techniques like virtualisation allow you to do it efficiently/easily.

Figure 8 : Several application Client

gif_8

So, don’t be afraid, think before implementing your CI, and GO !

In a future post I’ll give you team practices to follow to do even better !

Good luck !


Agility (Scrum) with Students

October 17, 2008

Last time I got 3 students, we organized the work in Scrum, or an adapted Scrum.

Why Scrum :

  1. It’s was a good training for me as Scrum Master
  2. It’s was a good training for them as student :)
  3. It’s was a good training exercise to plan tasks / to evaluate tasks duration, and the immediate feed back (every day, as all tasks are small) or the weekly discussion was very valuable for them.
  4. It’s a good way to ensure they don’t go to far on a topic, and keep focus on the priority.
  5. Good way to ensure communication among team members

Why adapted :

  1. I didn’t let them choose the task, the tree student got different objectives, so, most of the time, the task was design for a student…. (yes … i was product owner at the same time …. poor student)
  2. Some generic tasks, were shared (or can be taken by any one) … but there was not lot of them ….. But they have also to learn lot of stuff, and the learning depends of the initial skill…. difficult to put task that can be taken by all of them.
  3. The two first sprints I did the duration evaluation, and try to make  them understand what i Call ‘task finished’ (ie :  code written, test written, documentation written, integration validated …..).
  4. The third one i let them evaluate (define the task duration), … , lot of under estimation, but in two sprint that was corrected.
  5. Also, the activity of splitting features into task, was assisted during 6/7 sprints, and they start to do it fully after. I did that in order to avoid too many new things at the begging. They had also to understand the company, the tools, to work in a team, …. I wanted to avoid too many new stuff.

Feed back : great experience, according to them, fully different from what they learn at school, but very challenging.

Other feed back : with Scrum you have to work all the time ! :-D Did they expect to sleep at work ?????


Yahoo Tech Pulse

October 3, 2008

I was one of the lucky guys who can go to the Yahoo tech pulse. Hundred of guys showing their great technology stuff. This event was for Yahoo’s. So, i can not say to much on the technologies saw, but just that was great. I think soon you’ll see some of this stuff on YDN (Yahoo developper network), and you should be able to use this nice stuff soon.
For the pict, just have a look on flickr and shearch for Tech pulse stuff !

Enjoy !


Comment bien faire une recherche sur Internet

September 18, 2008

Une recherche classique :

Vous avez envie de vous offrir le dernier Archos, comment faites vous ?

archos

archos

Plusieurs solution s’offrent à vous : Google, Yahoo, Orange, Kelkoo, …. Mais laquelle choisir, et dans quel cas ? Nous verrons par la suite qu’une recherche dans un but d’achat ou dans un but d’information ne se fait pas forcement de la même manière.

Nous allons dans un premier temps montrer le résultat d’une première recherche naïve :

  • votre ami Google, moteur de recherche généraliste

Vous pouvez noter que :

  1. Les premiers résultats sont sponsorisés
  2. Les résultats ont été choisis parmi plusieurs millions de possibilités
  3. Des résultats ‘commerciaux’, mais aussi informatifs avec par exemple, Wikipedia. Wikipedia est un site communautaire de connaissance. Les internautes partagent leur connaissances. Par contre, il n’y a aucune validation scientifique du contenu.
  • votre ami Yahoo, moteur de recherche généraliste

Vous pouvez noter que :

  1. Les premiers résultats sont sponsorisés
  2. Les résultats ont été choisis parmi plusieurs millions de possibilités
  3. Des résultats ‘commerciaux’, mais aussi informatifs
  • votre ami kelkoo, moteur de recherche spécialisé dans l’achat

Vous pouvez noter :

  • une liste de modèles vous est présentée
  • une gamme de prix
  • une liste de modèles et de marchands
  • la possibilité de trier les résultats par prix/popularité/….
  • votre ami Orange, moteur de recherche généraliste

Vous pouvez remarquez :

  1. des liens sponsorisé ‘maison’
  2. accès aux services ‘maison’
  3. des résultat commerciaux, mais aussi informatifs
  4. des résultats triés, mais parmi moins de possibilités que Google/Yahoo.

Première Conclusion : vous avez envie d’acheter un produit connu, tous les moteurs vous donneront accès rapidement à l’information cherchée. Par contre, si vous ne savez pas exactement ce que vous cherchez (produit exact, gamme de prix, revendeur, …) un moteur spécialisé  ou un guide vous offrira plus de possibilités.

Vous pouvez aussi vous apercevoir que certains mots de vocabulaire ressortent lors de vos recherches : player/lecteur/encodeur/…..

Pour éliminer le ‘bruit’ (résultats non pertinents par rapport à votre recherche), vous pouvez raffiner votre recherche, et chercher par exemple ‘lecteur mpeg4′, ou ‘lecteur mpeg 4′.

Mais comment ca marche ?


Read the rest of this entry »


Agility by example

September 15, 2008

I’ve found this article today thanks to alex(Agilex), Team practices,

(in french…sorry) but that’s definitely one of the best example of agility in practice !

a MUST READ !