benad is currently certified at Apprentice level.

Name: Benoit Nadeau
Member since: 2002-04-03 17:23:09
Last Login: 2014-04-22 17:48:00

FOAF RDF Share This

Homepage: http://benad.me/

Notes:

I am Benoit Nadeau, jr. eng. in Software Engineering,
living and working in Montreal, Canada.

Projects

Recent blog entries by benad

Syndication: RSS 2.0

Modern MVC in the Browser

A few months ago, I started learning the AngularJS framework for web development. It is a quite clever framework that allows you to "augment" your HTML with markup to map placeholders with elements in your object model. It reminded me of JSP and markup-based SWING frameworks I used a decade ago. Since then, though, I realized that this template-based approach has quite a few limitations that bother me. First, as it is based on templates, it doesn't handle well highly dynamic elements that generate markup based on context. Second, complex, many-to-many mappings between the view and the model are difficult to express. For example, it is difficult to express a value placeholder that is the sum of multiple elements in the models without having to resort calling a custom function, or having the change in the view trigger the corresponding change in multiple locations in the model without, again, resorting to a custom function. Third, it is a framework that needs to keep track of the model/view mapping, so it is all-encompassing and heavy in configuration.

So, let's break down the problem and look individually at the view, model and controller as separate modules, and see if there could be a better, more "modern" approach.

The biggest problem with the "model" in JavaScript is that it lacks safely hiding properties as functions, as done natively in C# for example, or manually through "getter" and "setter" functions in JavaBeans. Because of that, it becomes difficult to take any normal object and add a layer on top of it that would implement an observer pattern automatically. Instead, you can use something like the model implementation of Backbone.js that fully hides the model behind a get/set interface that can automatically trigger change events to listener objects. It may not be elegant, but it works well.

For the "view", HTML development doesn't support well the concept of a "custom control" or "custom widget" in other GUIs. The view is mostly static, and JavaScript can change the DOM to some extent, and at high cost. In contrast, other GUI systems are based around rendering controls ("GUI controls", not to be confused with the controller in MVC) on a canvas, and the compositing engine takes care of rendering on screen only the visible and changed elements. A major advantage of that approach is that each control can render in any way it pleases without having to be aware of the lower-level rendering engine, be it pixels, vectors or HTML. You could simulate a full refresh of the DOM each time something (model or controller) changes the GUI in HTML, but the performance would be abysmal. Hence, the React library from Facebook and Instagram, which supports render-based custom controls but using a "virtual DOM", akin to "bitmasks" in traditional GUIs, so that only effective DOM changes are applied.

Finally, for the "controller", the biggest issue is how to have both the model and the view update each other automatically, on top of the usual business logic in the controller, without creating cyclic loops. React's approach, named Flux, is a design pattern where you always let events flow from the model to the view, and never (directly) in the other direction. My gripe with that is that it is merely a design pattern that cannot be enforced. This reminds me of the dangers of using the WPF threading model as a means to avoid concurrency issues: Forget to use the event dispatcher a single time, and you will create highly difficult to debug crashes. A new approach, called functional reactive programming is kind of like what dependency injection did for module integration, but for events. Essentially it is a functional way of manipulating channels of events between producers and consumers outside of the code of each producer and consumer. This may sound quite an overhead compared to the inline and prescriptive approach of Flux, but as soon as you build a web page heavy on asynchronous callbacks and events coming from outside the page, having all of that "glue" in a single location is a great benefit. An implementation of FRP for JavaScript, Bacon.js, has a great example of how FRP greatly reduces those countless nested callbacks that are commonplace in web and Node development.

Combined, Backbone.js, React and Bacon.js offer a compelling alternative to control- and template-heavy browser MVC frameworks, and at minimum prevent you from being "locked-in" a complex and difficult to replace framework.

Syndicated 2014-11-26 01:44:49 from Benad's Blog

A Tale of Two Shells

Last year, I completed "properly" learning the bash ("Bash"?) shell, using a combination of the book "Learning the bash Shell" and reading from start to finish the gigantic "man" page. And that was enough to convince me that, regardless of its ubiquity, I don't like it much, be it as a scripting language or as a command-line shell.

Having already learned tcsh, because it was the default shell on Mac OS X and is still popular, I was ready to try out more modern shells, rather than ones stuck in the 80s.

First, on Windows, DOS is quite archaic and annoying. Simple things like sleeping for a second require unintuitive workarounds. DOS batch scripting is painfully difficult, so I was eager to find something better. And since Windows 7, this strange "PowerShell" is now installed by default, and was heralded as a revolutionary step in command-line shells. Is it?

After reading a few tutorials, including this "free ebook" on powershell.com, things became clear. Windows PowerShell is essentially a shell built atop .NET that manipulates streams of objects rather than plain text across pipes, though thankfully formats the objects as plain text on the console screen by default. It provides many "cmdlets" that can manipulate that stream in a more convenient way than grep and awk. For example, listing all files in a directory greater than a megabyte is trivial in PowerShell, while on UNIX shell requires an awkward (pun intended) combination of extracting character positions and arithmetic. The "PowerShell IDE", also provided with PowerShell, can even perform tab-completion of the fields of the objects out of the current pipe, making it easy to extract their attributes. Because so much of the Windows internals are accessible using COM and .NET, it is easy to perform system administration with it, for example installing system services and querying their status. The only major issue I've found with PowerShell is that executing PowerShell scripts is completely disabled by default until unlocked by an administrator. Also, as a minor gripe, the version of PowerShell that came out of the box with Windows 7 is quite outdated. Overall, if there was a "grand vision" of ".NET" in Windows, it is best represented by PowerShell.

Back on UNIX-like systems, it seems like users are quite happy with old shells, or at least incremental evolutions of the old "Bourne shell" and Berkeley's "C Shell". Looking around, I found "fish", the "Friendly Interactive SHell". I liked its ironic tagline of "Finally, a command line shell for the 90s", since it was initially released in 2005. It is, indeed, friendly, as it has a deliberately limited set of features, and has default out-of-the-box functionality that makes interactive use enjoyable. It was built around a comprehensive design document that explicitly favours usability over compatibility with older or popular shells. The results are spectacular: Everything has colour, TAB and arrow keys completion with a type ahead preview in light grey, inline argument completion for most commands (including "man") that present interactively all the options and their meaning automatically extracted from the "man" pages, editing the configuration through a built-in web service, applying configuration changes to all shells instantly, I could go on. Its scripting is quite limited, but that may be a good thing considering the Shellshock bug (Not that "fish" has no security holes, but at least they're not "as designed").

Personally, I am ready to move to both PowerShell and "fish" for day-to-day use. While they both don't have much in common to older shells, they are far more usable. I highly recommend to all command-line users.

Syndicated 2014-10-28 01:37:41 from Benad's Blog

Moniker, the Security Weak Point

By the time I heard about the "Shellshock bug" security hole in the morning of September 25, already the small Debian Linux server that I built up in Ramnode to host my web site patched that security hole by itself. At any rate, my web site hosts only static pages, so it was never impacted.

While I am in control of the security of my web site starting from the Linux kernel, to the web server, up to the web pages it serves, I'm still dependent on its hosting service (Ramnode) to be secure. Another potential danger would be for someone to hijack the "benad.me" domain name, and make it point to a version of the site filled with malware and viruses. Sadly, this almost happened.

Back in 2008 I registered my domain through the registrar Moniker, which used to be recommended partly based on its security. They implemented additional features that one could pay to "lock" the domain and prevent unauthorized transfers from someone that would steal your user name and password. Since then though, the company was bought by another company, and what is now called Moniker is only by name, both in terms of staffing and software.

I did notice a difference in tone in email communications from the new Moniker. They seemed to be highly focused on domain name auctions, and would automatically auction off expired domains. This felt like a conflict of interest, as Moniker would deride higher profit auctioning off your domain than helping you renewing it. Of course, they would never do that on valuable customers that do "domain speculation" and own a large number of (unused) domains, but still that raises the suspicion that the company was sold based on the number of domains it had and how much money they can extract from large speculators rather than providing valuable customer service.

The "new" Moniker had a security hole in 2013, and to fix that Moniker forced users to change their password the next time they logged in. Note though that this happened with the old version of that web site. This summer, the parent company that bought Moniker (and its name) scrapped the old site's code and replaced it with a new broken, buggy interface. The new interface also brought with it worse security, and made the domain locking feature completely ineffective.

By early October, Moniker sent an email to all its users saying that for untold security reasons, all the account passwords would be reset. The shock was that the email contained both the user names and passwords of all the user's accounts. I was shocked that my old Moniker account, identified by a standard-looking user name, was placed under a parent, numerically-identified user name I've never seen, and another numerical sub-account that was created without my knowledge. It should be noted that I could never access the numerical sub-account, even when using the password provided in the email. Also, the email said that your new passwords must fit security requirements, including the use of at least one "special character", even though the passwords provided in the email didn't contain any special character, and when attempting to change passwords, it would refuse most special characters.

OK, I'm not a security expert, but sending user names and passwords in an email, refusing special characters (which would indicate that they don't use bcrypt), and resetting the passwords of all users may indicate that they were hacked. Badly. Moniker cited the Shellshock bug, but as reports of stolen domains started to appear, a user came forth saying that the security hole predated Shellshock by a month.

So, I was convinced that Moniker had a pattern of behaviour of not taking security seriously, that is until they experience a mass exodus of their customers. I started the process of domain name transfer the day after they announced the password reset, and I would recommend everybody else to do the same. I transferred to Namecheap. Despite its name, in my case the price was the same, though as a test I created a new empty account before the transfer, and already I could attest that they take security seriously, including emails for account activity (using secondary email addresses in rotation) and 2-factor authentication (using SMS for now). I completed the transfer yesterday, so that would explain why there was a little bit of downtime when resolving my domain name.

Syndicated 2014-10-15 01:21:21 from Benad's Blog

Eventual Consistency, Squared

Looking back at my article "The Syncing Problem", implementing a generic DVCS seems like a relatively straightforward solution. Actually, if the "data to sync" was simplified to plain text, existing DVCS like git or Mercurial may be sufficient. But there is a fundamental problem I glossed over that has huge ramifications on the design of the DVCS that make existing DVCS implementations dangerous to use.

In modern "Internet-connected" appliances, there are two storage solutions: On-device, and "in the cloud". It is the "cloud" storage that is going to be used for the devices to communicate to each other indirectly when performing data synchronization. There is though a huge behavioural difference between on-device and cloud storage: The cloud storage is "eventually consistent". Beneath its API, the cloud storage itself may also be distributed across machines, and modified data can take a little while to propagate to other machines. Essentially, if you upload a file from one device, it may take a little while for another device to see the change.

Sadly, whatever conflict resolution used by a cloud storage provide is unreliable, because their behaviour is either undocumented or inconsistent. Locking files on such storage may not be possible either. Worse, internal synchronization issues at the cloud provider may make their internal synchronization speed so inconsistent to make it unreliable as a means to communicate information between devices quickly. Almost all VCS (distributed or not) assume a reliable storage area for the version repository. Hosted VCS guarantee ACID. No DVCS was made to push revision information to an unreliable storage, and use that as the primary means to exchange information.

The easiest solution for this is to design a DVCS that supports "write-only" repositories. If the storage key (file name) contains the checksum of the data it contains, it may be possible to have multiple clients writing to the same storage area changes to the shared repository. Even if listing available "data blocks" is inconsistent on the shared storage, all it can do is augment the "knowledge" of what exists in the repository against the repository stored on local storage. The atomicity of the storage blocks should be as close as possible to the atomicity of a transactional version control delta, especially since the cloud storage may make information appear on other devices out-of-order of how they were written. That could make those "patch files" larger than well-optimized VCS, but on cloud storage we may not have any other option.

Sure, a write-only repository may be a big issue if the files in the version control are too large or if storage is limited, but then most VCS tend to avoid deleting historical data, and when they support "cleaning up a repository", the solutions are clumsy and error-prone. In our case, if a device prematurely deletes older historical data in the shared storage, by being unaware that other devices were synced at older versions and may branch from there, then this would be tantamount to using the share storage to host only the latest version and nothing else. All to say, deleting historical data in a shared, eventually-consistent storage is a difficult problem that may involve a lot of tuning based on how long devices can stay unsynchronized before considering them "lost", compared to how fast the cloud storage is expected to be consistent.

Syndicated 2014-10-04 15:05:19 from Benad's Blog

Client-Side JavaScript Modules

I dislike the JavaScript programming language. I despise Node.js for server-side programming, and people with far more experience than me with both also agree, for example Ted Dziuba in 2011 and more recently Eric Jiang. And while I can easily avoid using Node as a server-side solution, the same cannot be said about avoiding JavaScript altogether.

I recently discovered Atom, a text editor based on a custom build of WebKit, essentially running on HTML, CSS and JavaScript. Though it is far from the fastest text editor out there, it feels like a spiritual successor to my favourite editor, jEdit, but based on modern web technologies rather than Java. The net effect is that Atom seems like the fastest growing text editor, and with its deep integration with Git (it was made by Github), it makes it a breeze to change code.

I noticed a few interesting things that were used to make JavaScript more tolerable in Atom. First, it supports CoffeeScript. Second, it uses Node-like modules.

CoffeeScript was a huge discovery for me. It is essentially a programming language that compiles into JavaScript, and makes JavaScript development more bearable. Its syntax reminds me a bit of the syntax difference between Java and Groovy. There's also the very interesting JavaScript to CoffeeScript converter called js2coffee. I used js2coffee on one of my JavaScript module, and the result was far more readable and manageable.

The problem with CoffeeScript is that you need to integrate its compilation to JavaScript somewhere. It just so happens that its compiler is a command-line JavaScript tool made for Node. A JavaScript equivalent to Makefiles (actually, more like Maven) is called Grunt, and from it you can call the CoffeeScript compiler directly, on top of UglifyJS to make the generated output smaller. All of these tools exist under node_modules/.bin when installed locally using npm, the Node Package Manager.

Also, by writing my module as a Node module (actually, CommonJS), I could also use some dependency management and still deploy it for a web browser's environment using Browserify. I could even go further and integrate it with Jasmine for unit tests, and run them in a GUI-less full-stack browser like PhantomJS, but that's going too far for now, and you're better off reading the Browserify article by Bastian Krol for more information.

It remains that Browserify is kind of a hack that isn't ideal for running JavaScript modules in a browser, as it has to include browser-equivalent functionality that is unique to Node and isn't optimized for high-latency asynchronous loading. A better solution for browser-side JavaScript modules is RequireJS, using the module format AMD. While not all Node modules have an AMD equivalent, the major ones are easily accessible with bower. Interestingly, you can create a module that can as AMD, Node and natively in a browser using the templates called UMD (as in "Universal Module Definition"). Also, RequireJS can support Node modules (that don't use Node-specific functionality) and any other JavaScript library made for browsers, so that you can gain asynchronous loading.

It should be noted that bower, grunt and many command-line JavaScript tools are made for Node and installed locally using npm, So, even if "Node as a JavaScript web server" fails (and it should), using Node as an environment for local JavaScript command-line tools works quite well and could have a great future.

After all is said and done, I now have something that is kind of like Maven, but for JavaScript using Grunt, RequireJS, bower and Jasmine, to download, compile (CoffeeScript), inject and optimize JavaScript modules for deployment. Or you can use something like CodeKit if you prefer a nice GUI. Either way, JavaScript development, for client-side software like Atom, command-line scripts or for the browser, is finally starting to feel reasonable.

Syndicated 2014-08-15 03:14:41 from Benad's Blog

107 older entries...

 

benad certified others as follows:

  • benad certified benad as Journeyer
  • benad certified llasram as Journeyer
  • benad certified shlomif as Journeyer

Others have certified benad as follows:

  • benad certified benad as Journeyer
  • llasram certified benad as Journeyer
  • pasky certified benad as Apprentice

[ Certification disabled because you're not logged in. ]

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!

X
Share this page