Recent blog entries

26 Aug 2016 mjg59   » (Master)

Priorities in security

I read this tweet a couple of weeks ago:

to me, an inclusive security community would focus as much (or at all) on surveillance of women by abusive partners as it does the state

— kelsey ᕕ( ᐛ )ᕗ (@_K_E_L_S_E_Y) August 2, 2016

and it got me thinking. Security research is often derided as unnecessary stunt hacking, proving insecurity in things that are sufficiently niche or in ways that involve sufficient effort that the realistic probability of any individual being targeted is near zero. Fixing these issues is basically defending you against nation states (who (a) probably don't care, and (b) will probably just find some other way) and, uh, security researchers (who (a) probably don't care, and (b) see (a)).

Unfortunately, this may be insufficient. As basically anyone who's spent any time anywhere near the security industry will testify, many security researchers are not the nicest people. Some of them will end up as abusive partners, and they'll have both the ability and desire to keep track of their partners and ex-partners. As designers and implementers, we owe it to these people to make software as secure as we can rather than assuming that a certain level of adversary is unstoppable. "Can a state-level actor break this" may be something we can legitimately write off. "Can a security expert continue reading their ex-partner's email" shouldn't be.

comment count unavailable comments

Syndicated 2016-08-26 00:02:10 from Matthew Garrett

24 Aug 2016 olea   » (Master)

Cambios en mi web

Después de postergarlo por mucho tiempo y tras haber adquirido algo de experiencia al crear con Jekyll las web de HackLab Almería y Club de Cacharreo me he animado a descartar al viejo Lameblog y ponerme manos a la obra. Si se nota alguna inconsistencia en mi web tened la seguridad de que es culpa mía. Y no me refiero al diseño, que es algo que ya doy por perdido.

Algo fantástico de Jekyll es que puedes migrar un blog al menos a través del fichero RSS del antiguo. Y por ahora parece funcionar divinamente.

Una inconsistencia a la que tristemente renuncio a corregir son las URI/permalinks originales, que además son las que usa Disqus para recuperar los comentarios. Como Lameblog se hacía algo de lío al generar las URIs con la localización ES pues se ha quedado todo un poco guarro y ahora me parece un lío arreglarlo. Por otro lado voy a procurar a no eliminar contenido de las URIs antiguas por los (pocos) enlaces externos que pudiera haber. Perderé los comentarios, sí, en las versiones nuevas de las entradas pero realmente no son demasiados y deberían seguir accesibles en las antiguas.

PD: Finalmente me he puesto a migrar los comentarios del blog alojados en Disqus. Afortunadamente parece que el proceso está bastante bien implementado Migration console y en mi caso, ya que no me he atrevido a especificar unas reglas que indiquen la transformación de las URIs antiguas a las modernas he seguido el procedimiento manual en el que descargas un fichero CSV con las entradas (URIs) que Disqus ha registrado en la columna A para añadir en la B la nueva. En mi caso han sido más de 500 pero, OjO, 500 URIs no significan 500 comentarios ni por asomo. De hecho una de las cosas que he verificado es el poco interés que despierta mi blog por los pocos comentarios o recomendaciones recibidos :_( En fin.

El caso es que he querido verificar cada una de las URIs si eran dignas de ser migradas y sólo se me ha ocurrido un procedimiento manual que básicamente he gestionado con esta línea de bash:

   for a in `cat olea-2016-08-21T20-27-31.368505-links.csv ` ; do chrome "$a" && read -p  "pulsa Enter" ;done

Donde olea-2016-08-21T20-27-31.368505-links.csv es el fichero CSV descargado. El porqué del read es para evitar que se abran 500 solapas en el navegador.

En la práctica han sido muy pocas las URIs que he indicado migrar (20 o 30 máximo) porque no había más comentarios ni recomendaciones. A continuación, siguiendo las instrucciones de Disqus he subido el fichero CSV revisado y he esperado a que fuera procesado. Y listo.

En realidad me ha costado un poquito más porque ha cambiado el código que Disqus proporciona para integración y al final he encontrado este precioso código jekyll de Joshua Cox que ha terminado por resolver toda esta parte.

Ale, a disfrutarlo con salud.

PS: Igual hasta hacemos otro tallercín en el HackLab Almería para compartir la experiencia con los compis.

Syndicated 2016-08-19 22:00:00 from Ismael Olea

23 Aug 2016 joolean   » (Journeyer)

gzochi

gzochi 0.10 is out, and it's a big release, befitting its double-digit minor version, I hope. In particular, it includes an initial version of the meta server, which, as its name suggests, provides services to a cluster of gzochi application servers. In this first iteration, the meta server manages R/W locks on portions of a game's object graph, and feeds temporary copies of the relevant object data to the application server nodes to fuel task execution. In the future, it'll also be responsible for client load balancing and coordinating channel message delivery across the cluster.

I've been working on this release since the beginning of the year, although much of that time was spent brooding over the way these systems work in Project Darkstar and in factoring out some core components of the gzochid container to make them available to the meta server (including a kind of minimal dependency injection framework built on top of GObject - surprised no one's done that before). In the design I ultimately settled on, the native, memory-backed storage engine plays a key role, acting as a cache for object data received from the meta server, which maintains the canonical stores of game data. And so I also spent a lot of cycles examining the performance and concurrency profile of that system, tweaking lock orderings and tree traversal algorithms.

One of the more surprising things I discovered was that locks in the memory storage engine as of 0.9 were too granular, and that this made the store unusually prone to deadlock. As I mentioned in an earlier post, the architecture of the memory storage engine was based on my reading of the Berkeley DB source code - especially the B+tree implementation. There's a lot of stuff in BDB, though, and it's got plenty of features that I was pretty sure I could ignore while building my own transactional data store. One thing I decided to leave out of my implementation was pages. After all, if I didn't need to transfer this data to or from durable storage, then why bother partitioning it into disk bandwidth-friendly chunks? It turns out, though, that pages serve another valuable purpose: They limit the granularity of locks and force concurrent transactions to serialize their writes when accessing keys within some threshold distance, especially with respect to insertions of new key-value pairs. Take the following example:
  1. Transaction 1 attempts to update key a1 (write lock a1)
  2. Transaction 2 attempts to read key a2 (read lock a2)
  3. Transaction 2 attempts to insert key a3 (write lock a3)
  4. Transaction 1 attempts to update key a2 (write lock a2)

In principle, there's nothing about this lock ordering that should lead to deadlock. The two transactions are accessing key a2 in incompatible ways, but that just means that the second transaction might need to wait for the first to complete before it can get a write lock on a2. However, in a B+tree, the interstitial structural nodes - not just the "leaf" key-values - are subject to read/write locking when structural modifications such as inserts are made to the tree. So the sequence of locks above actually means:
  1. Transaction 1 attempts to update key a1 (write lock a1, read lock parent a)
  2. Transaction 2 attempts to read key a2 (read lock a2, read lock parent a)
  3. Transaction 2 attempts to insert key a3 (write lock a3, write lock parent a)
  4. Transaction 1 attempts to update key a2 (write lock a2, read lock parent a)

With this additional bit of context, it becomes clear how the contention between these transactions can lead to deadlock: The first transaction can't make progress until it can add get a write lock on key a2, which the second transaction won't release until it can get a write lock on interstitial parent node a to add the key a3 to it. This pattern of clustered reads and insertions is quite common for gzochi application tasks in which some data from the game object graph is read or updated and then another task is durably scheduled for execution. But this is also where pages can help! By grouping regions of an ordered keyspace into "chunks" of keys that have to be locked in bulk, the granularity of interleaved access between concurrent transactions is effectively capped, and transactions attempting to update or insert proximal keys are forced to serialize. Bad for concurrency, but ultimately good for performance in a latency-sensitive environment where retrying a transaction from scratch hurts. Page locking makes the following change to the lock sequence above:
  1. Transaction 1 attempts to update key a1 (write lock page a)
  2. Transaction 2 attempts to read key a2 (read lock page a)
  3. Transaction 2 attempts to insert key a3 (write lock page a)
  4. Transaction 1 attempts to update key a2 (write lock page a)

In this scenario, because the first transaction's update attempt requires a write lock on the entire page, the second transaction can't acquire that toxic read lock. The practical result is that the two transactions execute in serial, like so:
  1. Transaction 1 attempts to update key a1 (write lock page a)
  2. Transaction 1 attempts to update key a2 (write lock page a)

[Transaction 1 commit: unlock page a]
  1. Transaction 2 attempts to read key a2 (read lock page a)
  2. Transaction 2 attempts to insert key a3 (write lock page a)

After making the switch to pages in the memory storage engine, the rate of deadlock for these kinds of transaction workflows dropped sharply, and is now roughly the same as that of the BDB-based storage engine. ...Which stands to reason.

Check out the release! I'm proud of it.

23 Aug 2016 mones   » (Journeyer)

10 years of clawsker!

Today on #claws irc channel some conversation derailed into talking about hidden preferences (of Claws Mail) and clawsker's name, the Perl applet which can help you to edit them.

First name was not clawsker, it was a more like Sylpheed Claws Hidden Preferences Editor, which I of course abbreviated to the unspeakable schpe — unless you know German, I guess… ;-)

Looking for the initial script, it turned out it's still on my hard disc:

$ tree -Ds claws/dev/oldscripts/schpe/
claws/dev/oldscripts/schpe/
├── [ 4096 Sep 3 2006] mock
│   ├── [ 24456 Sep 3 2006] schpe.glade
│   ├── [ 24456 Sep 3 2006] schpe.glade.bak
│   ├── [ 315 Sep 3 2006] schpe.gladep
│   └── [ 315 Sep 3 2006] schpe.gladep.bak
└── [ 1160 Aug 22 2006] schpe

1 directory, 5 files


Notice it's dated just 10 years and 1 day ago, yay!

That version wasn't even functional, was just a skeleton and an attempt to made a GTK+ GUI with Glade, which, by that time wasn't as good as probably is today (although I've never used it again).

Fortunately that idea was abandoned in the following months and the first release in 2007 enjoyed a hand-made GUI, probably more laborious but better suited, IMHO.

Syndicated 2016-08-23 16:49:33 from Ricardo Mones

23 Aug 2016 philiph   » (Journeyer)

22 Aug 2016 philiph   » (Journeyer)

Living Clojure Study Group

New page

Syndicated 2016-08-22 17:40:21 from HollenbackDotNet

22 Aug 2016 mones   » (Journeyer)

I'd love to…

undo all bugs I'd been sneakily making along the years…

But some people call them features ;-)

Syndicated 2016-08-22 12:10:11 from Ricardo Mones

20 Aug 2016 philiph   » (Journeyer)

20 Aug 2016 hacker   » (Master)

HOWTO: Fix missing mouse clicks in VMware with Linux guests

This was a bugger to find out, and required installing and reinstalling Linux a dozen times in different ways, to narrow down on the actual cause. If you run VMware Workstation on your Linux host, and are also trying to run Linux guests, you may run into a situation where your mouse cursor in the […]

Related posts:
  1. SOLVED: VMware Tools __devexit_p Error on Linux Kernel 3.8 and Earlier If you run a current version of VMware Workstation, VMware...
  2. SOLVED: VMware Tools create_proc_entry Error with vmballoon_procfs_init on Linux Kernel 3.11.0 Another quick VMware Tools patch and fix if you’re using...
  3. Tuesday Tip: rsync Command to Include Only Specific Files I find myself using rsync a lot, both for moving...

Syndicated 2016-08-20 00:10:17 from random neuron misfires

19 Aug 2016 sye   » (Journeyer)

https://en.wikipedia.org/wiki/Muckraker:

How Philadelphia’s political class can keep the FBI busy all day, and still sleep soundly at night.

By Patrick Kerkstr

Read more at http://www.phillymag.com/citified/2016/08/16/philadelphia-deviant-political-culture/#eltlShk6V0H3e3I0.99



19 Aug 2016 marnanel   » (Journeyer)

Two Ronnies 1984 Christmas special: courtroom sketch

I've seen this sketch many times, but I only just realised that the judge is Patrick Troughton (the Second Doctor from Dr Who).

Some of the gameshows are largely forgotten:
1) What's My Line? (guess someone's job)
2) Mastermind (rapid-fire questions on a subject)
3) Call My Bluff (guess the definition of an obscure word)
4) Blankety-Blank (guess what word someone else used to complete a sentence)
5) Give Us A Clue (charades)
6) It's A Knockout ("play your joker" in a round to double your points)
7) The Price Is Right ("come on down!")

Gaffes:
1) the judge is wearing a barrister's wig
2) no judge in England uses a gavel
3) the defendant is standing in the witness box
4) lawyers don't walk around the courtroom
5) er, court trials don't include gameshow references.

This entry was originally posted at http://marnanel.dreamwidth.org/375907.html. Please comment there using OpenID.

Syndicated 2016-08-19 12:35:33 from Monument

18 Aug 2016 philiph   » (Journeyer)

17 Aug 2016 glyph   » (Master)

Probably best to get this out of the way before this weekend:

If I meet you at a technical conference, you’ll probably see me extend my elbow in your direction, rather than my hand. This is because I won’t shake your hand at a conference.

People sometimes joke about “con crud”, but the amount of lost productivity and human misery generated by conference-transmitted sickness is not funny. Personally, by the time the year is out, I will most likely have attended 5 conferences. This means that if I get sick at each one, I will spend more than a month out of the year out of commission being sick.

When I tell people this, they think I’m a germophobe. But, in all likelihood, I won’t be the one getting sick. I already have 10 years of building up herd immunity to the set of minor ailments that afflict the international Python-conference-attending community. It’s true that I don’t particularly want to get sick myself, but I happily shake people’s hands in more moderately-sized social gatherings. I’ve had a cold before and I’ve had one again; I have no illusion that ritually dousing myself in Purell every day will make me immune to all disease.

I’m not shaking your hand because I don’t want you to get sick. Please don’t be weird about it!

Syndicated 2016-08-17 18:42:00 from Deciphering Glyph

17 Aug 2016 benad   » (Apprentice)

Mac-Only Dev Tools

Even though I use Macs, Linux and Windows machines daily and could switch to any of these exclusively, I prefer running my Mac alongside either Linux or Windows. A reason I do so is that there are some development tools that run exclusively on macOS that I prefer over their other platforms’ equivalents. Here are a few I use regularly.

To be fair, I’ll also list for each of those tools what I typically use to replace these on Windows or Linux.

BBEdit

While BBEdit isn’t as flexible or extensible as jEdit, Atom, Emacs, or even Vim to some extent, BBEdit feels and act the most as a proper native Mac text editor. It is packed with features, is well supported, and is incredibly fast. It works quite well with SFTP, so I often use it to edit remote files. It also is the editor I used the longest, as I used it since the late 90s.

Alternatives : Too many to mention, but I currently prefer Visual Studio Code on the desktop and vim on the command-line.

CodeKit

CodeKit, which I mentioned before, is my “go to” tool to bootstrap my web development. It sits in the background of your text editor (any one you want) and web browsers, and automatically validates and optimizes your JavaScript code and CSS files to your liking. It also supports many languages that compile to JavaScript or CSS, like CoffeeScript and SASS.

Alternative : Once I move closer to production, I do end up using Grunt. You can set it up to auto-rebuild your site like CodeKit using grunt-contrib-watch, but Grunt isn’t as nearly user-friendly as CodeKit.

Paw

Paw quickly became my preferred tool to explore and understand HTTP APIs. It is used to build up HTTP requests with various placeholder variables and then explore the results using multiple built-in viewers for JSON. All your requests and their results are saved, so it’s safe to experiment and retrace your way back to your previously working version. You can also create sequences of requests, and use complex authentication like OAuth. When you’re ready, it can generate template code in multiple languages, or cURL commands.

Alternative : I like using httpie for the HTTP requests and jq to extract values from the JSON results.

Dash

When I was learning Python 3, I constantly made use of Dash to search its built-in modules. It can do I incremental search in many documentation packages and cheat sheets, and does so very quickly since it is done offline. It also make reading “man” pages much more convenient.

Alternatives : There’s Google, of course, but I prefer using the custom search engine of each language’s documentation site using the DuckDuckGo “bang syntax”.

Syndicated 2016-08-16 23:44:10 from Benad's Blog

16 Aug 2016 iddekingej   » (Observer)

What is this? New opensource os from google:

Fuchsia

15 Aug 2016 cbbrowne   » (Master)

Spamalicious times

Hmmph. Google sent me a “nastygram” indicating that one of my blog entries had something suggestive of content injection.

I poked around, and it was by no means evident that it was really so. The one suspicious posting was http://linuxdatabases.info/blog/?p=99 which legitimately has some stuff that looks like labels, as it contains a bunch of sample SQL code. I’m suspicious that they’re accounting that as being evil…

But it pointed me at a couple of mostly-irritating things…

  1. I haven’t generated a blog entry since 2013. Well, I’m not actually hugely worried about that.
  2. I reviewed proposed response posts, since, probably about 2013. Wow, oh wow, was that ever spam-filled. Literally several thousand attempts to get me to publish various and sundry advertising links. It’s seriously a pain to get rid of them all, as I could only trim out about 150 at a time. And hopefully there weren’t many “real” proposed postings; it’s almost certain I’ll have thrown those away. (Of course, proposed postings about things I said in 2013… How relevant could it still be???)

Syndicated 2016-08-15 20:44:52 from linuxdatabases.info

14 Aug 2016 glyph   » (Master)

A Container Is A Function Call

It seems to me that the prevailing mental model among users of container technology1 right now is that a container is a tiny little virtual machine. It’s like a machine in the sense that it is provisioned and deprovisioned by explicit decisions, and we talk about “booting” containers. We configure it sort of like we configure a machine; dropping a bunch of files into a volume, setting some environment variables.

In my mind though, a container is something fundamentally different than a VM. Rather than coming from the perspective of “let’s take a VM and make it smaller so we can do cool stuff” - get rid of the kernel, get rid of fixed memory allocations, get rid of emulated memory access and instructions, so we can provision more of them at higher density... I’m coming at it from the opposite direction.

For me, containers are “let’s take a program and made it bigger so we can do cool stuff”. Let’s add in the whole user-space filesystem so it’s got all the same bits every time, so we don’t need to worry about library management, so we can ship it around from computer to computer as a self-contained unit. Awesome!

Of course, there are other ecosystems that figured this out a really long time ago, but having it as a commodity within the most popular server deployment environment has changed things.

Of course, an individual container isn’t a whole program. That’s why we need tools like compose to put containers together into a functioning whole. This makes a container not just a program, but rather, a part of a program. And of course, we all know what the smaller parts of a program are called:

Functions.2

A container of course is not the function itself; the image is the function. A container itself is a function call.

Perceived through this lens, it becomes apparent that Docker is missing some pretty important information. As a tiny VM, it has all the parts you need: it has an operating system (in the docker build) the ability to boot and reboot (docker run), instrumentation (docker inspect) debugging (docker exec) etc. As a really big function, it’s strangely anemic.

Specifically: in every programming language worth its salt, we have a type system; some mechanism to identify what parameters a function will take, and what return value it will have.

You might find this weird coming from a Python person, a language where

1
2
def foo(a, b, c):
    return a.x(c.d(b))

is considered an acceptable level of type documentation by some3; there’s no requirement to say what a, b, and c are. However, just because the type system is implicit, that doesn’t mean it’s not there, even in the text of the program. Let’s consider, from reading this tiny example, what we can discover:

  • foo takes 3 arguments, their names are “a”, “b”, and “c”, and it returns a value.
  • Somewhere else in the codebase there’s an object with an x method, which takes a single argument and also returns a value.
  • The type of <unknown>.x’s argument is the same as the return type of another method somewhere in the codebase, <unknown-2>.d

And so on, and so on. At runtime each of these types takes on a specific, concrete value, with a type, and if you set a breakpoint and single-step into it with a debugger, you can see each of those types very easily. Also at runtime you will get TypeError exceptions telling you exactly what was wrong with what you tried to do at a number of points, if you make a mistake.

The analogy to containers isn’t exact; inputs and outputs aren’t obviously in the shape of “arguments” and “return values”, especially since containers tend to be long-running; but nevertheless, a container does have inputs and outputs in the form of env vars, network services, and volumes.

Let’s consider the “foo” of docker, which would be the middle tier of a 3-tier web application (cribbed from a real live example):

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
FROM pypy:2
RUN apt-get update -ym
RUN apt-get upgrade -ym
RUN apt-get install -ym libssl-dev libffi-dev
RUN pip install virtualenv
RUN mkdir -p /code/env
RUN virtualenv /code/env
RUN pwd

COPY requirements.txt /code/requirements.txt
RUN /code/env/bin/pip install -r /code/requirements.txt
COPY main /code/main
RUN chmod a+x /code/main

VOLUME /clf
VOLUME /site
VOLUME /etc/ssl/private

ENTRYPOINT ["/code/main"]

In this file, we can only see three inputs, which are filesystem locations: /clf, /site, and /etc/ssl/private. How is this different than our Python example, a language with supposedly “no type information”?

  • The image has no metadata explaining what might go in those locations, or what roles they serve. We have no way to annotate them within the Dockerfile.
  • What services does this container need to connect to in order to get its job done? What hostnames will it connect to, what ports, and what will it expect to find there? We have no way of knowing. It doesn’t say. Any errors about the failed connections will come in a custom format, possibly in logs, from the application itself, and not from docker.
  • What services does this container export? It could have used an EXPOSE line to give us a hint, but it doesn’t need to; and even if it did, all we’d have is a port number.
  • What environment variables does its code require? What format do they need to be in?
  • We do know that we could look in requirements.txt to figure out what libraries are going to be used, but in order to figure out what the service dependencies are, we’re going to need to read all of the code to all of them.

Of course, the one way that this example is unrealistic is that I deleted all the comments explaining all of those things. Indeed, best practice these days would be to include comments in your Dockerfiles, and include example compose files in your repository, to give users some hint as to how these things all wire together.

This sort of state isn’t entirely uncommon in programming languages. In fact, in this popular GitHub project you can see that large programs written in assembler in the 1960s included exactly this sort of documentation convention: huge front-matter comments in English prose.

That is the current state of the container ecosystem. We are at the “late ’60s assembly language” stage of orchestration development. It would be a huge technological leap forward to be able to communicate our intent structurally.


When you’re building an image, you’re building it for a particular purpose. You already pretty much know what you’re trying to do and what you’re going to need to do it.

  1. When instantiated, the image is going to consume network services. This is not just a matter of hostnames and TCP ports; those services need to be providing a specific service, over a specific protocol. A generic reverse proxy might be able to handle an arbitrary HTTP endpoint, but an API client needs that specific API. A database admin tool might be OK with just “it’s a database” but an application needs a particular schema.
  2. It’s going to consume environment variables. But not just any variables; the variables have to be in a particular format.
  3. It’s going to consume volumes. The volumes need to contain data in a particular format, readable and writable by a particular UID.
  4. It’s also going to produce all of these things; it may listen on a network service port, provision a database schema, or emit some text that needs to be fed back into an environment variable elsewhere.

Here’s a brief sketch of what I want to see in a Dockerfile to allow me to express this sort of thing:

1
2
3
4
5
6
7
8
9
FROM ...
RUN ...

LISTENS ON: TCP:80 FOR: org.ietf.http/com.example.my-application-api
CONNECTS TO: pgwritemaster.internal ON: TCP:5432 FOR: org.postgresql.db/com.example.my-app-schema
CONNECTS TO: {{ETCD_HOST}} ON: TCP:{{ETCD_PORT}} FOR: com.coreos.etcd/client-communication
ENVIRONMENT NEEDS: ETCD_HOST FORMAT: HOST(com.coreos.etcd/client-communication)
ENVIRONMENT NEEDS: ETCD_PORT FORMAT: PORT(com.coreos.etcd/client-communication)
VOLUME AT: /logs FORMAT: org.w3.clf REQUIRES: WRITE UID: 4321

An image thusly built would refuse to run unless:

  • Somewhere else on its network, there was an etcd host/port known to it, its host and port supplied via environment variables.
  • Somewhere else on its network, there was a postgres host, listening on port 5432, with a name-resolution entry of “pgwritemaster.internal”.
  • An environment variable for the etcd configuration was supplied
  • A writable volume for /logs was supplied, owned by user-ID 4321 where it could write common log format logs.

There are probably a lot of flaws in the specific syntax here, but I hope you can see past that, to the broader point that the software inside a container has precise expectations of its environment, and that we presently have no way of communicating those expectations beyond writing a Melvilleian essay in each Dockerfile comments, beseeching those who would run the image to give it what it needs.


Why bother with this sort of work, if all the image can do with it is “refuse to run”?

First and foremost, today, the image effectively won’t run. Oh, it’ll start up, and it’ll consume some resources, but it will break when you try to do anything with it. What this metadata will allow the container runtime to do is to tell you why the image didn’t run, and give you specific, actionable, fast feedback about what you need to do in order to fix the problem. You won’t have to go groveling through logs; which is always especially hard if the back-end service you forgot to properly connect to was the log aggregation service. So this will be an order of magnitude speed improvement on initial deployments and development-environment setups for utility containers. Whole applications typically already come with a compose file, of course, but ideally applications would be built out of functioning self-contained pieces and not assembled one custom container at a time.

Secondly, if there were a strong tooling standard for providing this metadata within the image itself, it might become possible for infrastructure service providers (like, ahem, my employer) to automatically detect and satisfy service dependencies. Right now, if you have a database as a service that lives outside the container system in production, but within the container system in development and test, there’s no way for the orchestration layer to say “good news, everyone! you can find the database you need here: ...”.

My main interest is in allowing open source software developers to give service operators exactly what they need, so the upstream developers can get useful bug reports. There’s a constant tension where volunteer software developers find themselves fielding bug reports where someone deployed their code in a weird way, hacked it up to support some strange environment, built a derived container that had all kinds of extra junk in it to support service discovery or logging or somesuch, and so they don’t want to deal with the support load that that generates. Both people in that exchange are behaving reasonably. The developers gave the ops folks a container that runs their software to the best of their abilities. The service vendors made the minimal modifications they needed to have the container become a part of their service fabric. Yet we arrive at a scenario where nobody feels responsible for the resulting artifact.

If we could just say what it is that the container needs in order to really work, in a way which was precise and machine-readable, then it would be clear where the responsibility lies. Service providers could just run the container unmodified, and they’d know very clearly whether or not they’d satisfied its runtime requirements. Open source developers - or even commercial service vendors! - could say very clearly what they expected to be passed in, and when they got bug reports, they’d know exactly how their service should have behaved.


  1. which mostly but not entirely just means “docker”; it’s weird, of course, because there are pieces that docker depends on and tools that build upon docker which are part of this, but docker remains the nexus. 

  2. Yes yes, I know that they’re not really functions Tristan, they’re subroutines, but that’s the word people use for “subroutines” nowadays. 

  3. Just to be clear: no it isn’t. Write a damn docstring, or at least some type annotations

Syndicated 2016-08-14 22:22:00 from Deciphering Glyph

14 Aug 2016 glyph   » (Master)

Python Packaging Is Good Now

Okay folks. Time’s up. It’s too late to say that Python’s packaging ecosystem terrible any more. I’m calling it.

Python packaging is not bad any more. If you’re a developer, and you’re trying to create or consume Python libraries, it can be a tractable, even pleasant experience.

I need to say this, because for a long time, Python’s packaging toolchain was … problematic. It isn’t any more, but a lot of people still seem to think that it is, so it’s time to set the record straight.

If you’re not familiar with the history it went something like this:

The Dawn

Python first shipped in an era when adding a dependency meant a veritable Odyssey into cyberspace. First, you’d wait until nobody in your whole family was using the phone line. Then you’d dial your ISP. Once you’d finished fighting your SLIP or PPP client, you’d ask a netnews group if anyone knew of a good gopher site to find a library that could solve your problem. Once you were done with that task, you’d sign off the Internet for the night, and wait about 48 hours too see if anyone responded. If you were lucky enough to get a reply, you’d set up a download at the end of your night’s web-surfing.

pip search it wasn’t.

For the time, Python’s approach to dependency-handling was incredibly forward-looking. The import statement, and the pluggable module import system, made it easy to get dependencies from wherever made sense.

In Python 2.01, Distutils was introduced. This let Python developers describe their collections of modules abstractly, and added tool support to producing redistributable collections of modules and packages. Again, this was tremendously forward-looking, if somewhat primitive; there was very little to compare it to at the time.

Fast forwarding to 2004; setuptools was created to address some of the increasingly-common tasks that open source software maintainers were facing with distributing their modules over the internet. In 2005, it added easy_install, in order to provide a tool to automate resolving dependencies and downloading them into the right locations.

The Dark Age

Unfortunately, in addition to providing basic utilities for expressing dependencies, setuptools also dragged in a tremendous amount of complexity. Its author felt that import should do something slightly different than what it does, so installing setuptools changed it. The main difference between normal import and setuptools import was that it facilitated having multiple different versions of the same library in the same program at the same time. It turns out that that’s a dumb idea, but in fairness, it wasn’t entirely clear at the time, and it is certainly useful (and necessary!) to be able to have multiple versions of a library installed onto a computer at the same time.

In addition to these idiosyncratic departures from standard Python semantics, setuptools suffered from being unmaintained. It became a critical part of the Python ecosystem at the same time as the author was moving on to other projects entirely outside of programming. No-one could agree on who the new maintainers should be for a long period of time. The project was forked, and many operating systems’ packaging toolchains calcified around a buggy, ancient version.

From 2008 to 2012 or so, Python packaging was a total mess. It was painful to use. It was not clear which libraries or tools to use, which ones were worth investing in or learning. Doing things the simple way was too tedious, and doing things the automated way involved lots of poorly-documented workarounds and inscrutable failure modes.

This is to say nothing of the fact that there were critical security flaws in various parts of this toolchain. There was no practical way to package and upload Python packages in such a way that users didn’t need a full compiler toolchain for their platform.

To make matters worse for the popular perception of Python’s packaging prowess2, at this same time, newer languages and environments were getting a lot of buzz, ones that had packaging built in at the very beginning and had a much better binary distribution story. These environments learned lessons from the screw-ups of Python and Perl, and really got a lot of things right from the start.

Finally, the Python Package Index, the site which hosts all the open source packages uploaded by the Python community, was basically a proof-of-concept that went live way too early, had almost no operational resources, and was offline all the dang time.

Things were looking pretty bad for Python.


Intermission

Here is we get to the point of this post - this is where popular opinion about Python packaging is stuck. Outdated information from this period abounds. Blog posts complaining about problems score high in web searches. Those who used Python during this time, but have now moved on to some other language, frequently scoff and dismiss Python as impossible to package, its packaging ecosystem as broken, PyPI as down all the time, and so on. Worst of all, bad advice for workarounds which are no longer necessary are still easy to find, which causes users to pre-emptively break their environments where they really don’t need to.


From The Ashes

In the midst of all this brokenness, there were some who were heroically, quietly, slowly fixing the mess, one gnarly bug-report at a time. pip was started, and its various maintainers fixed much of easy_install’s overcomplexity and many of its flaws. Donald Stufft stepped in both on Pip and PyPI and improved the availability of the systems it depended upon, as well as some pretty serious vulnerabilities in the tool itself. Daniel Holth wrote a PEP for the wheel format, which allows for binary redistribution of libraries. In other words, it lets authors of packages which need a C compiler to build give their users a way to not have one.

In 2013, setuptools and distribute un-forked, providing a path forward for operating system vendors to start updating their installations and allowing users to use something modern.

Python Core started distributing the ensurepip module along with both Python 2.7 and 3.3, allowing any user with a recent Python installed to quickly bootstrap into a sensible Python development environment with a one-liner.

A New Renaissance

I’m won’t give you a full run-down of the state of the packaging art. There’s already a website for that. I will, however, give you a précis of how much easier it is to get started nowadays. Today, if you want to get a sensible, up-to-date python development environment, without administrative privileges, all you have to do is:

1
2
3
$ python -m ensurepip --user
$ python -m pip install --user --upgrade pip
$ python -m pip install --user --upgrade virtualenv

Then, for each project you want to do, make a new virtualenv:

1
2
3
$ python -m virtualenv lets-go
$ . ./lets-go/bin/activate
(lets-go) $ _

From here on out, now the world is your oyster; you can pip install to your heart’s content, and you probably won’t even need to compile any C for most packages. These instructions don’t depend on Python version, either: as long as it’s up-to-date, the same steps work on Python 2, Python 3, PyPy and even Jython. In fact, often the ensurepip step isn’t even necessary since pip comes preinstalled. Running it if it’s unnecessary is harmless, even!

Other, more advanced packaging operations are much simpler than they used to be, too.

  • Need a C compiler? OS vendors have been working with the open source community to make this easier across the board:
    1
    2
    3
    4
    5
    $ apt install build-essential python-dev # ubuntu
    $ xcode-select --install # macOS
    $ dnf install @development-tools python-devel # fedora
    C:\> REM windows
    C:\> start https://www.microsoft.com/en-us/download/details.aspx?id=44266
    

Okay that last one’s not as obvious as it ought to be but they did at least make it freely available!

  • Want to upload some stuff to PyPI? This should do it for almost any project:

    1
    2
    3
    $ pip install twine
    $ python setup.py sdist bdist_wheel
    $ twine upload dist/*
    
  • Want to build wheels for the wild and wooly world of Linux? There’s an app4 for that.

Importantly, PyPI will almost certainly be online. Not only that, but a new, revamped site will be “launching” any day now3.

Again, this isn’t a comprehensive resource; I just want to give you an idea of what’s possible. But, as a deeply experienced Python expert I used to swear at these tools six times a day for years; the most serious Python packaging issue I’ve had this year to date was fixed by cleaning up my git repo to delete a cache file.

Work Still To Do

While the current situation is good, it’s still not great.

Here are just a few of my desiderata:

  • We still need better and more universally agreed-upon tooling for end-user deployments.
  • Pip should have a GUI frontend so that users can write Python stuff without learning as much command-line arcana.
  • There should be tools that help you write and update a setup.py. Or a setup.python.json or something, so you don’t actually need to write code just to ship some metadata.
  • The error messages that you get when you try to build something that needs a C compiler and it doesn’t work should be clearer and more actionable for users who don’t already know what they mean.
  • PyPI should automatically build wheels for all platforms by default when you upload sdists; this is a huge project, of course, but it would be super awesome default behavior.

I could go on. There are lots of ways that Python packaging could be better.

The Bottom Line

The real takeaway here though, is that although it’s still not perfect, other languages are no longer doing appreciably better. Go is still working through a number of different options regarding dependency management and vendoring, and, like Python extensions that require C dependencies, CGo is sometimes necessary and always a problem. Node has had its own well-publicized problems with their dependency management culture and package manager. Hackage is cool and all but everything takes a literal geological epoch to compile.

As always, I’m sure none of this applies to Rust and Cargo is basically perfect, but that doesn’t matter, because nobody reading this is actually using Rust.

My point is not that packaging in any of these languages is particularly bad. They’re all actually doing pretty well, especially compared to the state of the general programming ecosystem a few years ago; many of them are making regular progress towards user-facing improvements.

My point is that any commentary suggesting they’re meaningfully better than Python at this point is probably just out of date. Working with Python packaging is more or less fine right now. It could be better, but lots of people are working on improving it, and the structural problems that prevented those improvements from being adopted by the community in a timely manner have almost all been addressed.

Go! Make some virtualenvs! Hack some setup.pys! If it’s been a while and your last experience was really miserable, I promise, it’s better now.


Am I wrong? Did I screw up a detail of your favorite language? Did I forget to mention the one language environment that has a completely perfect, flawless packaging story? Do you feel the need to just yell at a stranger on the Internet about picayune details? Feel free to get in touch!


  1. released in October, 2000 

  2. say that five times fast. 

  3. although I’m not sure what it means to “launch” when the site is online, and running against the production data-store, and you can use it for pretty much everything... 

  4. “app” meaning of course “docker container” 

Syndicated 2016-08-14 09:17:00 from Deciphering Glyph

14 Aug 2016 glyph   » (Master)

What’s In A Name

Amber’s excellent lightning talk on identity yesterday made me feel many feels, and reminded me of this excellent post by Patrick McKenzie about false assumptions regarding names.

While that list is helpful, it’s very light on positively-framed advice, i.e. “you should” rather than “you shouldn’t”. So I feel like I want to give a little bit of specific, prescriptive advice to programmers who might need to deal with names.

First and foremost: stop asking for unnecessary information. If I’m just authenticating to your system to download a comic book, you do not need to know my name. Your payment provider might need a billing address, but you absolutely do not need to store my name.

Okay, okay. I understand that may make your system seem a little impersonal, and you want to be able to greet me, or maybe have a name to show to other users beyond my login ID or email address that has to be unique on the site. Fine. Here’s what a good “name” field looks like:

You don’t need to break my name down into parts. If you just need a way to refer to me, then let me tell you whatever the heck I want. Honorific? Maybe I have more than one; maybe I don’t want you to use any.

And this brings me to “first name / last name”.

In most cases, you should not use these terms. They are oversimplifications of how names work, appropriate only for children in English-speaking countries who might not understand the subtleties involved and only need to know that one name comes before the other.

The terms you’re looking for are given name and surname, or perhaps family name. (“Middle name” might still be an appropriate term because that fills a more specific role.) But by using these more semantically useful terms, you include orders of magnitude more names in your normalization scheme. More importantly, by acknowledging the roles of the different parts of a name, you’ll come to realize that there are many other types of name, such as:

If your application does have a legitimate need to normalize names, for example, to interoperate with third-party databases, or to fulfill some regulatory requirement:

  • When you refer to a user of the system, always allow them to customize how their name is presented. Give them the benefit of the doubt. If you’re concerned about users abusing this display-name system to insult other users, it's understandable that you may need to moderate that a little. But there's no reason to ever moderate or regulate how a user's name is displayed to themselves. You can start to address offensive names by allowing other users to set nicknames for them. Only as a last resort, allow other users to report their name as not-actually-their-name, abusive or rude; if you do that, you have to investigate those reports. Let users affirm other users’ names, too, and verify reports: if someone attracts a million fake troll accounts, but all their friends affirm that their name is correct, you should be able to detect that. Don’t check government IDs in order to do this; they’re not relevant.
  • Allow the user to enter their normalized name as a series of names with classifiers attached to each one. In other words, like this:
  • Keep in mind that spaces are valid in any of these names. Many people have multi-word first names, middle names, or last names, and it can matter how you classify them. For one example that should resonate with readers of this blog, it’s “Guido” “van Rossum”, not “Guido” “Van” “Rossum”. It is definitely not “Guido” “Vanrossum”.
  • So is other punctuation. Even dashes. Even apostrophes. Especially apostrophes, you insensitive clod. Literally ten billion people whose surnames start with “O’” live in Ireland and they do not care about your broken database input security practices.
  • Allow for the user to have multiple names with classifiers attached to each one: “legal name in China”, “stage name”, “name on passport”, “maiden name”, etc. Keep in mind that more than one name for a given person may simultaneously accurate for a certain audience and legally valid. They can even be legally valid in the same context: many people have social security cards, birth certificates, driver’s licenses and passports with different names on them; sometimes due to a clerical error, sometimes due to the way different systems work. If your goal is to match up with those systems, especially more than one of them, you need to account for that possibility.

If you’re a programmer and you’re balking at this complexity: good. Remember that for most systems, sticking with the first option - treating users’ names as totally opaque text - is probably your best bet. You probably don’t need to know the structure of the user’s name for most purposes.

Syndicated 2016-08-14 00:48:00 from Deciphering Glyph

13 Aug 2016 badvogato   » (Master)

dear all, good news from daily vanity fair. I am connected with Evan McMullin, the independent presidential candidate apart from Donald and Hillary 2016 run. This is msg I sent to him:

" following:Sir, Just wish to offer my sincere gratitude for your commitment to call for resonsible citizenship to resist letting Donald Trump to occupy any public office where he really has no temperament nor sincerity to serve... -"

Yours

citizen of U.S.A.

12 Aug 2016 iddekingej   » (Observer)

Today I started to develop a gui(Qt) for managing lxc containers Here


I stumbled over a problem that the program hangs after trying to start a container. After some research I discovered that QT is the culprit.

The problem is that lxc is forking the process twice and that messes up QT.

I solved this by forking the process before QApplication starts and listing to a pipe for start command. Inside the forked process the lxc container is started. Works fine so far.

It took me some hours to find this out




12 Aug 2016 glyph   » (Master)

The One Python Library Everyone Needs

Do you write programs in Python? You should be using attrs.

Why, you ask? Don’t ask. Just use it.

Okay, fine. Let me back up.

I love Python; it’s been my primary programming language for 10+ years and despite a number of interesting developments in the interim I have no plans to switch to anything else.

But Python is not without its problems. In some cases it encourages you to do the wrong thing. Particularly, there is a deeply unfortunate proliferation of class inheritance and the God-object anti-pattern in many libraries.

One cause for this might be that Python is a highly accessible language, so less experienced programmers make mistakes that they then have to live with forever.

But I think that perhaps a more significant reason is the fact that Python sometimes punishes you for trying to do the right thing.

The “right thing” in the context of object design is to make lots of small, self-contained classes that do one thing and do it well. For example, if you notice your object is starting to accrue a lot of private methods, perhaps you should be making those “public”1 methods of a private attribute. But if it’s tedious to do that, you probably won’t bother.

Another place you probably should be defining an object is when you have a bag of related data that needs its relationships, invariants, and behavior explained. Python makes it soooo easy to just define a tuple or a list. The first couple of times you type host, port = ... instead of address = ... it doesn’t seem like a big deal, but then soon enough you’re typing [(family, socktype, proto, canonname, sockaddr)] = ... everywhere and your life is filled with regret. That is, if you’re lucky. If you’re not lucky, you’re just maintaining code that does something like values[0][7][4][HOSTNAME][“canonical”] and your life is filled with garden-variety pain rather than the more complex and nuanced emotion of regret.


This raises the question: is it tedious to make a class in Python? Let’s look at a simple data structure: a 3-dimensional cartesian coordinate. It starts off simply enough:

1
class Point3D(object):

So far so good. We’ve got a 3 dimensional point. What next?

1
2
class Point3D(object):
    def __init__(self, x, y, z):

Well, that’s a bit unfortunate. I just want a holder for a little bit of data, and I’ve already had to override a special method from the Python runtime with an internal naming convention? Not too bad, I suppose; all programming is weird symbols after a fashion.

At least I see my attribute names in there, that makes sense.

1
2
3
class Point3D(object):
    def __init__(self, x, y, z):
        self.x

I already said I wanted an x, but now I have to assign it as an attribute...

1
2
3
class Point3D(object):
    def __init__(self, x, y, z):
        self.x = x

... to x? Uh, obviously ...

1
2
3
4
5
class Point3D(object):
    def __init__(self, x, y, z):
        self.x = x
        self.y = y
        self.z = z

... and now I have to do that once for every attribute, so this actually scales poorly? I have to type every attribute name 3 times?!?

Oh well. At least I’m done now.

1
2
3
4
5
6
class Point3D(object):
    def __init__(self, x, y, z):
        self.x = x
        self.y = y
        self.z = z
    def __repr__(self):

Wait what do you mean I’m not done.

1
2
3
4
5
6
7
8
class Point3D(object):
    def __init__(self, x, y, z):
        self.x = x
        self.y = y
        self.z = z
    def __repr__(self):
        return (self.__class__.__name__ +
                ("(x={}, y={}, z={})".format(self.x, self.y, self.z)))

Oh come on. So I have to type every attribute name 5 times, if I want to be able to see what the heck this thing is when I’m debugging, which a tuple would have given me for free?!?!?

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
class Point3D(object):
    def __init__(self, x, y, z):
        self.x = x
        self.y = y
        self.z = z
    def __repr__(self):
        return (self.__class__.__name__ +
                ("(x={}, y={}, z={})".format(self.x, self.y, self.z)))
    def __eq__(self, other):
        if not isinstance(other, self.__class__):
            return NotImplemented
        return (self.x, self.y, self.z) == (other.x, other.y, other.z)

7 times?!?!?!?

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
class Point3D(object):
    def __init__(self, x, y, z):
        self.x = x
        self.y = y
        self.z = z
    def __repr__(self):
        return (self.__class__.__name__ +
                ("(x={}, y={}, z={})".format(self.x, self.y, self.z)))
    def __eq__(self, other):
        if not isinstance(other, self.__class__):
            return NotImplemented
        return (self.x, self.y, self.z) == (other.x, other.y, other.z)
    def __lt__(self, other):
        if not isinstance(other, self.__class__):
            return NotImplemented
        return (self.x, self.y, self.z) < (other.x, other.y, other.z)

9 times?!?!?!?!?

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
from functools import total_ordering
@total_ordering
class Point3D(object):
    def __init__(self, x, y, z):
        self.x = x
        self.y = y
        self.z = z
    def __repr__(self):
        return (self.__class__.__name__ +
                ("(x={}, y={}, z={})".format(self.x, self.y, self.z)))
    def __eq__(self, other):
        if not isinstance(other, self.__class__):
            return NotImplemented
        return (self.x, self.y, self.z) == (other.x, other.y, other.z)
    def __lt__(self, other):
        if not isinstance(other, self.__class__):
            return NotImplemented
        return (self.x, self.y, self.z) < (other.x, other.y, other.z)

Okay, whew - 2 more lines of code isn’t great, but now at least we don’t have to define all the other comparison methods. But now we’re done, right?

1
2
from unittest import TestCase
class Point3DTests(TestCase):

You know what? I’m done. 20 lines of code so far and we don’t even have a class that does anything; the hard part of this problem was supposed to be the quaternion solver, not “make a data structure which can be printed and compared”. I’m all in on piles of undocumented garbage tuples, lists, and dictionaries it is; defining proper data structures well is way too hard in Python.2


namedtuple to the (not really) rescue

The standard library’s answer to this conundrum is namedtuple. While a valiant first draft (it bears many similarities to my own somewhat embarrassing and antiquated entry in this genre) namedtuple is unfortunately unsalvageable. It exports a huge amount of undesirable public functionality which would be a huge compatibility nightmare to maintain, and it doesn’t address half the problems that one runs into. A full enumeration of its shortcomings would be tedious, but a few of the highlights:

  • Its fields are accessable as numbered indexes whether you want them to be or not. Among other things, this means you can’t have private attributes, because they’re expoed via the apparently public __getitem__ interface.
  • It compares equal to a raw tuple of the same values, so it’s easy to get into bizarre type confusion, especially if you’re trying to use it to migrate away from using tuples and lists.
  • It’s a tuple, so it’s always immutable. Sort of.

As to that last point, either you can use it like this:

1
Point3D = namedtuple('Point3D', ['x', 'y', 'z'])

in which case it doesn’t look like a type in your code; simple syntax-analysis tools without special cases won’t recognize it as one. You can’t give it any other behaviors this way, since there’s nowhere to put a method. Not to mention the fact that you had to type the class’s name twice.

Alternately you can use inheritance and do this:

1
2
class Point3D(namedtuple('_Point3DBase', 'x y z'.split()])):
    pass

This gives you a place you can put methods, and a docstring, and generally have it look like a class, which it is... but in return you now have a weird internal name (which, by the way, is what shows up in the repr, not the class’s actual name). However, you’ve also silently made the attributes not listed here mutable, a strange side-effect of adding the class declaration; that is, unless you add __slots__ = 'x y z'.split() to the class body, and then we’re just back to typing every attribute name twice.

And this doesn’t even mention the fact that science has proven that you shouldn’t use inheritance.

So, namedtuple can be an improvement if it’s all you’ve got, but only in some cases, and it has its own weird baggage.


Enter The attr

So here’s where my favorite mandatory Python library comes in.

Let’s re-examine the problem above. How do I make Point3D with attrs?

1
2
import attr
@attr.s

Since this isn’t built into the language, we do have to have 2 lines of boilerplate to get us started: the import and the decorator saying we’re about to use it.

1
2
3
import attr
@attr.s
class Point3D(object):

Look, no inheritance! By using a class decorator, Point remains a Plain Old Python Class (albeit with some helpful double-underscore methods tacked on, as we’ll see momentarily).

1
2
3
4
import attr
@attr.s
class Point3D(object):
    x = attr.ib()

It has an attribute called x.

1
2
3
4
5
6
import attr
@attr.s
class Point3D(object):
    x = attr.ib()
    y = attr.ib()
    z = attr.ib()

And one called y and one called z and we’re done.

We’re done? Wait. What about a nice string representation?

1
2
>>> Point3D(1, 2, 3)
Point3D(x=1, y=2, z=3)

Comparison?

1
2
3
4
5
6
>>> Point3D(1, 2, 3) == Point3D(1, 2, 3)
True
>>> Point3D(3, 2, 1) == Point3D(1, 2, 3)
False
>>> Point3D(3, 2, 3) > Point3D(1, 2, 3)
True

Okay sure but what if I want to extract the data defined in explicit attributes in a format appropriate for JSON serialization?

1
2
>>> attr.asdict(Point3D(1, 2, 3))
{'y': 2, 'x': 1, 'z': 3}

Maybe that last one was a little on the nose. But nevertheless, it’s one of many things that becomes easier because attrs lets you declare the fields on your class, along with lots of potentially interesting metadata about them, and then get that metadata back out.

1
2
3
4
5
>>> import pprint
>>> pprint.pprint(attr.fields(Point3D))
(Attribute(name='x', default=NOTHING, validator=None, repr=True, cmp=True, hash=True, init=True, convert=None),
 Attribute(name='y', default=NOTHING, validator=None, repr=True, cmp=True, hash=True, init=True, convert=None),
 Attribute(name='z', default=NOTHING, validator=None, repr=True, cmp=True, hash=True, init=True, convert=None))

I am not going to dive into every interesting feature of attrs here; you can read the documentation for that. Plus, it’s well-maintained, so new goodies show up every so often and I might miss something important. But attrs does a few key things that, once you have them, you realize that Python was sorely missing before:

  1. It lets you define types concisely, as opposed to the normally quite verbose manual def __init__.... Types without typing.
  2. It lets you say what you mean directly with a declaration rather than expressing it in a roundabout imperative recipe. Instead of “I have a type, it’s called MyType, it has a constructor, in the constructor I assign the property ‘A’ to the parameter ‘A’ (and so on)”, you say “I have a type, it’s called MyType, it has an attribute called a”, and behavior is derived from that fact, rather than having to later guess about the fact by reverse engineering it from behavior (for example, running dir on an instance, or looking at self.__class__.__dict__).
  3. It provides useful default behavior, as opposed to Python’s sometimes-useful but often-backwards defaults.
  4. It adds a place for you to put a more rigorous implementation later, while starting out simple.

Let’s explore that last point.

Progressive Enhancement

While I’m not going to talk about every feature, I’d be remiss if I didn’t mention a few of them. As you can see from those mile-long repr()s for Attribute above, there are a number of interesting ones.

For example: you can validate attributes when they are passed into an @attr.s-ified class. Our Point3D, for example, should probably contain numbers. For simplicity’s sake, we could say that that means instances of float, like so:

1
2
3
4
5
6
7
import attr
from attr.validators import instance_of
@attr.s
class Point3D(object):
    x = attr.ib(validator=instance_of(float))
    y = attr.ib(validator=instance_of(int))
    z = attr.ib(validator=instance_of(int))

The fact that we were using attrs means we have a place to put this extra validation: we can just add type information to each attribute as we need it. Some of these facilities let us avoid other common mistakes. For example, this is a popular “spot the bug” Python interview question:

1
2
3
4
5
6
7
class Bag:
    def __init__(self, contents=[]):
        self._contents = contents
    def add(self, something):
        self._contents.append(something)
    def get(self):
        return self.contents[:]

Fixing it, of course, becomes this:

1
2
3
4
5
class Bag:
    def __init__(self, contents=None):
        if contents is None:
            contents = []
        self._contents = contents

adding two extra lines of code.

contents inadvertently becomes a global varible here, making all Bag objects not provided with a different list share the same list. With attrs this instead becomes:

1
2
3
4
5
6
7
@attr.s
class Bag:
    _contents = attr.ib(default=attr.Factory(list))
    def add(self, something):
        self._contents.append(something)
    def get(self):
        return self.contents[:]

There are several other features that attrs provides you with opportunities to make your classes both more convenient and more correct. Another great example? If you want to be strict about extraneous attributes on your objects (or more memory-efficient on CPython), you can just pass slots=True at the class level - e.g. @attr.s(slots=True) - to automatically turn your existing attrs declarations a matching __slots__ attribute. All of these handy features allow you to make better and more powerful use of your attr.ib() declarations.


The Python Of The Future

Some people are excited about eventually being able to program in Python 3 everywhere. What I’m looking forward to is being able to program in Python-with-attrs everywhere. It exerts a subtle, but positive, design influence in all the codebases I’ve see it used in.

Give it a try: you may find yourself surprised at places where you’ll now use a tidily explained class, where previously you might have used a sparsely-documented tuple, list, or a dict, and endure the occasional confusion from co-maintainers. Now that it’s so easy to have structured types that clearly point in the direction of their purpose (in their __repr__, in their __doc__, or even just in the names of their attributes), you might find you’ll use a lot more of them. Your code will be better for it; I know mine has been.


  1. Scare quotes here because the attributes aren’t meaningfully exposed to the caller, they’re just named publicly. This pattern, getting rid of private methods entirely and having only private attributes, probably deserves its own post... 

  2. And we hadn’t even gotten to the really exciting stuff yet: type validation on construction, default mutable values... 

Syndicated 2016-08-12 09:47:00 from Deciphering Glyph

12 Aug 2016 glyph   » (Master)

Remember that thing I said in my pycon talk about native packaging being the main thing to worry about, and single-file binaries being at best a stepping stone to that and at worst a bit of a red herring? You don’t have to take it from me. From the authors of a widely-distributed command-line application that was rewritten from Python into Go specifically for easier distribution, and then rewritten in Python:

... [the] majority of people prefer native packages so distributing precompiled binaries wasn’t a big win for this type of project1 ...

I don’t want to say “I told you so”, but... no. Wait a second. That is exactly what I want to do. That is what I am doing.

I told you so.


  1. Marcin Kulik, ‘1.3 aka “And Now for Something Completely Different”’, asciinema blog 

Syndicated 2016-08-12 03:37:00 from Deciphering Glyph

11 Aug 2016 mjg59   » (Master)

Microsoft's compromised Secure Boot implementation

There's been a bunch of coverage of this attack on Microsoft's Secure Boot implementation, a lot of which has been somewhat confused or misleading. Here's my understanding of the situation.

Windows RT devices were shipped without the ability to disable Secure Boot. Secure Boot is the root of trust for Microsoft's User Mode Code Integrity (UMCI) feature, which is what restricts Windows RT devices to running applications signed by Microsoft. This restriction is somewhat inconvenient for developers, so Microsoft added support in the bootloader to disable UMCI. If you were a member of the appropriate developer program, you could give your device's unique ID to Microsoft and receive a signed blob that disabled image validation. The bootloader would execute a (Microsoft-signed) utility that verified that the blob was appropriately signed and matched the device in question, and would then insert it into an EFI Boot Services variable[1]. On reboot, the boot loader reads the blob from that variable and integrates that policy, telling later stages to disable code integrity validation.

The problem here is that the signed blob includes the entire policy, and so any policy change requires an entirely new signed blob. The Windows 10 Anniversary Update added a new feature to the boot loader, allowing it to load supplementary policies. These must also be signed, but aren't tied to a device id - the idea is that they'll be ignored unless a device-specific policy has also been loaded. This way you can get a single device-specific signed blob that allows you to set an arbitrary policy later by using a combination of supplementary policies.

This is all fine in the Anniversary Edition. Unfortunately older versions of the boot loader will happily load a supplementary policy as if it were a full policy, ignoring the fact that it doesn't include a device ID. The loaded policy replaces the built-in policy, so in the absence of a base policy a supplementary policy as simple as "Enable this feature" will effectively remove all other restrictions.

Unfortunately for Microsoft, such a supplementary policy leaked. Installing it as a base policy on pre-Anniversary Edition boot loaders will then allow you to disable all integrity verification, including in the boot loader. Which means you can ask the boot loader to chain to any other executable, in turn allowing you to boot a compromised copy of any operating system you want (not just Windows).

This does require you to be able to install the policy, though. The PoC released includes a signed copy of SecureBootDebug.efi for ARM, which is sufficient to install the policy on ARM systems. There doesn't (yet) appear to be a public equivalent for x86, which means it's not (yet) practical for arbitrary attackers to subvert the Secure Boot process on x86. I've been doing my testing on a setup where I've manually installed the policy, which isn't practical in an automated way.

How can this be prevented? Installing the policy requires the ability to run code in the firmware environment, and by default the boot loader will only load signed images. The number of signed applications that will copy the policy to the Boot Services variable is presumably limited, so if the Windows boot loader supported blacklisting second-stage bootloaders Microsoft could simply blacklist all policy installers that permit installation of a supplementary policy as a primary policy. If that's not possible, they'll have to blacklist of the vulnerable boot loaders themselves. That would mean all pre-Anniversary Edition install media would stop working, including recovery and deployment images. That's, well, a problem. Things are much easier if the first case is true.

Thankfully, if you're not running Windows this doesn't have to be a issue. There are two commonly used Microsoft Secure Boot keys. The first is the one used to sign all third party code, including drivers in option ROMs and non-Windows operating systems. The second is used purely to sign Windows. If you delete the second from your system, Windows boot loaders (including all the vulnerable ones) will be rejected by your firmware, but non-Windows operating systems will still work fine.

From what we know so far, this isn't an absolute disaster. The ARM policy installer requires user intervention, so if the x86 one is similar it'd be difficult to use this as an automated attack vector[2]. If Microsoft are able to blacklist the policy installers without blacklisting the boot loader, it's also going to be minimally annoying. But if it's possible to install a policy without triggering any boot loader blacklists, this could end up being embarrassing.

Even outside the immediate harm, this is an interesting vulnerability. Presumably when the older boot loaders were written, Microsoft policy was that they would never sign policy files that didn't include a device ID. That policy changed when support for supplemental policies was added. without this policy change, the older boot loaders could still be considered secure. Adding new features can break old assumptions, and your design needs to take that into account.

[1] EFI variables come in two main forms - those accessible at runtime (Runtime Services variables) and those only accessible in the early boot environment (Boot Services variables). Boot Services variables can only be accessed before ExitBootServices() is called, and in Secure Boot environments all code executing before this point is (theoretically) signed. This means that Boot Services variables are nominally tamper-resistant.

[2] Shim has explicit support for allowing a physically present machine owner to disable signature validation - this is basically equivalent

comment count unavailable comments

Syndicated 2016-08-11 21:58:04 from Matthew Garrett

11 Aug 2016 glyph   » (Master)

Hello lazyweb,

I want to run some “legacy” software (Trac, specifically) on a Swarm cluster. The files that it needs to store are mostly effectively write-once (it’s the attachments database) but may need to be deleted (spammers and purveyors of malware occasionally try to upload things for spamming or C&C) so while mutability is necessary, there’s a very low risk of any write contention.

I can’t use a networked filesystem, or any volume drivers, so no easy-mode solutions. Basically I want to be able to deploy this on any swarm cluster, so no cheating and fiddling with the host.

Is there any software that I can easily run as a daemon that runs in the background, synchronizing the directory of a data volume between N containers where N is the number of hosts in my cluster?

I found this but it strikes me as ... somehow wrong ... to use that as a critical part of my data-center infrastructure. Maybe it would actually be a good start? But in addition to not being really designed for this task, it’s also not open source, which makes me a little nervous. This list, or even this smaller one is huge and bewildering. So I was hoping you could drop me a line if you’ve got an idea what I could use for this.

Syndicated 2016-08-11 05:28:00 from Deciphering Glyph

10 Aug 2016 zeenix   » (Journeyer)

Life is change

Quite a few major life events happened/happening this summer so I thought I blog about them and some of the experiences I had.

New job & new city/country

Yes, I found it hard to believe too that I'll ever be leaving Red Hat and the best manager I ever had (no offence to others but competing with Matthias is just impossible) but I'll be moving to Gothenburg to join Pelagicore folks as a Software Architect in just 2 weeks. I have always found Swedish language to be a very cute language so looking forward to my attempt of learning Swedish. If only I had learnt Swedish rather than Finnish when I was in Finland.

BTW, I'm selling all my furniture so if you're in London and need some furniture, get in touch!

Fresh helicopter pilot

So after two years of hard work and getting myself sinking in bank loans, I finally did it! Last week, I passed the skills test for Private Pilot License (Helicopters) and currently awaiting anxiously for my license to come through (it usually takes at least two weeks). Once I have that, I can rent Helicopters and take passengers with me. I'll be able to share the costs with passengers but I'm not allowed to make money out of it. The test was very tough and I came very close to failing at one particular point. The good news is that despite me being very tense and very windy conditions on test day, the biggest negative point from my examiner was that I was being over-cautious and hence very slow. So I think it wasn't so bad.



There are a few differences to a driving test. A minor one is is that in driving test, you are not expected to explain your steps but simply execute, where as in skills test for flying, you're expected to think everything out loud. But the most major difference is that in driving test, you are not expected to drive on your own until you pass the test, where as in flying test, you are required to have flown solo for at least 10 hours, which needs to include a solo cross country flight of at least a 100 nautical miles (185 KM) involving 3 major aeorodromes.  Mine involved Estree, Cranfield and Duxford. I've been GPS logging while flying so I can show you log of my qualifying solo cross country flight (click here to see details and notes):



I still got a long way towards Commercial License but at least now I can share the price with friends so building hours towards commercial license, won't be so expensive (I hope). I've found a nice company in Gothenburg that trains in and rents helicopters so I'm very much looking forward to flying over the costs in there. Wanna join? Let me know. :)

Syndicated 2016-08-10 13:24:00 (Updated 2016-08-10 13:25:42) from zeenix

9 Aug 2016 sye   » (Journeyer)

how a Chinese poem is translated

Winding Up
by Derek Walcott
I live on the water,
alone. Without wife and children,
I have circled every possibility
to come to this:
a low house by grey water,
with windows always open
to the stale sea. We do not choose such things,
but we are what we have made.
We suffer, the years pass,
we shed freight but not our need
for encumbrances. Love is a stone
that settled on the sea-bed
#13
under grey water. Now, I require nothing
from poetry but true feeling,
no pity, no fame, no healing. Silent wife,
we can sit watching grey water,
and in a life awash
with mediocrity and trash
live rock-like.
I shall unlearn feeling,
unlearn my gift. That is greater
and harder than what passes there for life.

阅读更多帖子 »
syndicated from nuniabiz.blogspot.com

Syndicated 2016-08-09 16:05:00 (Updated 2016-08-09 16:05:50) from badvogato

9 Aug 2016 glyph   » (Master)

I like keeping a comprehensive an accurate addressbook that includes all past email addresses for my contacts, including those which are no longer valid. I do this because I want to be able to see conversations stretching back over the years as originating from that person.

Unfortunately this causes problems when sending mail sometimes. On macOS, at least as of El Capitan, neither the Mail application nor the Contacts application have any mechanism for indicating preference-order of email addresses that I’ve been able to find. Compounding this annoyance, when completing a recipient’s address based on their name, it displays all email addresses for a contact without showing their label, which means even if I label one “preferred” or “USE THIS ONE NOW”, or “zzz don’t use this hasn’t worked since 2005”, I can’t tell when I’m sending a message.

But it seems as though it defaults to sending messages to the most recent outgoing address for that contact that it can see in an email. For people I send email to regularly to this is easy enough. For people who I’m aware have changed their email address, but where I don’t actually want to send them a message, I think I figured out a little hack that makes it work: make a new folder called “Preferred Addresses Hack” (or something suitably), compose a new message addressed to the correct address, then drag the message out of drafts into the folder; since it has a recent date and is addressed to the right person, Mail.app will index it and auto-complete the correct address in the future.

However, since the previous behavior appeared somewhat non-deterministic, I might be tricking myself into believing that this hack worked. If you can confirm it, I’d appreciate it if you would let me know.

Syndicated 2016-08-08 23:24:00 from Deciphering Glyph

7 Aug 2016 sye   » (Journeyer)

mod_virgule running on Amazon Web Services

okay. today try syndication from another brother and see how they merge...

just a note to test syndication/blog feed to my running instance of mod_virgule isn't quite working yet...


阅读更多帖子 »
syndicated from nuniabiz.blogspot.com

Syndicated 2016-08-07 16:24:00 (Updated 2016-08-07 16:35:20) from badvogato

5 Aug 2016 badvogato   » (Master)

Happy belated birthday congratulation to our president Obama. and our VP's twitter was as cute as my 5 years old's classic craft ... hope his talk for Syria peace deal is effective. have a nice day, everyone!

4 Aug 2016 hands   » (Master)

EOMA68: > $60k pledged on crowdsupply.com

crowdsupply.com has a campaign to fund production of EOMA68 computer cards (and associated peripherals) which recently passed the $60,000 mark.

If you were at DebConf13 in Switzerland, you may have seen me with some early prototypes that I had been lent to show people.

The inside of the A20 EOMA68 computer board

The inside of the A20 EOMA68 computer board

The concept: build computers on a PCMCIA physical form-factor, thus confining most of the hardware and software complexity in a single replaceable item, decoupling the design of the outer device from the chips that drive it.

EOMA68 pack-shot

EOMA68 pack-shot

There is a lot more information about this at crowdsupply, and at http://rhombus-tech.net/ -- I hope people find it interesting enough to sign up.

BTW While I host Rhombus Tech's website as a favour to Luke Leighton, I have no financial links with them.

Syndicated 2016-08-04 22:04:02 from chezfil

4 Aug 2016 iddekingej   » (Observer)

When slashdot was a hub for news and foss discussion.


Interview with Timothy Lord (last of the early Slashdot editors)

3 Aug 2016 iddekingej   » (Observer)

I'm doing now

  • Writing a program for displaying infromation about block devices, raid etc.. for linux : bdgui
  • And a gui program that displays open files for linux ofgui

3 Aug 2016 yosch   » (Master)

URW++ re-releases open fonts in MuPDF bundle

The URW++ foundry has re-released under the Open Font License (OFL) the core set of fonts for PDF rendering (via PostScript/GhostScript - the special subset of Nimbus - bundled with MuPDF reader by Artifex.

2 Aug 2016 marnanel   » (Journeyer)

"I saw the wicked in such prosperity"

Psalm 73 was in this morning's readings. The poetry remains bitterly relevant to today's society.

For I was envious of the proud;
I saw the wicked in such prosperity;
for they suffer no pains
and their bodies are sleek and sound.
They come to no misfortune like other folk;
nor are they plagued as others are.

Therefore pride is their necklace
and violence wraps them like a cloak.
And so the people turn to them
and find in them no fault.
Behold, these are the wicked;
ever at ease, they increase their wealth.

This entry was originally posted at http://marnanel.dreamwidth.org/375009.html. Please comment there using OpenID.

Syndicated 2016-08-02 16:28:41 from Monument

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!

Advogato User Stats
Users13993
Observer9877
Apprentice748
Journeyer2333
Master1031

New Advogato Members

Recently modified projects

14 Jun 2016 luxdvd
8 Mar 2016 ShinyCMS
8 Feb 2016 OpenBSC
5 Feb 2016 Abigail
29 Dec 2015 mod_virgule
19 Sep 2015 Break Great Firewall
20 Jul 2015 Justice4all
25 May 2015 Molins framework for PHP5
25 May 2015 Beobachter
7 Mar 2015 Ludwig van
7 Mar 2015 Stinky the Shithead
18 Dec 2014 AshWednesday
11 Nov 2014 respin
20 Jun 2014 Ultrastudio.org
13 Apr 2014 Babel

New projects

8 Mar 2016 ShinyCMS
5 Feb 2016 Abigail
2 Dec 2014 Justice4all
11 Nov 2014 respin
8 Mar 2014 Noosfero
17 Jan 2014 Haskell
17 Jan 2014 Erlang
17 Jan 2014 Hy
17 Jan 2014 clj-simulacrum
17 Jan 2014 Haskell-Lisp
17 Jan 2014 lfe-disco
17 Jan 2014 clj-openstack
17 Jan 2014 lfe-openstack
17 Jan 2014 LFE
1 Nov 2013 FAQ Linux