Recent blog entries for cbbrowne

Nexus 7 on CyanogenMod

At last…

I had been lazy, leaving all alone.

In February, I figured I was heading off for a chunk of the month on a cruise, hence wanting tablet for multimedia, but without network, so it was timely not to spend time fiddling with configuration with possible risk of mussing such up.

Alas, the OTA upgrade to JellyBean did a certain chunk of mussing…  It busted SuperUser access, thereby breaking Titanium Backup.  No backups went properly since :-( .

So, today seemed right timing.  I wanted backup, and needed root, the latter looking like a fight.  Ah well, go for gusto, see what we get without it…

I had to upgrade adb to support latest Android…   Got Clockwork Recovery in place, and zip files for CM10.1 and Google Apps…

The last backup was Feb 16, but happily the files still remained after fresh CM10.1 installation, so I could do a good chunk of recovery of apps, and in plenty of cases, this was basically network configuration, so apps would update their own data upon startup.  Sweet!

Superuser is nicely integrated into CM10, also sweet, no extra installation process.

I’ll need to reconfigure the launcher, due to the shift from ADW (I had a license) to built-in Trebuchet on CM10, but that seems like the “worst” irritation, and one I can well live with.

I’m not sure I can readily identify big differences between stock Android and CM10, but there are nice small creature comforts my CM10 phone has gotten me used to, like a quick “turn on/off WiFi” directly on notification screens.  Small but I like it.

Syndicated 2013-03-09 00:18:21 from linuxdatabases.info

Mailman subscriber lists

As part of “due diligence” for some mailing lists I am involved with (for Slony, see slony-backups ), I discovered the need to dump out Mailman mailing list subscribers.

There is a script to do this, written in Python, mentioned on the Mailman wiki, accessible as mailman-subscribers.py

I’d kind of rather have something a bit more version-tracked, so I poked around at GitHub, and found larsks / mailman-subscribers

That was a little out of date; the last code was from a couple of years ago, so I forked, updated to the latest, and suggested that “larsks” pull it, which he did, quite quickly.

The “kudos” bit is that I noticed a bit of a blemish, in that the mailing list password was required to be on the command line, thereby making it visible to anyone with access to /usr/bin/ps on one’s system. I submitted a feature request, and Lars was so kind as to have this feature added so quickly that by the time I had the prototype of my Slony “subscriber backup” script working, I immediately needed to change it to make use of the lovely new password-in-file feature. Nice!

Syndicated 2013-02-27 18:32:42 from linuxdatabases.info

Installing git-annex from Debian unstable

Installing git-annex from unstable

I happen to be a supporter of Joey Hess’ Git Annex Kickstarter project; no big bucks, but it seemed a good thing to help out.

I got in the stickers, that were my “project reward,” and figured I should start playing with the new results. I’m particularly keen on the planned Android client, but I should make some use of it before that comes available.

There’s good news, and bad news:

Good news
He has added in an assistant to provide interactive help in setting up repositories. It’s included in debian unstable, in a version released September 24th.
Bad news
I generally prefer using packages from debian testing, and it has a version released July 24th, well before any of this, and without any of Joey’s recent enhancements.

Fortunately, drawing in the September/~unstable~ version isn’t too terribly difficult. My /etc/apt/preferences.d/simple configuration has Pin-Priority values that prefer stable over testing, testing over unstable, and unstable over experimental (where enormous potential for breakage lies!).

As a consequence, installing the testing version is pretty easy, albeit involving an option I had to go looking for:

root@cbbrowne:~# apt-get -t unstable install git-annex
... leads to loading ...
Get:1 http://ftp.us.debian.org/debian/ unstable/main git-annex amd64 3.20120924 [7,411 kB]

And, with a run of % git annex webapp, it’s up and running!

Syndicated 2012-10-12 15:06:31 from linuxdatabases.info

Netboot via PXE

Netboot via PXE 2012-03-13 Tue

Some notes

To get this to work, you need…

BIOS ROM that supports PXE
True for most modern motherboards and/or NICs
DHCP server
To manage passing out configuration such as IP addresses and the next-server attribute.
TFTP server
With images
???
It looks for images based on most-to-least specific configuration
  • MAC address
  • IP subnet
  • Default

Some things PXE doesn’t support

It was created as a standard in 1999, and hasn’t been updated much since, so there are things that postdate it, and that are thus not supported.

WIFI
Likely to be troublesome anyways, as you surely want some authentication to get onto a WIFI network
IPv6
It wasn’t clear that it yet mattered in 1999…
DNS
It works with IP addresses only

DHCP discussion

  • Go look for next-server attribute
  • Some discussion of handling sharing subnets across a redundant set of DHCP servers

More worth looking at

Inquisitor
OSS hardware testing tool that’s better than memtest
gPXE
OSS bootloader
  • Supports DNS, so can forward requests broadly potentially anywhere
  • Can transfer data across additional protocols, such as HTTP, HTTPS, SAN (iSCSI, AoE)
  • Can support WIFI
  • Possibly IPv6

Syndicated 2012-03-14 19:47:00 from linuxdatabases.info

Subversion “deprecation”

I was a bit tickled by the characterization I saw today in the new Subversion release, describing the deprecation of version 1.5:

The Subversion 1.5.x line is no longer supported. This doesn't mean
that your 1.5 installation is doomed; if it works well and is all you
need, that's fine. "No longer supported" just means we've stopped
accepting bug reports against 1.5.x versions, and will not make any
more 1.5.x bugfix releases.

They aren’t telling us the world will end for anyone using version 1.5, just that they don’t intend to provide support anymore.

Which seems like a fine thing. Version 1.5 is 3 years old, and, when they seem to be releasing about a version per year (1.0 in 2004, 1.7 in 2011), 3 years of backwards support doesn’t seem dramatically insufficient. Particularly if, when support goes away, you’re not inherently doomed!

Syndicated 2011-10-11 19:55:00 from linuxdatabases.info

PostgreSQL 9.1 now available

Making for some reasonably good news on 9/11, the next version of PostgreSQL, version 9.1, has been released.

Major enhancements include:

Synchronous replication
continuing the enhancements to built-in WAL-based replication
Per-column collations
to support linguistically-correct sorting down to the column level
Unlogged tables
improving performance for the handling of ephemeral data (e.g. – such as caches)
K-Nearest-Neighbor Indexing
indexing on distances for geographical and text-search queries
Serialized Snapshot Isolation
implementing “true serializability”
Writable Common Table Expressions
recursive and similar queries can now update data
Security Enhanced Postgres
Similar to SE-Linux, providing Mandatory Access Controls for higher grade security
Foreign Data Wrappers
attach to other databases and data sources
Extensions
managing deployment of additional database features

Many of these continue the trend of continuing to enhance features added in earlier versions (e.g. – synchronous replication, KNN, Writable CTEs)

Some introduce new kinds of functionality (e.g. – SE-Postgres, FDW, Extensions), where new seeds are sown, that we may expect to flower into further new features in future versions.

Syndicated 2011-09-12 17:00:00 from linuxdatabases.info

Music Playing

My latest “musical experiment” is with Clementine, which was recently added to Debian.

I should note things that I have used in the past, and some areas of past pain:

XMMS
Which has often been nice enough, but which has grown long in the tooth.
XMMS2
Which takes the desirable step of being a client/server system which admits the availability of a bunch of backends. I have, when using it, tended to prefer the shell backend.
Amarok
An “all singing, all dancing” option…
  • It uses KDE, which I’m historically not terribly keen on
  • It has libraries that are evidently clever enough to pull music off my iPod Touch as long as it’s plugged into a USB dock
  • It has the “KDE integration” that seems to want to have widgets integrating into some “KDE-compliant” window manager. I’m running StumpWM, which is decidedly not a KDE thing, so controlling Amarok always seems like a bit of a crapshoot…
  • I have played a bit with the “playlist” functionality; it hasn’t yet agreed with me…

At any rate, I saw Clementine listed as “new in Debian,” so thought I’d take a peek. I’m liking what I see thus far:

  • Onscreen widgets for all the sorts of things that need to be controlled, including
    • Managing music library, so as to add things
  • Like Amarok, it can see my iPod whenever it’s plugged in, and can play that music through the computer
  • It easily grabbed album covers (I’m not sure what service it’s using) for most of my music
  • Onscreen controls seem pretty reasonable, though I kind of wish the volume control was larger, as that’s something one wants most frequently to fiddle with.
  • There’s a cool visualization widget (think “equalizer”)

Seems pretty likable thus far…

Syndicated 2011-06-15 21:46:00 from linuxdatabases.info

What’s Up Lately With Slony?

What’s up Lately? 2011-04-12 Tue

Git Changeover

In July 2010, we switched over to use Git, which has been working out quite fine so far. The official repository is at git.postgresql.org; note that some developers are publishing their repositories publicly at GitHub:

You can find details at those “private” repositories of branches that the developers have opened to work on various bug fixes and features.

The next big version

We have been working on what seems most likely to be called the “2.1 release.”

  • There are quite a lot of fixes and enhancements already in place. We have been quite faithful about integrating release notes in as changes are made, so Master RELEASE notes should be quite accurate in representing what has changed. Some highlights include:
    • Changes to queries against sl_log_* tables improve performance when undergoing large backlog
    • Slonik now supports commands to bulk-add tables and sequences
    • Integration of clustertest framework that does rather more sophisticated tests, obsolescing previous “ducttape” and shell script tests.
    • Cleanup of a bunch of things
      • Use named parameters in all functions.
      • Dropped SNMP support that doesn’t seem to run anymore, and which was never part of any regression tests.
  • It is unlikely that it will get dubbed “version 3,” as there aren’t the sorts of deep changes that would warrant such.
    • The database schema has not materially changed in any way that would warrant re-initializing clusters, as was the case between version 1.2 and 2.0.
    • The changes generally aren’t really huge, with the exceptions of a couple features that aren’t quite ready yet (which deserves its own separate discussion)

Still Outstanding

There are two features being worked on, which we hoped would be ready around the time of PGCon 2011:

Implicit WAIT FOR EVENT
This feature causes most Slonik commands to wait for whatever event responses should be received before they may be considered properly finished. For instance SUBSCRIBE SET would wait until the subscription has been completed before proceeding.
Multinode FAIL OVER
For clusters where there are multiple origins for different sets, this allows reshaping the entire cluster properly, which has historically been rather more troublesome than people usually were able to recognize.

Unfortunately, neither of these are quite ready yet. It is conceivable that the automatic waiting may be mostly ready, but complications and interruptions have gotten in the way of completion of multinode failover.

When will 2.1 be ready?

Three possibilities seem to present themselves:

  1. Release what we’ve got as 2.1, let the outstanding items arrive in a future version.Unfortunately, this would seem to dictate that we support a “version 2.1″ for an extended period of time, complete with the trouble and effort of backpatching. It’s not very attractive.
  2. Draw in Implicit WAIT FOR EVENT, which would make for a substantially more featureful 2.1, and let multinode FAIL OVER come along later.We had been hoping that there would be common functionality between these two features, so had imagined it a bad idea to do one without the other. But perhaps that’s wrong, and Implicit WAIT FOR EVENT doesn’t need multinode failover to be meaningful. That does seem like it may be true.

    There is still the same issue as with 1. above, that this would mean having an extra version of Slony to support, which isn’t something anyone is too keen on.

  3. Wait until it’s all ready.This gets rid of the version proliferation problem, but means that it’s going to be a while (several months, perhaps quite a few) before users may benefit from any of these enhancements.

    Development of the failover facility seems like it will be bottlenecked for a while on Jan, so this suggests that it may be timely to solicit features that Steve and I might work on concurrently in the interim.

So, what might still go into 2.1?

  • We periodically get bug reports from people about this and that, and minor things will certainly get drawn in, particularly if they represent incorrect behaviour.
  • ABORT scriptI plan to send a note out soon describing my thoughts thus far.
  • Cluster Analysis ToolingI think it would be pretty neat to connect to a Slony cluster, pull out some data, and generate some web pages and GraphViz diagrams to characterize the status and health of the cluster.
  • There was evidently discussion at PGEast about trying to get the altperl scripts improved/cleaned up.My personal opinion (cbbrowne) is that they’re not quite general enough, and that making them so would be more trouble than it’s worth, so my “vote” would be to deprecate them.

    But that is certainly not the only opinion out there – there are apparently others that regularly use them.

    While I’m not keen on putting effort into them, if there is some consensus on what to do, I’d go along with it. That might include:

    • Adding scripts to address slonik features that have not thus far been included in altperl.
    • Integrating tests into the set of tests run using the clustertest framework, so that we have some verification that this stuff works properly.
  • Insert Your Pet Feature Here?Maybe there’s some low hanging fruit that we’re not aware of that’s worth poking at.

Syndicated 2011-04-12 19:26:00 from linuxdatabases.info

Fast COUNT(*) in PostgreSQL

One of the frequently-asked questions about PostgreSQL is “why is SELECT COUNT(*) FROM some_table doing a slow sequential scan?”

This has been asked repeatedly on mailing lists everywhere, and the common answer in the FAQ provides a fine explanation which I shall not repeat. There is some elaboration on slow counting.

Regrettably, the proposed alternative solutions aren’t always quite so fine. The one that is most typically pointed out is this one, Tracking the row count

How Tracking the row count works

The idea is fine, at least at first blush:

  • Set up a table that captures row counts
CREATE TABLE rowcounts (
  table_name text not null primary key,
  total_rows bigint);
  • Initialize row counts for the desired tables
DELETE FROM rowcounts WHERE table_name = 'my_table';
INSERT INTO ROWCOUNTS (table_name, total_rows) SELECT 'my_table', count(*) from my_table;
  • Establish trigger function on my_table which has the following logic
if tg_op = 'INSERT' then
   update rowcounts set total_rows = total_rows + 1
     where table_name = 'my_table';
elsif tg_op = 'DELETE' then
   update rowcounts set total_rows = total_rows - 1
     where table_name = 'my_table';
end if;
  • If you want to know the size of my_table, then query
SELECT total_rows FROM rowcounts WHERE table_name = 'my_table';

The problem with this approach

On the face of it, it looks fine, but regrettably, it doesn’t work out happily under conditions of concurrency. If there are multiple connections trying to INSERT or DELETE on my_table, concurrently, then all require an exclusive lock on the tuple in rowcounts for my_table, and there is a risk (heading towards unity) of:

  1. Deadlock, if different connections access data in incompatible orderings
  2. Lock contention, leading to delays
  3. If some of the connections are running in SERIALIZABLE mode, rollbacks due to inability to serialize this update

So, there is risk of delay, or, rather worse, that this counting process causes otherwise perfectly legitimate transactions to fail. Eek!

A non-locking solution

I suggest a different approach, which eliminates the locking problem, in that:

  • The triggers are set up to only ever INSERT into the rowcounts
  • An asynchronous process does summarization, to shorten rowcounts
  • I’d be inclined to use a stored function to query rowcounts

Table definition

CREATE TABLE rowcounts (
    table_name text not null,
    total_rows bigint,
    id serial primary key);
create index rc_by_table on rowcounts(table_name);

I add the id column for the sake of nit-picking normalization, so that anyone that demands a primary key gets what they demand. I’d not be hugely uncomfortable with leaving it off.

Trigger strategy

The triggers have the following form:

if tg_op = 'INSERT' then
   insert into rowcounts(table_name,total_rows) values ('my_table',1);
elsif tg_op = 'DELETE' then
   insert into rowcounts(table_name,total_rows) values ('my_table',-1);
end if;

Note that since the triggers only ever INSERT into rowcounts, they no longer interact with one another in a way that would lead to locks or deadlocks.

Function to return row count

create or replace function row_count(i_table text) returns integer as $$
begin
   return sum(total_rows) from rowcounts where table_name = i_table;
end
$ language plpgsql;

It would be tempting to have this function itself do a “shortening” of the table, but, that would reintroduce into the application the locking that we were wanting to avoid. So DELETE/UPDATE are still deferred.

Function to clean up row counts table

This function needs to be run once in a while to summarize the table contents.

create or replace function rowcount_cleanse() returns integer as $$
define
   prec record;
begin
   for prec in select table_name, sum(total_rows) as sum, count(*) as count from rowcounts group by table_name loop
       if count > 1 then
          delete from rowcounts where table_name = prec.table_name;
          insert into rowcounts (table_name, total_rows) values (prec.table_name, prec.total_rows);
       end if;
   end loop;
   return 0;
end
$ language plpgsql;

Initializing rowcounts for a table that is already populated

Nothing has yet been mentioned that would cause an initial entry to go into rowcounts for an already-populated table.

create or replace function rowcount_new_table(i_table text) returns integer as $$
declare
   query text;
begin
   delete from rowcounts where table_name = i_table;
   query := 'insert into rowcounts(table_name, total_rows) select ''|| i_table ||'', count(*) from ' || i_table || ';';
   execute query;
   return total_rows from rowcounts where table_name = i_table;
end
$ language plpgsql;

If a table has already got data in it, then it’s necessary to populate rowcounts with an initial count. Implementing such a function is straightforward, and is left as an exercise to the reader.

Further enhancements possible

It is possible to shift some of the maintenance back into the row_count() function, if we do some exception handling.

create or replace function row_count(i_table text) returns integer as $$
declare
   prec record;
begin
   begin
      lock table rowcounts nowait;
      select sum(total_rows) as sum, count(*) as count from rowcounts where table_name = i_table;
      if count > 1 then
          delete from rowcounts where table_name = i_table;
          insert into rowcounts (table_name, total_rows) values (prec.table_name, prec.total_rows);
      end if;
      return prec.total_rows;
   exception
      return sum(total_rows) from rowcounts where table_name = i_table;
   end;
end
$ language plpgsql;

This is more than a little risky, as, if this function wins the lock, it will block other processes that wish to access row counts until it’s done, this likely isn’t a worthwhile exercise.

Syndicated 2011-03-04 16:34:00 from linuxdatabases.info

Please Send A Patch

Recent Debian blog entries with this title (by Lucas Nussbaum, Matt Palmer) point out assortedly that:

  • Existing developers frequently know the code base so much better than newcomers that they’re likely way more effective at improving things than some callow newcomer.
  • Taking those developers’ time to do your pet thing instead of something they find useful mayn’t be more effective.

Both points are quite valid, and recent PostgreSQL CommitFest activity suggests a way to at least try to evaluate things.

The PostgreSQL project has a number of committers that are unusually productive developers (-1 from me, Tom? :-) ), and there have certainly been times when the “best” outcome has been for someone to come in suggesting ideas, and for one of the notably productive folk to implement it.

But there has been some debate surrounding the 2011-01 CommitFest, which consists of some 98 proposed patches, all of which require review. These are all, in fact, patches that came as some sort of response to Please send a patch :-) . The trouble with this particular CommitFest is that the patches have been overwhelming the reviewers in terms of sheer volume. Developers that should be considering working on their own “pet features” have been drawn into the review process to look at others’ features instead. None of these results are inherently a bad thing, except for the aggregate that falls out, which is that there’s so much stuff outstanding that it’s tough to get them all properly reviewed.

If a project is busy and vital, it’s pretty necessary for people to do a fair bit of “scratching their own itches” (in keeping with Matt Palmer’s comment) in order to grow the community of people capable of giving real assistance to managing the code base.

“Growing community” requires that some people struggle with the code base a bit so that they become familiar enough to become effective in the future.

Syndicated 2011-02-15 16:18:00 from linuxdatabases.info

22 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!