ta0kira is currently certified at Apprentice level.

Name: Kevin Barry
Member since: 2009-02-15 23:21:18
Last Login: 2010-05-09 23:57:27

FOAF RDF Share This

Notes:

  • student of cognitive science and mathematics
  • cognitive neuroscience research assistant
  • self-taught programmer since 1991
  • idealist masquerading as a realist in a sort of rebellious ultra-compliance
  • selective perfectionist (perfectionism in isolation, chaos elsewhere)
  • (many other things irrelevant here)

Projects

Recent blog entries by ta0kira

Syndication: RSS 2.0
28 Mar 2009 (updated 4 Apr 2009 at 17:05 UTC) »
Running Linux from an $D card on Toshiba P0rtégé M75O


1: My Initial Assessment


Normally I wouldn't delete an entire blog entry; however, I don't want this to come up in web searches when people are looking for help. I'd planned on setting up what the subject says (replace the $ with S and it should make more sense), but it turns out I won't be owning such a machine. Sorry for the misinformation.
Kevin Barry
redi and redi,

You're entirely correct about everything you said, yet you're practically making my point for me. Maybe the confusion is that I reference gcc specifically. gcc is an excellent implementation of what irritates me in a very minor way. In fact, writing this much about it makes it sound like a huge problem either for me or for C++.

There are four reasons for #include <my_callback.h> in C++ to be taken as C++. The first is that C++ started as a pre-processor and the standard headers ended with .h before the ISO and ANSI standards. The second is a failure to subsequently differentiate between C and C++ headers, leading to widespread use of .h for C++ headers. The third is to retain compatibility with C, which can be used to justify letting .h be either C or C++. The fourth is that the standard C++ headers have no extension; therefore, everything must be assumed to be C++.

If allowing .h to be C++ is to allow for compatibility with C then the point is lost because of the details I described in previous posts. If allowing .h to be C++ is to take into consideration older C++ headers ending with .h, that's something that could have been standardized along with everything else. The more likely reason is that a large number of programmers still use .h for C++ headers. You've already lost compatibility with C to some extent because one has to explicitly let the compiler know that C is being used. At that, some parts of standard C will never compile when #included into a C++ file no matter how you try to do it.

It's not the behavior that irritates me; I know there are "official reasons" for it. It's the fact that one file can be C in one context and C++ in another under the same compiler. You can even #include "hello, this is kevin", and something isn't right about that.

Lastly, I can't believe you picked this to dispute out of my entire original post, but you have indeed made your point.

Kevin Barry

redi, you misunderstand my point. Take the example below:

  1. I want to create a shared library with the function my_callback. Because I want to support C programs, dlsym, and C++, I want my_callback to be unmangled; therefore, I put it in my_callback.h.

  2. I compile my 100%-C library libmc.so.

  3. program-a needs my_callback without hard-linking. Because my_callback isn't mangled, dlsym is an option.

  4. program-b needs my_callback with hard-linking. Because my_callback isn't mangled, program-b must either use extern "C" when including my_callback.h or my_callback.h needs to conditionally use extern "C" if C++ compilation is detected. This is because gcc infers that my_callback.h is meant to be a C++ header rather than at least implicitly giving it C linkage. The problem isn't apparent until link time, however; gcc mangles the name and an "undefined reference" error occurs.

I use gcc above to point out that it isn't just g++ that will do this.

Kevin Barry

Belated Response to appenwar's 'tokenizers' Blog

Most of the points have been well-spoken by others with more experience than I have; therefore, I'll stick to my own points. This has less to do with what you actually said and more to do with the principle.

One thing that always irritates me is how gcc will ignore C/C++ file extensions and take a guess, or it will default to C++. For example, a .h will only be taken as C if included strictly by a chain of C files and only if you don't use g++. One must therefore include the awkward #ifdef __cplusplus \ extern "C" { because some people don't know how to use the correct file extensions, otherwise you might have linking problems if your header is actually backed by a C source. If you have to use a C feature not carried over to C++ in a C file (e.g. the .sym = member initializer,) you can't #include your file in a C++ file even with extern "C". You can also get away with not qualifying structure variables with struct in C headers if a C++ file includes it. All of this leads to less concise code, all because of acceptable ambiguity. I do concede that early C++ used .h extensions for the standard headers, so it's partly lack of foresight.

Today I finally got around to using libxml2, which struck me as extensively (yet somehow poorly) documented and extremely ambiguous. On the other hand, it will save having to write my own compliant parser to parse the ~1.4M lines of XML I need to convert and load into a database. This has little to do with libxml2 not accepting partial errors because the data I received was probably exported from SQL using the same library. I'd actually copy the trees created by libxml2 into a more usable structure if they weren't going right into a database, but XML is meant as a format, not as a run-time representation.

If someone is actually hand-writing XML-proper, chances are they're missing the point (or they're dealing with a software interface that misses the point.) Additionally, if someone is using software other than libxml2 to generate XML, they're either missing the point or they lack the appropriate language bindings. That being said, I use my own library to assemble and parse "XML-like" structures (closer to HTML, I guess) for IPC. It wouldn't make sense for me to use formal XML for the application, and especially not libxml2. Though the formats are very similar, the run-time organization used by libxml2 isn't anywhere near being suitable for what I use the data for. Then again, I don't need any sort of standardization because the data doesn't go anywhere outside of the application. It's a symmetrical system because data importation and exportation are designed concurrently to compliment each other, which I can only assume is the case with libxml2.

Something many formal projects lack (software and otherwise) are explicit correlations between the core purposes of the project and the aspects of implementation (yes, I'm guilty, too.) If I were to author something comparable to XML, I'd explicitly state that it isn't meant to be hand-written and it's primarily intended to allow data transfer between applications with different maintainers. At the point of deciding whether or not to accept simple errors, I'd defer back to those principles and conclude that errors should not be accepted. If I were to author something like HTML on the other hand, I would account for hand-written code and acknowledge that rendering with errors is better than rejecting a file. All too often projects are approached with founding principles, yet they fail to rationally extrapolate those principles to the level of implementation (guilty, again.)

Rather than getting into everything already brought up, I'll leave it at that.

Kevin Barry

21 Feb 2009 (updated 22 Feb 2009 at 17:15 UTC) »
RE: Advogato posters: leech or seed? by cdfrey

I was actually very tempted to write a post about syndicated blogs today. cdfrey essentially said what I would have, but I probably would have been more elaborate and possibly less considerate. This is actually a great article topic, although I'm on my way to bed and I'm too lazy to compose right now.

I find myself pre-scanning the recent blogs for those that aren't syndicated. That's about 10%, which certainly saves me a lot of reading. Syndication just tells me "what I have to say is so important that many people on many sites will read it, but I don't have time to go to all of those sites and read what other people post." That might not be the truth, and indeed some people do generate more valuable entries that are of interest to a wide community. It might be better to have a "most-recently syndicated" list separate from the "I actually signed in, making it possible for me to read others' writing" list.

Many of the syndicated blogs provide useful information, but I don't think they belong in the same section as those originating from this site. I can't think of any other site where an RSS feed gets interleaved with original content as if it were the same.

Kevin

5 older entries...

 

ta0kira certified others as follows:

  • ta0kira certified poireau as Journeyer
  • ta0kira certified rcaden as Journeyer
  • ta0kira certified lkcl as Master
  • ta0kira certified fzort as Journeyer
  • ta0kira certified jstepien as Apprentice
  • ta0kira certified quasi as Apprentice

Others have certified ta0kira as follows:

  • mjg59 certified ta0kira as Apprentice
  • fzort certified ta0kira as Apprentice
  • MAK certified ta0kira as Apprentice
  • murajov certified ta0kira as Apprentice

[ Certification disabled because you're not logged in. ]

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!

X
Share this page