Auditory recognition module for artificial intelligence

Posted 14 May 2010 at 20:03 UTC by mentifex Share This

The AudRecog mind-module for auditory recognition in artificial intelligence (AI) tests user input one character or phoneme at a time to recognize words and morphemes that will activate a concept in the AI Mind or extract meaning from an idea.

1. Diagram of AudRecog

   /^^^^^^^^^\  Auditory Recognition of "c-a-t-s"  /^^^^^^^^^\
  /    EYE    \ REACTIVATED                       /   EAR     \
 /             \ CONCEPTS                        /"CATS"=input \
|   _______     |   | | |    SEMANTIC MEMORY    |               |
|  /old    \!!!!|!!!| | |                       |  C     match! |
| / image   \---|-----+ |             ___       |  -A    match! |
| \ fetch   /   |   |c| |            /   \      |    R    stop  |
|  \_______/    |   |a| |           /     \     |     S   drop  |
|               |   |t| |          / Old-  \    |               |
|  visual       |   |s| |         ( Concept )   |  C     match! |
|               |  e| | |          \       /    |  -A    match! |
|  memory       |  a| | |          /\     /!!!!!|!!!!T   match! |
|               |  t| | |   ______/  \___/------|-----S  recog! |
|  reactivation |   | |f|  /      \             |               |
|               |   | |i| (EnParser)            |  C     match! |
|  channel      |   | |s|  \______/             |  -A    match! |
|   _______     |   | |h|      |                |  --T   match! |
|  /old    \    |   |_|_|     _V_________       |  ---S   busy  |
| / image   \   |  /     \   /           \      |      U  drop  |
| \ store   /---|--\ Psi /--( InStantiate )     |       P drop  |
|  \_______/    |   \___/    \___________/      |               |

2. Algorithm of AudRecog

AudRecog works by comparing each word of input against words stored in the auditory memory channel of the AI Mind. If a matching word is found in memory, the OldConcept module is called to reactivate the concept behind the known word. If no matching word is found in memory, the NewConcept module is called to treat the incoming word as a new concept to be learned by the AI. Note that even a misspelled word will briefly be treated as a new concept, which quickly falls into desuetude if the proper spelling is used during subsequent inputs. Note also that users (companions) of the AI are not permitted to backspace during input to correct a mistake, because AudRecog is processing input dynamically and does not wait for a buffer to be filled with input to be submitted.

When AudRecog is trying to recognize a word like "CATS" as depicted above, all words starting with "C" are activated on both the initial "C" and on the next character stored after "C". Then one by one the input characters are tested for a continuing match-up between memory and input. If the chain of matching characters is broken, a candidate recall word is dropped from consideration. A remembered word that matches input in both length and content activates the deep Psi concept associated with the recognized word, and the AI Mind prepares to think in reaction to the input being recognized.

During the sequencing of the human genome, a technique remarkably akin to the AudRecog algorithm was used to recognize patterns among short strings of human DNA.

3. Complexity in AudRecog

In some ways AudRecog is the most complex and intricate of the forty-odd MindForth mind- modules. Other modules engage in thinking, but they do so by the rather simple process of spreading activation from concept to concept under the supervision of a linguistic superstructure. A barely functional VisRecog module would be vastly more sophisticated and complex than AudRecog, but AI devotees will delay implementing vision in MindForth until the proof- of-concept AI proves itself sufficiently to warrant implantation in physical robots and outfitting with physical vision.

What makes AudRecog so complex is the need to recognize not just complete words but also morphemes as parts of words. In September of 2008, AudRecog made perhaps not a saltational leap but a major step forward by incorporating an improved algorithm of using differential activation to recognize subwords or parts of words within a complete word.

4. Source code of AudRecog from 10 May 2010

:  AudRecog ( auditory recognition )
  0 audrec !
  0 psi !
  8 act !
  0 actbase !
  midway @  spt @ DO
    I 0 aud{ @ pho @ = IF  \ If incoming pho matches stored aud0;
      I 1 aud{ @ 0 = IF    \ if matching engram has no activation;
        I 3 aud{ @ 1 = IF  \ if beg=1 on matching no-act aud engram;
       \ audrun @ 1 = IF   \ if comparing start of a word; 8may2010
         audrun @ 2 < IF   \ if comparing start of a word; 8may2010
          I 4 aud{ @ 1 = IF   \ If beg-aud has ctu=1 continuing,
            8 I 1+   1 aud{ !  \ activate the N-I-L character,
            0 audrec !
            len @ 1 = IF
              I 5 aud{ @  monopsi !
            THEN  \ End of test for one char length.
          THEN   \ end of test for continuation of beg-aud
         THEN  \ end of test for audrun=1 start of word.
        THEN   \ end of test for a beg(inning) non-active aud0
      THEN   \ end of test for matching aud0 with no activation
      I 1 aud{ @ 0 > IF  \ If matching aud0 has activation,
        0 audrec !       \ Zero out any previous audrec.
        I 4 aud{ @ 1 = IF  \ If act-match aud0 has ctu=1 continuing,
          2 act +!           \ Increment act for discrimination.
          0 audrec !         \ because match-up is not complete.
          act @ I 1+   1 aud{ ! \ Increment for discrimination.
        THEN  \ end of test for active-match aud0 continuation
        I 4 aud{ @ 0 = IF  \ If ctu=0 indicates end of word
          len @ 2 = IF  \ If len(gth) is only two characters.
          \ I 1 aud{ @ 0 > IF  \ Or test for eight (8).
            I 1 aud{ @ 7 > IF  \ testing for eight (8).
              I 5 aud{ @ psibase !  \ Assume a match.
            THEN  \  End of test for act=8 or positive.
          THEN   \ End of test for two-letter words.
        THEN   \ End of test for end of word.
        I 1 aud{ @ 8 > IF  \ If activation higher than initial
          8 actbase !  \ Since act is > 8 anyway; 8may2010
          I 4 aud{ @ 0 = IF  \ If matching word-engram now ends,
            I 1 aud{ @ actbase @ > IF  \ Testing for high act.
              I 5 aud{ @ audrec !  \ Fetch the potential tag
              I 5 aud{ @ subpsi !  \ Seize a potential stem.
              len @ sublen !    \ Hold length of word-stem.
              I 5 aud{ @ psibase !  \ Hold onto winner.
              I 1 aud{ @ actbase !  \ Winner is new actbase.
            THEN  \ End of test for act higher than actbase.
            0 audrec !
            monopsi @ 0 > IF
              monopsi @ audrec !
              0 monopsi !
            THEN   \ End of inner test.
          THEN  \ End of test for final char that has a psi-tag.
        THEN  \  End of test for engram-activation above eight.
      THEN  \ End of test for matching aud0 with activation.
    THEN  \ End of test for a character matching "pho".
    I midway @ = IF  \ If a loop reaches midway; 8may2010
      1 audrun +!  \ Increment audrun beyond unity; 8may2010
    THEN   \ End of test for loop reaching midway; 8may2010
  -1 +LOOP
  0 act !
  0 actbase !
  psibase @ 0 > IF
     psibase @  audrec !
  audrec @ 0 = IF
    monopsi @ 0 > IF
      len @ 2 < IF
        monopsi @ audrec !
      0 monopsi !
   audrec @ 0 = IF
        psibase @ 0 > IF
          psibase @ audrec !
  audrec @ 0 = IF
      morphpsi @ audrec !
    sublen @ 0 > IF
      len @ sublen @ -  stemgap !
    stemgap @ 0 < IF 0 stemgap ! THEN
    stemgap @ 1 > IF 0 subpsi ! THEN
    stemgap @ 1 > IF 0 morphpsi ! THEN
    stemgap @ 1 > IF 0 audrec ! THEN
  subpsi @ morphpsi !
  0 psibase !
  0 subpsi !
  audrec @ 0 > IF
    stemgap @ 2 > IF
      0 audrec !
    pho @ 83 = IF
      2 num !
  audrec @ audpsi !
;  ( End of AudRecog; return to AudMem auditory memory )

5. Troubleshooting AudRecog

Temporary diagnostic messages may be inserted into the source code to display exactly what AudRecog is doing as it processes input. Typically such messages will identify important variables and immediately state their values. Remember to remove such diagnostic messages after debugging any mind- module.

It is also helpful to stop the AI by pressing the Escape key after entering some test input and then to run the ".psi" or ".aud" array reports to see what values have been recorded during the operation of AudRecog. If a word is recognized properly, it will have the proper Psi concept number in both the auditory memory array and the Psi concept array.

As a programmer, if you have tried to use simple string-matching to recognize words, your module becomes incapable of the more subtle operations afforded it when you use not only chains of activation to recognize a series of sounds, but differential activation to recognize subsets (morphemes) within a series of sounds. Think like a neuroscientist, not like a common, garden variety-show hacker hobbled by the groupthink of string-recognition.

6. Teamwork for AudRecog

Imagine that you are a made member of an elite Super-AI maintenance team charged and entrusted with the awesome responsibility of keeping a mission-critical AI Mind up and running, while safeguarding humanity against the dangers inherent in nurturing a higher form of intelligence capable at any time of breaking loose from human control and turning (or turing) against its human origins.

If it is your job to focus exclusively on the AudRecog module, your professional standards require you to grok all ideas immanent in this current document and in whatever AudRecog literature you can glean from an exhaustive search of all pre-Cyborg, that is, human knowledge. Therefore this document was prepared with you in mind, mindkeeper or mind-maintainer or whatever your job description calls you. Be aware, be very aware, that other AI shops and other AI enterprises are most likely duplicating your every thought and your every action in the accelerating race to the Technolog ical Singularity.

7. History of AudRecog

The MindForth AudRecog module was adapted from the Amiga MindRexx "Comparat or" and "String_effect" modules of 1994, which jointly served to compare incoming phonemes against auditory engrams strung together into the memory of a word. In the archival 28may1998 MindForth as described in the ACM SIGPLAN Notices, Screen #28 is the String-Effect and Screen #49 is the Comparator precursor to AudRecog.

The 11feb02A.f MindForth subsumes String-Effect into COMPARATOR, and the 4mar02A.f version of MindForth renames COMPARATOR as the AudRecog module. Although the word Comparator made sense for a module comparing input against memory, the overly broad term Comparator had to give way to the compound name AudRecog that would focus on the specific sense of audition and on the function of recognition, so that other sensory comparators could eventually be named with such appropriate terms as GusRecog, OlfRecog, TacRecog and VisRecog. Such precision in the naming of mind-modules frees up avenues of future AI development, because the names are already stubbed in for enterprising individuals to write the code.

8. MindForth Programming Journal (MFPJ)

Some but not all of the recent MFPJ entries dealing with AudRecog are available on-line among the following locations.

Sat.16.AUG.2008 - Tweaking the audRecog Module

Mon.18.AUG.2008 - audRecog Word-Stem Recognition

Tues.30.SEP.2008 -- audRecog Word-Stem Recognition

Wed.12.MAY.2010 -- Solution and Bugfix of AudRecog

9. Future of AudRecog

Just as MindForth is a precursor of next-generation AI Minds, likewise the AudRecog mind-module is a primitive implementation of AI technology that must mutate and evolve into a more advanced state of the art. Chief among the impending changes will be a switch-over from keyboard ASCII input to speech recognition of phonemic input. The SpeechAct module and the AudRecog module must both evolve in tandem so that the AI Mind may issue speech output and comprehend spoken input.

the piano is drinking, posted 19 May 2010 at 15:24 UTC by badvogato » (Master)

Come on, skank along, it's the skanking song .

what is the data input?, posted 20 May 2010 at 12:48 UTC by lkcl » (Master)

mentifex, hi,

what's the input format to AudRecog? i have a friend who did neural network word recognition using a traditional NN some twenty years ago: they simply wired the input values directly from a VOCODER algorithm - the compressed data - rather than inputting the full audio spectrum @ 48khz. as a result, the reduced data rate and massively-compressed but still "correct" representation resulted in a successful project.

what input format is AudRecog getting its audio data in?

signal-to-noise ratio, posted 20 May 2010 at 12:49 UTC by lkcl » (Master)

badvogato: i've never asked you this before, but the responses you're giving on articles is significantly reducing the value of the entire site. could you please keep to relevant comments? thanks.

data input is text from keyboard, posted 20 May 2010 at 15:49 UTC by mentifex » (Master)

Greetings, lkcl,

Currently the input format to AudRecog is simply ASCII text from the computer keyboard -- a format which pretends that English spelling is basically phonetic. If a word is consistently misspelled, the misspelling serves as the concept-identifier just as well as the correct spelling would have. Such an input format is good enough for a proof-of-concept AI, and presents a challenge for enterprising souls to implement phonemic, acoustic input.

Thank you for the fascinating remarks about the VOCODER, which I have looked up and found at /Vocoder -- although it was very annoying that the Google search engine, when I had clicked on the link to the Vocoder article at Wikipedia and I thought that the article was loading in its own window, turned out to have loaded only a "Redirect Notice" asking me to confirm (duh!) that I actually wanted the page I had requested.

By the way, I consider my above Advogato article #1040 to be a "Personal Best" at a level of quality which I may not be able to achieve again anytime soon in the future, so thanks, lkcl, for taking it seriously and open-source professionally.


New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!

Share this page