First glance: NuPIC Oddities

Marcus Lewis | 2015-11-12 | Download this IPython Notebook

While running NuPIC, I've been seeing a few unexpected behaviors. I figured it'd be useful to capture them in a notebook where you can inspect them from a code environment.

Oddity 1: Predicted column activates. Predicted cell doesn't.

In [1]:
from IPython.display import Image
Image(url="https://www.dropbox.com/s/ims48coyn4sr9b4/Screenshot%202015-11-12%2012.13.08.png?dl=1") 
Out[1]:

See the blue "cell 4"? That cell was predicted, but not activated.

See the red "cell 0"? That cell was not predicted, but was activated.

Let's rerun up to step 545 and verify this by API.

In [2]:
import datetime
from itertools import groupby

import nupic
from nupic.frameworks.opf.modelfactory import ModelFactory
In [3]:
cellsPerColumn = 32

MODEL_PARAMS = {
    'aggregationInfo': {
        'days': 0,
        'fields': [],
        'hours': 0,
        'microseconds': 0,
        'milliseconds': 0,
        'minutes': 0,
        'months': 0,
        'seconds': 0,
        'weeks': 0,
        'years': 0
    },
    'model': 'CLA',
    'modelParams': {
        'anomalyParams': {
            u'anomalyCacheRecords': None,
            u'autoDetectThreshold': None,
            u'autoDetectWaitRecords': None
        },
        'clParams': {
            'alpha': 0.01472949181138251,
            'clVerbosity': 0,
            'regionName': 'CLAClassifierRegion',
            'steps': '1'},
        'inferenceType': 'TemporalMultiStep',
        'sensorParams': {
            'encoders': {
                '_classifierInput': {
                    'classifierOnly': True,
                    'clipInput': True,
                    'fieldname': 'kw_energy_consumption',
                    'maxval': 53.0,
                    'minval': 0.0,
                    'n': 143,
                    'name': '_classifierInput',
                    'type': 'ScalarEncoder',
                    'w': 21
                },
                u'kw_energy_consumption': {
                    'clipInput': True,
                    'fieldname': 'kw_energy_consumption',
                    'maxval': 53.0,
                    'minval': 0.0,
                    'n': 97,
                    'name': 'kw_energy_consumption',
                    'type': 'ScalarEncoder',
                    'w': 21
                },
                u'timestamp_dayOfWeek': None,
                u'timestamp_timeOfDay': {
                    'fieldname': 'timestamp',
                    'name': 'timestamp',
                    'timeOfDay': ( 21,
                                   4.92648688354549),
                    'type': 'DateEncoder'
                },
                u'timestamp_weekend': None},
            'sensorAutoReset': None,
            'verbosity': 0},
        'spEnable': True,
        'spParams': {
            'columnCount': 2048,
            'globalInhibition': 1,
            'inputWidth': 0,
            'maxBoost': 2.0,
            'numActiveColumnsPerInhArea': 40,
            'potentialPct': 0.8,
            'seed': 1956,
            'spVerbosity': 0,
            'spatialImp': 'cpp',
            'synPermActiveInc': 0.05,
            'synPermConnected': 0.1,
            'synPermInactiveDec': 0.07703611403439363
        },
        'tpEnable': True,
        'tpParams': {
            'activationThreshold': 13,
            'cellsPerColumn': cellsPerColumn,
            'columnCount': 2048,
            'globalDecay': 0.0,
            'initialPerm': 0.21,
            'inputWidth': 2048,
            'maxAge': 0,
            'maxSegmentsPerCell': 128,
            'maxSynapsesPerSegment': 32,
            'minThreshold': 9,
            'newSynapseCount': 20,
            'outputType': 'normal',
            'pamLength': 2,
            'permanenceDec': 0.1,
            'permanenceInc': 0.1,
            'seed': 1960,
            'temporalImp': 'cpp',
            'verbosity': 0
        },
        'trainSPNetOnlyIfRequested': False
    },
    'predictAheadTime': None,
    'version': 1
}
In [4]:
import csv
from urllib import urlopen

inputFile = urlopen("http://mrcslws.com/stuff/rec-center-hourly.csv")
csvReader = csv.reader(inputFile)
csvReader.next()
csvReader.next()
csvReader.next()
Out[4]:
['T', '']
In [5]:
model = ModelFactory.create(MODEL_PARAMS)
model.enableInference({"predictedField": "kw_energy_consumption"})
In [6]:
def step():
    global model, csvReader
    timestampStr, consumptionStr = csvReader.next()
    timestamp = datetime.datetime.strptime(timestampStr, "%m/%d/%y %H:%M")
    consumption = float(consumptionStr)
    result = model.run({
        "timestamp": timestamp,
        "kw_energy_consumption": consumption,
    })
    
def getPredictedCells(model):
    tp = model._getTPRegion().getSelf()._tfdr
    return tp.getPredictedState().reshape(-1).nonzero()[0].tolist()

def getActiveCells(model):
    tp = model._getTPRegion().getSelf()._tfdr
    return tp.getActiveState().nonzero()[0].tolist()

def getActiveColumns(model):
    return model._getSPRegion().getSelf()._spatialPoolerOutput.nonzero()[0].tolist()

First, run to step 544. Observe this timestep's predicted cells.

Then run to step 545. Observe this timestep's active cells.

In [7]:
for i in range(544):
    step()

previouslyPredictedCells = getPredictedCells(model)
In [8]:
print "Timestep 744 predicted columns / cells:"
print ""
for column, cells in groupby(previouslyPredictedCells,
                             lambda x: x / cellsPerColumn):
    print "Column %d cells: %s" % (column, ", ".join(str(cell) for cell in cells))
Timestep 744 predicted columns / cells:

Column 15 cells: 507
Column 28 cells: 916
Column 226 cells: 7236
Column 798 cells: 25544
Column 821 cells: 26303
Column 854 cells: 27337
Column 858 cells: 27465
Column 897 cells: 28707
Column 1045 cells: 33445
Column 1160 cells: 37127
Column 1277 cells: 40885
Column 1354 cells: 43339
Column 1502 cells: 48080
Column 1788 cells: 57240
Column 1839 cells: 58873
Column 1884 cells: 60304
Column 1916 cells: 61333
Column 1991 cells: 63716
In [9]:
step()
activeCells = getActiveCells(model)
In [10]:
print "Timestep 745 active columns / cells:"
print ""
for column, cells in groupby(activeCells,
                             lambda x: x / cellsPerColumn):
    print "Column %d cells: %s" % (column, ", ".join(str(cell) for cell in cells))
Timestep 745 active columns / cells:

Column 15 cells: 480
Column 28 cells: 896
Column 91 cells: 2912
Column 132 cells: 4224
Column 147 cells: 4704
Column 184 cells: 5888
Column 226 cells: 7232
Column 276 cells: 8832
Column 549 cells: 17568
Column 662 cells: 21184
Column 743 cells: 23776
Column 746 cells: 23872
Column 798 cells: 25536
Column 821 cells: 26272
Column 854 cells: 27328
Column 858 cells: 27456
Column 886 cells: 28352
Column 897 cells: 28704
Column 927 cells: 29664
Column 1020 cells: 32640
Column 1045 cells: 33440
Column 1100 cells: 35200
Column 1160 cells: 37120
Column 1260 cells: 40320
Column 1277 cells: 40864
Column 1289 cells: 41248
Column 1327 cells: 42464
Column 1354 cells: 43328
Column 1492 cells: 47744
Column 1502 cells: 48064
Column 1582 cells: 50624
Column 1746 cells: 55872
Column 1788 cells: 57216
Column 1839 cells: 58848
Column 1869 cells: 59808
Column 1884 cells: 60288
Column 1916 cells: 61312
Column 1991 cells: 63712
Column 1995 cells: 63840
Column 2002 cells: 64064

You can see that many columns were predicted + active, but that a totally different set of cells was selected.

Oddity 2: Bursting on only one cell?

In [11]:
from IPython.display import Image
Image(url="https://www.dropbox.com/s/46xwmfgy0fl2265/Screenshot%202015-11-12%2012.34.41.png?dl=1") 
Out[11]:

In the same timestep as Oddity 1, all non-predicted columns have only activated their first cell.

In [12]:
predictedColumns = list(cell / cellsPerColumn for cell in previouslyPredictedCells)
activeColumns = getActiveColumns(model)
burstingColumns = filter(lambda x: x not in predictedColumns,
                         activeColumns)
In [13]:
for column, cells in groupby(activeCells,
                             lambda x: x / cellsPerColumn):
    if column in burstingColumns:
        print "Bursting column %d active cells: %s" % \
            (column, ", ".join(str(cell) for cell in cells))
    
Bursting column 91 active cells: 2912
Bursting column 132 active cells: 4224
Bursting column 147 active cells: 4704
Bursting column 184 active cells: 5888
Bursting column 276 active cells: 8832
Bursting column 549 active cells: 17568
Bursting column 662 active cells: 21184
Bursting column 743 active cells: 23776
Bursting column 746 active cells: 23872
Bursting column 886 active cells: 28352
Bursting column 927 active cells: 29664
Bursting column 1020 active cells: 32640
Bursting column 1100 active cells: 35200
Bursting column 1260 active cells: 40320
Bursting column 1289 active cells: 41248
Bursting column 1327 active cells: 42464
Bursting column 1492 active cells: 47744
Bursting column 1582 active cells: 50624
Bursting column 1746 active cells: 55872
Bursting column 1869 active cells: 59808
Bursting column 1995 active cells: 63840
Bursting column 2002 active cells: 64064

I'd expect to have 32 active cells in each of these columns.

The same issue?

So this might all be one issue: Sometimes an HTM just activates the first cell in each activated column, never bursting and never paying attention to the predicted cells.

Maybe sometimes this is intentional? But it seems wrong for predicted columns.