TNPM/5620SAM

From neil.tappsville.com
Jump to navigationJump to search

TNPM/5620SAM/Object_model


TNPM/5620SAM/Pack_sizing


TNPM/5620SAM/OAM_results


Versions

  • 2.6.0.0 4.3Y
  • 2.7.0.0 4.3Z
  • 2.8.0.0 1.3A Support for SAM 8 only.
  • 2.9.0.0 1.3B Support for SAM 6.1, 7 and 8
  • unknown 1.3D Support for SAM 9
  • 2.10.0.0 1.3E
  • 2.11.0.0 1.3G Support for SAM 8, 9, 10
  • 2.12.0.0 1.6H Oct 2012 - Support for 10 and SAM 11 - 4.4.3.2 and 1.3.1
  • 2.13.0.0 1.3K TNPM 1.3.1 only
  • 2.14.0.0 1.4A (findtofile) - 1.3.2+ (supported in TNPM 1.4 as well) --only tested with SAM 10, SAM 11, documentation states will work with 8 and 9--


New to the mix is the SAM Log to file - which uses IBM datastage and Mqueue - bolted onto the end of the existing UBAs.

  • 2.1.0.0 1.3J
  • 2.2.0.0 1.3L - 1.3.2+ (supported in TNPM 1.4 as well)
  • 2.4.0.0 1.4B (logtofile) support for 5620 SAM v10 and v11 - TNPM 1.4


Certification Matrix (from Alcatel-Lucent IPD - Connected Partner Programme)

  • 5620 SAM v11.0 R1 -- Tivoli Network Performance Manager (TNPM) 1.3.2 and Alcatel-Lucent 5620 SAM Log ToFile Technology Pack 2.2.0.0
  • 5620 SAM v10.0 R1 -- Tivoli Network Performance Manager (TNPM) v.1.3.1 with Alcatel-Lucent 5620 SAM Technology Pack 2.11.0.0
  • 5620 SAM v9.0 R1 -- Proviso v4.4.3.3 with Alcatel-Lucent 5620 SAM Technology Pack 2.10.0.0
  • 5620 SAM v8.0 -- Proviso v4.4.1.4 with Alcatel-Lucent 5620 SAM Technology Pack 2.7.0.0 (IBM state 2.6.0.0 as well)

Certification of SAM 9 - TNPM 1.3D 4.4.3.3 announced 8 Sept 2011


Cognos

  • Cognos 10 - SAM 2.4.0.0 --Log ToFile--


SAM.QUERY_START

The Data Channel BLB application queries the SAM primary and redundant SAM servers on a specified polling interval. The SAM.QUERY_START parameter specifies the first time period for which the BLB requested metrics from the SAM servers. Three related parameters are also discussed in this section:

  • SAM.EXPORTGRACEPERIOD — The BLB application queries the SAM primary and redundant servers on a specified polling interval. The SAM.EXPORTGRACEPERIOD parameter specifies the number of seconds the BLB waits after a poll interval ends before it issues the next query to the SAM primary and redundant servers. - this is the one to increase so you back off the collections from real time

SAM.EXPORTSCHEDULE — The BLB application exports the query schedule on a specified schedule interval. The SAM.EXPORTSCHEDULE parameter specifies the minutes past the hour that delineate the start of a metric collection. -always set to 0,15,30,45 *

  • SAM.FAILUREDELAY — The BLB application queries the SAM primary and redundant servers on a specified polling interval. It is possible that the primary, redundant, or both SAM servers could fail, causing a disruption in the BLB query operations. The SAM.FAILUREDELAY parameter specifies the number of seconds the BLB waits after a failure before it issues the next query to the SAM servers. The alcatel5620samsampledc.cfg template file provides the following parameter lines


Other Bits

Check SAMIF for failover messages when logging using FEWI123 (if log level 4 and app level 6 is enabled the JMS messages tell a better story)

tail -f proviso.log | egrep -e 'JMS''PROCESS|SAM''CONNECT|JMS communication'

and watch for fulldumps...

egrep -e 'JMS''PROCESS|SAM''CONNECT|JMS communication|\'''\'''\'''' '''proviso.log | grep -v "DISC.SF1288''SF2395''VIP"

What the SAMIF does when a new element is inserted - here it fails as another element exists with the same name

2011.04.11-10.46.46 UTC SAMIF.2.253-28916       I       BULK''DB''INSERT_FAILED   Bulk db insert failed because of:   ORA-00001
: unique constraint (PV''ADMIN.UN''ELDE_2) violated  retrying in by-statement mode
2011.04.11-10.46.46 UTC SAMIF.2.253-28916       W       [[DC10110]]       SQLERR  A SQL error has occurred for the SQL statemen
t: (insert into elmt''desc (str''type,str''state,int''collector,str''profile,str''name,str''comment,int''date,str''origin,str''user,idx
''ind,ncl''idx_ind) values(:type,:state,:collectorNumber,:profile,:name,:commentx,:datex,:origin,:username,:id,:nclId)) -   ORA
-00001: unique constraint (PV''ADMIN.UN''ELDE_2) violated
2011.04.11-10.46.46 UTC SAMIF.2.253-28916       W       [[DC10592]]       DB''INSERT''UPDATE''FAILED Statement 'insert into elmt''d
esc (str''type,str''state,int''collector,str''profile,str''name,str''comment,int''date,str''origin,str''user,idx''ind,ncl''idx''ind) valu
es(:type,:state,:collectorNumber,:profile,:name,:commentx,:datex,:origin,:username,:id,:nclId)' failed due to:   ORA-00001: u
nique constraint (PV''ADMIN.UN''ELDE''2) violated , Data: (type=#import,state=#on,collectorNumber=253,profile=#bulk''253,name='10
.0.16.18',commentx=nil,datex=1302518521,origin=#SAM,username=#pvuser,id=200019426,nclId=nil)

Error Messages that illustrate the SAM is providing garbage XML measurement files (SAM 8 is good at this)

egrep -e "DL39212|DL42002" proviso.log
.... UBA .... [[DL39212]]       FILE''SIZE''TOO_SMALL .....  size: 0 limit: '0'
.... BLB .... [[DL42002]]       SAM''EXPORT''FAILED       SAM metric export request failed for timeslot:
2011.05.10-02.15.00 error: PROVISO.SAMMethodInvocationError

Find the latest files fetched by the BLB

ls -lrt /appl/proviso/data/datachannel/BLB.2.251/ALCATEL''5620''SAM/done/sam | tail -28

Show how much space each 'set' of files takes up, I.E the output x4 = hourly disk usage

 du -hs `ls -rt /appl/proviso/data/datachannel/BLB.2.251/ALCATEL''5620''SAM/done/sam/* | tail -28`


Debugging UBAs

Show number of input files ( in this case it processes 10 files every 15 mins, 2 of which have data)

grep UBA.5.229 proviso.log | grep PERF''INPUT''PROCESSING | cut -d " " -f 1,3 | cut -d "." -f 1-5,7 | uniq -c

: 1 2011.09.26-19.31.01 210
: 2 2011.09.26-19.31.01 0
: 2 2011.09.26-19.31.01 210
: 2 2011.09.26-19.31.01 0
: 1 2011.09.26-19.31.01 210   <--- normal amount
: 1 2011.09.26-19.31.01 0
: 5 2011.09.26-19.31.59 0
: 2 2011.09.26-19.45.17 0
: 11 2011.09.26-19.46.30 0    <---- lots of zeros is error!
: 5 2011.09.26-19.48.20 0
: 13 2011.09.26-20.00.20 0
: 5 2011.09.26-20.02.15 0
: 13 2011.09.26-20.15.20 0
: 5 2011.09.26-20.17.15 0
: 13 2011.09.26-20.30.26 0
: 5 2011.09.26-20.32.22 0
: 1 2011.09.26-20.46.05 1046     <---- error!
: 2 2011.09.26-20.46.05 0
: 2 2011.09.26-20.46.06 1046
: 2 2011.09.26-20.46.07 0
: 1 2011.09.26-20.46.07 1046
: 2 2011.09.26-20.46.07 0
: 3 2011.09.26-20.46.08 0
: 5 2011.09.26-20.47.50 0
: 1 2011.09.26-20.59.51 0
: 1 2011.09.26-21.00.52 209
: 2 2011.09.26-21.00.52 0
: 2 2011.09.26-21.00.52 209
: 2 2011.09.26-21.00.52 0
: 1 2011.09.26-21.00.52 209
: 4 2011.09.26-21.00.52 0
: 5 2011.09.26-21.02.48 0
: 1 2011.09.26-21.14.51 0
: 1 2011.09.26-21.15.52 210
: 2 2011.09.26-21.15.52 0

Something else to look for are the METRICSTREAMINFO messages, unless the UBA crashes when its writing to the BOF file this will show how many metrics are computed


If the datafeed for a UBA has a large gap - missing several hours of data. The UBA needs to pretend to process all the hours its missed data for. Expect messages like the following for every data hour missed.

2014.02.06-05.15.45 UTC UBA.6.225-20638 1       METADATA_LOAD   Resyncing elementManager (between 2014.02.05-22.00.00 and 2014.02.05-22.59.59)...
2014.02.06-05.15.59 UTC UBA.6.225-20638 1       METADATA_LOAD   Resyncing subelementManager (between 2014.02.05-22.00.00 and 2014.02.05-22.59.59)...
2014.02.06-05.15.59 UTC UBA.6.225-20638 1       METADATA_LOAD   Resyncing formulaManager (between 2014.02.05-22.00.00 and 2014.02.05-22.59.59)...
2014.02.06-05.15.59 UTC UBA.6.225-20638 1       METADATA_LOAD   Resyncing propertyManager (between 2014.02.05-22.00.00 and 2014.02.05-22.59.59)...
2014.02.06-05.16.00 UTC UBA.6.225-20638 1       METADATA_LOAD   Resyncing relationManager (between 2014.02.05-22.00.00 and 2014.02.05-22.59.59)...
2014.02.06-05.16.00 UTC UBA.6.225-20638 1       METADATA_LOAD   Resyncing relationships elementManager...
2014.02.06-05.16.00 UTC UBA.6.225-20638 1       METADATA_LOAD   Resyncing relationships subelementManager...
2014.02.06-05.16.00 UTC UBA.6.225-20638 1       METADATA_LOAD   Resyncing relationships formulaManager...
2014.02.06-05.16.00 UTC UBA.6.225-20638 1       METADATA_LOAD   Resyncing relationships propertyManager...
2014.02.06-05.16.00 UTC UBA.6.225-20638 1       METADATA_LOAD   Resyncing relationships relationManager...
2014.02.06-05.16.00 UTC UBA.6.225-20638 1       METADATA_LOAD   Done resyncing relationships
2014.02.06-05.16.19 UTC UBA.6.225-20638 1       METADATA_LOAD   Resyncing elementManager (between 2014.02.05-23.00.00 and 2014.02.05-23.59.59)...
2014.02.06-05.16.19 UTC UBA.6.225-20638 1       METADATA_LOAD   Resyncing subelementManager (between 2014.02.05-23.00.00 and 2014.02.05-23.59.59)...
2014.02.06-05.16.20 UTC UBA.6.225-20638 1       METADATA_LOAD   Resyncing formulaManager (between 2014.02.05-23.00.00 and 2014.02.05-23.59.59)...
2014.02.06-05.16.20 UTC UBA.6.225-20638 1       METADATA_LOAD   Resyncing propertyManager (between 2014.02.05-23.00.00 and 2014.02.05-23.59.59)...
2014.02.06-05.16.20 UTC UBA.6.225-20638 1       METADATA_LOAD   Resyncing relationManager (between 2014.02.05-23.00.00 and 2014.02.05-23.59.59)...
2014.02.06-05.16.20 UTC UBA.6.225-20638 1       METADATA_LOAD   Resyncing relationships elementManager...
2014.02.06-05.16.20 UTC UBA.6.225-20638 1       METADATA_LOAD   Resyncing relationships subelementManager...
2014.02.06-05.16.21 UTC UBA.6.225-20638 1       METADATA_LOAD   Resyncing relationships formulaManager...
2014.02.06-05.16.21 UTC UBA.6.225-20638 1       METADATA_LOAD   Resyncing relationships propertyManager...
2014.02.06-05.16.21 UTC UBA.6.225-20638 1       METADATA_LOAD   Resyncing relationships relationManager...
2014.02.06-05.16.21 UTC UBA.6.225-20638 1       METADATA_LOAD   Done resyncing relationships
2014.02.06-05.16.41 UTC UBA.6.225-20638 1       METADATA_LOAD   Resyncing elementManager (between 2014.02.06-00.00.00 and 2014.02.06-00.59.59)...
2014.02.06-05.16.41 UTC UBA.6.225-20638 1       METADATA_LOAD   Resyncing subelementManager (between 2014.02.06-00.00.00 and 2014.02.06-00.59.59)...
2014.02.06-05.16.42 UTC UBA.6.225-20638 1       METADATA_LOAD   Resyncing formulaManager (between 2014.02.06-00.00.00 and 2014.02.06-00.59.59)...
2014.02.06-05.16.42 UTC UBA.6.225-20638 1       METADATA_LOAD   Resyncing propertyManager (between 2014.02.06-00.00.00 and 2014.02.06-00.59.59)...
2014.02.06-05.16.42 UTC UBA.6.225-20638 1       METADATA_LOAD   Resyncing relationManager (between 2014.02.06-00.00.00 and 2014.02.06-00.59.59)...
2014.02.06-05.16.42 UTC UBA.6.225-20638 1       METADATA_LOAD   Resyncing relationships elementManager...
2014.02.06-05.16.42 UTC UBA.6.225-20638 1       METADATA_LOAD   Resyncing relationships subelementManager...
2014.02.06-05.16.43 UTC UBA.6.225-20638 1       METADATA_LOAD   Resyncing relationships formulaManager...
2014.02.06-05.16.43 UTC UBA.6.225-20638 1       METADATA_LOAD   Resyncing relationships propertyManager...
2014.02.06-05.16.43 UTC UBA.6.225-20638 1       METADATA_LOAD   Resyncing relationships relationManager...
2014.02.06-05.16.43 UTC UBA.6.225-20638 1       METADATA_LOAD   Done resyncing relationships
2014.02.06-05.17.02 UTC UBA.6.225-20638 1       METADATA_LOAD   Resyncing elementManager (between 2014.02.06-01.00.00 and 2014.02.06-01.59.59)...
2014.02.06-05.17.16 UTC UBA.6.225-20638 1       METADATA_LOAD   Resyncing subelementManager (between 2014.02.06-01.00.00 and 2014.02.06-01.59.59)...
2014.02.06-05.17.16 UTC UBA.6.225-20638 1       METADATA_LOAD   Resyncing formulaManager (between 2014.02.06-01.00.00 and 2014.02.06-01.59.59)...
2014.02.06-05.17.16 UTC UBA.6.225-20638 1       METADATA_LOAD   Resyncing propertyManager (between 2014.02.06-01.00.00 and 2014.02.06-01.59.59)...
2014.02.06-05.17.17 UTC UBA.6.225-20638 1       METADATA_LOAD   Resyncing relationManager (between 2014.02.06-01.00.00 and 2014.02.06-01.59.59)...
2014.02.06-05.17.17 UTC UBA.6.225-20638 1       METADATA_LOAD   Resyncing relationships elementManager...
2014.02.06-05.17.17 UTC UBA.6.225-20638 1       METADATA_LOAD   Resyncing relationships subelementManager...
2014.02.06-05.17.17 UTC UBA.6.225-20638 1       METADATA_LOAD   Resyncing relationships formulaManager...
2014.02.06-05.17.17 UTC UBA.6.225-20638 1       METADATA_LOAD   Resyncing relationships propertyManager...
2014.02.06-05.17.17 UTC UBA.6.225-20638 1       METADATA_LOAD   Resyncing relationships relationManager...
2014.02.06-05.17.17 UTC UBA.6.225-20638 1       METADATA_LOAD   Done resyncing relationships


Check what the datachannel is upto

Find when the latest sub-directory was created given a path and search pattern (aka datachannel done files..)

date; bash -c 'for DIRECTORY in `find /appl/proviso/data/datachannel/ -name done -type d -user pvuser`; do echo "==";echo "$DIRECTORY"; ls -rt $DIRECTORY | tail -1 ;  done'

Count the files waiting to be processed

date; find /appl/proviso/data/datachannel/BLB*/output  | cut -d "/" -f 6 | sort | uniq -c


Inventory

Inventory is kicked off by issuing 'samifdump chan.subchan' you will then seen the following in the log


'''*'''''' Full Dump Started ''''''*'''.
'''*'''''' Full Dump Complete ''''''*'''.

Inventory isnt written back to the db untill you see

2014.08.07-04.03.31 UTC SAMIF.6.107-13338       I       FLUSHINVENTORY  Flushing buffered inventory changes to db
2014.08.07-04.03.32 UTC SAMIF.6.107-13338       I       PERF_INVFLUSH   Inserted/updated 128 inventory objects in 0.241 seconds
2014.08.07-04.03.32 UTC SAMIF.6.107-13338       1       PERF''INVFLUSH''[[Pv Subelement]]      DB flush stats: [[Pv Subelement]](totalObjects=6, totalTime=22, numberOfNewObjects=6, numberOfInserts=1, insertTime=22, avgInsertTimePerObject=3.6666666666667d, numberOfUpdatedObjects=0, numberOfUpdates=0, updateTime=0, numberOfDeletedObjects=0, numberOfDeletes=0, deleteTime=0, avgUpdateTimePerObject=0, numberOfInsertConvertedToUpdates=0, numberOfRejectedObjects=0)
2014.08.07-04.03.32 UTC SAMIF.6.107-13338       1       PERF''INVFLUSH''[[Pv Property]]        DB flush stats: [[Pv Property]](totalObjects=118, totalTime=197, numberOfNewObjects=118, numberOfInserts=1, insertTime=197, avgInsertTimePerObject=1.6694915254237d, numberOfUpdatedObjects=0, numberOfUpdates=0, updateTime=0, numberOfDeletedObjects=0, numberOfDeletes=0, deleteTime=0, avgUpdateTimePerObject=0, numberOfInsertConvertedToUpdates=0, numberOfRejectedObjects=0)
2014.08.07-04.03.32 UTC SAMIF.6.107-13338       1       PERF''INVFLUSH''[[Pv Element]] DB flush stats: [[Pv Element]](totalObjects=4, totalTime=22, numberOfNewObjects=4, numberOfInserts=1, insertTime=22, avgInsertTimePerObject=5.5d, numberOfUpdatedObjects=0, numberOfUpdates=0, updateTime=0, numberOfDeletedObjects=0, numberOfDeletes=0, deleteTime=0, avgUpdateTimePerObject=0, numberOfInsertConvertedToUpdates=0, numberOfRejectedObjects=0)


To be able to process the JMS Updates (realtime inventory) the SAMIF needs IDMAP files. If your running with the defaults 'inventory on startup = true' they will be genereated and kept upto date. If the SAMIF state directory is removed - the following needs to be run (a samifdump is not enough to re-generate them)

dccmd debug SAMIF.6.107 "self server firstAdaptor modelInterface generateFullIdMap"

The SAMIF does not need to re-generate the IDMAP files if is only stopped, then started (only if the state is removed).


Alarms

Get the current 5620 SAM Alarm List

<?xml version="1.0" encoding="UTF-8"?>
<SOAP:Envelope xmlns:SOAP="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<SOAP:Header>
<header xmlns="xmlapi_1.0">
<security>
<user>fm_user</user>
<password>md5hashh</password>
</security>
<requestID>client1:0</requestID>
</header>
</SOAP:Header>
<SOAP:Body>
<fm.[[Fault Manager]].findFaults xmlns="xmlapi_1.0">
<faultFilter>
<and class="fm.[[Alarm Object]]">
<equal name="olcState" value="inService"/>
</and>
</faultFilter>
</fm.[[Fault Manager]].findFaults>
</SOAP:Body>
</SOAP:Envelope>