TNPM/5620SAM
Contents
Versions
- 2.6.0.0 4.3Y
- 2.7.0.0 4.3Z
- 2.8.0.0 1.3A Support for SAM 8 only.
- 2.9.0.0 1.3B Support for SAM 6.1, 7 and 8
- unknown 1.3D Support for SAM 9
- 2.10.0.0 1.3E
- 2.11.0.0 1.3G Support for SAM 8, 9, 10
- 2.12.0.0 1.6H Oct 2012 - Support for 10 and SAM 11 - 4.4.3.2 and 1.3.1
- 2.13.0.0 1.3K TNPM 1.3.1 only
- 2.14.0.0 1.4A (findtofile) - 1.3.2+ (supported in TNPM 1.4 as well) --only tested with SAM 10, SAM 11, documentation states will work with 8 and 9--
New to the mix is the SAM Log to file - which uses IBM datastage and Mqueue - bolted onto the end of the existing UBAs.
- 2.1.0.0 1.3J
- 2.2.0.0 1.3L - 1.3.2+ (supported in TNPM 1.4 as well)
- 2.4.0.0 1.4B (logtofile) support for 5620 SAM v10 and v11 - TNPM 1.4
Certification Matrix (from Alcatel-Lucent IPD - Connected Partner Programme)
- 5620 SAM v11.0 R1 -- Tivoli Network Performance Manager (TNPM) 1.3.2 and Alcatel-Lucent 5620 SAM Log ToFile Technology Pack 2.2.0.0
- 5620 SAM v10.0 R1 -- Tivoli Network Performance Manager (TNPM) v.1.3.1 with Alcatel-Lucent 5620 SAM Technology Pack 2.11.0.0
- 5620 SAM v9.0 R1 -- Proviso v4.4.3.3 with Alcatel-Lucent 5620 SAM Technology Pack 2.10.0.0
- 5620 SAM v8.0 -- Proviso v4.4.1.4 with Alcatel-Lucent 5620 SAM Technology Pack 2.7.0.0 (IBM state 2.6.0.0 as well)
Certification of SAM 9 - TNPM 1.3D 4.4.3.3 announced 8 Sept 2011
Cognos
- Cognos 10 - SAM 2.4.0.0 --Log ToFile--
SAM.QUERY_START
The Data Channel BLB application queries the SAM primary and redundant SAM servers on a specified polling interval. The SAM.QUERY_START parameter specifies the first time period for which the BLB requested metrics from the SAM servers. Three related parameters are also discussed in this section:
- SAM.EXPORTGRACEPERIOD — The BLB application queries the SAM primary and redundant servers on a specified polling interval. The SAM.EXPORTGRACEPERIOD parameter specifies the number of seconds the BLB waits after a poll interval ends before it issues the next query to the SAM primary and redundant servers. - this is the one to increase so you back off the collections from real time
SAM.EXPORTSCHEDULE — The BLB application exports the query schedule on a specified schedule interval. The SAM.EXPORTSCHEDULE parameter specifies the minutes past the hour that delineate the start of a metric collection. -always set to 0,15,30,45 *
- SAM.FAILUREDELAY — The BLB application queries the SAM primary and redundant servers on a specified polling interval. It is possible that the primary, redundant, or both SAM servers could fail, causing a disruption in the BLB query operations. The SAM.FAILUREDELAY parameter specifies the number of seconds the BLB waits after a failure before it issues the next query to the SAM servers. The alcatel5620samsampledc.cfg template file provides the following parameter lines
Other Bits
Check SAMIF for failover messages when logging using FEWI123 (if log level 4 and app level 6 is enabled the JMS messages tell a better story)
tail -f proviso.log | egrep -e 'JMS''PROCESS|SAM''CONNECT|JMS communication'
and watch for fulldumps...
egrep -e 'JMS''PROCESS|SAM''CONNECT|JMS communication|\'''\'''\'''' '''proviso.log | grep -v "DISC.SF1288''SF2395''VIP"
What the SAMIF does when a new element is inserted - here it fails as another element exists with the same name
2011.04.11-10.46.46 UTC SAMIF.2.253-28916 I BULK''DB''INSERT_FAILED Bulk db insert failed because of: ORA-00001 : unique constraint (PV''ADMIN.UN''ELDE_2) violated retrying in by-statement mode 2011.04.11-10.46.46 UTC SAMIF.2.253-28916 W [[DC10110]] SQLERR A SQL error has occurred for the SQL statemen t: (insert into elmt''desc (str''type,str''state,int''collector,str''profile,str''name,str''comment,int''date,str''origin,str''user,idx ''ind,ncl''idx_ind) values(:type,:state,:collectorNumber,:profile,:name,:commentx,:datex,:origin,:username,:id,:nclId)) - ORA -00001: unique constraint (PV''ADMIN.UN''ELDE_2) violated 2011.04.11-10.46.46 UTC SAMIF.2.253-28916 W [[DC10592]] DB''INSERT''UPDATE''FAILED Statement 'insert into elmt''d esc (str''type,str''state,int''collector,str''profile,str''name,str''comment,int''date,str''origin,str''user,idx''ind,ncl''idx''ind) valu es(:type,:state,:collectorNumber,:profile,:name,:commentx,:datex,:origin,:username,:id,:nclId)' failed due to: ORA-00001: u nique constraint (PV''ADMIN.UN''ELDE''2) violated , Data: (type=#import,state=#on,collectorNumber=253,profile=#bulk''253,name='10 .0.16.18',commentx=nil,datex=1302518521,origin=#SAM,username=#pvuser,id=200019426,nclId=nil)
Error Messages that illustrate the SAM is providing garbage XML measurement files (SAM 8 is good at this)
egrep -e "DL39212|DL42002" proviso.log .... UBA .... [[DL39212]] FILE''SIZE''TOO_SMALL ..... size: 0 limit: '0' .... BLB .... [[DL42002]] SAM''EXPORT''FAILED SAM metric export request failed for timeslot: 2011.05.10-02.15.00 error: PROVISO.SAMMethodInvocationError
Find the latest files fetched by the BLB
ls -lrt /appl/proviso/data/datachannel/BLB.2.251/ALCATEL''5620''SAM/done/sam | tail -28
Show how much space each 'set' of files takes up, I.E the output x4 = hourly disk usage
du -hs `ls -rt /appl/proviso/data/datachannel/BLB.2.251/ALCATEL''5620''SAM/done/sam/* | tail -28`
Debugging UBAs
Show number of input files ( in this case it processes 10 files every 15 mins, 2 of which have data)
grep UBA.5.229 proviso.log | grep PERF''INPUT''PROCESSING | cut -d " " -f 1,3 | cut -d "." -f 1-5,7 | uniq -c : 1 2011.09.26-19.31.01 210 : 2 2011.09.26-19.31.01 0 : 2 2011.09.26-19.31.01 210 : 2 2011.09.26-19.31.01 0 : 1 2011.09.26-19.31.01 210 <--- normal amount : 1 2011.09.26-19.31.01 0 : 5 2011.09.26-19.31.59 0 : 2 2011.09.26-19.45.17 0 : 11 2011.09.26-19.46.30 0 <---- lots of zeros is error! : 5 2011.09.26-19.48.20 0 : 13 2011.09.26-20.00.20 0 : 5 2011.09.26-20.02.15 0 : 13 2011.09.26-20.15.20 0 : 5 2011.09.26-20.17.15 0 : 13 2011.09.26-20.30.26 0 : 5 2011.09.26-20.32.22 0 : 1 2011.09.26-20.46.05 1046 <---- error! : 2 2011.09.26-20.46.05 0 : 2 2011.09.26-20.46.06 1046 : 2 2011.09.26-20.46.07 0 : 1 2011.09.26-20.46.07 1046 : 2 2011.09.26-20.46.07 0 : 3 2011.09.26-20.46.08 0 : 5 2011.09.26-20.47.50 0 : 1 2011.09.26-20.59.51 0 : 1 2011.09.26-21.00.52 209 : 2 2011.09.26-21.00.52 0 : 2 2011.09.26-21.00.52 209 : 2 2011.09.26-21.00.52 0 : 1 2011.09.26-21.00.52 209 : 4 2011.09.26-21.00.52 0 : 5 2011.09.26-21.02.48 0 : 1 2011.09.26-21.14.51 0 : 1 2011.09.26-21.15.52 210 : 2 2011.09.26-21.15.52 0
Something else to look for are the METRICSTREAMINFO messages, unless the UBA crashes when its writing to the BOF file this will show how many metrics are computed
If the datafeed for a UBA has a large gap - missing several hours of data. The UBA needs to pretend to process all the hours its missed data for. Expect messages like the following for every data hour missed.
2014.02.06-05.15.45 UTC UBA.6.225-20638 1 METADATA_LOAD Resyncing elementManager (between 2014.02.05-22.00.00 and 2014.02.05-22.59.59)... 2014.02.06-05.15.59 UTC UBA.6.225-20638 1 METADATA_LOAD Resyncing subelementManager (between 2014.02.05-22.00.00 and 2014.02.05-22.59.59)... 2014.02.06-05.15.59 UTC UBA.6.225-20638 1 METADATA_LOAD Resyncing formulaManager (between 2014.02.05-22.00.00 and 2014.02.05-22.59.59)... 2014.02.06-05.15.59 UTC UBA.6.225-20638 1 METADATA_LOAD Resyncing propertyManager (between 2014.02.05-22.00.00 and 2014.02.05-22.59.59)... 2014.02.06-05.16.00 UTC UBA.6.225-20638 1 METADATA_LOAD Resyncing relationManager (between 2014.02.05-22.00.00 and 2014.02.05-22.59.59)... 2014.02.06-05.16.00 UTC UBA.6.225-20638 1 METADATA_LOAD Resyncing relationships elementManager... 2014.02.06-05.16.00 UTC UBA.6.225-20638 1 METADATA_LOAD Resyncing relationships subelementManager... 2014.02.06-05.16.00 UTC UBA.6.225-20638 1 METADATA_LOAD Resyncing relationships formulaManager... 2014.02.06-05.16.00 UTC UBA.6.225-20638 1 METADATA_LOAD Resyncing relationships propertyManager... 2014.02.06-05.16.00 UTC UBA.6.225-20638 1 METADATA_LOAD Resyncing relationships relationManager... 2014.02.06-05.16.00 UTC UBA.6.225-20638 1 METADATA_LOAD Done resyncing relationships 2014.02.06-05.16.19 UTC UBA.6.225-20638 1 METADATA_LOAD Resyncing elementManager (between 2014.02.05-23.00.00 and 2014.02.05-23.59.59)... 2014.02.06-05.16.19 UTC UBA.6.225-20638 1 METADATA_LOAD Resyncing subelementManager (between 2014.02.05-23.00.00 and 2014.02.05-23.59.59)... 2014.02.06-05.16.20 UTC UBA.6.225-20638 1 METADATA_LOAD Resyncing formulaManager (between 2014.02.05-23.00.00 and 2014.02.05-23.59.59)... 2014.02.06-05.16.20 UTC UBA.6.225-20638 1 METADATA_LOAD Resyncing propertyManager (between 2014.02.05-23.00.00 and 2014.02.05-23.59.59)... 2014.02.06-05.16.20 UTC UBA.6.225-20638 1 METADATA_LOAD Resyncing relationManager (between 2014.02.05-23.00.00 and 2014.02.05-23.59.59)... 2014.02.06-05.16.20 UTC UBA.6.225-20638 1 METADATA_LOAD Resyncing relationships elementManager... 2014.02.06-05.16.20 UTC UBA.6.225-20638 1 METADATA_LOAD Resyncing relationships subelementManager... 2014.02.06-05.16.21 UTC UBA.6.225-20638 1 METADATA_LOAD Resyncing relationships formulaManager... 2014.02.06-05.16.21 UTC UBA.6.225-20638 1 METADATA_LOAD Resyncing relationships propertyManager... 2014.02.06-05.16.21 UTC UBA.6.225-20638 1 METADATA_LOAD Resyncing relationships relationManager... 2014.02.06-05.16.21 UTC UBA.6.225-20638 1 METADATA_LOAD Done resyncing relationships 2014.02.06-05.16.41 UTC UBA.6.225-20638 1 METADATA_LOAD Resyncing elementManager (between 2014.02.06-00.00.00 and 2014.02.06-00.59.59)... 2014.02.06-05.16.41 UTC UBA.6.225-20638 1 METADATA_LOAD Resyncing subelementManager (between 2014.02.06-00.00.00 and 2014.02.06-00.59.59)... 2014.02.06-05.16.42 UTC UBA.6.225-20638 1 METADATA_LOAD Resyncing formulaManager (between 2014.02.06-00.00.00 and 2014.02.06-00.59.59)... 2014.02.06-05.16.42 UTC UBA.6.225-20638 1 METADATA_LOAD Resyncing propertyManager (between 2014.02.06-00.00.00 and 2014.02.06-00.59.59)... 2014.02.06-05.16.42 UTC UBA.6.225-20638 1 METADATA_LOAD Resyncing relationManager (between 2014.02.06-00.00.00 and 2014.02.06-00.59.59)... 2014.02.06-05.16.42 UTC UBA.6.225-20638 1 METADATA_LOAD Resyncing relationships elementManager... 2014.02.06-05.16.42 UTC UBA.6.225-20638 1 METADATA_LOAD Resyncing relationships subelementManager... 2014.02.06-05.16.43 UTC UBA.6.225-20638 1 METADATA_LOAD Resyncing relationships formulaManager... 2014.02.06-05.16.43 UTC UBA.6.225-20638 1 METADATA_LOAD Resyncing relationships propertyManager... 2014.02.06-05.16.43 UTC UBA.6.225-20638 1 METADATA_LOAD Resyncing relationships relationManager... 2014.02.06-05.16.43 UTC UBA.6.225-20638 1 METADATA_LOAD Done resyncing relationships 2014.02.06-05.17.02 UTC UBA.6.225-20638 1 METADATA_LOAD Resyncing elementManager (between 2014.02.06-01.00.00 and 2014.02.06-01.59.59)... 2014.02.06-05.17.16 UTC UBA.6.225-20638 1 METADATA_LOAD Resyncing subelementManager (between 2014.02.06-01.00.00 and 2014.02.06-01.59.59)... 2014.02.06-05.17.16 UTC UBA.6.225-20638 1 METADATA_LOAD Resyncing formulaManager (between 2014.02.06-01.00.00 and 2014.02.06-01.59.59)... 2014.02.06-05.17.16 UTC UBA.6.225-20638 1 METADATA_LOAD Resyncing propertyManager (between 2014.02.06-01.00.00 and 2014.02.06-01.59.59)... 2014.02.06-05.17.17 UTC UBA.6.225-20638 1 METADATA_LOAD Resyncing relationManager (between 2014.02.06-01.00.00 and 2014.02.06-01.59.59)... 2014.02.06-05.17.17 UTC UBA.6.225-20638 1 METADATA_LOAD Resyncing relationships elementManager... 2014.02.06-05.17.17 UTC UBA.6.225-20638 1 METADATA_LOAD Resyncing relationships subelementManager... 2014.02.06-05.17.17 UTC UBA.6.225-20638 1 METADATA_LOAD Resyncing relationships formulaManager... 2014.02.06-05.17.17 UTC UBA.6.225-20638 1 METADATA_LOAD Resyncing relationships propertyManager... 2014.02.06-05.17.17 UTC UBA.6.225-20638 1 METADATA_LOAD Resyncing relationships relationManager... 2014.02.06-05.17.17 UTC UBA.6.225-20638 1 METADATA_LOAD Done resyncing relationships
Check what the datachannel is upto
Find when the latest sub-directory was created given a path and search pattern (aka datachannel done files..)
date; bash -c 'for DIRECTORY in `find /appl/proviso/data/datachannel/ -name done -type d -user pvuser`; do echo "==";echo "$DIRECTORY"; ls -rt $DIRECTORY | tail -1 ; done'
Count the files waiting to be processed
date; find /appl/proviso/data/datachannel/BLB*/output | cut -d "/" -f 6 | sort | uniq -c
Inventory
Inventory is kicked off by issuing 'samifdump chan.subchan' you will then seen the following in the log
'''*'''''' Full Dump Started ''''''*'''. '''*'''''' Full Dump Complete ''''''*'''.
Inventory isnt written back to the db untill you see
2014.08.07-04.03.31 UTC SAMIF.6.107-13338 I FLUSHINVENTORY Flushing buffered inventory changes to db 2014.08.07-04.03.32 UTC SAMIF.6.107-13338 I PERF_INVFLUSH Inserted/updated 128 inventory objects in 0.241 seconds 2014.08.07-04.03.32 UTC SAMIF.6.107-13338 1 PERF''INVFLUSH''[[Pv Subelement]] DB flush stats: [[Pv Subelement]](totalObjects=6, totalTime=22, numberOfNewObjects=6, numberOfInserts=1, insertTime=22, avgInsertTimePerObject=3.6666666666667d, numberOfUpdatedObjects=0, numberOfUpdates=0, updateTime=0, numberOfDeletedObjects=0, numberOfDeletes=0, deleteTime=0, avgUpdateTimePerObject=0, numberOfInsertConvertedToUpdates=0, numberOfRejectedObjects=0) 2014.08.07-04.03.32 UTC SAMIF.6.107-13338 1 PERF''INVFLUSH''[[Pv Property]] DB flush stats: [[Pv Property]](totalObjects=118, totalTime=197, numberOfNewObjects=118, numberOfInserts=1, insertTime=197, avgInsertTimePerObject=1.6694915254237d, numberOfUpdatedObjects=0, numberOfUpdates=0, updateTime=0, numberOfDeletedObjects=0, numberOfDeletes=0, deleteTime=0, avgUpdateTimePerObject=0, numberOfInsertConvertedToUpdates=0, numberOfRejectedObjects=0) 2014.08.07-04.03.32 UTC SAMIF.6.107-13338 1 PERF''INVFLUSH''[[Pv Element]] DB flush stats: [[Pv Element]](totalObjects=4, totalTime=22, numberOfNewObjects=4, numberOfInserts=1, insertTime=22, avgInsertTimePerObject=5.5d, numberOfUpdatedObjects=0, numberOfUpdates=0, updateTime=0, numberOfDeletedObjects=0, numberOfDeletes=0, deleteTime=0, avgUpdateTimePerObject=0, numberOfInsertConvertedToUpdates=0, numberOfRejectedObjects=0)
To be able to process the JMS Updates (realtime inventory) the SAMIF needs IDMAP files.
If your running with the defaults 'inventory on startup = true' they will be genereated and kept upto date. If the SAMIF state directory is removed - the following needs to be run (a samifdump is not enough to re-generate them)
dccmd debug SAMIF.6.107 "self server firstAdaptor modelInterface generateFullIdMap"
The SAMIF does not need to re-generate the IDMAP files if is only stopped, then started (only if the state is removed).
Alarms
Get the current 5620 SAM Alarm List
<?xml version="1.0" encoding="UTF-8"?> <SOAP:Envelope xmlns:SOAP="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <SOAP:Header> <header xmlns="xmlapi_1.0"> <security> <user>fm_user</user> <password>md5hashh</password> </security> <requestID>client1:0</requestID> </header> </SOAP:Header> <SOAP:Body> <fm.[[Fault Manager]].findFaults xmlns="xmlapi_1.0"> <faultFilter> <and class="fm.[[Alarm Object]]"> <equal name="olcState" value="inService"/> </and> </faultFilter> </fm.[[Fault Manager]].findFaults> </SOAP:Body> </SOAP:Envelope>