01/16/2000 Owl shift report


Subject: 01/16/2000 Owl shift report
From: Vivian O'Dell (odell@fnal.gov)
Date: Fri Feb 04 2000 - 15:26:34 CST


Shift Summary for owl shift, 16/Jan/2000

On Shift: BobH, Valeri.

Hours of beam available at all intensities: 7.0
Hours of beam available at nominal intensity: 6.8
Hours of E799 data (special runs not included): 4.75
Hours of special runs (laser scans, muon runs...): 0.0

Accounting of lost time:

Stop/start run, tape change, fresh_starts, pipeline crates reset and
reboot,
debugging and tracking down lost permit.

========================================================================
Runs ended this shift:
Run 15541 - 8.75 M evts - BEAM_799
Run 15543 - in progresss - BEAM_799

Tapes brought to FCC this shift:
       UP(A,B,C)322-324,UPS108,UPK041,UPM057
       QKE173-181, QKB167
========================================================================
Accesses during shift:

None, but we made one during the begining of the next shift to track down
the lost permit.

========================================================================
Overview of shift:

Good beam all shift with intensity around 9e12 and no missing spills.
But we had problems
with resetting and rebooting pipeline crates (lost 1.5 hrs), as well
as lost permit in stream 5 near the end of shift (lost 45 min).
We also had several pipeline error alarms/hcc ineff due to stream 5
ordinal number mismatch which we found that kfrend will fix the problem.

========================================================================
Description of Shift:

run spill time comment
------------------------------------------------------------------------
15541 0:30 SDAQ Epicure button is red: NM2S1 reads back
                      515+-10A, below the limit.

               1:35 Tapes are full. Page Ed to install CsI constants,
                      update FERA pedestals, change tapes and do fresh
                      start (had pipeline errors and one ktev2 CPU used
                      more time than others).

               1:50 Run hangs on init - damp server communication error
                      Also having problems initializing pipeline.

               3:00 Were fighting with pipeline for more than an hour.
                      Rebooted all crates ones, crates 11, 13, and 14 -
                      once more (init hung up on crate 11 originally,
                      than crates 13 and 14 joined it during various
                      pipeline resets). Paged BobT, after trying a few
                      things he decided to come over.
                      Meanwhile, we decided to try to start new run
                      after drt_fresh_start without pipeline
initialization
                      to see if DA will work. And it did.

15543 3:05 Start new run, got pipeline errors in the first
spill,
                      reset pipeline (successfully this time for unknown
                      reason).

               3:15 Pipeline errors again - reset pipeline.

               3:45 SDAQ asked to call Theo and Ashkan and tell them
                      to create new swic db file (Red SDAQ button).

               4:30 Pipeline errors. Since BobT suspected that these
                      errors are caused by stream 5, which had wrong
                      sparse code and ordinal number, we try to do
                      kfrend only and it fixes the problem.

                      Found that there is always 'FFFF word is screwed up'
                      error in stream 5 right before the 'ordinal number/
                      sparse code mismatch' errors and event dump shows
                      that stream 5 is completely missing in that event,
                      but the next event has stream 5 with ord. number
                      and sparse code from the previous event. Also in
                      most cases the 'ordinal number mismatch' errors are
                      coming either from one of filter planes or from
                      a fillter plane and ktev3.

               4:35 Mounted UPF078 on ktev4.

               5:20 Pipeline errors - kfrend.

               5:30 Mounted UPT026 on ktev2.
                      Triggers stopped, DYC53 (AZLAT) busy - kfrend.

               6:15 Triggers are not running, lost permit in stream 5.
                      After that - lots of kfrends,
                      repeater power cyclings (both upstairs and
                      downstairs). Also tried to disable the triggers
                      and send one trigger at a time. Nothing seems to
                      work.

               6:45 Page Leo (run manager). More kfrends, power
                      cyclings and sending single and multiple triggers.

               7:30 BobH and Valeri go on access to look for lost
                      permit. The permit was found in crate 31 (HA/BA)
                      and the crate was power cycled, but we were still
                      losing permit in it. Closer examination has shown
                      that the crate is not getting L2 signal. The problem
                      was localized to patch panel in the same rack (RR90)
                      and L2 was moved to new connectors, labelled as BA3
                      (or something like that). Got trigger going.
                    
========================================================================
Follow up Notes on Previous Shift

========================================================================
Items to Follow up on Next Shift

In most cases kfrend will fix "pipeline errors";

DA experts might want to look into the missing stream 5 problem which
causes the "pipeline errors".



This archive was generated by hypermail 2b27 : Fri Feb 04 2000 - 15:26:37 CST