21
ALMA Integrated Computing Team Coordination & Planning Meeting #1 Santiago, 17-19 April 2013 ICT Group Planning: Control Rafael Hiriart ICT Control Group Lead

ALMA Integrated Computing Team Coordination & Planning Meeting #1 Santiago, 17-19 April 2013

  • Upload
    zanta

  • View
    34

  • Download
    0

Embed Size (px)

DESCRIPTION

ALMA Integrated Computing Team Coordination & Planning Meeting #1 Santiago, 17-19 April 2013. ICT Group Planning: Control Rafael Hiriart ICT Control Group Lead. Status since the last coordination meeting, Santiago 2012. What we have delivered, what is pending. Features & issues. - PowerPoint PPT Presentation

Citation preview

Page 1: ALMA Integrated Computing Team Coordination & Planning Meeting #1 Santiago, 17-19 April 2013

ALMA Integrated Computing Team

Coordination & Planning Meeting #1Santiago, 17-19 April 2013

ICT Group Planning: Control

Rafael Hiriart

ICT Control Group Lead

Page 2: ALMA Integrated Computing Team Coordination & Planning Meeting #1 Santiago, 17-19 April 2013

Outline

Status since the last coordination meeting, Santiago 2012. What we have delivered, what is pending.

Features & issues. Tentative schedule.

─ 3 month “recession” period.─ After recession.

Page 3: ALMA Integrated Computing Team Coordination & Planning Meeting #1 Santiago, 17-19 April 2013

ICT-CPM1 17-19 April 2013

Control Group Resources

Ralph Marson Jorge Avarias (Control/Scheduling) New hire, that was going to start next week, backed out yesterday.

=> Software Engineer II position open Rodrigo Amestica Jesus Perez Matias Mora Rafael Hiriart Open Position: Software Engineer III – General Control Developer Open Position: Software Engineer II – Control/ObOps Developer

5.5 FTEs currently, 8 FTEs when complete Currently there is really 1.5 FTEs in Control

and 2.5 FTEs in Correlator for development and debugging work

Page 4: ALMA Integrated Computing Team Coordination & Planning Meeting #1 Santiago, 17-19 April 2013

ICT-CPM1 17-19 April 2013

Control (or Antenna, Tuning & Timing) Includes Monitoring only up to the BACI property interface.

BL Correlator H/W Configuration Database (a.k.a. TMCDB)

S/W side is ACS, including S/W tmcdb-explorer plug-ins.

Metadata Capture (a.k.a. DataCapturer) QuickLook

Includes processing and visualization of TelCal results. It does not include the Real Time Filler, but for a couple of calls to activate it and

deactivate it. Currently it is completely deactivated, because it was suspected to cause bulkdata problems. What is the plan in this regard?

ALMA Phasing Project

Control Group “Components”

Page 5: ALMA Integrated Computing Team Coordination & Planning Meeting #1 Santiago, 17-19 April 2013

ICT-CPM1 17-19 April 2013

Observation efficiency was identified as the highest priority. Substantial improvements in inter-subscan interval (1.5 seconds, can be made

shorter with additional improvements in the correlator). Scan sequences allow similar timings. Management of LS slave lasers delivered, some HW issues being investigated by

BackEnd. FrontEnd optimizations being followed up by ADC and FrontEnd IPT. Integration of TelCal calibration results into scan sequences pending.

Monitoring enhancements delivered (batch 1 and batch 2). We delivered what CIPT had agreed to deliver according to “Analysis of collated…”.

Pending acceptance. Progress in recent meeting between Arturo and Maurizio.

DataCapturer incremental writes under testing An intermediate solution for memory issues is ready for Phase V in 9.1.4. A prototype for a more general solution is well advanced. This will allow to improve

QuickLook as well.

Additional ASDM tables were populated, but more are coming (Ephemeris being the most important).

Deliveries since 2012 Coordination Meeting

Page 6: ALMA Integrated Computing Team Coordination & Planning Meeting #1 Santiago, 17-19 April 2013

ICT-CPM1 17-19 April 2013

4 quadrant correlator delivered, including porting the CDP Master to 64-bit Linux. More of a “capabilitiy” than a simple feature.

Extensive refactoring in the threading structure of the correlator, Which allowed to overlap subscans in the correlator, improving efficiency. Paved the way to achieve 100% data rate.

QuickLook summaries and “alarm” reports delivered and integrated with AQUA.

Several TMCDB improvements, including adding versioning to several tables, FLOOG and DTX offsets, and support for “global” configurations. More requirements have been requested.

90 degrees Walsh functions were delivered, but this feature hasn’t been sufficiently tested yet.

Many bugs fixed. In general, tracking and fixing problems is taking more than 50% of our time. Nor surprising given reduced FTEs. All Cycle 1 critical bugs (1-2 priority in Stuartt’s document) have been addressed.

(continuation)

Page 7: ALMA Integrated Computing Team Coordination & Planning Meeting #1 Santiago, 17-19 April 2013

ICT-CPM1 17-19 April 2013

Alarms. Configuration issues were taken over by ADC. Some progress but further improvements currently stopped on waiting for an update

in valid range limits from BackEnd (Rodrigo Brito?). Still pending is to devise a way of not sending some alarms during inter-subscan

intervals (WCA not locked, for example). Issue with resetting alarms when components go up or down. Components need to

reset alarm state for each BACI one by one. These two last issues need more discussion with ACS.

Pending correlator modes 2x Nyquist, 3x3 and 4x4 modes. It has been de-scoped several times, it doesn’t seem

to have a high relative priority with respect to other correlator features.

Planned features that haven’t been delivered yet

Page 8: ALMA Integrated Computing Team Coordination & Planning Meeting #1 Santiago, 17-19 April 2013

ICT-CPM1 17-19 April 2013

Feature delivery schedule is highly dynamic, plans change often according to updated priorities, including high priority problems.

The Control Software Coordination Group (CSCG) meeting is our main forum to discuss priorities and planning. Currently we have representatives from Control (Rafael, Ralph and Rodrigo), ADC

(Tzu and Ruben), SIST (Neil), CSV (Stuartt), and ADE (Nick). Missing is DSO, but we usually discuss planning with DSO either in the Scheduling or AQUA meeting.

CSCG is an effective and useful meeting. It allow us to understand what are the higher priorities for the project and adjust our plan accordingly.

As discussed yesterday, we'll include Manabu.

Requirement management process discussed yesterday. Ticket priorities, and changes in development plan will be discussed in

this meeting.

Current Planning Process

Page 9: ALMA Integrated Computing Team Coordination & Planning Meeting #1 Santiago, 17-19 April 2013

ICT-CPM1 17-19 April 2013

Improve Correlator Calibrations Management Fix bug on Scan Sequences, but beyond this, it is necessary to investigate under

what conditions do we need to calibrate the correlator (e.g. what is the necessary input power range, etc.)

Fast Switching For the LS, currently the ball is in BackEnd side, but after this is done, we may need

to follow up making adjustments in the software. In the FrontEnd, algorithmic optimizations are being investigated, we will also need to

parallelize some operations. Observing Mode level software is complete.

Artificial Source Current plans?

Scan Sequence II Involves integrating management of the TelCal calibrations inside the Scan

Sequences. Needed for VLBI as well.

Important Features – Control

Page 10: ALMA Integrated Computing Team Coordination & Planning Meeting #1 Santiago, 17-19 April 2013

ICT-CPM1 17-19 April 2013

Fast Scanning Lissajour patterns in the antenna motion.

Nutator Observing Device driver delivered, includes only monitor/control points

• A small issue with scaling factors for some monitor/control points, which at the time of development were unclear. Clarification request rejected (!???).

• More issues with the meaning of dwell and transition periods. This is relevant for writing the observing mode. Clarification request also rejected.

• In many cases, significant details necessary to write software has been specified in Operations Manuals (e.g., Backend sub-device initialization order, LS algorithms). This time we are asking for the Nutator Operations manual.

Feature development pending on acceptance. Will require binning in the ACA correlator, not in the BL correlator. Is this still true?

Dynamic Sub-arrays Frequency Switching Get rid of CASA dependencies

Request from CASA, to avoid having to maintain a build for ALMA online SW.

(continuation)

Page 11: ALMA Integrated Computing Team Coordination & Planning Meeting #1 Santiago, 17-19 April 2013

ICT-CPM1 17-19 April 2013

Sideband separation in autocorrelations by frequency stepping. Not sure if this is still a requirement.

Blanking. DRX crossbar control. Flag by Antenna shadowing. Incorporate Doppler correction on ALMA-OT validation of frequency

tuning. Extend IF frequency from 4-8 GHz to 4-9 GHz to avoid (artificially) un-

tunable gaps. We requested that FrontEnd specifications were updated. FrontEnd said no. Is this

still a requirement?

Introduce sub-reflector movements in delay calculations. Improvements in Array and Antenna OMC plugins. Documentation (e.g., delay tracking / fringe rotation).

(continuation)

Page 12: ALMA Integrated Computing Team Coordination & Planning Meeting #1 Santiago, 17-19 April 2013

ICT-CPM1 17-19 April 2013

Many features delivered, but not yet accepted. Path forward discussed yesterday.

An important feature that was delivered is the ability to archive all monitor points. Missing types were implemented. Monitoring configuration is now only a matter of adjusting TMCDB records. “Composite” properties are being separated in the database.

Archive on-change (a pure ACS feature) is key to reduce monitoring rates.

Monitoring Enhancements

Page 13: ALMA Integrated Computing Team Coordination & Planning Meeting #1 Santiago, 17-19 April 2013

ICT-CPM1 17-19 April 2013

Subarrays It will involve work both in Control and in Correlator Design is being discussed. The main problem is how to synchronize the timing so

there’s no conflicts in the quadrants CAN bus. Execution in the first subarray will be optimal. A subsequent execution in a second

subarray will be sub-optimal, as it will need to be scheduled in such a way to avoid already used CAN “slots”.

Full Correlator Data Rate (60 MB/s) Rodrigo working on this. Interrupted to support bulkdata issues investigation.

Manage more than 16 configuration and calibration slots. J working on this. Interrupted to debug several issues.

Correlator Pending Modes FAST mode (1 ms auto-correlations) already implemented, it needs verification.

Correlator Multi-Resolution Modes LO offsetting sideband “separation” Correlator hardware timestamping 3x3 bit quantization correction in FDM

BL Correlator

Page 14: ALMA Integrated Computing Team Coordination & Planning Meeting #1 Santiago, 17-19 April 2013

ICT-CPM1 17-19 April 2013

Improvements in utilities to better track Antenna status (high priority, included in “recession” period features).

Complete Antenna CAI map. ACA specific pad delays. Track antenna metrology mode. Complete history of BaseElements and Assemblies. MonitoringUpdate tool. Mass update of monitoring properties. Access to Antenna pad position history (needed for pipeline). Extend versioning to other tables.

Assembly tables There are other tables in the S/W side (Component, BACI property), but this is

outside our scope. Is ACS considering this?

Global configuration delivered, but not support for S/W side. Is this an ACS requirement?

Do we need to integrate the TMCDB with the Antenna Status Dashboard?

H/W Configuration Database

Page 15: ALMA Integrated Computing Team Coordination & Planning Meeting #1 Santiago, 17-19 April 2013

ICT-CPM1 17-19 April 2013

Full incremental writes well advanced. The ASDM will be written to a relational database as the observation is performed. Discussed with DSO, ARCHIVE, OBOPS, OFFLINE. Everybody think it’s a good idea. It will make DataCapturer more robust (no longer holding everything in memory,

transaction support, etc.), it will allow to improve QuickLook and other applications to query metadata without having to wait for the observation to finish (e.g. some of the ERMA functionality could be integrated).

New tables.

DataCapturer

Page 16: ALMA Integrated Computing Team Coordination & Planning Meeting #1 Santiago, 17-19 April 2013

ICT-CPM1 17-19 April 2013

QuickLook functionalities: Plot TelCal results, agregating and computing simple statistics from calibration tables. Construct AQUA summaries. Manage the RTF.

• Currently deactivated. We should probably move this out of here.

It’s a model application for a three-tier system (data, logic, presentation). DataCapturer incremental writes enable this. Most of the application code can be

replaced by queries to the relational database (using GROUP BY, etc.) QuickLook can now have “memory”. It can go back and plot past results. This is not

currently supported. We can get rid of a complicated interaction between Control, DataCapturer, TelCal

and QuickLook. If fact, the plugins can simply be notified of new results at the end of a subscan and query the database themselves. We could get rid of obscures freezes.

External libraries that could further simplify and enhance QuickLook capabilities are being evaluated (the R statistical package, for instance).

QuickLook

Page 17: ALMA Integrated Computing Team Coordination & Planning Meeting #1 Santiago, 17-19 April 2013

ICT-CPM1 17-19 April 2013

ALMA Phasing Project (VLBI)

Quite detailed design and planning documents are being completed for the CDR (~100 pages, last time I looked). New ASDM tables included in the Appendix.

Development will begin after the CDR. We estimate 1.5 FTE of development effort.

Richard Hills sent around a memo describing the consequences of the IF residual fringe in the phasing. We are studying possible alternatives. Richard also had asked who will assume the “system engineering”

aspects of the project for the ALMA side. Who should we talk to about these issues?

A complete test environment has been put together in Charlottesville. This is a full STE, with additional node machines. It will allow us to integrate all the components involved in the phasing loop (Correlator, TelCal and Control), and test agains the hardware being developed by the correlator H/W team (2 antenna correlator rack).

Page 18: ALMA Integrated Computing Team Coordination & Planning Meeting #1 Santiago, 17-19 April 2013

ICT-CPM1 17-19 April 2013

Startup enhancements (Tzu is organizing the requirements, from Emilio’s email). We should concentrate on automate the startup procedure as much

as possible, and improve the way real problems are reported to the operators.

Merge and test DataCapturer Pointing table incremental write from 9.1.4.

TMCDB improvements to facilitate tracking Antenna status. Array and Master should be improved to avoid FSR.

Destroy array should always work. Reinitialize Antennas should be extended to Central LO, AOS

Timing, etc. Add the capability to update the Master Antenna list (I believe this

was part of Tzu's Antenna plug-and-play project).

“Recession” Period Features

Page 19: ALMA Integrated Computing Team Coordination & Planning Meeting #1 Santiago, 17-19 April 2013

ICT-CPM1 17-19 April 2013

Scalability test for DataCapturer. Define more granular tests for OSF regression tests. Correlator simulator improvements. DataCapturer scalability tests. Total Power processor tests. Integrate correlator in Control's main integration test.

Testing Improvements

Page 20: ALMA Integrated Computing Team Coordination & Planning Meeting #1 Santiago, 17-19 April 2013

ICT-CPM1 17-19 April 2013

After recession schedule

Ralph

– Calibration ID management

– Fast Scanning or Nutator (July)

– Fast switching (as needed)

– Artificial Source (October)

– Scan Sequence II (Jan. 2014) Rodrigo

– 100% data rates (~1-2 month development left)

– LO offsetting sideband separation (July)

– Pending modes (September)

– Multi-resolution modes (Jan. 2014)

Page 21: ALMA Integrated Computing Team Coordination & Planning Meeting #1 Santiago, 17-19 April 2013

ICT-CPM1 17-19 April 2013

(continuation)

J

– Manage more than 16 configurations (~1 month left)

– Subarrays (October) Matias:

– Correlator simulator (improvements during recession period)

– APP

– Subarrays