29
Contents 1 Introduction......................................................2 2 Science Background................................................2 3 Radio Frequency Interference......................................5 4 Pulsar Timing and Real-time RFI Excision and Cancellation.........6 5 Pulsar Searching and RFI Identification using Machine Learning. . .10 6 Relevance to NASA and Jurisdiction...............................12 7 NASA Interactions................................................13 8 Project Partners, Management and Personnel.......................14 9 Tasks and Schedule...............................................15 10 Partnerships and Sustainability................................16 11 Dissemination.................................................. 16 12 Prior NASA Research Support....................................16 13 References..................................................... 17 1

sheather/RFIProposalV6.doc · Web viewWhile we plan to focus on neural network classifiers in this work, alternative classification methods exist that could also be suitable for this

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: sheather/RFIProposalV6.doc · Web viewWhile we plan to focus on neural network classifiers in this work, alternative classification methods exist that could also be suitable for this

Contents1 Introduction.........................................................................................................................................2

2 Science Background............................................................................................................................2

3 Radio Frequency Interference..............................................................................................................5

4 Pulsar Timing and Real-time RFI Excision and Cancellation..............................................................6

5 Pulsar Searching and RFI Identification using Machine Learning.....................................................10

6 Relevance to NASA and Jurisdiction.................................................................................................12

7 NASA Interactions............................................................................................................................13

8 Project Partners, Management and Personnel....................................................................................14

9 Tasks and Schedule...........................................................................................................................15

10 Partnerships and Sustainability......................................................................................................16

11 Dissemination................................................................................................................................16

12 Prior NASA Research Support......................................................................................................16

13 References.....................................................................................................................................17

1

Page 2: sheather/RFIProposalV6.doc · Web viewWhile we plan to focus on neural network classifiers in this work, alternative classification methods exist that could also be suitable for this

1 IntroductionThe National Radio Astronomy Observatory, in collaboration with West Virginia University (WVU), Brigham Young University (BYU), and JPL, proposes a research program to develop state-of-the-art radio frequency interference (RFI) mitigation techniques. These techniques will enable sensitive pulsar searches and timing that will assist NASA in addressing key science questions in the targeted research program “Physics of the Cosmos (PCOS)”. PCOS is a focused program within NASA’s Astrophysics Division which seeks to understand the basic building blocks of our existence - matter, energy, space, and time - and how they behave under extreme physical conditions. PCOS addresses central questions about the nature of complex astrophysical phenomena such as black holes, neutron stars, dark energy, and gravitational waves.

Pulsar observations with the Robert C. Byrd Green Bank Telescope (GBT), located in Green Bank, WV, can directly answer these questions. Pulsars, rapidly rotating neutron stars with clock-like timing precision, provide insights into a rich variety of physics and astrophysics. Specifically, high precision pulsar timing observations address three of the five key science goals of the PCOS Program (http://pcos.gsfc.nasa.gov/science), namely to: (i) test the validity of Einstein's theory of General Relativity and investigate the nature of space-time; (ii) understand the formation and growth of supermassive black holes and their role in the evolution of galaxies; and (iii) explore the behavior of matter and energy in their most extreme environments.

The broader bandwidths made possible by newly developed pulsar instrumentation present the opportunity to dramatically increase pulsar search sensitivity and timing precision. These will lead to dramatic advances in all of these areas. However, taking advantage of broad-band observations requires the development of improved techniques to remove RFI, which becomes a larger problem as bandwidths increase.

The goal of this proposal is to develop the advanced radio frequency interference excision and mitigation techniques necessary to allow the most sensitive pulsar observations with the GBT. Research and development will occur in two primary areas:

Active Cancellation: Active cancellation of an RFI signal may be accomplished by receiving an interfering signal with a secondary antenna, and then using this signal to cancel out its effects on the radio astronomy signal. A related approach uses a parametric estimation/subtraction technique that exploits known properties of the RFI modulation. We will experiment with combining these two approaches.

Neural Networks: Machine learning algorithms, especially neural networks, have shown great promise in automating astronomical data processing. Our research will focus on the application of neural networks to identify RFI in pulsar data

This work is synergistic with NASA's Fermi gamma-ray telescope, which is revolutionizing our view of the Galactic neutron star population. Over the past four years, 44 millisecond pulsars have been found via targeted searches of Fermi sources. The RFI mitigation techniques to be developed in this proposal will allow us to find fainter pulsars associated with Fermi sources. In addition to providing astrophysical laboratories to study neutron stars and their environments, the

2

Page 3: sheather/RFIProposalV6.doc · Web viewWhile we plan to focus on neural network classifiers in this work, alternative classification methods exist that could also be suitable for this

long-term timing of some of these pulsars will be used to add further millisecond pulsars to timing array experiments currently being carried out to detect low-frequency gravitational waves.

2 Science BackgroundPulsars are rapidly rotating, highly magnetized neutron stars. Neutron stars can have magnetic field strengths exceeding 1012 G, rotation rates approaching 1000 Hz, central densities exceeding 1014 g cm-3, and normalized gravitational strengths of order 0.4. Due to their large moments of intertia, their pulses provide a highly regular clock when they are detected as radio pulsars. This timing precision makes them natural physical laboratories to test gravitational physics. For example, precise measurements of the pulse arrival times of PSR B1913+16, one member of a double neutron star system, over an interval of decades revealed that this system is losing energy at a rate that is consistent with the emission of quadrupolar gravitational waves as predicted by General Relativity (Taylor & Weisberg 1989). The discovery and subsequent timing of this pulsar resulted in the 1993 Nobel Prize in Physics. More recently, the double pulsar PSR J0737-3039A/B, with two detectable neutron stars, is proving to be a spectacular gravitational and plasma physics laboratory (Burgay et al. 2003; Lyne et al. 2004; McLaughlin et al. 2005a,b; Kramer et al. 2006; Kramer & Stairs 2008). Another exotic pulsar system that demonstrates pulsars as probes of fundamental physics is the pulsar J1744−2446ad, in the globular cluster Terzan 5, which has a pulse period of 1.4 ms (implied rotation rate of 716 Hz; Hessels et al. 2006). In order for it to maintain its structural integrity, the nuclear equation of state must be such that it can withstand the centrifugal force on its equator.

High precision pulsar timing observations specifically address three of the five key science goals of the PCOS Program, namely to:

1) Test the validity of Einstein's theory of General Relativity and investigate the nature of spacetime. Current GBT observations of the double pulsar system (Figure 1) provide the most sensitive test of general relativity in the strong-field regime (Kramer et al. 2006), and future discoveries expected of pulsar-black hole binaries and pulsars orbiting the black hole at the

center of the Milky Way are expected to make further fundamental contributions in this area.

Figure 1: Using GBT timing observations, we constrain the masses of the two pulsars and, at the same time, test the predictions of General Relativity. The diagonal lines labeled “R” represent the mass ratio based on the semi-major axes of the orbits of A and B. The shaded orange region is forbidden. The other lines illustrate the relativistic corrections to a Keplerian orbit that are measured. Two lines are plotted for all parameters to illustrate the 1-sigma errors. The masses of A and B correspond to the intersection of all lines and are measured to be 1.3381 ± 0.0007 Msun for A and 1.2489 ± 0.0007 Msun for B. Because all lines intersect at a common point, we can say that all measurements of relativistic parameters thus far are consistent with GR.

3

Page 4: sheather/RFIProposalV6.doc · Web viewWhile we plan to focus on neural network classifiers in this work, alternative classification methods exist that could also be suitable for this

2) Understand the formation and growth of supermassive black holes and their role in the evolution of galaxies. Gravitational wave detection via pulsar timing is just over the horizon and will be accelerated by this work. Supermassive black hole binaries are the strongest expected source of nHz gravitational waves (Hobbs et al. 2010). Gravitational wave astronomy will allow us to study the cosmological population of these binaries and possibly individual sources. Current limits based on GBT data (Figure 2) are already beginning to

constrain models for galaxy formation and evolution (Demorest et al. 2012).

Figure 2: current upper limits on the gravitational wave spectrum at low frequencies taken from recently published measurements by Demorest et al. (2012) as part of the North American Nanohertz Observatory for Gravitational Waves (NANOGrav), which uses high-precision timing of an array of millisecond pulsars. The current limits are just above the expected regimes for gravitational waves from cosmic strings (green band) and an ensemble of supermassive black hole binaries at cosmological distances (pink band).

3) Explore the behavior of matter and energy in their most extreme environments. Precision measurements of neutron star masses in binary systems allow us to constrain the equation of state of superdense matter (Demorest et al. 2010) and studies of pulsar magnetospheres allow us to probe particle physics in high magnetic-field relativistic plasmas (Li et al. 2012).

Figure 3: mass-radius diagram showing constraints on various neutron star equations-of-state provided by Shapiro time delay measurements of the binary millisecond pulsar J1614-2230 (Demorest et al. 2010). The red horizontal line shows the mass determination of the pulsar (the radius of which is currently unknown). The smooth curves show various mass-radius relations for different equations of state. A number of these models are now excluded by the mass measurement of PSR J1614-2230. The orange and yellow lines indicate previous mass constraints.

4

Page 5: sheather/RFIProposalV6.doc · Web viewWhile we plan to focus on neural network classifiers in this work, alternative classification methods exist that could also be suitable for this

Radio Frequency InterferenceRadio frequency interference (RFI) is a growing problem over all radio astronomy bands due to the proliferation of wireless applications, portable electronics, and new and expanded communications technologies such as spread-spectrum digital transmission and satellite downlinks. These technologies cause strong, unwanted radio signals that vary as a function of time and frequency throughout key pulsar observing bands. Compounding the effects of the new technologies is the fact that radio receivers and instruments have improved substantially and are much more sensitive, with wider bandwidths, causing us to detect all of these signals in terribly exquisite detail (see Figure 4).

Figure 4: Radio Frequency Interference measured by the GBT in the five receiver bands which currently cover the frequency range from 0.5 GHz to 2.5 GHz.

Most RFI is currently handled via analysis of the statistics of the raw data after it has been recorded to disk. Outlying data samples with values far from those expected as a function of time or observing frequency are removed from the analysis. Simply flagging statistically aberrant time intervals or frequency channels can result in large fractions of the data being discarded, causing a substantial loss of sensitivity, and so more sophisticated techniques are needed.

Pulsar observations are particularly susceptible to RFI. Remaining, unexcised RFI impacts pulsar research in two ways. For pulsar timing, the presence of RFI can make the data unusable; more insidiously, low levels of RFI subtly distort the pulse shape, leading to errors in the measured time of arrival (Figure 5). For pulsar searching, the presence of RFI leads to a deluge of “false positives”. The best recourse at present is extremely time-consuming human inspection of each pulsar candidate to distinguish between genuine pulsars and spurious RFI artifacts.

The proposed research project will address both of these concerns via state-of-the-art but realizable signal processing techniques. Pulsar timing will be enhanced via real-time RFI excision and cancellation algorithms. Pulsar searching will be improved using neural networks, to better distinguish genuine candidates from false positives. These two aspects of the proposal are described separately below.

5

Page 6: sheather/RFIProposalV6.doc · Web viewWhile we plan to focus on neural network classifiers in this work, alternative classification methods exist that could also be suitable for this

Figure 5: (a) and (c): radio frequency as a function of pulse phase for two different observations which show the quadratic signature of pulse dispersion in the interstellar medium, as well as the contamination by RFI on different parts of the spectrum. RFI is particularly prevalent in the observation in the left panel. (b) and (d): integrated pulse profiles formed after these data are corrected for dispersion. The impact on profile fidelity for the observation including strong RFI is clearly seen in (b) compared to the much cleaner pulse shape in (d).

3 Pulsar Timing and Real-time RFI Excision and CancellationHigh-precision pulsar timing is essential for fundamental physics experiments such as mass measurements, tests of General Relativity, and the detection of low-frequency gravitational waves. In addition, most of the Fermi pulsar detections have been enabled by folding gamma-ray data with a radio-derived ephemeris. Continued timing of radio pulsars is essential for revealing gamma-ray pulsations from more pulsars and for more sensitive gamma-ray detections of known pulsars. Improved RFI excision will increase the sensitivity of our timing observations by allowing us to use larger bandwidths and by removing spurious features in pulse profiles that can increase time-of-arrival errors. The ability to time pulsars more precisely will allow us to not only elucidate the geometries of neutron stars (by comparing the radio and gamma-ray pulse phases), but also will provide sensitivity improvements for those pulsars that are being used in timing arrays for the detection of low-frequency gravitational waves. This latter point is extremely relevant given that gravitational-wave astronomy is one of the key science priorities for NASA in the coming decade. The improvements discussed here will also result in many fewer candidate signals resulting from pulsar searches, dramatically reducing the time needed to inspect pulsar search output.

One important correction needed for high precision timing is the removal of frequency-dependent dispersive and scattering delays incurred as radio pulses traverse the interstellar medium. NRAO is in the process of developing a wide-band (approximately 0.6 – 2.4 GHz) prime-focus receiver system and matching backend to be used primarily for high-precision pulsar timing projects. Such a system will use all of the useful bandwidth available (i.e. at frequencies above those with strong interstellar scattering effects) for high-precision pulsar timing, thereby optimizing the GBT’s sensitivity for such science. In addition, it will enable cutting edge removal of frequency-dependent propagation effects which can systematically limit some pulsar timing observations. This crowded frequency range is especially prone to RFI since it includes most cell phone bands, digital TV broadcasts, UHF commercial land mobile services, the ubiquitous unlicensed industrial, scientific and medical (ISM) bands at 900 MHz and 2.4 GHz, aeronautical traffic control radar and DME services, satellite downlink transmissions, GPS, GLONASS, satellite phone, four large amateur radio bands, and many other high power licensed

6

Page 7: sheather/RFIProposalV6.doc · Web viewWhile we plan to focus on neural network classifiers in this work, alternative classification methods exist that could also be suitable for this

services. A combination of active cancellation and real-time blanking will be essential to fully realize the capabilities of the new receiver/backend system.

It is recently becoming apparent that an additional limitation to high-precision pulsar timing experiments, and therefore the associated science that can be extracted, is that individual pulses from a pulsar having varying pulse phases. Sophisticated techniques are now being developed to minimize the impact of this “pulse jitter”. However, because they rely on the ability to detect individual pulses and measure the pulse shapes with high precision, data must be essentially free from RFI.

A detailed background on RFI mitigation techniques is provided by ITU-R RA.2126, Kesteven (2010) and references therein. There are two main approaches to RFI mitigation applicable to the GBT:

1. Excision, in the sense of “cutting out” RFI. For example, RFI consisting of brief pulses might be mitigated by blanking the data when the pulse is present.

2. Cancellation, in the sense of “subtracting” RFI from the telescope output. Cancellation is potentially superior to excision in that the RFI is removed with no impact on the astronomical signal, providing a “look through” capability that is nominally free of the artifacts associated with the simple “cutting out” of data.

We have considerable expertise in the application of both of these techniques. Jeffs and co-workers (Dong et al., 2005), have demonstrated the ability to blank pulses from the ARSR-3 FAA Air Surveillance Radar on Apple Orchard Mountain near Bedford, VA, 106 km south-southeast of the GBT. This work was done by recording data to disk, and processing it after the fact. Fisher (2004) has also addressed the similar problem of pulsed interference from aviation distance measuring equipment (DME).

Active cancellation of an RFI signal is accomplished by using a high-gain antenna to receive the interfering signal and using it as a reference for cancelling out its effects on the radio astronomy signal. This technique was demonstrated for the first time in radio astronomy by Barnbaum and Bradley (1998). The cancellation process must be done with an adaptive filter, since the signal characteristics change with time. Barnbaum and Bradley used the popular least mean squares (LMS) algorithm based on Wiener filter principles. A limitation of this algorithm is that it requires an input interference to noise ratio (INR) > 1 in order to achieve significant benefit. To achieve an output INR << 1 using this method, it is usually necessary to implement some means to receive the RFI with INR greater than the INR perceived by the primary instrument. Since most large dishes have approximately unity gain in the far sidelobes, the INR can be improved in proportion to the gain of the auxiliary antenna used to receive the RFI. Thus, a yagi with 20 dB gain could improve the INR available to the cancellation algorithm by about 20 dB, which could then reduce INR at the telescope output by a comparable factor. This approach was demonstrated by BYU Masters student Andrew Poulson (2003, 2005). Subsequent work (Jeffs et al., 2005) describes the extension of this “reference signal” approach to achieve better performance against RFI from satellites by using multiple auxiliary signals from dishes with gains on the order of 30 dB.

Roshi (2002) has investigated techniques to suppress interference due to synchronization signals in TV transmission. A combination of noise-free modeling of the synchronization signals and

7

Page 8: sheather/RFIProposalV6.doc · Web viewWhile we plan to focus on neural network classifiers in this work, alternative classification methods exist that could also be suitable for this

adaptive filtering is used to suppress the interference. The measured lower limit on RFI rejection using this technique on the TV synchronization signal is about 12 dB.

While all of these techniques show promise, none of them are in regular production use. Two significant factors in this have been a) existing approaches have only been available to expert users, and b) until recently, available digital signal processing hardware has only been able to handle limited bandwidths, and time and frequency resolution. This situation is rapidly changing with the advent of powerful multi-core CPU, GPU and FPGA-based radio astronomy backends.

Under the auspices of this grant, we will develop RFI blanking and cancellation techniques based on extensions of the work cited above and implement them on the production digital backends in use at the GBT. The focus of the work will be to: a) convert current off-line prototypes and simulations into production real-time implementations; and b) continue research to advance the state of the art for each of these approaches. The specific work will be as follows:

Perform an investigation of how badly low levels of RFI impact pulsar time of arrivals: Paul Demorest has constructed an RFI-free software model of pulsar timing precision for measurements in the 0.3 to 3 GHz frequency range using ranges of known pulsar parameters, telescope sensitivity as function of frequency, and interstellar dispersion and scattering. This model will be extended to include the loss of spectrum due to known RFI in this frequency range and the improvements to pulsar timing that can be achieved by recovering various portions of lost spectrum with RFI mitigation techniques. This model will also be extended to include the effects of incomplete RFI subtraction on various types of known signals to guide the most effective use of signal processing development and to determine when each specific mitigation algorithm is “good enough”.

Implement radar and DME blankers using the approach of Fisher et al.: The parameters of short-pulse radar signals and signals from aircraft-borne distance measuring equipment (DME) have been well studied in our previous research. The main task here is to implement excision algorithms in FPGA firmware to make them effective in real time and over an unprecedented wide spectral bandwidth.

The spectrum allocated to the airborne portion of the DME service is 1.025 to 1.15 GHz, and is divided into 1 MHz wide independent channels. The first signal processing implementation task is to cleanly divide the spectrum into independent 1 MHz channels. There will be at least three identical spectrometers, one for each of the two radio astronomy signals that process orthogonal polarizations and at least one filter bank for a reference channel whose antenna gain is maximized on the horizon from which most DME signals arrive. The next step is to implement a matched filter in the reference channel for the double-pulse DME signature to maximize the detectability of each pulse pair and provide a blanking gate to remove each pulse pair from the two corresponding radio astronomy frequency channels with minimal loss of data. A few of the 1-MHz channels around 1.08 GHz will be discarded in this reconstruction to eliminate the cacophony of radar transponder signals from all aircraft at this frequency within the DME band.

Blanking of pulses from ground-based radar is best done with a different strategy from DME pulse blanking. The timing pulses from one or at most a few radar stations near radio astronomy sites follows a repeating sequence. Any slow drifts in absolute times of individual pulse arrival and the rotation rate of a sweeping radar antenna are easily tracked with modest signal

8

Page 9: sheather/RFIProposalV6.doc · Web viewWhile we plan to focus on neural network classifiers in this work, alternative classification methods exist that could also be suitable for this

processing. For the older style short-pulse radar, blanking in the few MHz around each radar transmitting frequency begins just before the expected pulse arrival time and continues for some tens of microseconds to allow time for echoes from surrounding terrain an nearby aircraft to die away. Newer radars use chirped pulses that occupy a larger fraction of the time between pulses and sweep across a wider frequency bandwidth. The pulses can be de-chirped with the appropriate algorithms in the radio telescope signal processing and removed with not much more loss of science data than with short-pulse radar, but the signal processing design and implementation is more challenging. After pulse removal the processed signal must be de-de-chirped (re-chirped) to restore the high resolution pulsar timing information in the data stream. Chirped radar blanking is a new area of development that will be supported by this research grant. It has immediate relevance because the short-pulse radar that currently affects GBT observing is slated to be replaced by a chirped radar within the period covered by this grant.

Implement an active canceller for GPS L1, L2C and GLONASS signals: Signals from navigation satellites, such as GPS and GLONASS occupy a much wider bandwidth than is required by the rate of information transmitted from satellite to navigation receiver. They use a technique called spread spectrum that provides a great deal of immunity to interference and, in the case of GPS, the ability to allow more than one satellite to use the same frequency band. Ellingson et al (2001), using off-line software processing, have demonstrated that the GLONASS signal can be de-spread using the published spreading code and the resulting narrow bandwidth signal removed with a narrowband filter. Other information in the broader band of the original GLONASS signal, including radio astronomy data, was recovered by re-spreading the filtered signal with the original digital spreading function. This signal processing will be implemented in FPGA firmware as part of this research grant and provided to the science and engineering communities. Removal of GPS signals from radio astronomy data presents a greater challenge because, unlike GLONASS where each satellite transmits on a different frequency band, all GPS satellites use the same frequencies. For the GPS system to work, the spreading functions for the different satellites are quite different, so that each satellite signal can be de-spread, have its resulting narrow bandwidth signal removed, and re-spread in succession. We will conduct a research study using real data to determine how effective this combined RFI excision technique is. Our initial estimate is that the residual RFI will be of little consequence to pulsar timing.

Implement cancellation of TV signals below 700 MHz: UHF TV occupies the 470-806 MHz frequency band with each channel occupying 6 MHz. With the advent of digital TV each TV signal occupies its entire 6 MHz band with only small guard bands between channels. Unlike GLONASS and GPS signals, the modulation (spreading) function is not deterministic. It depends on the ever-changing picture and sound content, so de-spreading algorithms cannot be applied to these signals. To remove these signals from radio astronomy data we must use adaptive cancellation techniques where a relatively clean copy of each TV signal is acquired in a reference channel with a relatively high gain antenna pointing in the direction of arrival. This clean copy is then modified in phase and amplitude using LMS or a similar algorithm implemented in DSP to match the same signal in the radio astronomy telescope channel and subtracted. Since the relative phase and amplitude of the TV signal in the reference and telescope channels is unknown and continuously changing, the subtractive process must be adaptive with the signal processing objective of minimizing the TV signal in the radio astronomy channel at all times. This technique has been well studied in experimental settings by several of the co-PIs and collaborators, so the key tasks for this project will be to implement the technique

9

Page 10: sheather/RFIProposalV6.doc · Web viewWhile we plan to focus on neural network classifiers in this work, alternative classification methods exist that could also be suitable for this

in FPGA firmware, study the unique propagation conditions for signals arriving at the GBT, and optimize the adaptive algorithms accordingly.

4 Pulsar Searching and RFI Identification using Machine LearningFinding more millisecond pulsars is the most important way to improve the sensitivity of pulsar timing arrays for gravitational waves. Therefore, pulsar searches directly support one of NASA’s key science priorities. In addition, as the Fermi mission continues, the need for radio identification will continue to be important and, as Fermi detects more distant objects, more sensitive radio surveys will be necessary. These are important for better understanding the pulsar population in the Galaxy, crucial for interpreting the Fermi discoveries. In particular, new pulsar discoveries will improve the distance model for pulsars in our galaxy. A refined distance model will enable more accurate distances to Fermi pulsars to be calculated, enabling better interpretation of their fluxes in the context of specific emission models.

All of the large-scale and targeted searches for pulsars currently being carried out are dominated by RFI and generate many spurious false pulsar candidates. These candidates are typically excised through a laborious manual review, which is currently the most significant limitation on the speed of pulsar searches. The second thrust of the proposed project will automate this candidate review process using machine learning pattern recognition techniques. This promises to be a game-changing advance that significantly improves the speed and accuracy of pulsar surveys.

The astronomy community has standardized pulsar search software that first uses a set of statistical tests to clean the worst RFI out of the data. However, even with this initial RFI filtering step, the candidate lists from real-world pulsar searches are dominated by RFI. For example, the Pulsar Search Collaboratory (PSC, http://www.pulsarsearchcollaboratory.org/, Rosen et al., 2010) database contains approximately 2 million candidates, out of which ~100 are thought to be true pulsar signals and 700,000 have been labeled as RFI. These numbers are typical of all current radio pulsar searches. The pulsar candidate data volumes are so large that manually classifying RFI is impractical. While some research into automated methods to rank candidates and/or remove likely RFI has been done, the diversity of RFI and the high noise of these observations make it very difficult to develop reliable rules in advance. The human eye is currently needed to interpret the subtle patterns that distinguish RFI from a real pulsar. For a large pulsar survey, this requires many hours of work by trained observers; the relative slowness of manual candidate inspection can cause discoveries to lag behind the original observation by several years. Human errors and review fatigue often mean events are discovered in second or third inspection rounds taking place years after the initial survey results (e.g. Mickaliger et al., 2012).

Fortunately, modern pattern recognition methods may address this situation. Machine learning algorithms, especially neural networks, have shown initial promise for automating astronomical data processing (McCarty 2011), and in particular, pulsar search data processing (Eatough et al., 2010). These techniques use examples (i.e. training data) to build a statistical model of RFI and pulsar patterns and extrapolate this to classify new cases. The models can exploit arbitrary linear or nonlinear relationships, finding distinguishing numerical features and exploiting patterns that the human user need not notice or articulate. This holds particular promise for automating the

10

Page 11: sheather/RFIProposalV6.doc · Web viewWhile we plan to focus on neural network classifiers in this work, alternative classification methods exist that could also be suitable for this

subtle pattern recognition problems of RFI labeling. The system can still defer ambiguous events for human review to minimize the risk of missing any real pulsars.

Our research on automated RFI excision for pulsar searching will proceed in several stages. The first step will be to study and catalogue the different kinds of RFI in pulsar search output. A solid understanding of RFI population will ensure that the classifier design incorporates all the relevant attributes and training data. A detailed characterization study will be performed on observations known to contain or be free of RFI. We will employ the best available visualization and data mining techniques to characterize these populations. Clustering and statistical analysis will be used explore the data and identify feature sets. Some of the techniques we will use include k-means clustering, Bayesian Networks (ref), Principal Component Analysis (Abdi et al., 2010), and frequent pattern mining (Yang et al., 2003).

The next critical problem in any pattern recognition task is “feature selection,” e.g. designing a numerical representation of the candidate events that encodes enough information to discriminate RFI from pulsars (Keith et al. 2009). Each event has an associated “feature vector,” a list of values serving as input to the classifier. We will start with attributes found by the initial study to be statistically discriminative. We will also conduct interviews and discussions with pulsar domain experts, working closely to understand the visual and statistical patterns that they use to recognize RFI. This will undoubtedly indicate other informative features. For example, we can incorporate the output from the PRESTO pulsar visualization software, such as the result of standard statistical tests and filtering operations, and feed these directly into the classifier. In this way we can seamlessly incorporate such derived attributes, automating – and building from – the best existing expert knowledge. We are aware of similar efforts elsewhere (e.g. I. Stairs, University of British Columbia) and will coordinate closely with them.

After developing the software to record, extract, and catalogue the features of candidate events, we will train the pattern recognition engine. We initially favor a back-propagation neural network, a classification method inspired by biological neural networks. Neural networks in particular have the advantages of a high tolerance to noisy data and the ability to handle high-dimensional feature spaces. The neural network is a classification model composed of a set of input and output nodes in which each connection has an associated weight. These weights are adjusted during a training phase to predict the correct class label of a given input. The results can be as simple as a Boolean to identify the class label (e.g., whether or not the sample contained RFI). More complex outputs produce the probability of a sample belonging to a given class; this has the distinct advantage of making the output more interpretable and useful.

A major strength of the proposed effort is the large PSC database of human-labeled training data already available for our use. Supervised classification methods such as neural networks require a training and validation phase where the system learns from data that has been manually labeled. The PSC database along with the associated raw observational data will be used to create training and testing data sets for our work. Once constructed, the classifier will be applied to testing data from the PSC database for validation. Finally, after validation, the neural network classification system will be used to identify and flag RFI in an offline data analysis pipeline.

By applying machine learning techniques to the problem we will at minimum reduce the volume of RFI candidates that must be inspected, leading to more efficient searches. Tuning the confidence level using decision theory will allow us to accomplish this without dismissing pulsar

11

Page 12: sheather/RFIProposalV6.doc · Web viewWhile we plan to focus on neural network classifiers in this work, alternative classification methods exist that could also be suitable for this

signals as RFI (false negatives). It may also be possible to improve the sensitivity of searches by more reliably removing RFI and allowing lower significance candidates to be considered.

While we plan to focus on neural network classifiers in this work, alternative classification methods exist that could also be suitable for this application. Most of our effort will go toward constructing the training and test data sets, software pipeline, and integration into exist data reduction pipelines. Our black box classification approach will allow us to trivially substitute alternative classifiers as needed, making the RFI excision pipeline a testbed resource for future research efforts in this area.

5 Relevance to NASA and JurisdictionThe proposed work, to develop algorithms and techniques for the mitigation of interfering signals, enables the detection of gravitational waves, tests of general relativity, and the study of the extreme states of matter. As such, the proposed work is responsive to Strategic Goal 2 of the 2011 NASA Strategic Plan, specifically addressing, “Discover how the Universe works, explore how it began and evolved, ….” Within the 2010 Science Plan for the Science Mission Directorate, the proposed work is responsive to two aspects. Because merging galaxies should produce supermassive black hole binaries that then generate gravitational waves, and the matter within neutron stars is at extremely high densities, the proposed work addresses the Astrophysics Science Questions “How do matter, energy, space, and time behave under the extraordinarily diverse conditions of the cosmos?” and “How did the Universe . . . evolve to produce the galaxies, stars, and planets that we see today?” From these questions, the specific Astrophysics Science Objectives addressed are “Understand . . . the nature of black holes . . . and gravity” and “Understand the many phenomena and processes associated with galaxy, stellar, and planetary system . . . evolution from the earliest epochs to today.”

The proposed work is also directly responsive to Physics of the Cosmos (PCOS) program science objectives, namely to "Test the validity of Einstein's General Theory of Relativity and investigate the nature of spacetime," "Understand the formation and growth of massive black holes and their role in the evolution of galaxies," and "Explore the behavior of matter and energy in its most extreme environments."

Finally, with respect to the New Worlds, New Horizons Decadal Survey, the proposed work cuts across all of the major science themes—Origins, Understanding the Cosmic Order, Frontiers of Knowledge, and Discovery—addressing questions such as:

How do cosmic structures form and evolve? What is the fossil record of galaxy assembly and evolution . . . ? How do . . . black holes form? How do black holes work and influence their surroundings? What controls the masses, spins and radii of compact stellar remnants? Gravitational wave astronomy

Nationally, there is much interest in research relating to access to the radio spectrum. Brigham Young University has a robust research program in RFI mitigation, and such programs are of direct benefit to NASA missions, but very little, if any research is occurring in West Virginia. The proposed work will build research capacity at NRAO and West Virginia University in radio

12

Page 13: sheather/RFIProposalV6.doc · Web viewWhile we plan to focus on neural network classifiers in this work, alternative classification methods exist that could also be suitable for this

frequency technologies, digital signal processing, and reconfigurable computing. This research has broad applicability—it carries with it potential for technology development that will command a wider audience than radio astronomy. As pressure—commercial and other— on the spectrum increases, the need for active RFI mitigation will be critical for communication technologies, not just for radio astronomy.

NRAO, throughout its history, has sought to provide astronomers with RFI-free data, traditionally by prohibiting emissions around the Observatory (with radio quiet zones). With the proliferation of land- and space based RF technologies, this is no longer sufficient. Astronomers are left to excise corrupting RFI from increasingly large data sets through means they develop individually. It is time to build research and development capacity in RFI mitigation and excision techniques and to convert experimental techniques to a robust common–user implementation.

This project also affords an opportunity for a wider collaboration that builds needed research capacity in a third EPSCoR jurisdiction—Puerto Rico. Astronomers using the Arecibo Telescope encounter similar RFI problems (as is the case at all radio observatories).

6 NASA InteractionsThis work will add value to existing collaborations between the partner organizations and NASA. NRAO, WVU and JPL are all members of the North American Nanohertz Observatory for Gravitational Waves (NANOGrav, http://nanograv.org/), which will directly benefit from the results of this work.

JPL and NRAO already have an active collaboration in machine learning via the VLBA Fast Radio Transients Experiment (V-FASTR, ref). Research and development in neural networks will benefit this partnership which seeks to develop algorithms to detect transient signals in time series data sets; transient searches are uniquely sensitive to RFI. The current proposal will further strengthen our collaboration with NASA. The research team will be working closely with the following researchers from JPL:

• Dr Joseph Lazio, Scientist• Dr Walid Majid, Scientist• Dr Kiri Wagstaff, Machine Learning Researcher• Dr David Thompson, Machine Learning Researcher

The primary technical contact throughout the project will be Dr. Lazio, Please note the support letter from Dr. Lazio which is in the attached proposal documents. The above researchers and the appropriate members of our research team already hold bi-weekly telecons to discuss areas of technical overlap. These will continue throughout the proposal period. In addition, we have budgeted four person-weeks per year of travel from JPL to WV (NRAO or WVU), this will allow two face-to-face meetings per year.

Additionally the proposed work will improve the data products of researchers who make joint observations using the Fermi telescope and the GBT. NRAO has a cooperative agreement with Fermi that commits observing time on NRAO telescopes for coordinated observations of Fermi sources, to be awarded on a competitive basis. The scientific programs supported within this

13

Page 14: sheather/RFIProposalV6.doc · Web viewWhile we plan to focus on neural network classifiers in this work, alternative classification methods exist that could also be suitable for this

agreement are those that are enhanced by the combination of Fermi observations with investigations using the radio facilities operated by NRAO (see http://fermi.gsfc.nasa.gov/ssc/proposals/nrao.html).

7 Project Partners, Management and PersonnelThe proposed research is a collaboration between the National Radio Astronomy Observatory (NRAO), and West Virginia University, both in West Virginia, and Brigham Young University in Utah. WVU will be the lead, and BYU will be a sub-awardee; as a Federal Agency we assume NRAO will be funded via inter agency transfer.

The personnel for the project are as follows:

Dr Majid Jaridi, PI, Professor, Director of the NASA WV ESPCoR, WVU Dr Paul Demorest, Co-I/Science PI, Assistant Scientist, NRAO Dr Richard Prestage, Co-I, Institutional-PI, NRAO Dr Duncan Lorimer, Co-I, Professor, WVU Dr Brian Jeffs, Co-I, Professor, WVU Mr Michael McCarty, Software Engineer, NRAO A Post-doctoral Fellow to be based at NRAO An Electronic Engineer to be based at NRAO Two PhD students (one each at WVU and BYU) 12 undergraduate summer interns (over the three year period) NRAO

In addition, key collaborators include Dr Maura McLaughlin (WVU), Dr Karl Warnick (BYU) Drs Scott Ransom, Anish Roshi, Rick Fisher and Rich Bradley (NRAO) and the JPL machine learning group. As part of its federal funding, NRAO will provide support for Prestage, Demorest and McCarty. NRAO will also provide all of the support infrastructure necessary to accomplish the project; this includes access to telescopes, the telescope backends to which firmware upgrades will be made, the CASPER development support environment, and the necessary electronics equipment and facilities. The post-doc and electronics engineer will be new hires supported by this proposal.

The Principal Investigator will be Dr. Majid Jaridi, as director of the WV NASA EPSCoR program. Dr. Jaridi will be responsible for coordinating the preparation and submission of all reports. He will also assist in undergraduate student recruitment. Dr Richard Prestage will be the Project Manager, and have overall technical responsibility for the program. He will coordinate the technical work and manage the budget. He will also be responsible for ensuring good communication and coordination between the project partners. Drs Lorimer and Jeffs will provide guidance and supervision to the WVU and BYU graduate students. The program will be split into four areas, as follows:

Characterization of the effects of RFI and the improvements provided by the various mit-igation strategies: Demorest, WVU graduate student.

Development of active cancellation strategies: Post-doc, BYU graduate student Implementation of active cancellation algorithms in production backends: Electronics En-

gineer assisted by Demorest.

14

Page 15: sheather/RFIProposalV6.doc · Web viewWhile we plan to focus on neural network classifiers in this work, alternative classification methods exist that could also be suitable for this

Research and implementation of machine learning techniques: Demorest and McCarty in collaboration with JPL (Majid, Thompson).

8 Diversity and Education.A core part of NRAO’s mission is to mentor the next generation of STEM professionals, and the organization has an outstanding fifty-year track record in the regard. Number of undergrad/grad students have participated in research experiences at the NRAO. The Observatory uses a spiraling apprenticeship model where students may participate in projects immediately, as a novice, while building skills and knowledge that allow them to progress to higher levels within the project as they gain expertise. In this project we will develop a co-op program that will serve up to 12 undergraduate students over the 3-year grant period. Rather than a single-summer research experience, we aim to have students return for a second summer, or school term. The returning students will mentor the new cadre of students. To encourage greater participation among underrepresented groups, we will partner with Bluefield State University (an historically black college). Six students will be funded by the NSF Research Experiences for Undergraduate program, and NRAO funds set aside for co-op programs.

Finally, we note that this project can potentially involve high-school students. Currently, more than 600 high-school students from around the U.S. are engaged in the NRAO/WVU Pulsar Search Collaboratory program, in which students search GBT data for new pulsars. Students can flag RFI in the data thus creating a training data set for neural network development. In addition, the RFI excision algorithms we develop will increase the sensitivity of the PSC searches, directly benefiting this very successful outreach program.

9 Tasks, Schedule and Evaluation.

Task 1: Characterize effects of RFI and improvements provided by mitigation strategies. Sub-Task 1A: Extend model of pulsar timing precision (Yr 1). Sub-Task 1B: Characterize effectiveness of mitigation strategies (Yr 2,3).Task 2: Development of active cancellation strategies. Sub-Task 2A: Develop radar and DME blankers (Yr 1). Sub-Task 2B: Develop active canceller for GPS and GLONASS signals (Yr 1,2). Sub-Task 2C: Develop cancellation techniques for digital TV signals (Yr 2,3).Task 3: Implement active cancellation strategies. Sub-Task 3A: Implement radar and DME blankers (Yr 1). Sub-Task 3B: Implement active canceller for GPS and GLONASS signals (Yr 2). Sub-Task 3C: Implement cancellation techniques for digital TV signals (Yr 3).Task 4: Research and Implementation of machine learning techniques.

Sub-Task 4A: Identify training and testing data sets (Yr 1).a. Decide appropriate stage of the data processingb. Identify labeled training and testing data

Sub-Task 4B: Characterize RFI found in training and testing data sets (Yr 1).a. Using unsupervised clustering and statistical methods to identify featuresb. Using frequent pattern mining of data known to be contaminatedc. By inspecting data with astronomers.

15

Page 16: sheather/RFIProposalV6.doc · Web viewWhile we plan to focus on neural network classifiers in this work, alternative classification methods exist that could also be suitable for this

Sub-Task 4C: Identify representative features found through the characterization study that are applicable to Neural Net classification. (Yr 2).

Sub-Task 4D: Implement preprocessing procedures to extract representative features from training, testing, and observational data for offline data processing. (Yr 2).

Sub-Task 4E: Implement Neural Network (Yr 3).a. Design initial network topology (architecture).b. Interface network with training and testing data sets.c. Training, testing, and evaluation iterations.

Sub-Task 4F: Incorporate classification results back into original data sets. (Yr 3).

We will evaluate the project through the following metrics:

Did Project participants accomplish project milestones on time and within budget? Tracking and reporting of major tasks accomplished.

Did the project effectively train a cadre of STEM professionals? Number of undergraduate students mentored through the program

Pre/post Surveys and interviews with undergraduate students; Number of publications with undergraduates as first and second authors.

Did the project develop new research capacity that enables the jurisdiction to seek outside support?

Successful grant applications to NSF, or non EPSCoR NASA CANs Sustained collaborations with West Virginia University and Bluefield State University

after project ends.

10 Partnerships and SustainabilityThe project will strengthen the ties between two of West Virginia’s premier research institutions – West Virginia University and the National Radio Astronomy Observatory in Green Bank, West Virginia. This proposal aims to partner students from WVU with post-doctoral researchers and engineers at NRAO to find novel ways of eliminating radio frequency interference from radio astronomy data. The techniques themselves, using neural networks and FPGA-based technologies to remove unwanted signals from datasets, are also new to West Virginia, opening a potential new area of research to the state. This builds our capacity in West Virginia to compete for research and development work within future NASA remote sensing missions, NSF programs such as the new program “Enhancing Access to the Radio Spectrum”, and commercial research and development contracts.

The proposed work will create, for the first time in West Virginia, a cadre of researchers, and science and engineering students who are focused on research and development in active RFI mitigation strategies.

NSF EARS, SAVI

11 DisseminationThe results of this work will have broad appeal beyond its implications for pulsar research and will be disseminated widely:

16

Page 17: sheather/RFIProposalV6.doc · Web viewWhile we plan to focus on neural network classifiers in this work, alternative classification methods exist that could also be suitable for this

The Berkeley “Collaboration for Astronomy Signal Processing and Electronics Research”. Hundreds of STEM professionals from Universities around the world meet annually to develop and learn new digital signal processing implementations. The CASPER group also maintains a development wiki for remote collaboration. We will present this work at the annual conference and also disseminate project related documentation to the CASPER wiki. (See https://casper.berkeley.edu/wiki/Projects).

Remote Sensing and NASA Deep Space Network. We will visit JPL to present the work at a colloquium and interact with potential users of the algorithms developed (for example the NASA Deep Space Network, future Earth remote sensing satellites).

URSI meetings Open Source. The implementations will be open source, and freely available to the

scientific/engineering community. Publications. The results of this work will be published in the scientific literature which

include astronomy journals, but also those with broader audiences such as Radio Science, and the IEEE transactions.

12 Prior NASA Research SupportIn the past five years, the following projects have been awarded to researchers from West Virginia, and managed by Dr. Majid Jaridi, WV NASA EPSCoR Director.

1. Molecular and Cellular Mechanisms Underlying Skeletal Muscle and Cardiovascular Adaptation to Simulated Microgravity; NASA Award Number NNX07AT54A; NASA Funding $749,521.2. Design, Simulation Validation and Flight Testing of Adaptive Fault Tolerant Flight Control Systems; NASA Award Number NNX07AT53A; NASA Funding $750,000.3. NASA EPSCoR Research Infrastructure Development; NASA Award Number NNX07AL53A; NASA Funding $250,000.4. NASA EPSCoR Augmentation; NASA Award Number NNX07AL53A; NASA Funding $50,000.5. Control of Steady and Unsteady Separation Through Dynamic Roughness; NASA Award Number NNX09AW07A; NASA Funding $750,000.6. NASA EPSCoR Research Infrastructure Development; NASA Award Number NNX07AL53A; NASA Funding $250,000.7. Remote Thermal Ion Measurements and Integrated Magnetospheric Modeling; NASA Award Number NNX10AN08A; NASA Funding $748,994.8. Spray Cooling Heat Transfer Mechanisms; NASA Award Number NNX10AN04A; NASA Funding $750,000.9. Coherent Terahertz Acoustic Phonons: A Novel Diagnostic for Erosion in Hall Thruster Discharge Chamber Walls; NASA Award Number NNX11AM04A; NASA Funding $748,68510. NASA EPSCoR Research Infrastructure Development Augmentation; NASA Award Number NNX07AL53A; NASA Funding $50,000.

17

Page 18: sheather/RFIProposalV6.doc · Web viewWhile we plan to focus on neural network classifiers in this work, alternative classification methods exist that could also be suitable for this

13 ReferencesAbdi, H., & Williams, L. J., 2010. WIREs Comp Stat 2010, 2: 433-459. doi: 10.1002/wics.101Abdo, A., et al. 2010, ApJS, 187, 460Barnbaum, C. & Bradley, R. F. 1998, Astron. J. 116, 2598Burgay, M. et al. 2003, Nature, 426, 531Demorest, P. et al. 2010, Nature, 467, 1081Demorest, P. et al. 2012, ApJ, in press, arXiv:1201.6641Dong, W., et al., 2005, Radio Science, RS5S04, doi:10.1029/2004RS003130, vol. 40, no. 5Eatough, R. et al. 2010, MNRAS, 407, 2443Ellingson et al. 2001, ApJS, 135, 87Fisher, J. 2004, NRAO Green Bank Electronics Division Internal Report No 313. http://www.gb.nrao.edu/electronics/edir/edir313.pdfHessels, J. et al. 2006, Science, 311, 1901 Hobbs, G. et al. 2010, Classical and Quantum Gravity, Volume 27, Issue 8, pp. 084013IRU-R RA.2126. http://www.itu.int/dms_pub/itu-r/opb/rep/R-REP-RA.2126-2007-PDF-E.pdfJeffs, B.D. et al., 2005, IEEE Transactions on Signal Processing, vol. 53, No. 2, pp. 439-451.Keith, M. et al. 2009, MNRAS, 395, 837Kesteven, M. et al.  2005, Radio Science, vol. 40, no. 5Kesteven, M. et al, 2010, RFI mitigation workshop, Proceedings of ScienceKramer, M. et al. 2006, Science, 314, 97Kramer, M. & Stairs, I. H. 2008, ARAA, 46, 541Li, J. et al. 2012. ApJ , 746, 60Lorimer, D. R. & Kramer, M. 2005. “Handbook of Pulsar Astronomy” Cambridge University PressMickaliger, M. et al. 2012, ApJ, submitted, arXiv:1206.2895Lyne, A. G. et al. 2004, Science, 303, 1153McCarty, M., 2011. http://www.gb.nrao.edu/~mmccarty/ann_astronomy.pdfMcLaughlin, M. A. et al. 2004a, ApJ, 613, L57McLaughlin, M. A. et al. 2004b, ApJ, 616, L131Poulson, A. 2003. BYU Masters Thesis. http://ras.groups.et.byu.net/docs/poulsen_thesis.pdfPoulson, A.J. et al., 2005, Astronomical Journal, vol. 130, no. 6, pp. 2916-2927Rosen, R. et al. 2010, Astron. Education Review, 9, 010106Roshi, A. 2002, NRAO Green Bank Electronics Division Technical Report No. 193 http://www.gb.nrao.edu/electronics/edtn/edtn193.pdfRadhakrishnan, V. & Cooke, D. J., 1969, ApJ, 3, 225Taylor, J. H. & Weisberg, J. M., 1989, ApJ, 345, 434Yang, J. et al, 2003, , IEEE Transactions on Knowledge and Data Engineering, vol.15, no.3, pp. 613- 628 doi: 10.1109/TKDE.2003.1198394

18